Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp2628388rwb; Sun, 15 Jan 2023 19:22:19 -0800 (PST) X-Google-Smtp-Source: AMrXdXsBX7n07tQ+AvwojeD52+3NRW0OoGpm3RQwG1O04TMrkELN2uYmmVlvA0QCrHvl4rOrmGZK X-Received: by 2002:a62:1406:0:b0:58d:a930:e230 with SMTP id 6-20020a621406000000b0058da930e230mr2129102pfu.18.1673839338779; Sun, 15 Jan 2023 19:22:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673839338; cv=none; d=google.com; s=arc-20160816; b=PK+8gqf2fka9TY0P+WonhIA2hrULR1Ofhkg5h/sNPpVZX6keTPTkawrNWTsATwPYzX 3d6VEvnNdKrDs5vIoVxm3C3XfXXACwxlJVBaBZGvPLtV9h3jJ/r33D0OyBwtNHyzsoc4 ZzLbELwuEvcC67bFG20MAUYL/bWw/87126wA+YUMGs10VoSwDEizLfsk/x2BH2gNgAUQ Cgg9GeY5yYDi9zXd0jvm5ZHJznKlEONlPAX0NnwvA236q5sIz6PFCqIKJt22rRsBJ/N6 dTNUSUWMgUgqFvJ0pAvULw/huSlhHy7Q5L6jKlYyRZmPGR2KKC7KQEGSdF9YAlHPrAkU 0DFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=UyQGBZ3Gkd3b4iAbz3EJTizZbAeFcDEdV/doVZ88Q2Q=; b=s1Fke1A3jk8AjcoAXg9f1xxsIlG9RYB3X8hjhuwpoxmIUgnm73/ev0MVbhsR0k8z3C k8+pVI176nXmeT2RuURLifbSaX3hVHkco/lZfvE898gZaa4qRxmgZ/rX82cxxA0cKaRh lJR52tiHbX6Q4SefCNdH8umUCsiXvwmbWHkV7faVZqATlXY0s2+DJrulAod3x2Iz7Zdx iUl8cqoae27ikf+4adyofjtj8FbTeRjPz7HQT+tSeO2WRGBo0dAHxeZLOOPrzT+WAxnl Elv/Ed9uDrnJNqGVsuDnBBZNjYuPRHeCjus38goCkqhCJ1tUQm6QYxHPpomLyQfHv+xM 1qOw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l124-20020a622582000000b0058b9025bd61si9733812pfl.53.2023.01.15.19.22.12; Sun, 15 Jan 2023 19:22:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231724AbjAPC7F (ORCPT + 52 others); Sun, 15 Jan 2023 21:59:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231672AbjAPC7A (ORCPT ); Sun, 15 Jan 2023 21:59:00 -0500 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BF3D30D9; Sun, 15 Jan 2023 18:58:57 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=renyu.zj@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0VZbYmeb_1673837931; Received: from 30.221.147.208(mailfrom:renyu.zj@linux.alibaba.com fp:SMTPD_---0VZbYmeb_1673837931) by smtp.aliyun-inc.com; Mon, 16 Jan 2023 10:58:52 +0800 Message-ID: <399e8b83-ee7b-1b24-bd6b-74d274fcda46@linux.alibaba.com> Date: Mon, 16 Jan 2023 10:58:50 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [PATCH v7 1/9] perf pmu: Add #slots literal support for arm64 To: Ian Rogers , John Garry Cc: Xing Zhengjun , Will Deacon , James Clark , Mike Leach , Leo Yan , linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Andrew Kilroy , Shuai Xue , Zhuo Song References: <1673601740-122788-1-git-send-email-renyu.zj@linux.alibaba.com> <1673601740-122788-2-git-send-email-renyu.zj@linux.alibaba.com> From: Jing Zhang In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2023/1/15 上午6:15, Ian Rogers 写道: > On Fri, Jan 13, 2023 at 1:22 AM Jing Zhang wrote: >> >> The slots in each architecture may be different, so add #slots literal >> to obtain the slots of different architectures, and the #slots can be >> applied in the metric. Currently, The #slots just support for arm64, >> and other architectures will return NAN. >> >> On arm64, the value of slots is from the register PMMIR_EL1.SLOT, which >> I can read in /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. >> PMMIR_EL1.SLOT might read as zero if the PMU version is lower than >> ID_AA64DFR0_EL1_PMUVer_V3P4 or the STALL_SLOT event is not implemented. >> >> Signed-off-by: Jing Zhang >> --- >> tools/perf/arch/arm64/util/pmu.c | 34 ++++++++++++++++++++++++++++++++-- >> tools/perf/util/expr.c | 5 +++++ >> tools/perf/util/pmu.c | 6 ++++++ >> tools/perf/util/pmu.h | 1 + >> 4 files changed, 44 insertions(+), 2 deletions(-) >> >> diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c >> index 477e513..5f8667b 100644 >> --- a/tools/perf/arch/arm64/util/pmu.c >> +++ b/tools/perf/arch/arm64/util/pmu.c >> @@ -3,8 +3,9 @@ >> #include >> #include "../../../util/cpumap.h" >> #include "../../../util/pmu.h" >> +#include >> >> -const struct pmu_events_table *pmu_events_table__find(void) >> +static struct perf_pmu *pmu_core__find_same(void) > > I'm not sure "find_same" is the best name here. I suspect it should be > "find_core_pmu" which would agree with is_arm_pmu_core. Unfortunately > "core" has become an overloaded term sometimes used interchangeably > with CPU, hyperthread or SMT thread, it was a model name for Intel and > it is used to distinguish a set of SMT threads running together from a > single one. Anyway, for consistency I think perf_pmu__find_core_pmu is > the most appropriate name (or pmu__find_core_pmu, I'm not sure why we > get the extra perf_ prefix sometimes, in general that indicates the > functionality is in libperf). > The reason for using "pmu_core__find_same" before is to indicate that we're only dealing with homogeneous cores. And in the tools/perf/util/pmu.c file, most of the static functions have "pmu_" prefix, maybe we can use "pmu_find_same_core_pmu"? Ian, John, what do you think? Thanks, Jing > Aside from that, lgtm. Thanks, > Ian > >> { >> struct perf_pmu *pmu = NULL; >> >> @@ -19,8 +20,37 @@ const struct pmu_events_table *pmu_events_table__find(void) >> if (pmu->cpus->nr != cpu__max_cpu().cpu) >> return NULL; >> >> - return perf_pmu__find_table(pmu); >> + return pmu; >> } >> >> return NULL; >> } >> + >> +const struct pmu_events_table *pmu_events_table__find(void) >> +{ >> + struct perf_pmu *pmu = pmu_core__find_same(); >> + >> + if (pmu) >> + return perf_pmu__find_table(pmu); >> + >> + return NULL; >> +} >> + >> +double perf_pmu__cpu_slots_per_cycle(void) >> +{ >> + char path[PATH_MAX]; >> + unsigned long long slots = 0; >> + struct perf_pmu *pmu = pmu_core__find_same(); >> + >> + if (pmu) { >> + scnprintf(path, PATH_MAX, >> + EVENT_SOURCE_DEVICE_PATH "%s/caps/slots", pmu->name); >> + /* >> + * The value of slots is not greater than 32 bits, but sysfs__read_int >> + * can't read value with 0x prefix, so use sysfs__read_ull instead. >> + */ >> + sysfs__read_ull(path, &slots); >> + } >> + >> + return (double)slots; >> +} >> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c >> index 00dcde3..9d3076a 100644 >> --- a/tools/perf/util/expr.c >> +++ b/tools/perf/util/expr.c >> @@ -19,6 +19,7 @@ >> #include >> #include >> #include >> +#include "pmu.h" >> >> #ifdef PARSER_DEBUG >> extern int expr_debug; >> @@ -448,6 +449,10 @@ double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx >> result = topology->core_cpus_lists; >> goto out; >> } >> + if (!strcmp("#slots", literal)) { >> + result = perf_pmu__cpu_slots_per_cycle() ?: NAN; >> + goto out; >> + } >> >> pr_err("Unrecognized literal '%s'", literal); >> out: >> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c >> index 2bdeb89..cbb4fbf 100644 >> --- a/tools/perf/util/pmu.c >> +++ b/tools/perf/util/pmu.c >> @@ -19,6 +19,7 @@ >> #include >> #include >> #include >> +#include >> #include "debug.h" >> #include "evsel.h" >> #include "pmu.h" >> @@ -1993,3 +1994,8 @@ int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus, >> *ucpus_ptr = unmatched_cpus; >> return 0; >> } >> + >> +double __weak perf_pmu__cpu_slots_per_cycle(void) >> +{ >> + return NAN; >> +} >> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h >> index 69ca000..fd414ba 100644 >> --- a/tools/perf/util/pmu.h >> +++ b/tools/perf/util/pmu.h >> @@ -259,4 +259,5 @@ int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus, >> >> char *pmu_find_real_name(const char *name); >> char *pmu_find_alias_name(const char *name); >> +double perf_pmu__cpu_slots_per_cycle(void); >> #endif /* __PMU_H */ >> -- >> 1.8.3.1 >>