2022-07-21 06:59:55

by Xing Zhengjun

[permalink] [raw]
Subject: [PATCH v4 0/5] Add perf stat default events for hybrid machines

From: Zhengjun Xing <[email protected]>

The patch series is to clean up the existing perf stat default and support
the perf metrics Topdown for the p-core PMU in the perf stat default. The
first 4 patches are the clean-up patch and fixing the "--detailed" issue.
The last patch adds support for the perf metrics Topdown, the perf metrics
Topdown support for e-core PMU will be implemented later separately.

Kan Liang (4):
perf stat: Revert "perf stat: Add default hybrid events"
perf evsel: Add arch_evsel__hw_name()
perf evlist: Always use arch_evlist__add_default_attrs()
perf x86 evlist: Add default hybrid events for perf stat

Zhengjun Xing (1):
perf stat: Add topdown metrics in the default perf stat on the hybrid
machine

tools/perf/arch/x86/util/evlist.c | 64 +++++++++++++++++++++++++-----
tools/perf/arch/x86/util/evsel.c | 20 ++++++++++
tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++
tools/perf/arch/x86/util/topdown.h | 1 +
tools/perf/builtin-stat.c | 50 ++++-------------------
tools/perf/util/evlist.c | 11 +++--
tools/perf/util/evlist.h | 9 ++++-
tools/perf/util/evsel.c | 7 +++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/stat-display.c | 2 +-
tools/perf/util/topdown.c | 7 ++++
tools/perf/util/topdown.h | 3 +-
12 files changed, 166 insertions(+), 60 deletions(-)

--
2.25.1


2022-07-21 07:00:49

by Xing Zhengjun

[permalink] [raw]
Subject: [PATCH v4 2/5] perf evsel: Add arch_evsel__hw_name()

From: Kan Liang <[email protected]>

The commit 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.

Add an Arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.

Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.

Nothing is changed for the other Archs.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Zhengjun Xing <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
Change log:
v4:
* Adds Acked-by from Namhyung Kim <[email protected]>
* Rebase code to the latest perf/core branch
v3:
* no change since v1.

tools/perf/arch/x86/util/evsel.c | 20 ++++++++++++++++++++
tools/perf/util/evsel.c | 7 ++++++-
tools/perf/util/evsel.h | 1 +
3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 882c1a8c1ded..ea3972d785d1 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -66,6 +66,26 @@ bool arch_evsel__must_be_in_group(const struct evsel *evsel)
strcasestr(evsel->name, "topdown"));
}

+int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+{
+ u64 event = evsel->core.attr.config & PERF_HW_EVENT_MASK;
+ u64 pmu = evsel->core.attr.config >> PERF_PMU_TYPE_SHIFT;
+ const char *event_name;
+
+ if (event < PERF_COUNT_HW_MAX && evsel__hw_names[event])
+ event_name = evsel__hw_names[event];
+ else
+ event_name = "unknown-hardware";
+
+ /* The PMU type is not required for the non-hybrid platform. */
+ if (!pmu)
+ return scnprintf(bf, size, "%s", event_name);
+
+ return scnprintf(bf, size, "%s/%s/",
+ evsel->pmu_name ? evsel->pmu_name : "cpu",
+ event_name);
+}
+
static void ibs_l3miss_warn(void)
{
pr_warning(
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8fea51a9cd90..8199774a1dc2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -593,9 +593,14 @@ static int evsel__add_modifiers(struct evsel *evsel, char *bf, size_t size)
return r;
}

+int __weak arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+{
+ return scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
+}
+
static int evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
{
- int r = scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
+ int r = arch_evsel__hw_name(evsel, bf, size);
return r + evsel__add_modifiers(evsel, bf + r, size - r);
}

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 92bed8e2f7d8..9ec48049ee68 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -271,6 +271,7 @@ extern const char *const evsel__hw_names[PERF_COUNT_HW_MAX];
extern const char *const evsel__sw_names[PERF_COUNT_SW_MAX];
extern char *evsel__bpf_counter_events;
bool evsel__match_bpf_counter_events(const char *name);
+int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size);

int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size);
const char *evsel__name(struct evsel *evsel);
--
2.25.1

2022-07-21 07:03:14

by Xing Zhengjun

[permalink] [raw]
Subject: [PATCH v4 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine

From: Zhengjun Xing <[email protected]>

Topdown metrics are missed in the default perf stat on the hybrid machine,
add Topdown metrics in default perf stat for hybrid systems.

Currently, we support the perf metrics Topdown for the p-core PMU in the
perf stat default, the perf metrics Topdown support for e-core PMU will be
implemented later separately. Refactor the code adds two x86 specific
functions. Widen the size of the event name column by 7 chars, so that all
metrics after the "#" become aligned again.

The perf metrics topdown feature is supported on the cpu_core of ADL. The
dedicated perf metrics counter and the fixed counter 3 are used for the
topdown events. Adding the topdown metrics doesn't trigger multiplexing.

Before:

# ./perf stat -a true

Performance counter stats for 'system wide':

53.70 msec cpu-clock # 25.736 CPUs utilized
80 context-switches # 1.490 K/sec
24 cpu-migrations # 446.951 /sec
52 page-faults # 968.394 /sec
2,788,555 cpu_core/cycles/ # 51.931 M/sec
851,129 cpu_atom/cycles/ # 15.851 M/sec
2,974,030 cpu_core/instructions/ # 55.385 M/sec
416,919 cpu_atom/instructions/ # 7.764 M/sec
586,136 cpu_core/branches/ # 10.916 M/sec
79,872 cpu_atom/branches/ # 1.487 M/sec
14,220 cpu_core/branch-misses/ # 264.819 K/sec
7,691 cpu_atom/branch-misses/ # 143.229 K/sec

0.002086438 seconds time elapsed

After:

# ./perf stat -a true

Performance counter stats for 'system wide':

61.39 msec cpu-clock # 24.874 CPUs utilized
76 context-switches # 1.238 K/sec
24 cpu-migrations # 390.968 /sec
52 page-faults # 847.097 /sec
2,753,695 cpu_core/cycles/ # 44.859 M/sec
903,899 cpu_atom/cycles/ # 14.725 M/sec
2,927,529 cpu_core/instructions/ # 47.690 M/sec
428,498 cpu_atom/instructions/ # 6.980 M/sec
581,299 cpu_core/branches/ # 9.470 M/sec
83,409 cpu_atom/branches/ # 1.359 M/sec
13,641 cpu_core/branch-misses/ # 222.216 K/sec
8,008 cpu_atom/branch-misses/ # 130.453 K/sec
14,761,308 cpu_core/slots/ # 240.466 M/sec
3,288,625 cpu_core/topdown-retiring/ # 22.3% retiring
1,323,323 cpu_core/topdown-bad-spec/ # 9.0% bad speculation
5,477,470 cpu_core/topdown-fe-bound/ # 37.1% frontend bound
4,679,199 cpu_core/topdown-be-bound/ # 31.7% backend bound
646,194 cpu_core/topdown-heavy-ops/ # 4.4% heavy operations # 17.9% light operations
1,244,999 cpu_core/topdown-br-mispredict/ # 8.4% branch mispredict # 0.5% machine clears
3,891,800 cpu_core/topdown-fetch-lat/ # 26.4% fetch latency # 10.7% fetch bandwidth
1,879,034 cpu_core/topdown-mem-bound/ # 12.7% memory bound # 19.0% Core bound

0.002467839 seconds time elapsed

Signed-off-by: Zhengjun Xing <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
Change log:
v4:
* Adds Acked-by from Namhyung Kim <[email protected]>
v3:
* Make the pr_warning in one line.
v2:
* Refactor arch_get_topdown_pmu_name() as Namhyung's suggestion.

tools/perf/arch/x86/util/evlist.c | 13 ++------
tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++++++++
tools/perf/arch/x86/util/topdown.h | 1 +
tools/perf/builtin-stat.c | 14 ++------
tools/perf/util/stat-display.c | 2 +-
tools/perf/util/topdown.c | 7 ++++
tools/perf/util/topdown.h | 3 +-
7 files changed, 66 insertions(+), 25 deletions(-)

diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index c83f8c11735f..cb59ce9b9638 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -3,12 +3,9 @@
#include "util/pmu.h"
#include "util/evlist.h"
#include "util/parse-events.h"
-#include "topdown.h"
#include "util/event.h"
#include "util/pmu-hybrid.h"
-
-#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
-#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+#include "topdown.h"

static int ___evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs,
@@ -65,13 +62,7 @@ int arch_evlist__add_default_attrs(struct evlist *evlist,
if (nr_attrs)
return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);

- if (!pmu_have_event("cpu", "slots"))
- return 0;
-
- if (pmu_have_event("cpu", "topdown-heavy-ops"))
- return parse_events(evlist, TOPDOWN_L2_EVENTS, NULL);
- else
- return parse_events(evlist, TOPDOWN_L1_EVENTS, NULL);
+ return topdown_parse_events(evlist);
}

struct evsel *arch_evlist__leader(struct list_head *list)
diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
index f81a7cfe4d63..67c524324125 100644
--- a/tools/perf/arch/x86/util/topdown.c
+++ b/tools/perf/arch/x86/util/topdown.c
@@ -3,9 +3,17 @@
#include "api/fs/fs.h"
#include "util/pmu.h"
#include "util/topdown.h"
+#include "util/evlist.h"
+#include "util/debug.h"
+#include "util/pmu-hybrid.h"
#include "topdown.h"
#include "evsel.h"

+#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
+#define TOPDOWN_L1_EVENTS_CORE "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/}"
+#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+#define TOPDOWN_L2_EVENTS_CORE "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/,cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/,cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}"
+
/* Check whether there is a PMU which supports the perf metrics. */
bool topdown_sys_has_perf_metrics(void)
{
@@ -73,3 +81,46 @@ bool arch_topdown_sample_read(struct evsel *leader)

return false;
}
+
+const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
+{
+ const char *pmu_name;
+
+ if (!perf_pmu__has_hybrid())
+ return "cpu";
+
+ if (!evlist->hybrid_pmu_name) {
+ if (warn)
+ pr_warning("WARNING: default to use cpu_core topdown events\n");
+ evlist->hybrid_pmu_name = perf_pmu__hybrid_type_to_pmu("core");
+ }
+
+ pmu_name = evlist->hybrid_pmu_name;
+
+ return pmu_name;
+}
+
+int topdown_parse_events(struct evlist *evlist)
+{
+ const char *topdown_events;
+ const char *pmu_name;
+
+ if (!topdown_sys_has_perf_metrics())
+ return 0;
+
+ pmu_name = arch_get_topdown_pmu_name(evlist, false);
+
+ if (pmu_have_event(pmu_name, "topdown-heavy-ops")) {
+ if (!strcmp(pmu_name, "cpu_core"))
+ topdown_events = TOPDOWN_L2_EVENTS_CORE;
+ else
+ topdown_events = TOPDOWN_L2_EVENTS;
+ } else {
+ if (!strcmp(pmu_name, "cpu_core"))
+ topdown_events = TOPDOWN_L1_EVENTS_CORE;
+ else
+ topdown_events = TOPDOWN_L1_EVENTS;
+ }
+
+ return parse_events(evlist, topdown_events, NULL);
+}
diff --git a/tools/perf/arch/x86/util/topdown.h b/tools/perf/arch/x86/util/topdown.h
index 46bf9273e572..7eb81f042838 100644
--- a/tools/perf/arch/x86/util/topdown.h
+++ b/tools/perf/arch/x86/util/topdown.h
@@ -3,5 +3,6 @@
#define _TOPDOWN_H 1

bool topdown_sys_has_perf_metrics(void);
+int topdown_parse_events(struct evlist *evlist);

#endif
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 837c3ca91af1..c6b68be78f8c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -71,6 +71,7 @@
#include "util/bpf_counter.h"
#include "util/iostat.h"
#include "util/pmu-hybrid.h"
+ #include "util/topdown.h"
#include "asm/bug.h"

#include <linux/time64.h>
@@ -1858,22 +1859,11 @@ static int add_default_attributes(void)
unsigned int max_level = 1;
char *str = NULL;
bool warn = false;
- const char *pmu_name = "cpu";
+ const char *pmu_name = arch_get_topdown_pmu_name(evsel_list, true);

if (!force_metric_only)
stat_config.metric_only = true;

- if (perf_pmu__has_hybrid()) {
- if (!evsel_list->hybrid_pmu_name) {
- pr_warning("WARNING: default to use cpu_core topdown events\n");
- evsel_list->hybrid_pmu_name = perf_pmu__hybrid_type_to_pmu("core");
- }
-
- pmu_name = evsel_list->hybrid_pmu_name;
- if (!pmu_name)
- return -1;
- }
-
if (pmu_have_event(pmu_name, topdown_metric_L2_attrs[5])) {
metric_attrs = topdown_metric_L2_attrs;
max_level = 2;
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 606f09b09226..44045565c8f8 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -374,7 +374,7 @@ static void abs_printout(struct perf_stat_config *config,
config->csv_output ? 0 : config->unit_width,
evsel->unit, config->csv_sep);

- fprintf(output, "%-*s", config->csv_output ? 0 : 25, evsel__name(evsel));
+ fprintf(output, "%-*s", config->csv_output ? 0 : 32, evsel__name(evsel));

print_cgroup(config, evsel);
}
diff --git a/tools/perf/util/topdown.c b/tools/perf/util/topdown.c
index a369f84ceb6a..1090841550f7 100644
--- a/tools/perf/util/topdown.c
+++ b/tools/perf/util/topdown.c
@@ -65,3 +65,10 @@ __weak bool arch_topdown_sample_read(struct evsel *leader __maybe_unused)
{
return false;
}
+
+__weak const char *arch_get_topdown_pmu_name(struct evlist *evlist
+ __maybe_unused,
+ bool warn __maybe_unused)
+{
+ return "cpu";
+}
diff --git a/tools/perf/util/topdown.h b/tools/perf/util/topdown.h
index 118e75281f93..f9531528c559 100644
--- a/tools/perf/util/topdown.h
+++ b/tools/perf/util/topdown.h
@@ -2,11 +2,12 @@
#ifndef TOPDOWN_H
#define TOPDOWN_H 1
#include "evsel.h"
+#include "evlist.h"

bool arch_topdown_check_group(bool *warn);
void arch_topdown_group_warn(void);
bool arch_topdown_sample_read(struct evsel *leader);
-
+const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn);
int topdown_filter_events(const char **attr, char **str, bool use_group,
const char *pmu_name);

--
2.25.1

2022-07-21 07:22:48

by Xing Zhengjun

[permalink] [raw]
Subject: [PATCH v4 1/5] perf stat: Revert "perf stat: Add default hybrid events"

From: Kan Liang <[email protected]>

This reverts commit ac2dc29edd21 ("perf stat: Add default hybrid
events").

Between this patch and the reverted patch, the commit 6c1912898ed2
("perf parse-events: Rename parse_events_error functions") and the
commit 07eafd4e053a ("perf parse-event: Add init and exit to
parse_event_error") clean up the parse_events_error_*() codes. The
related change is also reverted.

The reverted patch is hard to be extended to support new default
events, e.g., Topdown events, and the existing "--detailed" option
on a hybrid platform.

A new solution will be proposed in the following patch to enable the
perf stat default on a hybrid platform.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Zhengjun Xing <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
Change log:
v4:
* Adds Acked-by from Namhyung Kim <[email protected]>
v3:
* no change since v1.

tools/perf/builtin-stat.c | 30 ------------------------------
1 file changed, 30 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4ce87a8eb7d7..6ac79d95f3b5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1685,12 +1685,6 @@ static int add_default_attributes(void)
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

-};
- struct perf_event_attr default_sw_attrs[] = {
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
};

/*
@@ -1947,30 +1941,6 @@ static int add_default_attributes(void)
}

if (!evsel_list->core.nr_entries) {
- if (perf_pmu__has_hybrid()) {
- struct parse_events_error errinfo;
- const char *hybrid_str = "cycles,instructions,branches,branch-misses";
-
- if (target__has_cpu(&target))
- default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
-
- if (evlist__add_default_attrs(evsel_list,
- default_sw_attrs) < 0) {
- return -1;
- }
-
- parse_events_error__init(&errinfo);
- err = parse_events(evsel_list, hybrid_str, &errinfo);
- if (err) {
- fprintf(stderr,
- "Cannot set up hybrid events %s: %d\n",
- hybrid_str, err);
- parse_events_error__print(&errinfo, hybrid_str);
- }
- parse_events_error__exit(&errinfo);
- return err ? -1 : 0;
- }
-
if (target__has_cpu(&target))
default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;

--
2.25.1

2022-07-21 07:28:14

by Xing Zhengjun

[permalink] [raw]
Subject: [PATCH v4 3/5] perf evlist: Always use arch_evlist__add_default_attrs()

From: Kan Liang <[email protected]>

Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs()
to add the Arch specific default attrs, e.g., Topdown for X86.

It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.

Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.

Add default_null_attrs[] to indicate the Arch specific attrs.
No functional change for the Arch specific default attrs either.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Zhengjun Xing <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
Change log:
v4:
* Adds Acked-by from Namhyung Kim <[email protected]>
v3:
* no change since v1.

tools/perf/arch/x86/util/evlist.c | 7 ++++++-
tools/perf/builtin-stat.c | 6 +++++-
tools/perf/util/evlist.c | 9 +++++++--
tools/perf/util/evlist.h | 7 +++++--
4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 68f681ad54c1..777bdf182a58 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -8,8 +8,13 @@
#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"

-int arch_evlist__add_default_attrs(struct evlist *evlist)
+int arch_evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs)
{
+ if (nr_attrs)
+ return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
+
if (!pmu_have_event("cpu", "slots"))
return 0;

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6ac79d95f3b5..837c3ca91af1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1777,6 +1777,9 @@ static int add_default_attributes(void)
(PERF_COUNT_HW_CACHE_OP_PREFETCH << 8) |
(PERF_COUNT_HW_CACHE_RESULT_MISS << 16) },
};
+
+ struct perf_event_attr default_null_attrs[] = {};
+
/* Set attrs if no event is selected and !null_run: */
if (stat_config.null_run)
return 0;
@@ -1958,7 +1961,8 @@ static int add_default_attributes(void)
return -1;

stat_config.topdown_level = TOPDOWN_MAX_LEVEL;
- if (arch_evlist__add_default_attrs(evsel_list) < 0)
+ /* Platform specific attrs */
+ if (evlist__add_default_attrs(evsel_list, default_null_attrs) < 0)
return -1;
}

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 48af7d379d82..efa5f006b5c6 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -342,9 +342,14 @@ int __evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *a
return evlist__add_attrs(evlist, attrs, nr_attrs);
}

-__weak int arch_evlist__add_default_attrs(struct evlist *evlist __maybe_unused)
+__weak int arch_evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs)
{
- return 0;
+ if (!nr_attrs)
+ return 0;
+
+ return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
}

struct evsel *evlist__find_tracepoint_by_id(struct evlist *evlist, int id)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1bde9ccf4e7d..129095c0fe6d 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -107,10 +107,13 @@ static inline int evlist__add_default(struct evlist *evlist)
int __evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs, size_t nr_attrs);

+int arch_evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs);
+
#define evlist__add_default_attrs(evlist, array) \
- __evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))
+ arch_evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))

-int arch_evlist__add_default_attrs(struct evlist *evlist);
struct evsel *arch_evlist__leader(struct list_head *list);

int evlist__add_dummy(struct evlist *evlist);
--
2.25.1

2022-07-21 07:32:22

by Xing Zhengjun

[permalink] [raw]
Subject: [PATCH v4 4/5] perf x86 evlist: Add default hybrid events for perf stat

From: Kan Liang <[email protected]>

Provide a new solution to replace the reverted commit ac2dc29edd21
("perf stat: Add default hybrid events").

For the default software attrs, nothing is changed.
For the default hardware attrs, create a new evsel for each hybrid pmu.

With the new solution, adding a new default attr will not require the
special support for the hybrid platform anymore.

Also, the "--detailed" is supported on the hybrid platform

With the patch,

./perf stat -a -ddd sleep 1

Performance counter stats for 'system wide':

32,231.06 msec cpu-clock # 32.056 CPUs utilized
529 context-switches # 16.413 /sec
32 cpu-migrations # 0.993 /sec
69 page-faults # 2.141 /sec
176,754,151 cpu_core/cycles/ # 5.484 M/sec (41.65%)
161,695,280 cpu_atom/cycles/ # 5.017 M/sec (49.92%)
48,595,992 cpu_core/instructions/ # 1.508 M/sec (49.98%)
32,363,337 cpu_atom/instructions/ # 1.004 M/sec (58.26%)
10,088,639 cpu_core/branches/ # 313.010 K/sec (58.31%)
6,390,582 cpu_atom/branches/ # 198.274 K/sec (58.26%)
846,201 cpu_core/branch-misses/ # 26.254 K/sec (66.65%)
676,477 cpu_atom/branch-misses/ # 20.988 K/sec (58.27%)
14,290,070 cpu_core/L1-dcache-loads/ # 443.363 K/sec (66.66%)
9,983,532 cpu_atom/L1-dcache-loads/ # 309.749 K/sec (58.27%)
740,725 cpu_core/L1-dcache-load-misses/ # 22.982 K/sec (66.66%)
<not supported> cpu_atom/L1-dcache-load-misses/
480,441 cpu_core/LLC-loads/ # 14.906 K/sec (66.67%)
326,570 cpu_atom/LLC-loads/ # 10.132 K/sec (58.27%)
329 cpu_core/LLC-load-misses/ # 10.208 /sec (66.68%)
0 cpu_atom/LLC-load-misses/ # 0.000 /sec (58.32%)
<not supported> cpu_core/L1-icache-loads/
21,982,491 cpu_atom/L1-icache-loads/ # 682.028 K/sec (58.43%)
4,493,189 cpu_core/L1-icache-load-misses/ # 139.406 K/sec (33.34%)
4,711,404 cpu_atom/L1-icache-load-misses/ # 146.176 K/sec (50.08%)
13,713,090 cpu_core/dTLB-loads/ # 425.462 K/sec (33.34%)
9,384,727 cpu_atom/dTLB-loads/ # 291.170 K/sec (50.08%)
157,387 cpu_core/dTLB-load-misses/ # 4.883 K/sec (33.33%)
108,328 cpu_atom/dTLB-load-misses/ # 3.361 K/sec (50.08%)
<not supported> cpu_core/iTLB-loads/
<not supported> cpu_atom/iTLB-loads/
37,655 cpu_core/iTLB-load-misses/ # 1.168 K/sec (33.32%)
61,661 cpu_atom/iTLB-load-misses/ # 1.913 K/sec (50.03%)
<not supported> cpu_core/L1-dcache-prefetches/
<not supported> cpu_atom/L1-dcache-prefetches/
<not supported> cpu_core/L1-dcache-prefetch-misses/
<not supported> cpu_atom/L1-dcache-prefetch-misses/

1.005466919 seconds time elapsed

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Zhengjun Xing <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
Change log:
v4:
* Adds Acked-by from Namhyung Kim <[email protected]>
v3:
* Use evsel__new() in place of evsel__new_idx()
v2:
* The index of all new evsel will be updated when adding to the evlist,
just set 0 idx for the new evsel.

tools/perf/arch/x86/util/evlist.c | 52 ++++++++++++++++++++++++++++++-
tools/perf/util/evlist.c | 2 +-
tools/perf/util/evlist.h | 2 ++
3 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 777bdf182a58..c83f8c11735f 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -4,16 +4,66 @@
#include "util/evlist.h"
#include "util/parse-events.h"
#include "topdown.h"
+#include "util/event.h"
+#include "util/pmu-hybrid.h"

#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"

+static int ___evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs)
+{
+ struct perf_cpu_map *cpus;
+ struct evsel *evsel, *n;
+ struct perf_pmu *pmu;
+ LIST_HEAD(head);
+ size_t i = 0;
+
+ for (i = 0; i < nr_attrs; i++)
+ event_attr_init(attrs + i);
+
+ if (!perf_pmu__has_hybrid())
+ return evlist__add_attrs(evlist, attrs, nr_attrs);
+
+ for (i = 0; i < nr_attrs; i++) {
+ if (attrs[i].type == PERF_TYPE_SOFTWARE) {
+ evsel = evsel__new(attrs + i);
+ if (evsel == NULL)
+ goto out_delete_partial_list;
+ list_add_tail(&evsel->core.node, &head);
+ continue;
+ }
+
+ perf_pmu__for_each_hybrid_pmu(pmu) {
+ evsel = evsel__new(attrs + i);
+ if (evsel == NULL)
+ goto out_delete_partial_list;
+ evsel->core.attr.config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT;
+ cpus = perf_cpu_map__get(pmu->cpus);
+ evsel->core.cpus = cpus;
+ evsel->core.own_cpus = perf_cpu_map__get(cpus);
+ evsel->pmu_name = strdup(pmu->name);
+ list_add_tail(&evsel->core.node, &head);
+ }
+ }
+
+ evlist__splice_list_tail(evlist, &head);
+
+ return 0;
+
+out_delete_partial_list:
+ __evlist__for_each_entry_safe(&head, n, evsel)
+ evsel__delete(evsel);
+ return -1;
+}
+
int arch_evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs,
size_t nr_attrs)
{
if (nr_attrs)
- return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
+ return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);

if (!pmu_have_event("cpu", "slots"))
return 0;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index efa5f006b5c6..5ff4b9504828 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -309,7 +309,7 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
return evsel;
}

-static int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
+int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
{
struct evsel *evsel, *n;
LIST_HEAD(head);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 129095c0fe6d..351ba2887a79 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -104,6 +104,8 @@ static inline int evlist__add_default(struct evlist *evlist)
return __evlist__add_default(evlist, true);
}

+int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs);
+
int __evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs, size_t nr_attrs);

--
2.25.1

2022-07-29 15:20:33

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] Add perf stat default events for hybrid machines

On Wed, Jul 20, 2022 at 11:56 PM <[email protected]> wrote:
>
> From: Zhengjun Xing <[email protected]>
>
> The patch series is to clean up the existing perf stat default and support
> the perf metrics Topdown for the p-core PMU in the perf stat default. The
> first 4 patches are the clean-up patch and fixing the "--detailed" issue.
> The last patch adds support for the perf metrics Topdown, the perf metrics
> Topdown support for e-core PMU will be implemented later separately.
>
> Kan Liang (4):
> perf stat: Revert "perf stat: Add default hybrid events"
> perf evsel: Add arch_evsel__hw_name()
> perf evlist: Always use arch_evlist__add_default_attrs()
> perf x86 evlist: Add default hybrid events for perf stat
>
> Zhengjun Xing (1):
> perf stat: Add topdown metrics in the default perf stat on the hybrid
> machine

Acked-by: Ian Rogers <[email protected]>

Thanks,
Ian

> tools/perf/arch/x86/util/evlist.c | 64 +++++++++++++++++++++++++-----
> tools/perf/arch/x86/util/evsel.c | 20 ++++++++++
> tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++
> tools/perf/arch/x86/util/topdown.h | 1 +
> tools/perf/builtin-stat.c | 50 ++++-------------------
> tools/perf/util/evlist.c | 11 +++--
> tools/perf/util/evlist.h | 9 ++++-
> tools/perf/util/evsel.c | 7 +++-
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/stat-display.c | 2 +-
> tools/perf/util/topdown.c | 7 ++++
> tools/perf/util/topdown.h | 3 +-
> 12 files changed, 166 insertions(+), 60 deletions(-)
>
> --
> 2.25.1
>

2022-07-29 16:49:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] Add perf stat default events for hybrid machines

Em Fri, Jul 29, 2022 at 08:03:20AM -0700, Ian Rogers escreveu:
> On Wed, Jul 20, 2022 at 11:56 PM <[email protected]> wrote:
> >
> > From: Zhengjun Xing <[email protected]>
> >
> > The patch series is to clean up the existing perf stat default and support
> > the perf metrics Topdown for the p-core PMU in the perf stat default. The
> > first 4 patches are the clean-up patch and fixing the "--detailed" issue.
> > The last patch adds support for the perf metrics Topdown, the perf metrics
> > Topdown support for e-core PMU will be implemented later separately.
> >
> > Kan Liang (4):
> > perf stat: Revert "perf stat: Add default hybrid events"
> > perf evsel: Add arch_evsel__hw_name()
> > perf evlist: Always use arch_evlist__add_default_attrs()
> > perf x86 evlist: Add default hybrid events for perf stat
> >
> > Zhengjun Xing (1):
> > perf stat: Add topdown metrics in the default perf stat on the hybrid
> > machine
>
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied.

- Arnaldo