2021-04-23 05:39:15

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 00/26] perf tool: AlderLake hybrid support series 1

AlderLake uses a hybrid architecture utilizing Golden Cove cores
(core cpu) and Gracemont cores (atom cpu). Each cpu has dedicated
event list. Some events are available on core cpu, some events
are available on atom cpu and some events can be available on both.

Kernel exports new pmus "cpu_core" and "cpu_atom" through sysfs:
/sys/devices/cpu_core
/sys/devices/cpu_atom

cat /sys/devices/cpu_core/cpus
0-15

cat /sys/devices/cpu_atom/cpus
16-23

In this example, core cpus are 0-15 and atom cpus are 16-23.

To enable a core only event or atom only event:

cpu_core/<event name>/
or
cpu_atom/<event name>/

Count the 'cycles' event on core cpus.

# perf stat -e cpu_core/cycles/ -a -- sleep 1

Performance counter stats for 'system wide':

12,853,951,349 cpu_core/cycles/

1.002581249 seconds time elapsed

If one event is available on both atom cpu and core cpu, two events
are created automatically.

# perf stat -e cycles -a -- sleep 1

Performance counter stats for 'system wide':

12,856,467,438 cpu_core/cycles/
6,404,634,785 cpu_atom/cycles/

1.002453013 seconds time elapsed

Group is supported if the events are from same pmu, otherwise a warning
is displayed and disable grouping automatically.

# perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -a -- sleep 1

Performance counter stats for 'system wide':

12,863,866,968 cpu_core/cycles/
554,795,017 cpu_core/instructions/

1.002616117 seconds time elapsed

# perf stat -e '{cpu_core/cycles/,cpu_atom/instructions/}' -a -- sleep 1
WARNING: events in group from different hybrid PMUs!
WARNING: grouped events cpus do not match, disabling group:
anon group { cpu_core/cycles/, cpu_atom/instructions/ }

Performance counter stats for 'system wide':

6,283,970 cpu_core/cycles/
765,635 cpu_atom/instructions/

1.003959036 seconds time elapsed

Note that, since the whole patchset for AlderLake hybrid support is very
large (40+ patches). For simplicity, it's splitted into several patch
series.

The patch series 1 only supports the basic functionality. The advanced
supports for perf-c2c/perf-mem/topdown/metrics/topology header and others
will be added in follow-up patch series.

The perf tool codes can also be found at:
https://github.com/yaoj/perf.git

v5:
---
- Now Liang Kan's patch series for AlderLake perf core support has been
upstreamed. So the interface for perf tool part will not be changed.

- '[PATCH v5 12/26] perf parse-events: Support event inside hybrid pmu',
check the head_config list has only one term and if yes then do the
second parsing. We drop the 'parsed' param and make parse_events__with_hybrid_pmu
return 0 when we find some event.

Move 'evsel->use_config_name = true;' to the patch
'[PATCH v5 07/26] perf stat: Uniquify hybrid event name'.

- '[PATCH v5 14/26] perf stat: Add default hybrid events',
do the same way like when topdown calls parse events for checking
result and displayt the error.

- '[PATCH v5 15/26] perf stat: Filter out unmatched aggregation for hybrid event',
use Jiri's code to filter, which is much simpler than original.

- Some perf test minor updates.

v4:
---
- In Liang Kan's patch:
'[PATCH V6 21/25] perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE',
the user interface for hardware events and cache events are changed, so
perf tool patches are changed as well.

- Fix an issue when atom CPUs are offlined. "/sys/bus/event_source/devices/cpu_atom/cpus"
exists but the content is empty. For this case, we can't enable the cpu_atom
PMU. '[PATCH v4 05/25] perf pmu: Save detected hybrid pmus to a global pmu list'

- Define 'ret' variable for return value in patch
'[PATCH v4 09/25] perf parse-events: Create two hybrid cache events'

- Directly return add_raw_hybrid() in patch
'[PATCH v4 10/25] perf parse-events: Create two hybrid raw events'

- Drop the patch 'perf pmu: Support 'cycles' and 'branches' inside
hybrid PMU'.

- Separate '[PATCH v3 12/27] perf parse-events: Support no alias assigned event
inside hybrid PMU' into two patches:
'[PATCH v4 11/25] perf parse-events: Compare with hybrid pmu name'
'[PATCH v4 12/25] perf parse-events: Support event inside hybrid pmu'.
And these two patches are improved according to Jiri's comments.

v3:
---
- Drop 'perf evlist: Hybrid event uses its own cpus'. This patch is wide
and actually it's not very necessary. The current perf framework has
processed the cpus for evsel well even for hybrid evsel. So this patch can
be dropped.

- Drop 'perf evsel: Adjust hybrid event and global event mixed group'.
The patch is a bit tricky and hard to understand. In v3, we will disable
grouping when the group members are from different PMUs. So this patch
would be not necessary.

- Create parse-events-hybrid.c/parse-events-hybrid.h and evlist-hybrid.c/evlist-hybrid.h.
Move hybrid related codes to these files.

- Create a new patch 'perf pmu: Support 'cycles' and 'branches' inside hybrid PMU' to
support 'cycles' and 'branches' inside PMU.

- Create a new patch 'perf record: Uniquify hybrid event name' to tell user the
pmu which the event belongs to for perf-record.

- If group members are from different hybrid PMUs, shows warning and disable
grouping.

- Other refining and refactoring.

v2:
---
- Drop kernel patches (Kan posted the series "Add Alder Lake support for perf (kernel)" separately).
- Drop the patches for perf-c2c/perf-mem/topdown/metrics/topology header supports,
which will be added in series 2 or series 3.
- Simplify the arguments of __perf_pmu__new_alias() by passing
the 'struct pme_event' pointer.
- Check sysfs validity before access.
- Use pmu style event name, such as "cpu_core/cycles/".
- Move command output two chars to the right.
- Move pmu hybrid functions to new created pmu-hybrid.c/pmu-hybrid.h.
This is to pass the perf test python case.

Jin Yao (26):
tools headers uapi: Update tools's copy of linux/perf_event.h
perf jevents: Support unit value "cpu_core" and "cpu_atom"
perf pmu: Simplify arguments of __perf_pmu__new_alias
perf pmu: Save pmu name
perf pmu: Save detected hybrid pmus to a global pmu list
perf pmu: Add hybrid helper functions
perf stat: Uniquify hybrid event name
perf parse-events: Create two hybrid hardware events
perf parse-events: Create two hybrid cache events
perf parse-events: Create two hybrid raw events
perf parse-events: Compare with hybrid pmu name
perf parse-events: Support event inside hybrid pmu
perf record: Create two hybrid 'cycles' events by default
perf stat: Add default hybrid events
perf stat: Filter out unmatched aggregation for hybrid event
perf stat: Warn group events from different hybrid PMU
perf record: Uniquify hybrid event name
perf tests: Add hybrid cases for 'Parse event definition strings' test
perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
perf tests: Support 'Track with sched_switch' test for hybrid
perf tests: Support 'Parse and process metrics' test for hybrid
perf tests: Support 'Session topology' test for hybrid
perf tests: Support 'Convert perf time to TSC' test for hybrid
perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
perf Documentation: Document intel-hybrid support

include/uapi/linux/perf_event.h | 15 ++
tools/include/uapi/linux/perf_event.h | 15 ++
tools/perf/Documentation/intel-hybrid.txt | 214 +++++++++++++++++++++
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/Documentation/perf-stat.txt | 2 +
tools/perf/builtin-record.c | 47 ++++-
tools/perf/builtin-stat.c | 36 ++++
tools/perf/pmu-events/jevents.c | 2 +
tools/perf/tests/attr.c | 4 +
tools/perf/tests/evsel-roundtrip-name.c | 19 +-
tools/perf/tests/parse-events.c | 152 +++++++++++++++
tools/perf/tests/parse-metric.c | 8 +-
tools/perf/tests/perf-time-to-tsc.c | 12 ++
tools/perf/tests/shell/stat+shadow_stat.sh | 3 +
tools/perf/tests/switch-tracking.c | 6 +-
tools/perf/tests/topology.c | 13 +-
tools/perf/util/Build | 3 +
tools/perf/util/evlist-hybrid.c | 88 +++++++++
tools/perf/util/evlist-hybrid.h | 14 ++
tools/perf/util/evlist.c | 5 +-
tools/perf/util/evsel.c | 12 +-
tools/perf/util/evsel.h | 4 +-
tools/perf/util/parse-events-hybrid.c | 178 +++++++++++++++++
tools/perf/util/parse-events-hybrid.h | 23 +++
tools/perf/util/parse-events.c | 97 +++++++++-
tools/perf/util/parse-events.h | 9 +-
tools/perf/util/parse-events.y | 9 +-
tools/perf/util/pmu-hybrid.c | 89 +++++++++
tools/perf/util/pmu-hybrid.h | 22 +++
tools/perf/util/pmu.c | 64 ++++--
tools/perf/util/pmu.h | 7 +
tools/perf/util/python-ext-sources | 2 +
tools/perf/util/stat-display.c | 18 +-
33 files changed, 1143 insertions(+), 50 deletions(-)
create mode 100644 tools/perf/Documentation/intel-hybrid.txt
create mode 100644 tools/perf/util/evlist-hybrid.c
create mode 100644 tools/perf/util/evlist-hybrid.h
create mode 100644 tools/perf/util/parse-events-hybrid.c
create mode 100644 tools/perf/util/parse-events-hybrid.h
create mode 100644 tools/perf/util/pmu-hybrid.c
create mode 100644 tools/perf/util/pmu-hybrid.h

--
2.17.1


2021-04-23 05:39:16

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 14/26] perf stat: Add default hybrid events

Previously if '-e' is not specified in perf stat, some software events
and hardware events are added to evlist by default.

Before:

# perf stat -a -- sleep 1

Performance counter stats for 'system wide':

24,044.40 msec cpu-clock # 23.946 CPUs utilized
99 context-switches # 4.117 /sec
24 cpu-migrations # 0.998 /sec
3 page-faults # 0.125 /sec
7,000,244 cycles # 0.000 GHz
2,955,024 instructions # 0.42 insn per cycle
608,941 branches # 25.326 K/sec
31,991 branch-misses # 5.25% of all branches

1.004106859 seconds time elapsed

Among the events, cycles, instructions, branches and branch-misses
are hardware events.

One hybrid platform, two hardware events are created for one
hardware event.

cpu_core/cycles/,
cpu_atom/cycles/,
cpu_core/instructions/,
cpu_atom/instructions/,
cpu_core/branches/,
cpu_atom/branches/,
cpu_core/branch-misses/,
cpu_atom/branch-misses/

These events would be added to evlist on hybrid platform.

Since parse_events() has been supported to create two hardware events
for one event on hybrid platform, so we just use parse_events(evlist,
"cycles,instructions,branches,branch-misses") to create the default
events and add them to evlist.

After:

# perf stat -a -- sleep 1

Performance counter stats for 'system wide':

24,043.99 msec cpu-clock # 23.991 CPUs utilized
139 context-switches # 5.781 /sec
25 cpu-migrations # 1.040 /sec
6 page-faults # 0.250 /sec
10,381,751 cpu_core/cycles/ # 431.782 K/sec
1,264,216 cpu_atom/cycles/ # 52.579 K/sec
3,406,958 cpu_core/instructions/ # 141.697 K/sec
414,588 cpu_atom/instructions/ # 17.243 K/sec
705,149 cpu_core/branches/ # 29.327 K/sec
82,358 cpu_atom/branches/ # 3.425 K/sec
40,821 cpu_core/branch-misses/ # 1.698 K/sec
9,086 cpu_atom/branch-misses/ # 377.891 /sec

1.002228863 seconds time elapsed

We can see two events are created for one hardware event.

One TODO is, the shadow stats looks a bit different, now it's just
'M/sec'.

The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
need to be improved in future if we want to get the original shadow
stats.

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- Do the same way like when topdown calls parse events for checking
result and displayt the error.

v4:
- No change.

tools/perf/builtin-stat.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 1255af4751c2..3ab4069ff8f0 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1626,6 +1626,12 @@ static int add_default_attributes(void)
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

+};
+ struct perf_event_attr default_sw_attrs[] = {
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
};

/*
@@ -1863,6 +1869,28 @@ static int add_default_attributes(void)
}

if (!evsel_list->core.nr_entries) {
+ if (perf_pmu__has_hybrid()) {
+ const char *hybrid_str = "cycles,instructions,branches,branch-misses";
+
+ if (target__has_cpu(&target))
+ default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
+
+ if (evlist__add_default_attrs(evsel_list,
+ default_sw_attrs) < 0) {
+ return -1;
+ }
+
+ err = parse_events(evsel_list, hybrid_str, &errinfo);
+ if (err) {
+ fprintf(stderr,
+ "Cannot set up hybrid events %s: %d\n",
+ hybrid_str, err);
+ parse_events_print_error(&errinfo, hybrid_str);
+ return -1;
+ }
+ return err;
+ }
+
if (target__has_cpu(&target))
default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;

--
2.17.1

2021-04-23 05:39:16

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 15/26] perf stat: Filter out unmatched aggregation for hybrid event

perf-stat has supported some aggregation modes, such as --per-core,
--per-socket and etc. While for hybrid event, it may only available
on part of cpus. So for --per-core, we need to filter out the
unavailable cores, for --per-socket, filter out the unavailable
sockets, and so on.

Before:

# perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

Performance counter stats for 'system wide':

S0-D0-C0 2 479,530 cpu_core/cycles/
S0-D0-C4 2 175,007 cpu_core/cycles/
S0-D0-C8 2 166,240 cpu_core/cycles/
S0-D0-C12 2 704,673 cpu_core/cycles/
S0-D0-C16 2 865,835 cpu_core/cycles/
S0-D0-C20 2 2,958,461 cpu_core/cycles/
S0-D0-C24 2 163,988 cpu_core/cycles/
S0-D0-C28 2 164,729 cpu_core/cycles/
S0-D0-C32 0 <not counted> cpu_core/cycles/
S0-D0-C33 0 <not counted> cpu_core/cycles/
S0-D0-C34 0 <not counted> cpu_core/cycles/
S0-D0-C35 0 <not counted> cpu_core/cycles/
S0-D0-C36 0 <not counted> cpu_core/cycles/
S0-D0-C37 0 <not counted> cpu_core/cycles/
S0-D0-C38 0 <not counted> cpu_core/cycles/
S0-D0-C39 0 <not counted> cpu_core/cycles/

1.003597211 seconds time elapsed

After:

# perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

Performance counter stats for 'system wide':

S0-D0-C0 2 210,428 cpu_core/cycles/
S0-D0-C4 2 444,830 cpu_core/cycles/
S0-D0-C8 2 435,241 cpu_core/cycles/
S0-D0-C12 2 423,976 cpu_core/cycles/
S0-D0-C16 2 859,350 cpu_core/cycles/
S0-D0-C20 2 1,559,589 cpu_core/cycles/
S0-D0-C24 2 163,924 cpu_core/cycles/
S0-D0-C28 2 376,610 cpu_core/cycles/

1.003621290 seconds time elapsed

Signed-off-by: Jin Yao <[email protected]>
Co-developed-by: Jiri Olsa <[email protected]>
---
v5:
- Use Jiri's code to filter, which is much simpler than original.

v4:
- No change.

tools/perf/util/stat-display.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 5255d78b1c30..06689f128e56 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -661,6 +661,9 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
if (!collect_data(config, counter, aggr_cb, &ad))
return;

+ if (perf_pmu__has_hybrid() && ad.ena == 0)
+ return;
+
nr = ad.nr;
ena = ad.ena;
run = ad.run;
--
2.17.1

2021-04-23 05:39:28

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 16/26] perf stat: Warn group events from different hybrid PMU

If a group has events which are from different hybrid PMUs,
shows a warning:

"WARNING: events in group from different hybrid PMUs!"

This is to remind the user not to put the core event and atom
event into one group.

Next, just disable grouping.

# perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1
WARNING: events in group from different hybrid PMUs!
WARNING: grouped events cpus do not match, disabling group:
anon group { cpu_core/cycles/, cpu_atom/cycles/ }

Performance counter stats for 'system wide':

5,438,125 cpu_core/cycles/
3,914,586 cpu_atom/cycles/

1.004250966 seconds time elapsed

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- No change.

v4:
- No change.

tools/perf/builtin-stat.c | 4 +++
tools/perf/util/evlist-hybrid.c | 47 ++++++++++++++++++++++++++++++
tools/perf/util/evlist-hybrid.h | 2 ++
tools/perf/util/evsel.c | 6 ++++
tools/perf/util/evsel.h | 1 +
tools/perf/util/python-ext-sources | 2 ++
6 files changed, 62 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 3ab4069ff8f0..4dfa26ff365a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -48,6 +48,7 @@
#include "util/pmu.h"
#include "util/event.h"
#include "util/evlist.h"
+#include "util/evlist-hybrid.h"
#include "util/evsel.h"
#include "util/debug.h"
#include "util/color.h"
@@ -240,6 +241,9 @@ static void evlist__check_cpu_maps(struct evlist *evlist)
struct evsel *evsel, *pos, *leader;
char buf[1024];

+ if (evlist__has_hybrid(evlist))
+ evlist__warn_hybrid_group(evlist);
+
evlist__for_each_entry(evlist, evsel) {
leader = evsel->leader;

diff --git a/tools/perf/util/evlist-hybrid.c b/tools/perf/util/evlist-hybrid.c
index e11998526f2e..db3f5fbdebe1 100644
--- a/tools/perf/util/evlist-hybrid.c
+++ b/tools/perf/util/evlist-hybrid.c
@@ -7,6 +7,7 @@
#include "../perf.h"
#include "util/pmu-hybrid.h"
#include "util/evlist-hybrid.h"
+#include "debug.h"
#include <unistd.h>
#include <stdlib.h>
#include <linux/err.h>
@@ -39,3 +40,49 @@ int evlist__add_default_hybrid(struct evlist *evlist, bool precise)

return 0;
}
+
+static bool group_hybrid_conflict(struct evsel *leader)
+{
+ struct evsel *pos, *prev = NULL;
+
+ for_each_group_evsel(pos, leader) {
+ if (!evsel__is_hybrid(pos))
+ continue;
+
+ if (prev && strcmp(prev->pmu_name, pos->pmu_name))
+ return true;
+
+ prev = pos;
+ }
+
+ return false;
+}
+
+void evlist__warn_hybrid_group(struct evlist *evlist)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel__is_group_leader(evsel) &&
+ evsel->core.nr_members > 1 &&
+ group_hybrid_conflict(evsel)) {
+ pr_warning("WARNING: events in group from "
+ "different hybrid PMUs!\n");
+ return;
+ }
+ }
+}
+
+bool evlist__has_hybrid(struct evlist *evlist)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel->pmu_name &&
+ perf_pmu__is_hybrid(evsel->pmu_name)) {
+ return true;
+ }
+ }
+
+ return false;
+}
diff --git a/tools/perf/util/evlist-hybrid.h b/tools/perf/util/evlist-hybrid.h
index e25861649d8f..19f74b4c340a 100644
--- a/tools/perf/util/evlist-hybrid.h
+++ b/tools/perf/util/evlist-hybrid.h
@@ -8,5 +8,7 @@
#include <unistd.h>

int evlist__add_default_hybrid(struct evlist *evlist, bool precise);
+void evlist__warn_hybrid_group(struct evlist *evlist);
+bool evlist__has_hybrid(struct evlist *evlist);

#endif /* __PERF_EVLIST_HYBRID_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0ba4daa09453..0f64a32ea9c5 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -47,6 +47,7 @@
#include "memswap.h"
#include "util.h"
#include "hashmap.h"
+#include "pmu-hybrid.h"
#include "../perf-sys.h"
#include "util/parse-branch-options.h"
#include <internal/xyarray.h>
@@ -2797,3 +2798,8 @@ void evsel__zero_per_pkg(struct evsel *evsel)
hashmap__clear(evsel->per_pkg_mask);
}
}
+
+bool evsel__is_hybrid(struct evsel *evsel)
+{
+ return evsel->pmu_name && perf_pmu__is_hybrid(evsel->pmu_name);
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index ff89196281bd..f6f90f68381b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -453,4 +453,5 @@ struct perf_env *evsel__env(struct evsel *evsel);
int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);

void evsel__zero_per_pkg(struct evsel *evsel);
+bool evsel__is_hybrid(struct evsel *evsel);
#endif /* __PERF_EVSEL_H */
diff --git a/tools/perf/util/python-ext-sources b/tools/perf/util/python-ext-sources
index 845dd46e3c61..d7c976671e3a 100644
--- a/tools/perf/util/python-ext-sources
+++ b/tools/perf/util/python-ext-sources
@@ -37,3 +37,5 @@ util/units.c
util/affinity.c
util/rwsem.c
util/hashmap.c
+util/pmu-hybrid.c
+util/fncache.c
--
2.17.1

2021-04-23 05:39:45

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 22/26] perf tests: Support 'Parse and process metrics' test for hybrid

Some events are not supported. Only pick up some cases for hybrid.

# ./perf test 68
68: Parse and process metrics : Ok

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- Remove the perf_pmu_scan() since it's called in
perf_pmu__has_hybrid() yet.

tools/perf/tests/parse-metric.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/parse-metric.c b/tools/perf/tests/parse-metric.c
index 4968c4106254..4f6f4904e852 100644
--- a/tools/perf/tests/parse-metric.c
+++ b/tools/perf/tests/parse-metric.c
@@ -11,6 +11,7 @@
#include "debug.h"
#include "expr.h"
#include "stat.h"
+#include "pmu.h"

static struct pmu_event pme_test[] = {
{
@@ -372,10 +373,13 @@ int test__parse_metric(struct test *test __maybe_unused, int subtest __maybe_unu
{
TEST_ASSERT_VAL("IPC failed", test_ipc() == 0);
TEST_ASSERT_VAL("frontend failed", test_frontend() == 0);
- TEST_ASSERT_VAL("cache_miss_cycles failed", test_cache_miss_cycles() == 0);
TEST_ASSERT_VAL("DCache_L2 failed", test_dcache_l2() == 0);
TEST_ASSERT_VAL("recursion fail failed", test_recursion_fail() == 0);
- TEST_ASSERT_VAL("test metric group", test_metric_group() == 0);
TEST_ASSERT_VAL("Memory bandwidth", test_memory_bandwidth() == 0);
+
+ if (!perf_pmu__has_hybrid()) {
+ TEST_ASSERT_VAL("cache_miss_cycles failed", test_cache_miss_cycles() == 0);
+ TEST_ASSERT_VAL("test metric group", test_metric_group() == 0);
+ }
return 0;
}
--
2.17.1

2021-04-23 05:40:04

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 18/26] perf tests: Add hybrid cases for 'Parse event definition strings' test

Add basic hybrid test cases for 'Parse event definition strings' test.

# perf test 6
6: Parse event definition strings : Ok

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- No change.

tools/perf/tests/parse-events.c | 152 ++++++++++++++++++++++++++++++++
1 file changed, 152 insertions(+)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 026c54743311..40eb08049ab2 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -1512,6 +1512,110 @@ static int test__all_tracepoints(struct evlist *evlist)
return test__checkevent_tracepoint_multi(evlist);
}

+static int test__hybrid_hw_event_with_pmu(struct evlist *evlist)
+{
+ struct evsel *evsel = evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong number of entries", 1 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x3c == evsel->core.attr.config);
+ return 0;
+}
+
+static int test__hybrid_hw_group_event(struct evlist *evlist)
+{
+ struct evsel *evsel, *leader;
+
+ evsel = leader = evlist__first(evlist);
+ TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x3c == evsel->core.attr.config);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+
+ evsel = evsel__next(evsel);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0xc0 == evsel->core.attr.config);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+ return 0;
+}
+
+static int test__hybrid_sw_hw_group_event(struct evlist *evlist)
+{
+ struct evsel *evsel, *leader;
+
+ evsel = leader = evlist__first(evlist);
+ TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_SOFTWARE == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+
+ evsel = evsel__next(evsel);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x3c == evsel->core.attr.config);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+ return 0;
+}
+
+static int test__hybrid_hw_sw_group_event(struct evlist *evlist)
+{
+ struct evsel *evsel, *leader;
+
+ evsel = leader = evlist__first(evlist);
+ TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x3c == evsel->core.attr.config);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+
+ evsel = evsel__next(evsel);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_SOFTWARE == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+ return 0;
+}
+
+static int test__hybrid_group_modifier1(struct evlist *evlist)
+{
+ struct evsel *evsel, *leader;
+
+ evsel = leader = evlist__first(evlist);
+ TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x3c == evsel->core.attr.config);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+ TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user);
+ TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel);
+
+ evsel = evsel__next(evsel);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0xc0 == evsel->core.attr.config);
+ TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
+ TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
+ TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
+ return 0;
+}
+
+static int test__hybrid_raw1(struct evlist *evlist)
+{
+ struct evsel *evsel = evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x1a == evsel->core.attr.config);
+
+ /* The type of second event is randome value */
+ evsel = evsel__next(evsel);
+ TEST_ASSERT_VAL("wrong config", 0x1a == evsel->core.attr.config);
+ return 0;
+}
+
+static int test__hybrid_raw2(struct evlist *evlist)
+{
+ struct evsel *evsel = evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong number of entries", 1 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x1a == evsel->core.attr.config);
+ return 0;
+}
+
struct evlist_test {
const char *name;
__u32 type;
@@ -1868,6 +1972,49 @@ static struct terms_test test__terms[] = {
},
};

+static struct evlist_test test__hybrid_events[] = {
+ {
+ .name = "cpu_core/cpu-cycles/",
+ .check = test__hybrid_hw_event_with_pmu,
+ .id = 0,
+ },
+ {
+ .name = "{cpu_core/cpu-cycles/,cpu_core/instructions/}",
+ .check = test__hybrid_hw_group_event,
+ .id = 1,
+ },
+ {
+ .name = "{cpu-clock,cpu_core/cpu-cycles/}",
+ .check = test__hybrid_sw_hw_group_event,
+ .id = 2,
+ },
+ {
+ .name = "{cpu_core/cpu-cycles/,cpu-clock}",
+ .check = test__hybrid_hw_sw_group_event,
+ .id = 3,
+ },
+ {
+ .name = "{cpu_core/cpu-cycles/k,cpu_core/instructions/u}",
+ .check = test__hybrid_group_modifier1,
+ .id = 4,
+ },
+ {
+ .name = "r1a",
+ .check = test__hybrid_raw1,
+ .id = 5,
+ },
+ {
+ .name = "cpu_core/r1a/",
+ .check = test__hybrid_raw2,
+ .id = 6,
+ },
+ {
+ .name = "cpu_core/config=10,config1,config2=3,period=1000/u",
+ .check = test__checkevent_pmu,
+ .id = 7,
+ },
+};
+
static int test_event(struct evlist_test *e)
{
struct parse_events_error err;
@@ -2035,6 +2182,11 @@ do { \
ret2 = ret1; \
} while (0)

+ if (perf_pmu__has_hybrid()) {
+ TEST_EVENTS(test__hybrid_events);
+ return ret2;
+ }
+
TEST_EVENTS(test__events);

if (test_pmu())
--
2.17.1

2021-04-23 05:40:08

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 20/26] perf tests: Skip 'Setup struct perf_event_attr' test for hybrid

For hybrid, the attr.type consists of pmu type id + original type.
There will be much changes for this test. Now we temporarily
skip this test case and TODO in future.

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- Since it's skip the test case, return TEST_SKIP.

tools/perf/tests/attr.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c
index dd39ce9b0277..9b40a25376ae 100644
--- a/tools/perf/tests/attr.c
+++ b/tools/perf/tests/attr.c
@@ -34,6 +34,7 @@
#include "event.h"
#include "util.h"
#include "tests.h"
+#include "pmu.h"

#define ENV "PERF_TEST_ATTR"

@@ -184,6 +185,9 @@ int test__attr(struct test *test __maybe_unused, int subtest __maybe_unused)
char path_dir[PATH_MAX];
char *exec_path;

+ if (perf_pmu__has_hybrid())
+ return TEST_SKIP;
+
/* First try development tree tests. */
if (!lstat("./tests", &st))
return run_dir("./tests", "./perf");
--
2.17.1

2021-04-23 05:40:12

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 21/26] perf tests: Support 'Track with sched_switch' test for hybrid

Since for "cycles:u' on hybrid platform, it creates two "cycles".
So the number of events in evlist is not expected in next test
steps. Now we just use one event "cpu_core/cycles:u/" for hybrid.

# ./perf test 35
35: Track with sched_switch : Ok

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- Drop the variable 'hybrid' and use 'if (perf_pmu__has_hybrid())'
directly.

tools/perf/tests/switch-tracking.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
index 3ebaa758df77..62c0ec21aaa8 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -18,6 +18,7 @@
#include "record.h"
#include "tests.h"
#include "util/mmap.h"
+#include "pmu.h"

static int spin_sleep(void)
{
@@ -371,7 +372,10 @@ int test__switch_tracking(struct test *test __maybe_unused, int subtest __maybe_
cpu_clocks_evsel = evlist__last(evlist);

/* Second event */
- err = parse_events(evlist, "cycles:u", NULL);
+ if (perf_pmu__has_hybrid())
+ err = parse_events(evlist, "cpu_core/cycles/u", NULL);
+ else
+ err = parse_events(evlist, "cycles:u", NULL);
if (err) {
pr_debug("Failed to parse event cycles:u\n");
goto out_err;
--
2.17.1

2021-04-23 05:40:33

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 24/26] perf tests: Support 'Convert perf time to TSC' test for hybrid

Since for "cycles:u' on hybrid platform, it creates two "cycles".
So the second evsel in evlist also needs initialization.

With this patch,

# ./perf test 71
71: Convert perf time to TSC : Ok

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- Drop the variable 'hybrid' and use 'if (perf_pmu__has_hybrid())'.

tools/perf/tests/perf-time-to-tsc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/tools/perf/tests/perf-time-to-tsc.c b/tools/perf/tests/perf-time-to-tsc.c
index 680c3cffb128..85d75b9b25a1 100644
--- a/tools/perf/tests/perf-time-to-tsc.c
+++ b/tools/perf/tests/perf-time-to-tsc.c
@@ -20,6 +20,7 @@
#include "tsc.h"
#include "mmap.h"
#include "tests.h"
+#include "pmu.h"

#define CHECK__(x) { \
while ((x) < 0) { \
@@ -88,6 +89,17 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
evsel->core.attr.disabled = 1;
evsel->core.attr.enable_on_exec = 0;

+ /*
+ * For hybrid "cycles:u", it creates two events.
+ * Init the second evsel here.
+ */
+ if (perf_pmu__has_hybrid()) {
+ evsel = evsel__next(evsel);
+ evsel->core.attr.comm = 1;
+ evsel->core.attr.disabled = 1;
+ evsel->core.attr.enable_on_exec = 0;
+ }
+
CHECK__(evlist__open(evlist));

CHECK__(evlist__mmap(evlist, UINT_MAX));
--
2.17.1

2021-04-23 05:40:37

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 25/26] perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid

Currently we don't support shadow stat for hybrid.

root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1

Performance counter stats for 'system wide':

12,883,109,591 cpu_core/cycles/
6,405,163,221 cpu_atom/cycles/
555,553,778 cpu_core/instructions/
841,158,734 cpu_atom/instructions/

1.002644773 seconds time elapsed

Now there is no shadow stat 'insn per cycle' reported. We will support
it later and now just skip the 'perf stat metrics (shadow stat) test'.

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- No change.

tools/perf/tests/shell/stat+shadow_stat.sh | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/tests/shell/stat+shadow_stat.sh b/tools/perf/tests/shell/stat+shadow_stat.sh
index ebebd3596cf9..e6e35fc6c882 100755
--- a/tools/perf/tests/shell/stat+shadow_stat.sh
+++ b/tools/perf/tests/shell/stat+shadow_stat.sh
@@ -7,6 +7,9 @@ set -e
# skip if system-wide mode is forbidden
perf stat -a true > /dev/null 2>&1 || exit 2

+# skip if on hybrid platform
+perf stat -a -e cycles sleep 1 2>&1 | grep -e cpu_core && exit 2
+
test_global_aggr()
{
perf stat -a --no-big-num -e cycles,instructions sleep 1 2>&1 | \
--
2.17.1

2021-04-23 05:40:46

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 26/26] perf Documentation: Document intel-hybrid support

Add some words and examples to help understanding of
Intel hybrid perf support.

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- No change.

tools/perf/Documentation/intel-hybrid.txt | 214 ++++++++++++++++++++++
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/Documentation/perf-stat.txt | 2 +
3 files changed, 217 insertions(+)
create mode 100644 tools/perf/Documentation/intel-hybrid.txt

diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt
new file mode 100644
index 000000000000..07f0aa3bf682
--- /dev/null
+++ b/tools/perf/Documentation/intel-hybrid.txt
@@ -0,0 +1,214 @@
+Intel hybrid support
+--------------------
+Support for Intel hybrid events within perf tools.
+
+For some Intel platforms, such as AlderLake, which is hybrid platform and
+it consists of atom cpu and core cpu. Each cpu has dedicated event list.
+Part of events are available on core cpu, part of events are available
+on atom cpu and even part of events are available on both.
+
+Kernel exports two new cpu pmus via sysfs:
+/sys/devices/cpu_core
+/sys/devices/cpu_atom
+
+The 'cpus' files are created under the directories. For example,
+
+cat /sys/devices/cpu_core/cpus
+0-15
+
+cat /sys/devices/cpu_atom/cpus
+16-23
+
+It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
+
+Quickstart
+
+List hybrid event
+-----------------
+
+As before, use perf-list to list the symbolic event.
+
+perf list
+
+inst_retired.any
+ [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
+inst_retired.any
+ [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
+
+The 'Unit: xxx' is added to brief description to indicate which pmu
+the event is belong to. Same event name but with different pmu can
+be supported.
+
+Enable hybrid event with a specific pmu
+---------------------------------------
+
+To enable a core only event or atom only event, following syntax is supported:
+
+ cpu_core/<event name>/
+or
+ cpu_atom/<event name>/
+
+For example, count the 'cycles' event on core cpus.
+
+ perf stat -e cpu_core/cycles/
+
+Create two events for one hardware event automatically
+------------------------------------------------------
+
+When creating one event and the event is available on both atom and core,
+two events are created automatically. One is for atom, the other is for
+core. Most of hardware events and cache events are available on both
+cpu_core and cpu_atom.
+
+For hardware events, they have pre-defined configs (e.g. 0 for cycles).
+But on hybrid platform, kernel needs to know where the event comes from
+(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
+can't carry pmu information. So now this type is extended to be PMU aware
+type. The PMU type ID is stored at attr.config[63:32].
+
+PMU type ID is retrieved from sysfs.
+/sys/devices/cpu_atom/type
+/sys/devices/cpu_core/type
+
+The new attr.config layout for PERF_TYPE_HARDWARE:
+
+PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA
+ AA: hardware event ID
+ EEEEEEEE: PMU type ID
+
+Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
+PMU aware type. The PMU type ID is stored at attr.config[63:32].
+
+The new attr.config layout for PERF_TYPE_HW_CACHE:
+
+PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB
+ BB: hardware cache ID
+ CC: hardware cache op ID
+ DD: hardware cache op result ID
+ EEEEEEEE: PMU type ID
+
+When enabling a hardware event without specified pmu, such as,
+perf stat -e cycles -a (use system-wide in this example), two events
+are created automatically.
+
+ ------------------------------------------------------------
+ perf_event_attr:
+ size 120
+ config 0x400000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ exclude_guest 1
+ ------------------------------------------------------------
+
+and
+
+ ------------------------------------------------------------
+ perf_event_attr:
+ size 120
+ config 0x800000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ exclude_guest 1
+ ------------------------------------------------------------
+
+type 0 is PERF_TYPE_HARDWARE.
+0x4 in 0x400000000 indicates it's cpu_core pmu.
+0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
+
+The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
+and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
+
+For perf-stat result, it displays two events:
+
+ Performance counter stats for 'system wide':
+
+ 6,744,979 cpu_core/cycles/
+ 1,965,552 cpu_atom/cycles/
+
+The first 'cycles' is core event, the second 'cycles' is atom event.
+
+Thread mode example:
+--------------------
+
+perf-stat reports the scaled counts for hybrid event and with a percentage
+displayed. The percentage is the event's running time/enabling time.
+
+One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
+scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
+
+perf stat -e cycles -- taskset -c 16 ./triad_loop
+
+As previous, two events are created.
+
+------------------------------------------------------------
+perf_event_attr:
+ size 120
+ config 0x400000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ enable_on_exec 1
+ exclude_guest 1
+------------------------------------------------------------
+
+and
+
+------------------------------------------------------------
+perf_event_attr:
+ size 120
+ config 0x800000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ enable_on_exec 1
+ exclude_guest 1
+------------------------------------------------------------
+
+ Performance counter stats for 'taskset -c 16 ./triad_loop':
+
+ 233,066,666 cpu_core/cycles/ (0.43%)
+ 604,097,080 cpu_atom/cycles/ (99.57%)
+
+perf-record:
+------------
+
+If there is no '-e' specified in perf record, on hybrid platform,
+it creates two default 'cycles' and adds them to event list. One
+is for core, the other is for atom.
+
+perf-stat:
+----------
+
+If there is no '-e' specified in perf stat, on hybrid platform,
+besides of software events, following events are created and
+added to event list in order.
+
+cpu_core/cycles/,
+cpu_atom/cycles/,
+cpu_core/instructions/,
+cpu_atom/instructions/,
+cpu_core/branches/,
+cpu_atom/branches/,
+cpu_core/branch-misses/,
+cpu_atom/branch-misses/
+
+Of course, both perf-stat and perf-record support to enable
+hybrid event with a specific pmu.
+
+e.g.
+perf stat -e cpu_core/cycles/
+perf stat -e cpu_atom/cycles/
+perf stat -e cpu_core/r1a/
+perf stat -e cpu_atom/L1-icache-loads/
+perf stat -e cpu_core/cycles/,cpu_atom/instructions/
+perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
+
+But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
+warning and disable grouping, because the pmus in group are
+not matched (cpu_core vs. cpu_atom).
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index f3161c9673e9..d71bac847936 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -695,6 +695,7 @@ measurements:
wait -n ${perf_pid}
exit $?

+include::intel-hybrid.txt[]

SEE ALSO
--------
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 6ec5960b08c3..aedb05f56c35 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -550,6 +550,8 @@ The fields are in this order:

Additional metrics may be printed with all earlier fields being empty.

+include::intel-hybrid.txt[]
+
SEE ALSO
--------
linkperf:perf-top[1], linkperf:perf-list[1]
--
2.17.1

2021-04-23 05:41:12

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 19/26] perf tests: Add hybrid cases for 'Roundtrip evsel->name' test

Since for one hw event, two hybrid events are created.

For example,

evsel->idx evsel__name(evsel)
0 cycles
1 cycles
2 instructions
3 instructions
...

So for comparing the evsel name on hybrid, the evsel->idx
needs to be divided by 2.

# ./perf test 14
14: Roundtrip evsel->name : Ok

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- No change.

tools/perf/tests/evsel-roundtrip-name.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/tools/perf/tests/evsel-roundtrip-name.c b/tools/perf/tests/evsel-roundtrip-name.c
index f7f3e5b4c180..b74cf80d1f10 100644
--- a/tools/perf/tests/evsel-roundtrip-name.c
+++ b/tools/perf/tests/evsel-roundtrip-name.c
@@ -4,6 +4,7 @@
#include "parse-events.h"
#include "tests.h"
#include "debug.h"
+#include "pmu.h"
#include <errno.h>
#include <linux/kernel.h>

@@ -62,7 +63,8 @@ static int perf_evsel__roundtrip_cache_name_test(void)
return ret;
}

-static int __perf_evsel__name_array_test(const char *names[], int nr_names)
+static int __perf_evsel__name_array_test(const char *names[], int nr_names,
+ int distance)
{
int i, err;
struct evsel *evsel;
@@ -82,9 +84,9 @@ static int __perf_evsel__name_array_test(const char *names[], int nr_names)

err = 0;
evlist__for_each_entry(evlist, evsel) {
- if (strcmp(evsel__name(evsel), names[evsel->idx])) {
+ if (strcmp(evsel__name(evsel), names[evsel->idx / distance])) {
--err;
- pr_debug("%s != %s\n", evsel__name(evsel), names[evsel->idx]);
+ pr_debug("%s != %s\n", evsel__name(evsel), names[evsel->idx / distance]);
}
}

@@ -93,18 +95,21 @@ static int __perf_evsel__name_array_test(const char *names[], int nr_names)
return err;
}

-#define perf_evsel__name_array_test(names) \
- __perf_evsel__name_array_test(names, ARRAY_SIZE(names))
+#define perf_evsel__name_array_test(names, distance) \
+ __perf_evsel__name_array_test(names, ARRAY_SIZE(names), distance)

int test__perf_evsel__roundtrip_name_test(struct test *test __maybe_unused, int subtest __maybe_unused)
{
int err = 0, ret = 0;

- err = perf_evsel__name_array_test(evsel__hw_names);
+ if (perf_pmu__has_hybrid())
+ return perf_evsel__name_array_test(evsel__hw_names, 2);
+
+ err = perf_evsel__name_array_test(evsel__hw_names, 1);
if (err)
ret = err;

- err = __perf_evsel__name_array_test(evsel__sw_names, PERF_COUNT_SW_DUMMY + 1);
+ err = __perf_evsel__name_array_test(evsel__sw_names, PERF_COUNT_SW_DUMMY + 1, 1);
if (err)
ret = err;

--
2.17.1

2021-04-23 05:41:25

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 23/26] perf tests: Support 'Session topology' test for hybrid

Force to create one event "cpu_core/cycles/" by default,
otherwise in evlist__valid_sample_type, the checking of
'if (evlist->core.nr_entries == 1)' would be failed.

# ./perf test 41
41: Session topology : Ok

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- Add "TEST_ASSERT_VAL session->evlist".

tools/perf/tests/topology.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 050489807a47..ec4e3b21b831 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -8,6 +8,7 @@
#include "session.h"
#include "evlist.h"
#include "debug.h"
+#include "pmu.h"
#include <linux/err.h>

#define TEMPL "/tmp/perf-test-XXXXXX"
@@ -40,8 +41,16 @@ static int session_write_header(char *path)
session = perf_session__new(&data, false, NULL);
TEST_ASSERT_VAL("can't get session", !IS_ERR(session));

- session->evlist = evlist__new_default();
- TEST_ASSERT_VAL("can't get evlist", session->evlist);
+ if (!perf_pmu__has_hybrid()) {
+ session->evlist = evlist__new_default();
+ TEST_ASSERT_VAL("can't get evlist", session->evlist);
+ } else {
+ struct parse_events_error err;
+
+ session->evlist = evlist__new();
+ TEST_ASSERT_VAL("can't get evlist", session->evlist);
+ parse_events(session->evlist, "cpu_core/cycles/", &err);
+ }

perf_header__set_feat(&session->header, HEADER_CPU_TOPOLOGY);
perf_header__set_feat(&session->header, HEADER_NRCPUS);
--
2.17.1

2021-04-23 05:41:53

by Jin Yao

[permalink] [raw]
Subject: [PATCH v5 17/26] perf record: Uniquify hybrid event name

For perf-record, it would be useful to tell user the pmu which the
event belongs to.

For example,

# perf record -a -- sleep 1
# perf report

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 106 of event 'cpu_core/cycles/'
# Event count (approx.): 22043448
#
# Overhead Command Shared Object Symbol
# ........ ............ ....................... ............................
#
...

Signed-off-by: Jin Yao <[email protected]>
---
v5:
- No change.

v4:
- No change.

tools/perf/builtin-record.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6af46c6a4fd8..3337b5f93336 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1605,6 +1605,32 @@ static void hit_auxtrace_snapshot_trigger(struct record *rec)
}
}

+static void record__uniquify_name(struct record *rec)
+{
+ struct evsel *pos;
+ struct evlist *evlist = rec->evlist;
+ char *new_name;
+ int ret;
+
+ if (!perf_pmu__has_hybrid())
+ return;
+
+ evlist__for_each_entry(evlist, pos) {
+ if (!evsel__is_hybrid(pos))
+ continue;
+
+ if (strchr(pos->name, '/'))
+ continue;
+
+ ret = asprintf(&new_name, "%s/%s/",
+ pos->pmu_name, pos->name);
+ if (ret) {
+ free(pos->name);
+ pos->name = new_name;
+ }
+ }
+}
+
static int __cmd_record(struct record *rec, int argc, const char **argv)
{
int err;
@@ -1709,6 +1735,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
if (data->is_pipe && rec->evlist->core.nr_entries == 1)
rec->opts.sample_id = true;

+ record__uniquify_name(rec);
+
if (record__open(rec) != 0) {
err = -1;
goto out_child;
--
2.17.1

2021-04-26 20:42:21

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v5 00/26] perf tool: AlderLake hybrid support series 1

Em Fri, Apr 23, 2021 at 01:35:15PM +0800, Jin Yao escreveu:
> AlderLake uses a hybrid architecture utilizing Golden Cove cores
> (core cpu) and Gracemont cores (atom cpu). Each cpu has dedicated
> event list. Some events are available on core cpu, some events
> are available on atom cpu and some events can be available on both.
>
> Kernel exports new pmus "cpu_core" and "cpu_atom" through sysfs:
> /sys/devices/cpu_core
> /sys/devices/cpu_atom

[acme@five perf]$ b4 am -t -s -l --cc-trailers [email protected]
Looking up https://lore.kernel.org/r/20210423053541.12521-1-yao.jin%40linux.intel.com
Grabbing thread from lore.kernel.org/lkml
Analyzing 29 messages in the thread
---
Writing ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.mbx
[PATCH v5 01/26] tools headers uapi: Update tools's copy of linux/perf_event.h
+ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
+ Link: https://lore.kernel.org/r/[email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
<SNIP>
---
Total patches: 26
---
Cover: ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.cover
Link: https://lore.kernel.org/r/[email protected]
Base: not found
git am ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.mbx
[acme@five perf]$ git am ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.mbx
Applying: tools headers uapi: Update tools's copy of linux/perf_event.h
Applying: perf jevents: Support unit value "cpu_core" and "cpu_atom"
Applying: perf pmu: Simplify arguments of __perf_pmu__new_alias
Applying: perf pmu: Save pmu name
Applying: perf pmu: Save detected hybrid pmus to a global pmu list
Applying: perf pmu: Add hybrid helper functions
Applying: perf stat: Uniquify hybrid event name
error: patch failed: tools/perf/builtin-stat.c:68
error: tools/perf/builtin-stat.c: patch does not apply
error: patch failed: tools/perf/util/stat-display.c:17
error: tools/perf/util/stat-display.c: patch does not apply
Patch failed at 0007 perf stat: Uniquify hybrid event name
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
[acme@five perf]$

[acme@five perf]$ git am --show-current-patch=diff | patch -p1
patching file tools/perf/builtin-stat.c
Hunk #1 FAILED at 68.
Hunk #2 succeeded at 2402 (offset 24 lines).
1 out of 2 hunks FAILED -- saving rejects to file tools/perf/builtin-stat.c.rej
patching file tools/perf/util/evsel.h
Hunk #1 succeeded at 116 (offset 1 line).
patching file tools/perf/util/parse-events.c
patching file tools/perf/util/stat-display.c
Hunk #1 FAILED at 17.
Hunk #2 succeeded at 538 (offset 6 lines).
Hunk #3 succeeded at 553 (offset 6 lines).
1 out of 3 hunks FAILED -- saving rejects to file tools/perf/util/stat-display.c.rej
[acme@five perf]$ vim tools/perf/builtin-stat.c.rej
[acme@five perf]$ cat tools/perf/builtin-stat.c.rej
--- tools/perf/builtin-stat.c
+++ tools/perf/builtin-stat.c
@@ -68,6 +68,7 @@
#include "util/affinity.h"
#include "util/pfm.h"
#include "util/bpf_counter.h"
+#include "util/pmu-hybrid.h"
#include "asm/bug.h"

#include <linux/time64.h>
[acme@five perf]$ cat tools/perf/util/stat-display.c.rej
--- tools/perf/util/stat-display.c
+++ tools/perf/util/stat-display.c
@@ -17,6 +17,7 @@
#include "cgroup.h"
#include <api/fs/fs.h>
#include "util.h"
+#include "pmu-hybrid.h"

#define CNTR_NOT_SUPPORTED "<not supported>"
#define CNTR_NOT_COUNTED "<not counted>"
[acme@five perf]$


Its clashing with some BPF changes by Song that are still under review
but I have in my tmp.perf/core branch so that I can test build it while
I wait for Jiri to say if Song addressed all his comments.

So after you address the new round of comments for v5 you can please
rebase on tmp.perf/core or, at that point, perf/core.

- Arnaldo

2021-04-27 00:53:10

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH v5 00/26] perf tool: AlderLake hybrid support series 1

Hi Arnaldo,

On 4/27/2021 4:41 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Apr 23, 2021 at 01:35:15PM +0800, Jin Yao escreveu:
>> AlderLake uses a hybrid architecture utilizing Golden Cove cores
>> (core cpu) and Gracemont cores (atom cpu). Each cpu has dedicated
>> event list. Some events are available on core cpu, some events
>> are available on atom cpu and some events can be available on both.
>>
>> Kernel exports new pmus "cpu_core" and "cpu_atom" through sysfs:
>> /sys/devices/cpu_core
>> /sys/devices/cpu_atom
>
> [acme@five perf]$ b4 am -t -s -l --cc-trailers [email protected]
> Looking up https://lore.kernel.org/r/20210423053541.12521-1-yao.jin%40linux.intel.com
> Grabbing thread from lore.kernel.org/lkml
> Analyzing 29 messages in the thread
> ---
> Writing ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.mbx
> [PATCH v5 01/26] tools headers uapi: Update tools's copy of linux/perf_event.h
> + Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> + Link: https://lore.kernel.org/r/[email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> <SNIP>
> ---
> Total patches: 26
> ---
> Cover: ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.cover
> Link: https://lore.kernel.org/r/[email protected]
> Base: not found
> git am ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.mbx
> [acme@five perf]$ git am ./v5_20210423_yao_jin_perf_tool_alderlake_hybrid_support_series_1.mbx
> Applying: tools headers uapi: Update tools's copy of linux/perf_event.h
> Applying: perf jevents: Support unit value "cpu_core" and "cpu_atom"
> Applying: perf pmu: Simplify arguments of __perf_pmu__new_alias
> Applying: perf pmu: Save pmu name
> Applying: perf pmu: Save detected hybrid pmus to a global pmu list
> Applying: perf pmu: Add hybrid helper functions
> Applying: perf stat: Uniquify hybrid event name
> error: patch failed: tools/perf/builtin-stat.c:68
> error: tools/perf/builtin-stat.c: patch does not apply
> error: patch failed: tools/perf/util/stat-display.c:17
> error: tools/perf/util/stat-display.c: patch does not apply
> Patch failed at 0007 perf stat: Uniquify hybrid event name
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> [acme@five perf]$
>
> [acme@five perf]$ git am --show-current-patch=diff | patch -p1
> patching file tools/perf/builtin-stat.c
> Hunk #1 FAILED at 68.
> Hunk #2 succeeded at 2402 (offset 24 lines).
> 1 out of 2 hunks FAILED -- saving rejects to file tools/perf/builtin-stat.c.rej
> patching file tools/perf/util/evsel.h
> Hunk #1 succeeded at 116 (offset 1 line).
> patching file tools/perf/util/parse-events.c
> patching file tools/perf/util/stat-display.c
> Hunk #1 FAILED at 17.
> Hunk #2 succeeded at 538 (offset 6 lines).
> Hunk #3 succeeded at 553 (offset 6 lines).
> 1 out of 3 hunks FAILED -- saving rejects to file tools/perf/util/stat-display.c.rej
> [acme@five perf]$ vim tools/perf/builtin-stat.c.rej
> [acme@five perf]$ cat tools/perf/builtin-stat.c.rej
> --- tools/perf/builtin-stat.c
> +++ tools/perf/builtin-stat.c
> @@ -68,6 +68,7 @@
> #include "util/affinity.h"
> #include "util/pfm.h"
> #include "util/bpf_counter.h"
> +#include "util/pmu-hybrid.h"
> #include "asm/bug.h"
>
> #include <linux/time64.h>
> [acme@five perf]$ cat tools/perf/util/stat-display.c.rej
> --- tools/perf/util/stat-display.c
> +++ tools/perf/util/stat-display.c
> @@ -17,6 +17,7 @@
> #include "cgroup.h"
> #include <api/fs/fs.h>
> #include "util.h"
> +#include "pmu-hybrid.h"
>
> #define CNTR_NOT_SUPPORTED "<not supported>"
> #define CNTR_NOT_COUNTED "<not counted>"
> [acme@five perf]$
>
>
> Its clashing with some BPF changes by Song that are still under review
> but I have in my tmp.perf/core branch so that I can test build it while
> I wait for Jiri to say if Song addressed all his comments.
>
> So after you address the new round of comments for v5 you can please
> rebase on tmp.perf/core or, at that point, perf/core.
>
> - Arnaldo
>

The v6 is ready now which is based on perf/core. It only has a minor update (adding new test case
for cache events with pmu prefix). There is no other updates in v6.

I'm OK to rebase v6 on tmp.perf/core first and then post the new series.

Thanks
Jin Yao