2021-05-27 14:45:45

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 0/8] perf: Support perf-mem/perf-c2c for AlderLake

AlderLake uses a hybrid architecture utilizing Golden Cove cores
(core CPU) and Gracemont cores (atom CPU). This patchset supports
perf-mem and perf-c2c for AlderLake.

v2:
---
- Use mem_loads_name__init to keep original behavior for non-hybrid platform.
- Move x86 specific perf_mem_events[] to arch/x86/util/mem-events.c.
- Move mem-store event to a new patch.
- Add a new patch to fix wrong verbose output for recording events
- Add a new patch to disable 'mem-loads-aux' group before reporting

Jin Yao (8):
perf tools: Check mem-loads auxiliary event
perf tools: Support pmu prefix for mem-load event
perf tools: Support pmu prefix for mem-store event
perf tools: Check if mem_events is supported for hybrid platform
perf mem: Support record for hybrid platform
perf mem: Fix wrong verbose output for recording events
perf mem: Disable 'mem-loads-aux' group before reporting
perf c2c: Support record for hybrid platform

tools/perf/arch/arm64/util/mem-events.c | 2 +-
tools/perf/arch/powerpc/util/mem-events.c | 2 +-
tools/perf/arch/x86/util/mem-events.c | 54 ++++++++++--
tools/perf/builtin-c2c.c | 40 +++++----
tools/perf/builtin-mem.c | 51 ++++++-----
tools/perf/builtin-report.c | 2 +
tools/perf/util/evlist.c | 25 ++++++
tools/perf/util/evlist.h | 1 +
tools/perf/util/mem-events.c | 101 ++++++++++++++++++++--
tools/perf/util/mem-events.h | 4 +-
10 files changed, 225 insertions(+), 57 deletions(-)

--
2.17.1


2021-05-27 14:47:38

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 6/8] perf mem: Fix wrong verbose output for recording events

Current code:

for (j = 0; j < argc; j++, i++)
rec_argv[i] = argv[j];

if (verbose > 0) {
pr_debug("calling: record ");

while (rec_argv[j]) {
pr_debug("%s ", rec_argv[j]);
j++;
}
pr_debug("\n");
}

The entries of argv[] are copied to the end of rec_argv[], not
copied to the beginning of rec_argv[]. So the index j at
rec_argv[] doesn't point to the first event.

Now we record the start index and end index for events in rec_argv[],
and print them if verbose is enabled.

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- New in v2.

tools/perf/builtin-mem.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 6b633df458c2..0fd2a74dbaca 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -65,6 +65,7 @@ static const char * const *record_mem_usage = __usage;
static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
{
int rec_argc, i = 0, j, tmp_nr = 0;
+ int start, end;
const char **rec_argv;
char **rec_tmp;
int ret;
@@ -144,9 +145,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
if (mem->data_page_size)
rec_argv[i++] = "--data-page-size";

+ start = i;
ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &tmp_nr);
if (ret)
goto out;
+ end = i;

if (all_user)
rec_argv[i++] = "--all-user";
@@ -160,10 +163,9 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
if (verbose > 0) {
pr_debug("calling: record ");

- while (rec_argv[j]) {
+ for (j = start; j < end; j++)
pr_debug("%s ", rec_argv[j]);
- j++;
- }
+
pr_debug("\n");
}

--
2.17.1

2021-05-27 14:48:57

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 4/8] perf tools: Check if mem_events is supported for hybrid platform

Check if the mem_events ('mem-loads' and 'mem-stores') exist
in the sysfs path.

For Alderlake, the hybrid cpu pmu are "cpu_core" and "cpu_atom".
Check the existing of following paths:

/sys/devices/cpu_atom/events/mem-loads
/sys/devices/cpu_atom/events/mem-stores
/sys/devices/cpu_core/events/mem-loads
/sys/devices/cpu_core/events/mem-stores

If the patch exists, the mem_event is supported.

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- Use 'e->supported |= perf_mem_event__supported(mnt, sysfs_name);'

tools/perf/util/mem-events.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index c736eaded06c..69dcac730ada 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -12,6 +12,8 @@
#include "mem-events.h"
#include "debug.h"
#include "symbol.h"
+#include "pmu.h"
+#include "pmu-hybrid.h"

unsigned int perf_mem_events__loads_ldlat = 30;

@@ -100,6 +102,15 @@ int perf_mem_events__parse(const char *str)
return -1;
}

+static bool perf_mem_event__supported(const char *mnt, char *sysfs_name)
+{
+ char path[PATH_MAX];
+ struct stat st;
+
+ scnprintf(path, PATH_MAX, "%s/devices/%s", mnt, sysfs_name);
+ return !stat(path, &st);
+}
+
int perf_mem_events__init(void)
{
const char *mnt = sysfs__mount();
@@ -110,9 +121,9 @@ int perf_mem_events__init(void)
return -ENOENT;

for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- char path[PATH_MAX];
struct perf_mem_event *e = perf_mem_events__ptr(j);
- struct stat st;
+ struct perf_pmu *pmu;
+ char sysfs_name[100];

/*
* If the event entry isn't valid, skip initialization
@@ -121,11 +132,20 @@ int perf_mem_events__init(void)
if (!e->tag)
continue;

- scnprintf(path, PATH_MAX, "%s/devices/%s",
- mnt, e->sysfs_name);
+ if (!perf_pmu__has_hybrid()) {
+ scnprintf(sysfs_name, sizeof(sysfs_name),
+ e->sysfs_name, "cpu");
+ e->supported = perf_mem_event__supported(mnt, sysfs_name);
+ } else {
+ perf_pmu__for_each_hybrid_pmu(pmu) {
+ scnprintf(sysfs_name, sizeof(sysfs_name),
+ e->sysfs_name, pmu->name);
+ e->supported |= perf_mem_event__supported(mnt, sysfs_name);
+ }
+ }

- if (!stat(path, &st))
- e->supported = found = true;
+ if (e->supported)
+ found = true;
}

return found ? 0 : -ENOENT;
--
2.17.1

2021-05-27 14:49:29

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 7/8] perf mem: Disable 'mem-loads-aux' group before reporting

For some platforms, such as Alderlake, the 'mem-loads' event is required
to use together with 'mem-loads-aux' within a group and 'mem-loads-aux'
must be the group leader. Now we disable this group before reporting
because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
any valid memory load result. If we show the 'mem-loads-aux' +
'mem-loads' as a group in report, it needs many of changes but they
are totally unnecessary.

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- New in v2.

tools/perf/builtin-report.c | 2 ++
tools/perf/util/evlist.c | 25 +++++++++++++++++++++++++
tools/perf/util/evlist.h | 1 +
3 files changed, 28 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 36f9ccfeb38a..bc5c393021dc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -934,6 +934,8 @@ static int __cmd_report(struct report *rep)
return ret;
}

+ evlist__check_mem_load_aux(session->evlist);
+
if (rep->stats_mode)
return stats_print(rep);

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6ea3e677dc1e..6ba9664089bd 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -2161,3 +2161,28 @@ int evlist__scnprintf_evsels(struct evlist *evlist, size_t size, char *bf)

return printed;
}
+
+void evlist__check_mem_load_aux(struct evlist *evlist)
+{
+ struct evsel *leader, *evsel, *pos;
+
+ /*
+ * For some platforms, the 'mem-loads' event is required to use
+ * together with 'mem-loads-aux' within a group and 'mem-loads-aux'
+ * must be the group leader. Now we disable this group before reporting
+ * because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
+ * any valid memory load information.
+ */
+ evlist__for_each_entry(evlist, evsel) {
+ leader = evsel->leader;
+ if (leader == evsel)
+ continue;
+
+ if (leader->name && strstr(leader->name, "mem-loads-aux")) {
+ for_each_group_evsel(pos, leader) {
+ pos->leader = pos;
+ pos->core.nr_members = 0;
+ }
+ }
+ }
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a8b97b50cceb..2073cfa79f79 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -367,4 +367,5 @@ int evlist__ctlfd_ack(struct evlist *evlist);
struct evsel *evlist__find_evsel(struct evlist *evlist, int idx);

int evlist__scnprintf_evsels(struct evlist *evlist, size_t size, char *bf);
+void evlist__check_mem_load_aux(struct evlist *evlist);
#endif /* __PERF_EVLIST_H */
--
2.17.1

2021-05-27 15:05:50

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 2/8] perf tools: Support pmu prefix for mem-load event

The perf_mem_events__name() can generate the mem-load event name.
It uses a variable 'mem_loads_name__init' to avoid generating the
event name every time (because perf_pmu__scan takes some time).

The perf_mem_events__name() assumes the pmu is "cpu" but it's not
correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or
"cpu_atom"

Introduce a new parameter 'pmu_name' in perf_mem_events__name
to let the caller specify a pmu name.

Considering such event name is x86 specific, so move
perf_mem_events[] to arch/x86/util/mem-events.c.

We still keep the variable 'mem_loads_name__init' but it's only
used when pmu_name is NULL (compatible for original behavior). When
pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have
optimization. That can be implemented in follow up patch.

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- Move perf_mem_events[] to x86 specific file.
- Create x86 specific perf_mem_events__ptr().
- Use mem_loads_name__init for keeping original behavior
on non-hybrid platform.

tools/perf/arch/arm64/util/mem-events.c | 2 +-
tools/perf/arch/powerpc/util/mem-events.c | 2 +-
tools/perf/arch/x86/util/mem-events.c | 35 ++++++++++++++++++-----
tools/perf/builtin-c2c.c | 4 +--
tools/perf/builtin-mem.c | 4 +--
tools/perf/util/mem-events.c | 4 +--
tools/perf/util/mem-events.h | 2 +-
7 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
index 2a2497372671..be41721b9aa1 100644
--- a/tools/perf/arch/arm64/util/mem-events.c
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -20,7 +20,7 @@ struct perf_mem_event *perf_mem_events__ptr(int i)
return &perf_mem_events[i];
}

-char *perf_mem_events__name(int i)
+char *perf_mem_events__name(int i, char *pmu_name __maybe_unused)
{
struct perf_mem_event *e = perf_mem_events__ptr(i);

diff --git a/tools/perf/arch/powerpc/util/mem-events.c b/tools/perf/arch/powerpc/util/mem-events.c
index 07fb5e049488..4120fafe0be4 100644
--- a/tools/perf/arch/powerpc/util/mem-events.c
+++ b/tools/perf/arch/powerpc/util/mem-events.c
@@ -3,7 +3,7 @@
#include "mem-events.h"

/* PowerPC does not support 'ldlat' parameter. */
-char *perf_mem_events__name(int i)
+char *perf_mem_events__name(int i, char *pmu_name __maybe_unused)
{
if (i == PERF_MEM_EVENTS__LOAD)
return (char *) "cpu/mem-loads/";
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index e79232e3f2a0..f9e444a4fe70 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -7,7 +7,23 @@ static char mem_loads_name[100];
static bool mem_loads_name__init;

#define MEM_LOADS_AUX 0x8203
-#define MEM_LOADS_AUX_NAME "{cpu/mem-loads-aux/,cpu/mem-loads,ldlat=%u/pp}:S"
+#define MEM_LOADS_AUX_NAME "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P"
+
+#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+
+static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+ E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
+ E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
+ E(NULL, NULL, NULL),
+};
+
+struct perf_mem_event *perf_mem_events__ptr(int i)
+{
+ if (i >= PERF_MEM_EVENTS__MAX)
+ return NULL;
+
+ return &perf_mem_events[i];
+}

bool is_mem_loads_aux_event(struct evsel *leader)
{
@@ -22,7 +38,7 @@ bool is_mem_loads_aux_event(struct evsel *leader)
return leader->core.attr.config == MEM_LOADS_AUX;
}

-char *perf_mem_events__name(int i)
+char *perf_mem_events__name(int i, char *pmu_name)
{
struct perf_mem_event *e = perf_mem_events__ptr(i);

@@ -30,17 +46,22 @@ char *perf_mem_events__name(int i)
return NULL;

if (i == PERF_MEM_EVENTS__LOAD) {
- if (mem_loads_name__init)
+ if (mem_loads_name__init && !pmu_name)
return mem_loads_name;

- mem_loads_name__init = true;
+ if (!pmu_name) {
+ mem_loads_name__init = true;
+ pmu_name = (char *)"cpu";
+ }

- if (pmu_have_event("cpu", "mem-loads-aux")) {
+ if (pmu_have_event(pmu_name, "mem-loads-aux")) {
scnprintf(mem_loads_name, sizeof(mem_loads_name),
- MEM_LOADS_AUX_NAME, perf_mem_events__loads_ldlat);
+ MEM_LOADS_AUX_NAME, pmu_name, pmu_name,
+ perf_mem_events__loads_ldlat);
} else {
scnprintf(mem_loads_name, sizeof(mem_loads_name),
- e->name, perf_mem_events__loads_ldlat);
+ e->name, pmu_name,
+ perf_mem_events__loads_ldlat);
}
return mem_loads_name;
}
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e3b9d63077ef..a4fd375acdd1 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2971,13 +2971,13 @@ static int perf_c2c__record(int argc, const char **argv)

if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(j));
+ perf_mem_events__name(j, NULL));
free(rec_argv);
return -1;
}

rec_argv[i++] = "-e";
- rec_argv[i++] = perf_mem_events__name(j);
+ rec_argv[i++] = perf_mem_events__name(j, NULL);
}

if (all_user)
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index cdd2b9f643f6..03795bf49d51 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -135,13 +135,13 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)

if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(j));
+ perf_mem_events__name(j, NULL));
free(rec_argv);
return -1;
}

rec_argv[i++] = "-e";
- rec_argv[i++] = perf_mem_events__name(j);
+ rec_argv[i++] = perf_mem_events__name(j, NULL);
}

if (all_user)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index f93a852ad838..c736eaded06c 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -37,7 +37,7 @@ struct perf_mem_event * __weak perf_mem_events__ptr(int i)
return &perf_mem_events[i];
}

-char * __weak perf_mem_events__name(int i)
+char * __weak perf_mem_events__name(int i, char *pmu_name __maybe_unused)
{
struct perf_mem_event *e = perf_mem_events__ptr(i);

@@ -141,7 +141,7 @@ void perf_mem_events__list(void)
fprintf(stderr, "%-13s%-*s%s\n",
e->tag ?: "",
verbose > 0 ? 25 : 0,
- verbose > 0 ? perf_mem_events__name(j) : "",
+ verbose > 0 ? perf_mem_events__name(j, NULL) : "",
e->supported ? ": available" : "");
}
}
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index cacdebd65b8a..a3fa19093fd2 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -38,7 +38,7 @@ extern unsigned int perf_mem_events__loads_ldlat;
int perf_mem_events__parse(const char *str);
int perf_mem_events__init(void);

-char *perf_mem_events__name(int i);
+char *perf_mem_events__name(int i, char *pmu_name);
struct perf_mem_event *perf_mem_events__ptr(int i);
bool is_mem_loads_aux_event(struct evsel *leader);

--
2.17.1

2021-05-27 15:05:56

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 3/8] perf tools: Support pmu prefix for mem-store event

For enabling mem-store event, it doesn't need an auxiliary event.
So just build an event name string with the pmu prefix.

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- New in v2.

tools/perf/arch/x86/util/mem-events.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index f9e444a4fe70..5214370ca4e4 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -5,6 +5,7 @@

static char mem_loads_name[100];
static bool mem_loads_name__init;
+static char mem_stores_name[100];

#define MEM_LOADS_AUX 0x8203
#define MEM_LOADS_AUX_NAME "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P"
@@ -13,7 +14,7 @@ static bool mem_loads_name__init;

static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
- E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
+ E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores"),
E(NULL, NULL, NULL),
};

@@ -66,5 +67,14 @@ char *perf_mem_events__name(int i, char *pmu_name)
return mem_loads_name;
}

+ if (i == PERF_MEM_EVENTS__STORE) {
+ if (!pmu_name)
+ pmu_name = (char *)"cpu";
+
+ scnprintf(mem_stores_name, sizeof(mem_stores_name),
+ e->name, pmu_name);
+ return mem_stores_name;
+ }
+
return (char *)e->name;
}
--
2.17.1

2021-05-27 15:06:16

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 8/8] perf c2c: Support record for hybrid platform

Support 'perf c2c record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf c2c record', it actually calls:

record -W -d --phys-data --sample-cpu
-e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
-e cpu_atom/mem-loads,ldlat=30/P
-e cpu_core/mem-stores/P
-e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- For hybrid, rec_argc = argc + 11 * perf_pmu__hybrid_pmu_num().
- Directly 'free(rec_tmp[i])', don't need to check NULL.

tools/perf/builtin-c2c.c | 40 +++++++++++++++++++++++-----------------
1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a4fd375acdd1..6dea37f141b2 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -42,6 +42,8 @@
#include "ui/ui.h"
#include "ui/progress.h"
#include "../perf.h"
+#include "pmu.h"
+#include "pmu-hybrid.h"

struct c2c_hists {
struct hists hists;
@@ -2907,8 +2909,9 @@ static const char * const *record_mem_usage = __usage_record;

static int perf_c2c__record(int argc, const char **argv)
{
- int rec_argc, i = 0, j;
+ int rec_argc, i = 0, j, rec_tmp_nr = 0;
const char **rec_argv;
+ char **rec_tmp;
int ret;
bool all_user = false, all_kernel = false;
bool event_set = false;
@@ -2932,11 +2935,21 @@ static int perf_c2c__record(int argc, const char **argv)
argc = parse_options(argc, argv, options, record_mem_usage,
PARSE_OPT_KEEP_UNKNOWN);

- rec_argc = argc + 11; /* max number of arguments */
+ if (!perf_pmu__has_hybrid())
+ rec_argc = argc + 11; /* max number of arguments */
+ else
+ rec_argc = argc + 11 * perf_pmu__hybrid_pmu_num();
+
rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (!rec_argv)
return -1;

+ rec_tmp = calloc(rec_argc + 1, sizeof(char *));
+ if (!rec_tmp) {
+ free(rec_argv);
+ return -1;
+ }
+
rec_argv[i++] = "record";

if (!event_set) {
@@ -2964,21 +2977,9 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "--phys-data";
rec_argv[i++] = "--sample-cpu";

- for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- e = perf_mem_events__ptr(j);
- if (!e->record)
- continue;
-
- if (!e->supported) {
- pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(j, NULL));
- free(rec_argv);
- return -1;
- }
-
- rec_argv[i++] = "-e";
- rec_argv[i++] = perf_mem_events__name(j, NULL);
- }
+ ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &rec_tmp_nr);
+ if (ret)
+ goto out;

if (all_user)
rec_argv[i++] = "--all-user";
@@ -3002,6 +3003,11 @@ static int perf_c2c__record(int argc, const char **argv)
}

ret = cmd_record(i, rec_argv);
+out:
+ for (i = 0; i < rec_tmp_nr; i++)
+ free(rec_tmp[i]);
+
+ free(rec_tmp);
free(rec_argv);
return ret;
}
--
2.17.1

2021-05-27 15:06:27

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2 5/8] perf mem: Support record for hybrid platform

Support 'perf mem record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf mem record', it actually calls:

record -e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
-e cpu_atom/mem-loads,ldlat=30/P
-e cpu_core/mem-stores/P
-e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- For hybrid, rec_argc = argc + 9 * perf_pmu__hybrid_pmu_num().
- Directly 'free(rec_tmp[i])', don't need to check NULL.

tools/perf/builtin-mem.c | 43 ++++++++++++++----------
tools/perf/util/mem-events.c | 65 ++++++++++++++++++++++++++++++++++++
tools/perf/util/mem-events.h | 2 ++
3 files changed, 93 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 03795bf49d51..6b633df458c2 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -18,6 +18,8 @@
#include "util/dso.h"
#include "util/map.h"
#include "util/symbol.h"
+#include "util/pmu.h"
+#include "util/pmu-hybrid.h"
#include <linux/err.h>

#define MEM_OPERATION_LOAD 0x1
@@ -62,8 +64,9 @@ static const char * const *record_mem_usage = __usage;

static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
{
- int rec_argc, i = 0, j;
+ int rec_argc, i = 0, j, tmp_nr = 0;
const char **rec_argv;
+ char **rec_tmp;
int ret;
bool all_user = false, all_kernel = false;
struct perf_mem_event *e;
@@ -87,11 +90,24 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
argc = parse_options(argc, argv, options, record_mem_usage,
PARSE_OPT_KEEP_UNKNOWN);

- rec_argc = argc + 9; /* max number of arguments */
+ if (!perf_pmu__has_hybrid())
+ rec_argc = argc + 9; /* max number of arguments */
+ else
+ rec_argc = argc + 9 * perf_pmu__hybrid_pmu_num();
+
rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (!rec_argv)
return -1;

+ /*
+ * Save the allocated event name strings.
+ */
+ rec_tmp = calloc(rec_argc + 1, sizeof(char *));
+ if (!rec_tmp) {
+ free(rec_argv);
+ return -1;
+ }
+
rec_argv[i++] = "record";

e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
@@ -128,21 +144,9 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
if (mem->data_page_size)
rec_argv[i++] = "--data-page-size";

- for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- e = perf_mem_events__ptr(j);
- if (!e->record)
- continue;
-
- if (!e->supported) {
- pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(j, NULL));
- free(rec_argv);
- return -1;
- }
-
- rec_argv[i++] = "-e";
- rec_argv[i++] = perf_mem_events__name(j, NULL);
- }
+ ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &tmp_nr);
+ if (ret)
+ goto out;

if (all_user)
rec_argv[i++] = "--all-user";
@@ -164,6 +168,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
}

ret = cmd_record(i, rec_argv);
+out:
+ for (i = 0; i < tmp_nr; i++)
+ free(rec_tmp[i]);
+
+ free(rec_tmp);
free(rec_argv);
return ret;
}
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 69dcac730ada..f38c0da9e698 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -166,6 +166,71 @@ void perf_mem_events__list(void)
}
}

+static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
+ int idx)
+{
+ const char *mnt = sysfs__mount();
+ char sysfs_name[100];
+ struct perf_pmu *pmu;
+
+ perf_pmu__for_each_hybrid_pmu(pmu) {
+ scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name,
+ pmu->name);
+ if (!perf_mem_event__supported(mnt, sysfs_name)) {
+ pr_err("failed: event '%s' not supported\n",
+ perf_mem_events__name(idx, pmu->name));
+ }
+ }
+}
+
+int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
+ char **rec_tmp, int *tmp_nr)
+{
+ int i = *argv_nr, k = 0;
+ struct perf_mem_event *e;
+ struct perf_pmu *pmu;
+ char *s;
+
+ for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
+ e = perf_mem_events__ptr(j);
+ if (!e->record)
+ continue;
+
+ if (!perf_pmu__has_hybrid()) {
+ if (!e->supported) {
+ pr_err("failed: event '%s' not supported\n",
+ perf_mem_events__name(j, NULL));
+ return -1;
+ }
+
+ rec_argv[i++] = "-e";
+ rec_argv[i++] = perf_mem_events__name(j, NULL);
+ } else {
+ if (!e->supported) {
+ perf_mem_events__print_unsupport_hybrid(e, j);
+ return -1;
+ }
+
+ perf_pmu__for_each_hybrid_pmu(pmu) {
+ rec_argv[i++] = "-e";
+ s = perf_mem_events__name(j, pmu->name);
+ if (s) {
+ s = strdup(s);
+ if (!s)
+ return -1;
+
+ rec_argv[i++] = s;
+ rec_tmp[k++] = s;
+ }
+ }
+ }
+ }
+
+ *argv_nr = i;
+ *tmp_nr = k;
+ return 0;
+}
+
static const char * const tlb_access[] = {
"N/A",
"HIT",
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index a3fa19093fd2..916242f8020a 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -43,6 +43,8 @@ struct perf_mem_event *perf_mem_events__ptr(int i);
bool is_mem_loads_aux_event(struct evsel *leader);

void perf_mem_events__list(void);
+int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
+ char **rec_tmp, int *tmp_nr);

int perf_mem__tlb_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
--
2.17.1

2021-05-31 22:44:45

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf mem: Fix wrong verbose output for recording events

On Thu, May 27, 2021 at 08:16:08AM +0800, Jin Yao wrote:
> Current code:
>
> for (j = 0; j < argc; j++, i++)
> rec_argv[i] = argv[j];
>
> if (verbose > 0) {
> pr_debug("calling: record ");
>
> while (rec_argv[j]) {
> pr_debug("%s ", rec_argv[j]);
> j++;
> }
> pr_debug("\n");
> }
>
> The entries of argv[] are copied to the end of rec_argv[], not
> copied to the beginning of rec_argv[]. So the index j at
> rec_argv[] doesn't point to the first event.
>
> Now we record the start index and end index for events in rec_argv[],
> and print them if verbose is enabled.
>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> v2:
> - New in v2.
>
> tools/perf/builtin-mem.c | 8 +++++---

hi,
do we need the same in c2c as well?

jirka

> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
> index 6b633df458c2..0fd2a74dbaca 100644
> --- a/tools/perf/builtin-mem.c
> +++ b/tools/perf/builtin-mem.c
> @@ -65,6 +65,7 @@ static const char * const *record_mem_usage = __usage;
> static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
> {
> int rec_argc, i = 0, j, tmp_nr = 0;
> + int start, end;
> const char **rec_argv;
> char **rec_tmp;
> int ret;
> @@ -144,9 +145,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
> if (mem->data_page_size)
> rec_argv[i++] = "--data-page-size";
>
> + start = i;
> ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &tmp_nr);
> if (ret)
> goto out;
> + end = i;
>
> if (all_user)
> rec_argv[i++] = "--all-user";
> @@ -160,10 +163,9 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
> if (verbose > 0) {
> pr_debug("calling: record ");
>
> - while (rec_argv[j]) {
> + for (j = start; j < end; j++)
> pr_debug("%s ", rec_argv[j]);
> - j++;
> - }
> +
> pr_debug("\n");
> }
>
> --
> 2.17.1
>

2021-05-31 22:51:30

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] perf: Support perf-mem/perf-c2c for AlderLake

On Thu, May 27, 2021 at 08:16:02AM +0800, Jin Yao wrote:
> AlderLake uses a hybrid architecture utilizing Golden Cove cores
> (core CPU) and Gracemont cores (atom CPU). This patchset supports
> perf-mem and perf-c2c for AlderLake.
>
> v2:
> ---
> - Use mem_loads_name__init to keep original behavior for non-hybrid platform.
> - Move x86 specific perf_mem_events[] to arch/x86/util/mem-events.c.
> - Move mem-store event to a new patch.
> - Add a new patch to fix wrong verbose output for recording events
> - Add a new patch to disable 'mem-loads-aux' group before reporting

Acked-by: Jiri Olsa <[email protected]>

thanks,
jirka

>
> Jin Yao (8):
> perf tools: Check mem-loads auxiliary event
> perf tools: Support pmu prefix for mem-load event
> perf tools: Support pmu prefix for mem-store event
> perf tools: Check if mem_events is supported for hybrid platform
> perf mem: Support record for hybrid platform
> perf mem: Fix wrong verbose output for recording events
> perf mem: Disable 'mem-loads-aux' group before reporting
> perf c2c: Support record for hybrid platform
>
> tools/perf/arch/arm64/util/mem-events.c | 2 +-
> tools/perf/arch/powerpc/util/mem-events.c | 2 +-
> tools/perf/arch/x86/util/mem-events.c | 54 ++++++++++--
> tools/perf/builtin-c2c.c | 40 +++++----
> tools/perf/builtin-mem.c | 51 ++++++-----
> tools/perf/builtin-report.c | 2 +
> tools/perf/util/evlist.c | 25 ++++++
> tools/perf/util/evlist.h | 1 +
> tools/perf/util/mem-events.c | 101 ++++++++++++++++++++--
> tools/perf/util/mem-events.h | 4 +-
> 10 files changed, 225 insertions(+), 57 deletions(-)
>
> --
> 2.17.1
>

2021-06-01 01:15:01

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] perf mem: Fix wrong verbose output for recording events

Hi Jiri,

On 6/1/2021 6:42 AM, Jiri Olsa wrote:
> On Thu, May 27, 2021 at 08:16:08AM +0800, Jin Yao wrote:
>> Current code:
>>
>> for (j = 0; j < argc; j++, i++)
>> rec_argv[i] = argv[j];
>>
>> if (verbose > 0) {
>> pr_debug("calling: record ");
>>
>> while (rec_argv[j]) {
>> pr_debug("%s ", rec_argv[j]);
>> j++;
>> }
>> pr_debug("\n");
>> }
>>
>> The entries of argv[] are copied to the end of rec_argv[], not
>> copied to the beginning of rec_argv[]. So the index j at
>> rec_argv[] doesn't point to the first event.
>>
>> Now we record the start index and end index for events in rec_argv[],
>> and print them if verbose is enabled.
>>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> v2:
>> - New in v2.
>>
>> tools/perf/builtin-mem.c | 8 +++++---
>
> hi,
> do we need the same in c2c as well?
>
> jirka
>

perf c2c is a bit different. It sets 'j = 0;' before 'while' loop so it prints all of rec_argv[].

In test,

# perf c2c record -vvv -a -- sleep 1

calling: record -W -d --phys-data --sample-cpu -e
{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P -e cpu_atom/mem-loads,ldlat=30/P -e
cpu_core/mem-stores/P -e cpu_atom/mem-stores/P -a sleep 1

The verbose output looks OK.

Thanks
Jin Yao

>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
>> index 6b633df458c2..0fd2a74dbaca 100644
>> --- a/tools/perf/builtin-mem.c
>> +++ b/tools/perf/builtin-mem.c
>> @@ -65,6 +65,7 @@ static const char * const *record_mem_usage = __usage;
>> static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
>> {
>> int rec_argc, i = 0, j, tmp_nr = 0;
>> + int start, end;
>> const char **rec_argv;
>> char **rec_tmp;
>> int ret;
>> @@ -144,9 +145,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
>> if (mem->data_page_size)
>> rec_argv[i++] = "--data-page-size";
>>
>> + start = i;
>> ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &tmp_nr);
>> if (ret)
>> goto out;
>> + end = i;
>>
>> if (all_user)
>> rec_argv[i++] = "--all-user";
>> @@ -160,10 +163,9 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
>> if (verbose > 0) {
>> pr_debug("calling: record ");
>>
>> - while (rec_argv[j]) {
>> + for (j = start; j < end; j++)
>> pr_debug("%s ", rec_argv[j]);
>> - j++;
>> - }
>> +
>> pr_debug("\n");
>> }
>>
>> --
>> 2.17.1
>>
>

2021-06-01 14:10:13

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] perf: Support perf-mem/perf-c2c for AlderLake

Em Tue, Jun 01, 2021 at 12:49:34AM +0200, Jiri Olsa escreveu:
> On Thu, May 27, 2021 at 08:16:02AM +0800, Jin Yao wrote:
> > AlderLake uses a hybrid architecture utilizing Golden Cove cores
> > (core CPU) and Gracemont cores (atom CPU). This patchset supports
> > perf-mem and perf-c2c for AlderLake.
> >
> > v2:
> > ---
> > - Use mem_loads_name__init to keep original behavior for non-hybrid platform.
> > - Move x86 specific perf_mem_events[] to arch/x86/util/mem-events.c.
> > - Move mem-store event to a new patch.
> > - Add a new patch to fix wrong verbose output for recording events
> > - Add a new patch to disable 'mem-loads-aux' group before reporting
>
> Acked-by: Jiri Olsa <[email protected]>

Thanks, applied.

- Arnaldo


> thanks,
> jirka
>
> >
> > Jin Yao (8):
> > perf tools: Check mem-loads auxiliary event
> > perf tools: Support pmu prefix for mem-load event
> > perf tools: Support pmu prefix for mem-store event
> > perf tools: Check if mem_events is supported for hybrid platform
> > perf mem: Support record for hybrid platform
> > perf mem: Fix wrong verbose output for recording events
> > perf mem: Disable 'mem-loads-aux' group before reporting
> > perf c2c: Support record for hybrid platform
> >
> > tools/perf/arch/arm64/util/mem-events.c | 2 +-
> > tools/perf/arch/powerpc/util/mem-events.c | 2 +-
> > tools/perf/arch/x86/util/mem-events.c | 54 ++++++++++--
> > tools/perf/builtin-c2c.c | 40 +++++----
> > tools/perf/builtin-mem.c | 51 ++++++-----
> > tools/perf/builtin-report.c | 2 +
> > tools/perf/util/evlist.c | 25 ++++++
> > tools/perf/util/evlist.h | 1 +
> > tools/perf/util/mem-events.c | 101 ++++++++++++++++++++--
> > tools/perf/util/mem-events.h | 4 +-
> > 10 files changed, 225 insertions(+), 57 deletions(-)
> >
> > --
> > 2.17.1
> >
>

--

- Arnaldo