2023-12-13 19:53:16

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 0/7] Clean up perf mem

From: Kan Liang <[email protected]>

Changes since V2:
- Fix the Arm64 building error (Leo)
- Add two new patches to clean up perf_mem_events__record_args()
and perf_pmus__num_mem_pmus() (Leo)

Changes since V1:
- Fix strcmp of PMU name checking (Ravi)
- Fix "/," typo (Ian)
- Rename several functions with perf_pmu__mem_events prefix. (Ian)
- Fold the header removal patch into the patch where the cleanups made.
(Arnaldo)
- Add reviewed-by and tested-by from Ian and Ravi

As discussed in the below thread, the patch set is to clean up perf mem.
https://lore.kernel.org/lkml/[email protected]/

Introduce generic functions perf_mem_events__ptr(),
perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
ARCH specific ones.
Simplify the perf_mem_event__supported().

Only keeps the ARCH-specific perf_mem_events array in the corresponding
mem-events.c for each ARCH.

There is no functional change.

The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
etc. But I can only test it on two Intel platforms.
Please give it try, if you have machines with other ARCHs.

Here are the test results:
Intel hybrid machine:

$perf mem record -e list
ldlat-loads : available
ldlat-stores : available

$perf mem record -e ldlat-loads -v --ldlat 50
calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P

$perf mem record -v
calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P

$perf mem record -t store -v
calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P


Intel SPR:
$perf mem record -e list
ldlat-loads : available
ldlat-stores : available

$perf mem record -e ldlat-loads -v --ldlat 50
calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P

$perf mem record -v
calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P

$perf mem record -t store -v
calling: record -e cpu/mem-stores/P

Kan Liang (7):
perf mem: Add mem_events into the supported perf_pmu
perf mem: Clean up perf_mem_events__ptr()
perf mem: Clean up perf_mem_events__name()
perf mem: Clean up perf_mem_event__supported()
perf mem: Clean up is_mem_loads_aux_event()
perf mem: Clean up perf_mem_events__record_args()
perf mem: Clean up perf_pmus__num_mem_pmus()

tools/perf/arch/arm/util/pmu.c | 3 +
tools/perf/arch/arm64/util/mem-events.c | 39 +---
tools/perf/arch/arm64/util/mem-events.h | 7 +
tools/perf/arch/powerpc/util/mem-events.c | 13 +-
tools/perf/arch/powerpc/util/mem-events.h | 7 +
tools/perf/arch/powerpc/util/pmu.c | 11 ++
tools/perf/arch/s390/util/pmu.c | 3 +
tools/perf/arch/x86/util/mem-events.c | 99 ++--------
tools/perf/arch/x86/util/mem-events.h | 10 +
tools/perf/arch/x86/util/pmu.c | 19 +-
tools/perf/builtin-c2c.c | 45 ++---
tools/perf/builtin-mem.c | 48 ++---
tools/perf/util/mem-events.c | 217 +++++++++++++---------
tools/perf/util/mem-events.h | 19 +-
tools/perf/util/pmu.c | 4 +-
tools/perf/util/pmu.h | 7 +
tools/perf/util/pmus.c | 6 -
tools/perf/util/pmus.h | 1 -
18 files changed, 278 insertions(+), 280 deletions(-)
create mode 100644 tools/perf/arch/arm64/util/mem-events.h
create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
create mode 100644 tools/perf/arch/powerpc/util/pmu.c
create mode 100644 tools/perf/arch/x86/util/mem-events.h

--
2.35.1


2023-12-13 19:53:23

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 1/7] perf mem: Add mem_events into the supported perf_pmu

From: Kan Liang <[email protected]>

With the mem_events, perf doesn't need to read sysfs for each PMU to
find the mem-events-supported PMU. The patch also makes it possible to
clean up the related __weak functions later.

The patch is only to add the mem_events into the perf_pmu for all ARCHs.
It will be used in the later cleanup patches.

Reviewed-by: Ian Rogers <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Suggested-by: Leo Yan <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/arch/arm/util/pmu.c | 3 +++
tools/perf/arch/arm64/util/mem-events.c | 7 ++++---
tools/perf/arch/arm64/util/mem-events.h | 7 +++++++
tools/perf/arch/s390/util/pmu.c | 3 +++
tools/perf/arch/x86/util/mem-events.c | 4 ++--
tools/perf/arch/x86/util/mem-events.h | 9 +++++++++
tools/perf/arch/x86/util/pmu.c | 7 +++++++
tools/perf/util/mem-events.c | 2 +-
tools/perf/util/mem-events.h | 1 +
tools/perf/util/pmu.c | 4 +++-
tools/perf/util/pmu.h | 7 +++++++
11 files changed, 47 insertions(+), 7 deletions(-)
create mode 100644 tools/perf/arch/arm64/util/mem-events.h
create mode 100644 tools/perf/arch/x86/util/mem-events.h

diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
index 7f3af3b97f3b..8b7cb68ba1a8 100644
--- a/tools/perf/arch/arm/util/pmu.c
+++ b/tools/perf/arch/arm/util/pmu.c
@@ -13,6 +13,7 @@
#include "hisi-ptt.h"
#include "../../../util/pmu.h"
#include "../../../util/cs-etm.h"
+#include "../../arm64/util/mem-events.h"

void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
{
@@ -26,6 +27,8 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
pmu->selectable = true;
pmu->is_uncore = false;
pmu->perf_event_attr_init_default = arm_spe_pmu_default_config;
+ if (!strcmp(pmu->name, "arm_spe_0"))
+ pmu->mem_events = perf_mem_events_arm;
} else if (strstarts(pmu->name, HISI_PTT_PMU_NAME)) {
pmu->selectable = true;
#endif
diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
index 3bcc5c7035c2..edf8207f7812 100644
--- a/tools/perf/arch/arm64/util/mem-events.c
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -1,10 +1,11 @@
// SPDX-License-Identifier: GPL-2.0
-#include "map_symbol.h"
+#include "util/map_symbol.h"
+#include "util/mem-events.h"
#include "mem-events.h"

#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }

-static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
E("spe-load", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0"),
E("spe-store", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0"),
E("spe-ldst", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0"),
@@ -17,7 +18,7 @@ struct perf_mem_event *perf_mem_events__ptr(int i)
if (i >= PERF_MEM_EVENTS__MAX)
return NULL;

- return &perf_mem_events[i];
+ return &perf_mem_events_arm[i];
}

const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
diff --git a/tools/perf/arch/arm64/util/mem-events.h b/tools/perf/arch/arm64/util/mem-events.h
new file mode 100644
index 000000000000..5fc50be4be38
--- /dev/null
+++ b/tools/perf/arch/arm64/util/mem-events.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ARM64_MEM_EVENTS_H
+#define _ARM64_MEM_EVENTS_H
+
+extern struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX];
+
+#endif /* _ARM64_MEM_EVENTS_H */
diff --git a/tools/perf/arch/s390/util/pmu.c b/tools/perf/arch/s390/util/pmu.c
index 886c30e001fa..225d7dc2379c 100644
--- a/tools/perf/arch/s390/util/pmu.c
+++ b/tools/perf/arch/s390/util/pmu.c
@@ -19,4 +19,7 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
!strcmp(pmu->name, S390_PMUPAI_EXT) ||
!strcmp(pmu->name, S390_PMUCPUM_CF))
pmu->selectable = true;
+
+ if (pmu->is_core)
+ pmu->mem_events = perf_mem_events;
}
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index 191b372f9a2d..2b81d229982c 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -16,13 +16,13 @@ static char mem_stores_name[100];

#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }

-static struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
+struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores"),
E(NULL, NULL, NULL),
};

-static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
+struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
E(NULL, NULL, NULL),
E(NULL, NULL, NULL),
E("mem-ldst", "ibs_op//", "ibs_op"),
diff --git a/tools/perf/arch/x86/util/mem-events.h b/tools/perf/arch/x86/util/mem-events.h
new file mode 100644
index 000000000000..3959e427f482
--- /dev/null
+++ b/tools/perf/arch/x86/util/mem-events.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _X86_MEM_EVENTS_H
+#define _X86_MEM_EVENTS_H
+
+extern struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX];
+
+extern struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX];
+
+#endif /* _X86_MEM_EVENTS_H */
diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index 469555ae9b3c..cd22e80e5657 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -15,6 +15,7 @@
#include "../../../util/pmu.h"
#include "../../../util/fncache.h"
#include "../../../util/pmus.h"
+#include "mem-events.h"
#include "env.h"

void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
@@ -30,6 +31,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
pmu->selectable = true;
}
#endif
+
+ if (x86__is_amd_cpu()) {
+ if (!strcmp(pmu->name, "ibs_op"))
+ pmu->mem_events = perf_mem_events_amd;
+ } else if (pmu->is_core)
+ pmu->mem_events = perf_mem_events_intel;
}

int perf_pmus__num_mem_pmus(void)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 3a2e3687878c..0a8f415f5efe 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -19,7 +19,7 @@ unsigned int perf_mem_events__loads_ldlat = 30;

#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }

-static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
E("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "cpu/events/mem-loads"),
E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
E(NULL, NULL, NULL),
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index b40ad6ea93fc..8c5694b2d0b0 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -34,6 +34,7 @@ enum {
};

extern unsigned int perf_mem_events__loads_ldlat;
+extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];

int perf_mem_events__parse(const char *str);
int perf_mem_events__init(void);
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 3c9609944a2f..3d4373b8ab63 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -986,8 +986,10 @@ static int pmu_max_precise(int dirfd, struct perf_pmu *pmu)
}

void __weak
-perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
+perf_pmu__arch_init(struct perf_pmu *pmu)
{
+ if (pmu->is_core)
+ pmu->mem_events = perf_mem_events;
}

struct perf_pmu *perf_pmu__lookup(struct list_head *pmus, int dirfd, const char *name)
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 424c3fee0949..e35d985206db 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -10,6 +10,8 @@
#include <stdio.h>
#include "parse-events.h"
#include "pmu-events/pmu-events.h"
+#include "map_symbol.h"
+#include "mem-events.h"

struct evsel_config_term;
struct perf_cpu_map;
@@ -162,6 +164,11 @@ struct perf_pmu {
*/
bool exclude_guest;
} missing_features;
+
+ /**
+ * @mem_events: List of the supported mem events
+ */
+ struct perf_mem_event *mem_events;
};

/** @perf_pmu__fake: A special global PMU used for testing. */
--
2.35.1

2023-12-13 19:53:25

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 2/7] perf mem: Clean up perf_mem_events__ptr()

From: Kan Liang <[email protected]>

The mem_events can be retrieved from the struct perf_pmu now. An ARCH
specific perf_mem_events__ptr() is not required anymore. Remove all of
them.

The Intel hybrid has multiple mem-events-supported PMUs. But they share
the same mem_events. Other ARCHs only support one mem-events-supported
PMU. In the configuration, it's good enough to only configure the
mem_events for one PMU. Add perf_mem_events_find_pmu() which returns the
first mem-events-supported PMU.

In the perf_mem_events__init(), the perf_pmus__scan() is not required
anymore. It avoids checking the sysfs for every PMU on the system.

Make the perf_mem_events__record_args() more generic. Remove the
perf_mem_events__print_unsupport_hybrid().

Since pmu is added as a new parameter, rename perf_mem_events__ptr() to
perf_pmu__mem_events_ptr(). Several other functions also do a similar
rename.

Reviewed-by: Ian Rogers <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/arch/arm64/util/mem-events.c | 10 +--
tools/perf/arch/x86/util/mem-events.c | 18 ++---
tools/perf/builtin-c2c.c | 28 +++++--
tools/perf/builtin-mem.c | 28 +++++--
tools/perf/util/mem-events.c | 103 ++++++++++++------------
tools/perf/util/mem-events.h | 9 ++-
6 files changed, 104 insertions(+), 92 deletions(-)

diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
index edf8207f7812..d3e69a520c2b 100644
--- a/tools/perf/arch/arm64/util/mem-events.c
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -13,17 +13,9 @@ struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {

static char mem_ev_name[100];

-struct perf_mem_event *perf_mem_events__ptr(int i)
-{
- if (i >= PERF_MEM_EVENTS__MAX)
- return NULL;
-
- return &perf_mem_events_arm[i];
-}
-
const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
{
- struct perf_mem_event *e = perf_mem_events__ptr(i);
+ struct perf_mem_event *e = &perf_mem_events_arm[i];

if (i >= PERF_MEM_EVENTS__MAX)
return NULL;
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index 2b81d229982c..5fb41d50118d 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -28,17 +28,6 @@ struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
E("mem-ldst", "ibs_op//", "ibs_op"),
};

-struct perf_mem_event *perf_mem_events__ptr(int i)
-{
- if (i >= PERF_MEM_EVENTS__MAX)
- return NULL;
-
- if (x86__is_amd_cpu())
- return &perf_mem_events_amd[i];
-
- return &perf_mem_events_intel[i];
-}
-
bool is_mem_loads_aux_event(struct evsel *leader)
{
struct perf_pmu *pmu = perf_pmus__find("cpu");
@@ -54,7 +43,12 @@ bool is_mem_loads_aux_event(struct evsel *leader)

const char *perf_mem_events__name(int i, const char *pmu_name)
{
- struct perf_mem_event *e = perf_mem_events__ptr(i);
+ struct perf_mem_event *e;
+
+ if (x86__is_amd_cpu())
+ e = &perf_mem_events_amd[i];
+ else
+ e = &perf_mem_events_intel[i];

if (!e)
return NULL;
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f78eea9e2153..838481505e08 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -3215,12 +3215,19 @@ static int parse_record_events(const struct option *opt,
const char *str, int unset __maybe_unused)
{
bool *event_set = (bool *) opt->value;
+ struct perf_pmu *pmu;
+
+ pmu = perf_mem_events_find_pmu();
+ if (!pmu) {
+ pr_err("failed: there is no PMU that supports perf c2c\n");
+ exit(-1);
+ }

if (!strcmp(str, "list")) {
- perf_mem_events__list();
+ perf_pmu__mem_events_list(pmu);
exit(0);
}
- if (perf_mem_events__parse(str))
+ if (perf_pmu__mem_events_parse(pmu, str))
exit(-1);

*event_set = true;
@@ -3245,6 +3252,7 @@ static int perf_c2c__record(int argc, const char **argv)
bool all_user = false, all_kernel = false;
bool event_set = false;
struct perf_mem_event *e;
+ struct perf_pmu *pmu;
struct option options[] = {
OPT_CALLBACK('e', "event", &event_set, "event",
"event selector. Use 'perf c2c record -e list' to list available events",
@@ -3256,7 +3264,13 @@ static int perf_c2c__record(int argc, const char **argv)
OPT_END()
};

- if (perf_mem_events__init()) {
+ pmu = perf_mem_events_find_pmu();
+ if (!pmu) {
+ pr_err("failed: no PMU supports the memory events\n");
+ return -1;
+ }
+
+ if (perf_pmu__mem_events_init(pmu)) {
pr_err("failed: memory events not supported\n");
return -1;
}
@@ -3280,7 +3294,7 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "record";

if (!event_set) {
- e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD_STORE);
/*
* The load and store operations are required, use the event
* PERF_MEM_EVENTS__LOAD_STORE if it is supported.
@@ -3289,15 +3303,15 @@ static int perf_c2c__record(int argc, const char **argv)
e->record = true;
rec_argv[i++] = "-W";
} else {
- e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
e->record = true;

- e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__STORE);
e->record = true;
}
}

- e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
if (e->record)
rec_argv[i++] = "-W";

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 51499c20da01..ef64bae77ca7 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -43,12 +43,19 @@ static int parse_record_events(const struct option *opt,
const char *str, int unset __maybe_unused)
{
struct perf_mem *mem = *(struct perf_mem **)opt->value;
+ struct perf_pmu *pmu;
+
+ pmu = perf_mem_events_find_pmu();
+ if (!pmu) {
+ pr_err("failed: there is no PMU that supports perf mem\n");
+ exit(-1);
+ }

if (!strcmp(str, "list")) {
- perf_mem_events__list();
+ perf_pmu__mem_events_list(pmu);
exit(0);
}
- if (perf_mem_events__parse(str))
+ if (perf_pmu__mem_events_parse(pmu, str))
exit(-1);

mem->operation = 0;
@@ -72,6 +79,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
int ret;
bool all_user = false, all_kernel = false;
struct perf_mem_event *e;
+ struct perf_pmu *pmu;
struct option options[] = {
OPT_CALLBACK('e', "event", &mem, "event",
"event selector. use 'perf mem record -e list' to list available events",
@@ -84,7 +92,13 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
OPT_END()
};

- if (perf_mem_events__init()) {
+ pmu = perf_mem_events_find_pmu();
+ if (!pmu) {
+ pr_err("failed: no PMU supports the memory events\n");
+ return -1;
+ }
+
+ if (perf_pmu__mem_events_init(pmu)) {
pr_err("failed: memory events not supported\n");
return -1;
}
@@ -113,7 +127,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)

rec_argv[i++] = "record";

- e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD_STORE);

/*
* The load and store operations are required, use the event
@@ -126,17 +140,17 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
rec_argv[i++] = "-W";
} else {
if (mem->operation & MEM_OPERATION_LOAD) {
- e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
e->record = true;
}

if (mem->operation & MEM_OPERATION_STORE) {
- e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__STORE);
e->record = true;
}
}

- e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+ e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
if (e->record)
rec_argv[i++] = "-W";

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 0a8f415f5efe..27a33dc44964 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -29,17 +29,42 @@ struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
static char mem_loads_name[100];
static bool mem_loads_name__init;

-struct perf_mem_event * __weak perf_mem_events__ptr(int i)
+struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i)
{
- if (i >= PERF_MEM_EVENTS__MAX)
+ if (i >= PERF_MEM_EVENTS__MAX || !pmu)
return NULL;

- return &perf_mem_events[i];
+ return &pmu->mem_events[i];
+}
+
+static struct perf_pmu *perf_pmus__scan_mem(struct perf_pmu *pmu)
+{
+ while ((pmu = perf_pmus__scan(pmu)) != NULL) {
+ if (pmu->mem_events)
+ return pmu;
+ }
+ return NULL;
+}
+
+struct perf_pmu *perf_mem_events_find_pmu(void)
+{
+ /*
+ * The current perf mem doesn't support per-PMU configuration.
+ * The exact same configuration is applied to all the
+ * mem_events supported PMUs.
+ * Return the first mem_events supported PMU.
+ *
+ * Notes: The only case which may support multiple mem_events
+ * supported PMUs is Intel hybrid. The exact same mem_events
+ * is shared among the PMUs. Only configure the first PMU
+ * is good enough as well.
+ */
+ return perf_pmus__scan_mem(NULL);
}

const char * __weak perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
{
- struct perf_mem_event *e = perf_mem_events__ptr(i);
+ struct perf_mem_event *e = &perf_mem_events[i];

if (!e)
return NULL;
@@ -61,7 +86,7 @@ __weak bool is_mem_loads_aux_event(struct evsel *leader __maybe_unused)
return false;
}

-int perf_mem_events__parse(const char *str)
+int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str)
{
char *tok, *saveptr = NULL;
bool found = false;
@@ -79,7 +104,7 @@ int perf_mem_events__parse(const char *str)

while (tok) {
for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- struct perf_mem_event *e = perf_mem_events__ptr(j);
+ struct perf_mem_event *e = perf_pmu__mem_events_ptr(pmu, j);

if (!e->tag)
continue;
@@ -112,7 +137,7 @@ static bool perf_mem_event__supported(const char *mnt, struct perf_pmu *pmu,
return !stat(path, &st);
}

-int perf_mem_events__init(void)
+int perf_pmu__mem_events_init(struct perf_pmu *pmu)
{
const char *mnt = sysfs__mount();
bool found = false;
@@ -122,8 +147,7 @@ int perf_mem_events__init(void)
return -ENOENT;

for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- struct perf_mem_event *e = perf_mem_events__ptr(j);
- struct perf_pmu *pmu = NULL;
+ struct perf_mem_event *e = perf_pmu__mem_events_ptr(pmu, j);

/*
* If the event entry isn't valid, skip initialization
@@ -132,29 +156,20 @@ int perf_mem_events__init(void)
if (!e->tag)
continue;

- /*
- * Scan all PMUs not just core ones, since perf mem/c2c on
- * platforms like AMD uses IBS OP PMU which is independent
- * of core PMU.
- */
- while ((pmu = perf_pmus__scan(pmu)) != NULL) {
- e->supported |= perf_mem_event__supported(mnt, pmu, e);
- if (e->supported) {
- found = true;
- break;
- }
- }
+ e->supported |= perf_mem_event__supported(mnt, pmu, e);
+ if (e->supported)
+ found = true;
}

return found ? 0 : -ENOENT;
}

-void perf_mem_events__list(void)
+void perf_pmu__mem_events_list(struct perf_pmu *pmu)
{
int j;

for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- struct perf_mem_event *e = perf_mem_events__ptr(j);
+ struct perf_mem_event *e = perf_pmu__mem_events_ptr(pmu, j);

fprintf(stderr, "%-*s%-*s%s",
e->tag ? 13 : 0,
@@ -165,50 +180,32 @@ void perf_mem_events__list(void)
}
}

-static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
- int idx)
-{
- const char *mnt = sysfs__mount();
- struct perf_pmu *pmu = NULL;
-
- while ((pmu = perf_pmus__scan(pmu)) != NULL) {
- if (!perf_mem_event__supported(mnt, pmu, e)) {
- pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(idx, pmu->name));
- }
- }
-}
-
int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
char **rec_tmp, int *tmp_nr)
{
const char *mnt = sysfs__mount();
+ struct perf_pmu *pmu = NULL;
int i = *argv_nr, k = 0;
struct perf_mem_event *e;

- for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
- e = perf_mem_events__ptr(j);
- if (!e->record)
- continue;

- if (perf_pmus__num_mem_pmus() == 1) {
- if (!e->supported) {
- pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(j, NULL));
- return -1;
- }
+ while ((pmu = perf_pmus__scan_mem(pmu)) != NULL) {
+ for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
+ e = perf_pmu__mem_events_ptr(pmu, j);

- rec_argv[i++] = "-e";
- rec_argv[i++] = perf_mem_events__name(j, NULL);
- } else {
- struct perf_pmu *pmu = NULL;
+ if (!e->record)
+ continue;

if (!e->supported) {
- perf_mem_events__print_unsupport_hybrid(e, j);
+ pr_err("failed: event '%s' not supported\n",
+ perf_mem_events__name(j, pmu->name));
return -1;
}

- while ((pmu = perf_pmus__scan(pmu)) != NULL) {
+ if (perf_pmus__num_mem_pmus() == 1) {
+ rec_argv[i++] = "-e";
+ rec_argv[i++] = perf_mem_events__name(j, NULL);
+ } else {
const char *s = perf_mem_events__name(j, pmu->name);

if (!perf_mem_event__supported(mnt, pmu, e))
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 8c5694b2d0b0..0ad301a2e424 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -36,14 +36,15 @@ enum {
extern unsigned int perf_mem_events__loads_ldlat;
extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];

-int perf_mem_events__parse(const char *str);
-int perf_mem_events__init(void);
+int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str);
+int perf_pmu__mem_events_init(struct perf_pmu *pmu);

const char *perf_mem_events__name(int i, const char *pmu_name);
-struct perf_mem_event *perf_mem_events__ptr(int i);
+struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i);
+struct perf_pmu *perf_mem_events_find_pmu(void);
bool is_mem_loads_aux_event(struct evsel *leader);

-void perf_mem_events__list(void);
+void perf_pmu__mem_events_list(struct perf_pmu *pmu);
int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
char **rec_tmp, int *tmp_nr);

--
2.35.1

2023-12-13 19:53:47

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 3/7] perf mem: Clean up perf_mem_events__name()

From: Kan Liang <[email protected]>

Introduce a generic perf_mem_events__name(). Remove the ARCH-specific
one.

The mem_load events may have a different format. Add ldlat and aux_event
in the struct perf_mem_event to indicate the format and the extra aux
event.

Add perf_mem_events_intel_aux[] to support the extra mem_load_aux event.

Rename perf_mem_events__name to perf_pmu__mem_events_name.

Reviewed-by: Ian Rogers <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/arch/arm64/util/mem-events.c | 26 ++-------
tools/perf/arch/powerpc/util/mem-events.c | 13 ++---
tools/perf/arch/powerpc/util/mem-events.h | 7 +++
tools/perf/arch/powerpc/util/pmu.c | 11 ++++
tools/perf/arch/x86/util/mem-events.c | 70 +++++------------------
tools/perf/arch/x86/util/mem-events.h | 1 +
tools/perf/arch/x86/util/pmu.c | 8 ++-
tools/perf/util/mem-events.c | 60 +++++++++++++------
tools/perf/util/mem-events.h | 3 +-
9 files changed, 93 insertions(+), 106 deletions(-)
create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
create mode 100644 tools/perf/arch/powerpc/util/pmu.c

diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
index d3e69a520c2b..96460c46640a 100644
--- a/tools/perf/arch/arm64/util/mem-events.c
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -3,28 +3,10 @@
#include "util/mem-events.h"
#include "mem-events.h"

-#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
- E("spe-load", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0"),
- E("spe-store", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0"),
- E("spe-ldst", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0"),
+ E("spe-load", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0", true, 0),
+ E("spe-store", "%s/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0", false, 0),
+ E("spe-ldst", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0", true, 0),
};
-
-static char mem_ev_name[100];
-
-const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
-{
- struct perf_mem_event *e = &perf_mem_events_arm[i];
-
- if (i >= PERF_MEM_EVENTS__MAX)
- return NULL;
-
- if (i == PERF_MEM_EVENTS__LOAD || i == PERF_MEM_EVENTS__LOAD_STORE)
- scnprintf(mem_ev_name, sizeof(mem_ev_name),
- e->name, perf_mem_events__loads_ldlat);
- else /* PERF_MEM_EVENTS__STORE */
- scnprintf(mem_ev_name, sizeof(mem_ev_name), e->name);
-
- return mem_ev_name;
-}
diff --git a/tools/perf/arch/powerpc/util/mem-events.c b/tools/perf/arch/powerpc/util/mem-events.c
index 78b986e5268d..b7883e38950f 100644
--- a/tools/perf/arch/powerpc/util/mem-events.c
+++ b/tools/perf/arch/powerpc/util/mem-events.c
@@ -2,11 +2,10 @@
#include "map_symbol.h"
#include "mem-events.h"

-/* PowerPC does not support 'ldlat' parameter. */
-const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
-{
- if (i == PERF_MEM_EVENTS__LOAD)
- return "cpu/mem-loads/";
+#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }

- return "cpu/mem-stores/";
-}
+struct perf_mem_event perf_mem_events_power[PERF_MEM_EVENTS__MAX] = {
+ E("ldlat-loads", "%s/mem-loads/", "cpu/events/mem-loads", false, 0),
+ E("ldlat-stores", "%s/mem-stores/", "cpu/events/mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
+};
diff --git a/tools/perf/arch/powerpc/util/mem-events.h b/tools/perf/arch/powerpc/util/mem-events.h
new file mode 100644
index 000000000000..6acc3d1b6873
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/mem-events.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _POWER_MEM_EVENTS_H
+#define _POWER_MEM_EVENTS_H
+
+extern struct perf_mem_event perf_mem_events_power[PERF_MEM_EVENTS__MAX];
+
+#endif /* _POWER_MEM_EVENTS_H */
diff --git a/tools/perf/arch/powerpc/util/pmu.c b/tools/perf/arch/powerpc/util/pmu.c
new file mode 100644
index 000000000000..168173f88ddb
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/pmu.c
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <string.h>
+
+#include "../../../util/pmu.h"
+
+void perf_pmu__arch_init(struct perf_pmu *pmu)
+{
+ if (pmu->is_core)
+ pmu->mem_events = perf_mem_events_power;
+}
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index 5fb41d50118d..f0e66a0151a0 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -7,25 +7,26 @@
#include "linux/string.h"
#include "env.h"

-static char mem_loads_name[100];
-static bool mem_loads_name__init;
-static char mem_stores_name[100];
-
#define MEM_LOADS_AUX 0x8203
-#define MEM_LOADS_AUX_NAME "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P"

-#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
- E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores"),
- E(NULL, NULL, NULL),
+ E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads", true, 0),
+ E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
+};
+
+struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX] = {
+ E("ldlat-loads", "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P", "%s/events/mem-loads", true, MEM_LOADS_AUX),
+ E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
};

struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
- E(NULL, NULL, NULL),
- E(NULL, NULL, NULL),
- E("mem-ldst", "ibs_op//", "ibs_op"),
+ E(NULL, NULL, NULL, false, 0),
+ E(NULL, NULL, NULL, false, 0),
+ E("mem-ldst", "%s//", "ibs_op", false, 0),
};

bool is_mem_loads_aux_event(struct evsel *leader)
@@ -40,48 +41,3 @@ bool is_mem_loads_aux_event(struct evsel *leader)

return leader->core.attr.config == MEM_LOADS_AUX;
}
-
-const char *perf_mem_events__name(int i, const char *pmu_name)
-{
- struct perf_mem_event *e;
-
- if (x86__is_amd_cpu())
- e = &perf_mem_events_amd[i];
- else
- e = &perf_mem_events_intel[i];
-
- if (!e)
- return NULL;
-
- if (i == PERF_MEM_EVENTS__LOAD) {
- if (mem_loads_name__init && !pmu_name)
- return mem_loads_name;
-
- if (!pmu_name) {
- mem_loads_name__init = true;
- pmu_name = "cpu";
- }
-
- if (perf_pmus__have_event(pmu_name, "mem-loads-aux")) {
- scnprintf(mem_loads_name, sizeof(mem_loads_name),
- MEM_LOADS_AUX_NAME, pmu_name, pmu_name,
- perf_mem_events__loads_ldlat);
- } else {
- scnprintf(mem_loads_name, sizeof(mem_loads_name),
- e->name, pmu_name,
- perf_mem_events__loads_ldlat);
- }
- return mem_loads_name;
- }
-
- if (i == PERF_MEM_EVENTS__STORE) {
- if (!pmu_name)
- pmu_name = "cpu";
-
- scnprintf(mem_stores_name, sizeof(mem_stores_name),
- e->name, pmu_name);
- return mem_stores_name;
- }
-
- return e->name;
-}
diff --git a/tools/perf/arch/x86/util/mem-events.h b/tools/perf/arch/x86/util/mem-events.h
index 3959e427f482..f55c8d3b7d59 100644
--- a/tools/perf/arch/x86/util/mem-events.h
+++ b/tools/perf/arch/x86/util/mem-events.h
@@ -3,6 +3,7 @@
#define _X86_MEM_EVENTS_H

extern struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX];
+extern struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX];

extern struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX];

diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index cd22e80e5657..0f49ff13cfe2 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -35,8 +35,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
if (x86__is_amd_cpu()) {
if (!strcmp(pmu->name, "ibs_op"))
pmu->mem_events = perf_mem_events_amd;
- } else if (pmu->is_core)
- pmu->mem_events = perf_mem_events_intel;
+ } else if (pmu->is_core) {
+ if (perf_pmu__have_event(pmu, "mem-loads-aux"))
+ pmu->mem_events = perf_mem_events_intel_aux;
+ else
+ pmu->mem_events = perf_mem_events_intel;
+ }
}

int perf_pmus__num_mem_pmus(void)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 27a33dc44964..51e53e33df03 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -17,17 +17,17 @@

unsigned int perf_mem_events__loads_ldlat = 30;

-#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "cpu/events/mem-loads"),
- E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
- E(NULL, NULL, NULL),
+ E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "cpu/events/mem-loads", true, 0),
+ E("ldlat-stores", "%s/mem-stores/P", "cpu/events/mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
};
#undef E

static char mem_loads_name[100];
-static bool mem_loads_name__init;
+static char mem_stores_name[100];

struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i)
{
@@ -62,23 +62,49 @@ struct perf_pmu *perf_mem_events_find_pmu(void)
return perf_pmus__scan_mem(NULL);
}

-const char * __weak perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
+static const char *perf_pmu__mem_events_name(int i, struct perf_pmu *pmu)
{
- struct perf_mem_event *e = &perf_mem_events[i];
+ struct perf_mem_event *e;

+ if (i >= PERF_MEM_EVENTS__MAX || !pmu)
+ return NULL;
+
+ e = &pmu->mem_events[i];
if (!e)
return NULL;

- if (i == PERF_MEM_EVENTS__LOAD) {
- if (!mem_loads_name__init) {
- mem_loads_name__init = true;
- scnprintf(mem_loads_name, sizeof(mem_loads_name),
- e->name, perf_mem_events__loads_ldlat);
+ if (i == PERF_MEM_EVENTS__LOAD || i == PERF_MEM_EVENTS__LOAD_STORE) {
+ if (e->ldlat) {
+ if (!e->aux_event) {
+ /* ARM and Most of Intel */
+ scnprintf(mem_loads_name, sizeof(mem_loads_name),
+ e->name, pmu->name,
+ perf_mem_events__loads_ldlat);
+ } else {
+ /* Intel with mem-loads-aux event */
+ scnprintf(mem_loads_name, sizeof(mem_loads_name),
+ e->name, pmu->name, pmu->name,
+ perf_mem_events__loads_ldlat);
+ }
+ } else {
+ if (!e->aux_event) {
+ /* AMD and POWER */
+ scnprintf(mem_loads_name, sizeof(mem_loads_name),
+ e->name, pmu->name);
+ } else
+ return NULL;
}
+
return mem_loads_name;
}

- return e->name;
+ if (i == PERF_MEM_EVENTS__STORE) {
+ scnprintf(mem_stores_name, sizeof(mem_stores_name),
+ e->name, pmu->name);
+ return mem_stores_name;
+ }
+
+ return NULL;
}

__weak bool is_mem_loads_aux_event(struct evsel *leader __maybe_unused)
@@ -175,7 +201,7 @@ void perf_pmu__mem_events_list(struct perf_pmu *pmu)
e->tag ? 13 : 0,
e->tag ? : "",
e->tag && verbose > 0 ? 25 : 0,
- e->tag && verbose > 0 ? perf_mem_events__name(j, NULL) : "",
+ e->tag && verbose > 0 ? perf_pmu__mem_events_name(j, pmu) : "",
e->supported ? ": available\n" : "");
}
}
@@ -198,15 +224,15 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,

if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
- perf_mem_events__name(j, pmu->name));
+ perf_pmu__mem_events_name(j, pmu));
return -1;
}

if (perf_pmus__num_mem_pmus() == 1) {
rec_argv[i++] = "-e";
- rec_argv[i++] = perf_mem_events__name(j, NULL);
+ rec_argv[i++] = perf_pmu__mem_events_name(j, pmu);
} else {
- const char *s = perf_mem_events__name(j, pmu->name);
+ const char *s = perf_pmu__mem_events_name(j, pmu);

if (!perf_mem_event__supported(mnt, pmu, e))
continue;
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 0ad301a2e424..79d342768d12 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -14,6 +14,8 @@
struct perf_mem_event {
bool record;
bool supported;
+ bool ldlat;
+ u32 aux_event;
const char *tag;
const char *name;
const char *sysfs_name;
@@ -39,7 +41,6 @@ extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];
int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str);
int perf_pmu__mem_events_init(struct perf_pmu *pmu);

-const char *perf_mem_events__name(int i, const char *pmu_name);
struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i);
struct perf_pmu *perf_mem_events_find_pmu(void);
bool is_mem_loads_aux_event(struct evsel *leader);
--
2.35.1

2023-12-13 19:53:52

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 4/7] perf mem: Clean up perf_mem_event__supported()

From: Kan Liang <[email protected]>

For some ARCHs, e.g., ARM and AMD, to get the availability of the
mem-events, perf checks the existence of a specific PMU. For the other
ARCHs, e.g., Intel and Power, perf has to check the existence of some
specific events.

The current perf only iterates the mem-events-supported PMUs. It's not
required to check the existence of a specific PMU anymore.

Rename sysfs_name to event_name, which stores the specific mem-events.
Perf only needs to check those events for the availability of the
mem-events.

Rename perf_mem_event__supported to perf_pmu__mem_events_supported.

Reviewed-by: Ian Rogers <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/arch/arm64/util/mem-events.c | 8 ++++----
tools/perf/arch/powerpc/util/mem-events.c | 8 ++++----
tools/perf/arch/x86/util/mem-events.c | 20 ++++++++++----------
tools/perf/util/mem-events.c | 22 ++++++++++++----------
tools/perf/util/mem-events.h | 2 +-
5 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
index 96460c46640a..9f8da7937255 100644
--- a/tools/perf/arch/arm64/util/mem-events.c
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -3,10 +3,10 @@
#include "util/mem-events.h"
#include "mem-events.h"

-#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
- E("spe-load", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0", true, 0),
- E("spe-store", "%s/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0", false, 0),
- E("spe-ldst", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0", true, 0),
+ E("spe-load", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", NULL, true, 0),
+ E("spe-store", "%s/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", NULL, false, 0),
+ E("spe-ldst", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", NULL, true, 0),
};
diff --git a/tools/perf/arch/powerpc/util/mem-events.c b/tools/perf/arch/powerpc/util/mem-events.c
index b7883e38950f..72a6ac2b52f5 100644
--- a/tools/perf/arch/powerpc/util/mem-events.c
+++ b/tools/perf/arch/powerpc/util/mem-events.c
@@ -2,10 +2,10 @@
#include "map_symbol.h"
#include "mem-events.h"

-#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events_power[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "%s/mem-loads/", "cpu/events/mem-loads", false, 0),
- E("ldlat-stores", "%s/mem-stores/", "cpu/events/mem-stores", false, 0),
- E(NULL, NULL, NULL, false, 0),
+ E("ldlat-loads", "%s/mem-loads/", "mem-loads", false, 0),
+ E("ldlat-stores", "%s/mem-stores/", "mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
};
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index f0e66a0151a0..b776d849fc64 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -9,24 +9,24 @@

#define MEM_LOADS_AUX 0x8203

-#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads", true, 0),
- E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores", false, 0),
- E(NULL, NULL, NULL, false, 0),
+ E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "mem-loads", true, 0),
+ E("ldlat-stores", "%s/mem-stores/P", "mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
};

struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P", "%s/events/mem-loads", true, MEM_LOADS_AUX),
- E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores", false, 0),
- E(NULL, NULL, NULL, false, 0),
+ E("ldlat-loads", "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P", "mem-loads", true, MEM_LOADS_AUX),
+ E("ldlat-stores", "%s/mem-stores/P", "mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
};

struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
- E(NULL, NULL, NULL, false, 0),
- E(NULL, NULL, NULL, false, 0),
- E("mem-ldst", "%s//", "ibs_op", false, 0),
+ E(NULL, NULL, NULL, false, 0),
+ E(NULL, NULL, NULL, false, 0),
+ E("mem-ldst", "%s//", NULL, false, 0),
};

bool is_mem_loads_aux_event(struct evsel *leader)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 51e53e33df03..32890848bb3d 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -17,12 +17,12 @@

unsigned int perf_mem_events__loads_ldlat = 30;

-#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
+#define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat = l, .aux_event = a }

struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "cpu/events/mem-loads", true, 0),
- E("ldlat-stores", "%s/mem-stores/P", "cpu/events/mem-stores", false, 0),
- E(NULL, NULL, NULL, false, 0),
+ E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "mem-loads", true, 0),
+ E("ldlat-stores", "%s/mem-stores/P", "mem-stores", false, 0),
+ E(NULL, NULL, NULL, false, 0),
};
#undef E

@@ -151,15 +151,17 @@ int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str)
return -1;
}

-static bool perf_mem_event__supported(const char *mnt, struct perf_pmu *pmu,
+static bool perf_pmu__mem_events_supported(const char *mnt, struct perf_pmu *pmu,
struct perf_mem_event *e)
{
- char sysfs_name[100];
char path[PATH_MAX];
struct stat st;

- scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name, pmu->name);
- scnprintf(path, PATH_MAX, "%s/devices/%s", mnt, sysfs_name);
+ if (!e->event_name)
+ return true;
+
+ scnprintf(path, PATH_MAX, "%s/devices/%s/events/%s", mnt, pmu->name, e->event_name);
+
return !stat(path, &st);
}

@@ -182,7 +184,7 @@ int perf_pmu__mem_events_init(struct perf_pmu *pmu)
if (!e->tag)
continue;

- e->supported |= perf_mem_event__supported(mnt, pmu, e);
+ e->supported |= perf_pmu__mem_events_supported(mnt, pmu, e);
if (e->supported)
found = true;
}
@@ -234,7 +236,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
} else {
const char *s = perf_pmu__mem_events_name(j, pmu);

- if (!perf_mem_event__supported(mnt, pmu, e))
+ if (!perf_pmu__mem_events_supported(mnt, pmu, e))
continue;

rec_argv[i++] = "-e";
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 79d342768d12..f817a507b106 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -18,7 +18,7 @@ struct perf_mem_event {
u32 aux_event;
const char *tag;
const char *name;
- const char *sysfs_name;
+ const char *event_name;
};

struct mem_info {
--
2.35.1

2023-12-13 19:54:07

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 7/7] perf mem: Clean up perf_pmus__num_mem_pmus()

From: Kan Liang <[email protected]>

The number of mem PMUs can be calculated by searching the
perf_pmus__scan_mem().

Remove the ARCH specific perf_pmus__num_mem_pmus()

Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/arch/x86/util/pmu.c | 10 ----------
tools/perf/builtin-c2c.c | 2 +-
tools/perf/builtin-mem.c | 2 +-
tools/perf/util/mem-events.c | 14 ++++++++++++++
tools/perf/util/mem-events.h | 1 +
tools/perf/util/pmus.c | 6 ------
tools/perf/util/pmus.h | 1 -
7 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index 0f49ff13cfe2..c3d89d6ba1bf 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -42,13 +42,3 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
pmu->mem_events = perf_mem_events_intel;
}
}
-
-int perf_pmus__num_mem_pmus(void)
-{
- /* AMD uses IBS OP pmu and not a core PMU for perf mem/c2c */
- if (x86__is_amd_cpu())
- return 1;
-
- /* Intel uses core pmus for perf mem/c2c */
- return perf_pmus__num_core_pmus();
-}
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3bcb903b6b38..16b40f5d43db 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -3278,7 +3278,7 @@ static int perf_c2c__record(int argc, const char **argv)
PARSE_OPT_KEEP_UNKNOWN);

/* Max number of arguments multiplied by number of PMUs that can support them. */
- rec_argc = argc + 11 * perf_pmus__num_mem_pmus();
+ rec_argc = argc + 11 * (perf_pmu__mem_events_num_mem_pmus(pmu) + 1);

rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (!rec_argv)
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 1d92e309c97c..5b851e64e4a1 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -106,7 +106,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
PARSE_OPT_KEEP_UNKNOWN);

/* Max number of arguments multiplied by number of PMUs that can support them. */
- rec_argc = argc + 9 * perf_pmus__num_mem_pmus();
+ rec_argc = argc + 9 * (perf_pmu__mem_events_num_mem_pmus(pmu) + 1);

if (mem->cpu_list)
rec_argc += 2;
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index a20611b4fb1b..637cbd4a7bfb 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -62,6 +62,20 @@ struct perf_pmu *perf_mem_events_find_pmu(void)
return perf_pmus__scan_mem(NULL);
}

+/**
+ * perf_pmu__mem_events_num_mem_pmus - Get the number of mem PMUs since the given pmu
+ * @pmu: Start pmu. If it's NULL, search the entire PMU list.
+ */
+int perf_pmu__mem_events_num_mem_pmus(struct perf_pmu *pmu)
+{
+ int num = 0;
+
+ while ((pmu = perf_pmus__scan_mem(pmu)) != NULL)
+ num++;
+
+ return num;
+}
+
static const char *perf_pmu__mem_events_name(int i, struct perf_pmu *pmu)
{
struct perf_mem_event *e;
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index c97cd3caa766..15d5f0320d27 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -43,6 +43,7 @@ int perf_pmu__mem_events_init(struct perf_pmu *pmu);

struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i);
struct perf_pmu *perf_mem_events_find_pmu(void);
+int perf_pmu__mem_events_num_mem_pmus(struct perf_pmu *pmu);
bool is_mem_loads_aux_event(struct evsel *leader);

void perf_pmu__mem_events_list(struct perf_pmu *pmu);
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index ce4931461741..16505071d362 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -345,12 +345,6 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
return NULL;
}

-int __weak perf_pmus__num_mem_pmus(void)
-{
- /* All core PMUs are for mem events. */
- return perf_pmus__num_core_pmus();
-}
-
/** Struct for ordering events as output in perf list. */
struct sevent {
/** PMU for event. */
diff --git a/tools/perf/util/pmus.h b/tools/perf/util/pmus.h
index 4c67153ac257..94d2a08d894b 100644
--- a/tools/perf/util/pmus.h
+++ b/tools/perf/util/pmus.h
@@ -17,7 +17,6 @@ struct perf_pmu *perf_pmus__scan_core(struct perf_pmu *pmu);

const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str);

-int perf_pmus__num_mem_pmus(void);
void perf_pmus__print_pmu_events(const struct print_callbacks *print_cb, void *print_state);
bool perf_pmus__have_event(const char *pname, const char *name);
int perf_pmus__num_core_pmus(void);
--
2.35.1

2023-12-13 19:54:12

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 5/7] perf mem: Clean up is_mem_loads_aux_event()

From: Kan Liang <[email protected]>

The aux_event can be retrieved from the perf_pmu now. Implement a
generic support.

Reviewed-by: Ian Rogers <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/arch/x86/util/mem-events.c | 23 ++++-------------------
tools/perf/util/mem-events.c | 14 ++++++++++++--
2 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index b776d849fc64..62df03e91c7e 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -1,11 +1,9 @@
// SPDX-License-Identifier: GPL-2.0
-#include "util/pmu.h"
-#include "util/pmus.h"
-#include "util/env.h"
-#include "map_symbol.h"
-#include "mem-events.h"
#include "linux/string.h"
-#include "env.h"
+#include "util/map_symbol.h"
+#include "util/mem-events.h"
+#include "mem-events.h"
+

#define MEM_LOADS_AUX 0x8203

@@ -28,16 +26,3 @@ struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
E(NULL, NULL, NULL, false, 0),
E("mem-ldst", "%s//", NULL, false, 0),
};
-
-bool is_mem_loads_aux_event(struct evsel *leader)
-{
- struct perf_pmu *pmu = perf_pmus__find("cpu");
-
- if (!pmu)
- pmu = perf_pmus__find("cpu_core");
-
- if (pmu && !perf_pmu__have_event(pmu, "mem-loads-aux"))
- return false;
-
- return leader->core.attr.config == MEM_LOADS_AUX;
-}
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 32890848bb3d..7d7df3d0b2b9 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -107,9 +107,19 @@ static const char *perf_pmu__mem_events_name(int i, struct perf_pmu *pmu)
return NULL;
}

-__weak bool is_mem_loads_aux_event(struct evsel *leader __maybe_unused)
+bool is_mem_loads_aux_event(struct evsel *leader)
{
- return false;
+ struct perf_pmu *pmu = leader->pmu;
+ struct perf_mem_event *e;
+
+ if (!pmu || !pmu->mem_events)
+ return false;
+
+ e = &pmu->mem_events[PERF_MEM_EVENTS__LOAD];
+ if (!e->aux_event)
+ return false;
+
+ return leader->core.attr.config == e->aux_event;
}

int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str)
--
2.35.1

2023-12-13 19:54:35

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V3 6/7] perf mem: Clean up perf_mem_events__record_args()

From: Kan Liang <[email protected]>

The current code iterates all memory PMUs. It doesn't matter if the
system has only one memory PMU or multiple PMUs. The check of
perf_pmus__num_mem_pmus() is not required anymore.

The rec_tmp is not used in c2c and mem. Removing them as well.

Suggested-by: Leo Yan <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-c2c.c | 15 ++-------------
tools/perf/builtin-mem.c | 18 ++----------------
tools/perf/util/mem-events.c | 34 ++++++++++++----------------------
tools/perf/util/mem-events.h | 3 +--
4 files changed, 17 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 838481505e08..3bcb903b6b38 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -3245,9 +3245,8 @@ static const char * const *record_mem_usage = __usage_record;

static int perf_c2c__record(int argc, const char **argv)
{
- int rec_argc, i = 0, j, rec_tmp_nr = 0;
+ int rec_argc, i = 0, j;
const char **rec_argv;
- char **rec_tmp;
int ret;
bool all_user = false, all_kernel = false;
bool event_set = false;
@@ -3285,12 +3284,6 @@ static int perf_c2c__record(int argc, const char **argv)
if (!rec_argv)
return -1;

- rec_tmp = calloc(rec_argc + 1, sizeof(char *));
- if (!rec_tmp) {
- free(rec_argv);
- return -1;
- }
-
rec_argv[i++] = "record";

if (!event_set) {
@@ -3319,7 +3312,7 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "--phys-data";
rec_argv[i++] = "--sample-cpu";

- ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &rec_tmp_nr);
+ ret = perf_mem_events__record_args(rec_argv, &i);
if (ret)
goto out;

@@ -3346,10 +3339,6 @@ static int perf_c2c__record(int argc, const char **argv)

ret = cmd_record(i, rec_argv);
out:
- for (i = 0; i < rec_tmp_nr; i++)
- free(rec_tmp[i]);
-
- free(rec_tmp);
free(rec_argv);
return ret;
}
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index ef64bae77ca7..1d92e309c97c 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -72,10 +72,9 @@ static const char * const *record_mem_usage = __usage;

static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
{
- int rec_argc, i = 0, j, tmp_nr = 0;
+ int rec_argc, i = 0, j;
int start, end;
const char **rec_argv;
- char **rec_tmp;
int ret;
bool all_user = false, all_kernel = false;
struct perf_mem_event *e;
@@ -116,15 +115,6 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
if (!rec_argv)
return -1;

- /*
- * Save the allocated event name strings.
- */
- rec_tmp = calloc(rec_argc + 1, sizeof(char *));
- if (!rec_tmp) {
- free(rec_argv);
- return -1;
- }
-
rec_argv[i++] = "record";

e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD_STORE);
@@ -163,7 +153,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
rec_argv[i++] = "--data-page-size";

start = i;
- ret = perf_mem_events__record_args(rec_argv, &i, rec_tmp, &tmp_nr);
+ ret = perf_mem_events__record_args(rec_argv, &i);
if (ret)
goto out;
end = i;
@@ -193,10 +183,6 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)

ret = cmd_record(i, rec_argv);
out:
- for (i = 0; i < tmp_nr; i++)
- free(rec_tmp[i]);
-
- free(rec_tmp);
free(rec_argv);
return ret;
}
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 7d7df3d0b2b9..a20611b4fb1b 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -218,14 +218,14 @@ void perf_pmu__mem_events_list(struct perf_pmu *pmu)
}
}

-int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
- char **rec_tmp, int *tmp_nr)
+int perf_mem_events__record_args(const char **rec_argv, int *argv_nr)
{
const char *mnt = sysfs__mount();
struct perf_pmu *pmu = NULL;
- int i = *argv_nr, k = 0;
struct perf_mem_event *e;
-
+ int i = *argv_nr;
+ const char *s;
+ char *copy;

while ((pmu = perf_pmus__scan_mem(pmu)) != NULL) {
for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
@@ -240,30 +240,20 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
return -1;
}

- if (perf_pmus__num_mem_pmus() == 1) {
- rec_argv[i++] = "-e";
- rec_argv[i++] = perf_pmu__mem_events_name(j, pmu);
- } else {
- const char *s = perf_pmu__mem_events_name(j, pmu);
-
- if (!perf_pmu__mem_events_supported(mnt, pmu, e))
- continue;
+ s = perf_pmu__mem_events_name(j, pmu);
+ if (!s || !perf_pmu__mem_events_supported(mnt, pmu, e))
+ continue;

- rec_argv[i++] = "-e";
- if (s) {
- char *copy = strdup(s);
- if (!copy)
- return -1;
+ copy = strdup(s);
+ if (!copy)
+ return -1;

- rec_argv[i++] = copy;
- rec_tmp[k++] = copy;
- }
- }
+ rec_argv[i++] = "-e";
+ rec_argv[i++] = copy;
}
}

*argv_nr = i;
- *tmp_nr = k;
return 0;
}

diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index f817a507b106..c97cd3caa766 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -46,8 +46,7 @@ struct perf_pmu *perf_mem_events_find_pmu(void);
bool is_mem_loads_aux_event(struct evsel *leader);

void perf_pmu__mem_events_list(struct perf_pmu *pmu);
-int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
- char **rec_tmp, int *tmp_nr);
+int perf_mem_events__record_args(const char **rec_argv, int *argv_nr);

int perf_mem__tlb_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
--
2.35.1

2023-12-16 03:30:49

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem

On Wed, Dec 13, 2023 at 11:51:47AM -0800, [email protected] wrote:

[...]

> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
>
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
>
> There is no functional change.
>
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.

This patch series is fine for me:

Reviewed-by: Leo Yan <[email protected]>

I only compiled successfully it on my Arm64 machine but don't test
it due to I have no chance to access a machine with Arm SPE.

James, could you test it? Thanks a lot!

> Here are the test results:
> Intel hybrid machine:
>
> $perf mem record -e list
> ldlat-loads : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>
>
> Intel SPR:
> $perf mem record -e list
> ldlat-loads : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P

2023-12-19 08:50:16

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 1/7] perf mem: Add mem_events into the supported perf_pmu

Patch looks fine to me.

Reviewed-by: Kajol Jain<[email protected]>
Tested-by: Kajol Jain<[email protected]>

Thanks,
Kajol Jain

On 12/14/23 01:21, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> With the mem_events, perf doesn't need to read sysfs for each PMU to
> find the mem-events-supported PMU. The patch also makes it possible to
> clean up the related __weak functions later.
>
> The patch is only to add the mem_events into the perf_pmu for all ARCHs.
> It will be used in the later cleanup patches.
>
> Reviewed-by: Ian Rogers <[email protected]>
> Tested-by: Ravi Bangoria <[email protected]>
> Suggested-by: Leo Yan <[email protected]>
> Signed-off-by: Kan Liang <[email protected]>
> ---
> tools/perf/arch/arm/util/pmu.c | 3 +++
> tools/perf/arch/arm64/util/mem-events.c | 7 ++++---
> tools/perf/arch/arm64/util/mem-events.h | 7 +++++++
> tools/perf/arch/s390/util/pmu.c | 3 +++
> tools/perf/arch/x86/util/mem-events.c | 4 ++--
> tools/perf/arch/x86/util/mem-events.h | 9 +++++++++
> tools/perf/arch/x86/util/pmu.c | 7 +++++++
> tools/perf/util/mem-events.c | 2 +-
> tools/perf/util/mem-events.h | 1 +
> tools/perf/util/pmu.c | 4 +++-
> tools/perf/util/pmu.h | 7 +++++++
> 11 files changed, 47 insertions(+), 7 deletions(-)
> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>
> diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
> index 7f3af3b97f3b..8b7cb68ba1a8 100644
> --- a/tools/perf/arch/arm/util/pmu.c
> +++ b/tools/perf/arch/arm/util/pmu.c
> @@ -13,6 +13,7 @@
> #include "hisi-ptt.h"
> #include "../../../util/pmu.h"
> #include "../../../util/cs-etm.h"
> +#include "../../arm64/util/mem-events.h"
>
> void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
> {
> @@ -26,6 +27,8 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
> pmu->selectable = true;
> pmu->is_uncore = false;
> pmu->perf_event_attr_init_default = arm_spe_pmu_default_config;
> + if (!strcmp(pmu->name, "arm_spe_0"))
> + pmu->mem_events = perf_mem_events_arm;
> } else if (strstarts(pmu->name, HISI_PTT_PMU_NAME)) {
> pmu->selectable = true;
> #endif
> diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
> index 3bcc5c7035c2..edf8207f7812 100644
> --- a/tools/perf/arch/arm64/util/mem-events.c
> +++ b/tools/perf/arch/arm64/util/mem-events.c
> @@ -1,10 +1,11 @@
> // SPDX-License-Identifier: GPL-2.0
> -#include "map_symbol.h"
> +#include "util/map_symbol.h"
> +#include "util/mem-events.h"
> #include "mem-events.h"
>
> #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
>
> -static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> +struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
> E("spe-load", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0"),
> E("spe-store", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0"),
> E("spe-ldst", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0"),
> @@ -17,7 +18,7 @@ struct perf_mem_event *perf_mem_events__ptr(int i)
> if (i >= PERF_MEM_EVENTS__MAX)
> return NULL;
>
> - return &perf_mem_events[i];
> + return &perf_mem_events_arm[i];
> }
>
> const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
> diff --git a/tools/perf/arch/arm64/util/mem-events.h b/tools/perf/arch/arm64/util/mem-events.h
> new file mode 100644
> index 000000000000..5fc50be4be38
> --- /dev/null
> +++ b/tools/perf/arch/arm64/util/mem-events.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ARM64_MEM_EVENTS_H
> +#define _ARM64_MEM_EVENTS_H
> +
> +extern struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX];
> +
> +#endif /* _ARM64_MEM_EVENTS_H */
> diff --git a/tools/perf/arch/s390/util/pmu.c b/tools/perf/arch/s390/util/pmu.c
> index 886c30e001fa..225d7dc2379c 100644
> --- a/tools/perf/arch/s390/util/pmu.c
> +++ b/tools/perf/arch/s390/util/pmu.c
> @@ -19,4 +19,7 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
> !strcmp(pmu->name, S390_PMUPAI_EXT) ||
> !strcmp(pmu->name, S390_PMUCPUM_CF))
> pmu->selectable = true;
> +
> + if (pmu->is_core)
> + pmu->mem_events = perf_mem_events;
> }
> diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
> index 191b372f9a2d..2b81d229982c 100644
> --- a/tools/perf/arch/x86/util/mem-events.c
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -16,13 +16,13 @@ static char mem_stores_name[100];
>
> #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
>
> -static struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
> +struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
> E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
> E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores"),
> E(NULL, NULL, NULL),
> };
>
> -static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
> +struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
> E(NULL, NULL, NULL),
> E(NULL, NULL, NULL),
> E("mem-ldst", "ibs_op//", "ibs_op"),
> diff --git a/tools/perf/arch/x86/util/mem-events.h b/tools/perf/arch/x86/util/mem-events.h
> new file mode 100644
> index 000000000000..3959e427f482
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/mem-events.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _X86_MEM_EVENTS_H
> +#define _X86_MEM_EVENTS_H
> +
> +extern struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX];
> +
> +extern struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX];
> +
> +#endif /* _X86_MEM_EVENTS_H */
> diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
> index 469555ae9b3c..cd22e80e5657 100644
> --- a/tools/perf/arch/x86/util/pmu.c
> +++ b/tools/perf/arch/x86/util/pmu.c
> @@ -15,6 +15,7 @@
> #include "../../../util/pmu.h"
> #include "../../../util/fncache.h"
> #include "../../../util/pmus.h"
> +#include "mem-events.h"
> #include "env.h"
>
> void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
> @@ -30,6 +31,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
> pmu->selectable = true;
> }
> #endif
> +
> + if (x86__is_amd_cpu()) {
> + if (!strcmp(pmu->name, "ibs_op"))
> + pmu->mem_events = perf_mem_events_amd;
> + } else if (pmu->is_core)
> + pmu->mem_events = perf_mem_events_intel;
> }
>
> int perf_pmus__num_mem_pmus(void)
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index 3a2e3687878c..0a8f415f5efe 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -19,7 +19,7 @@ unsigned int perf_mem_events__loads_ldlat = 30;
>
> #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
>
> -static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> +struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> E("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "cpu/events/mem-loads"),
> E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
> E(NULL, NULL, NULL),
> diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
> index b40ad6ea93fc..8c5694b2d0b0 100644
> --- a/tools/perf/util/mem-events.h
> +++ b/tools/perf/util/mem-events.h
> @@ -34,6 +34,7 @@ enum {
> };
>
> extern unsigned int perf_mem_events__loads_ldlat;
> +extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];
>
> int perf_mem_events__parse(const char *str);
> int perf_mem_events__init(void);
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 3c9609944a2f..3d4373b8ab63 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -986,8 +986,10 @@ static int pmu_max_precise(int dirfd, struct perf_pmu *pmu)
> }
>
> void __weak
> -perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
> +perf_pmu__arch_init(struct perf_pmu *pmu)
> {
> + if (pmu->is_core)
> + pmu->mem_events = perf_mem_events;
> }
>
> struct perf_pmu *perf_pmu__lookup(struct list_head *pmus, int dirfd, const char *name)
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 424c3fee0949..e35d985206db 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -10,6 +10,8 @@
> #include <stdio.h>
> #include "parse-events.h"
> #include "pmu-events/pmu-events.h"
> +#include "map_symbol.h"
> +#include "mem-events.h"
>
> struct evsel_config_term;
> struct perf_cpu_map;
> @@ -162,6 +164,11 @@ struct perf_pmu {
> */
> bool exclude_guest;
> } missing_features;
> +
> + /**
> + * @mem_events: List of the supported mem events
> + */
> + struct perf_mem_event *mem_events;
> };
>
> /** @perf_pmu__fake: A special global PMU used for testing. */

2023-12-19 09:27:12

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem

Hi,
I was trying to test this patchset on powerpc.

After applying it on top of acme's perf-tools-next branch, I am getting
below error:

INSTALL libsubcmd_headers
INSTALL libperf_headers
INSTALL libsymbol_headers
INSTALL libapi_headers
INSTALL libbpf_headers
CC arch/powerpc/util/mem-events.o
In file included from arch/powerpc/util/mem-events.c:3:
arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
undeclared here (not in a function)
5 | extern struct perf_mem_event
perf_mem_events_power[PERF_MEM_EVENTS__MAX];
|
^~~~~~~~~~~~~~~~~~~~
make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
arch/powerpc/util/mem-events.o] Error 1
make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
Error 2
make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
Error 2
make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
make[1]: *** [Makefile.perf:251: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

It seems some headerfiles are missing from arch/powerpc/util/mem-
events.c

Thanks,
Kajol Jain

On 12/14/23 01:21, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> Changes since V2:
> - Fix the Arm64 building error (Leo)
> - Add two new patches to clean up perf_mem_events__record_args()
> and perf_pmus__num_mem_pmus() (Leo)
>
> Changes since V1:
> - Fix strcmp of PMU name checking (Ravi)
> - Fix "/," typo (Ian)
> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
> - Fold the header removal patch into the patch where the cleanups made.
> (Arnaldo)
> - Add reviewed-by and tested-by from Ian and Ravi
>
> As discussed in the below thread, the patch set is to clean up perf mem.
> https://lore.kernel.org/lkml/[email protected]/
>
> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
>
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
>
> There is no functional change.
>
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.
>
> Here are the test results:
> Intel hybrid machine:
>
> $perf mem record -e list
> ldlat-loads : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>
>
> Intel SPR:
> $perf mem record -e list
> ldlat-loads : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P
>
> Kan Liang (7):
> perf mem: Add mem_events into the supported perf_pmu
> perf mem: Clean up perf_mem_events__ptr()
> perf mem: Clean up perf_mem_events__name()
> perf mem: Clean up perf_mem_event__supported()
> perf mem: Clean up is_mem_loads_aux_event()
> perf mem: Clean up perf_mem_events__record_args()
> perf mem: Clean up perf_pmus__num_mem_pmus()
>
> tools/perf/arch/arm/util/pmu.c | 3 +
> tools/perf/arch/arm64/util/mem-events.c | 39 +---
> tools/perf/arch/arm64/util/mem-events.h | 7 +
> tools/perf/arch/powerpc/util/mem-events.c | 13 +-
> tools/perf/arch/powerpc/util/mem-events.h | 7 +
> tools/perf/arch/powerpc/util/pmu.c | 11 ++
> tools/perf/arch/s390/util/pmu.c | 3 +
> tools/perf/arch/x86/util/mem-events.c | 99 ++--------
> tools/perf/arch/x86/util/mem-events.h | 10 +
> tools/perf/arch/x86/util/pmu.c | 19 +-
> tools/perf/builtin-c2c.c | 45 ++---
> tools/perf/builtin-mem.c | 48 ++---
> tools/perf/util/mem-events.c | 217 +++++++++++++---------
> tools/perf/util/mem-events.h | 19 +-
> tools/perf/util/pmu.c | 4 +-
> tools/perf/util/pmu.h | 7 +
> tools/perf/util/pmus.c | 6 -
> tools/perf/util/pmus.h | 1 -
> 18 files changed, 278 insertions(+), 280 deletions(-)
> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>

2023-12-19 14:26:12

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 2023-12-19 4:26 a.m., kajoljain wrote:
> Hi,
> I was trying to test this patchset on powerpc.
>
> After applying it on top of acme's perf-tools-next branch, I am getting
> below error:
>
> INSTALL libsubcmd_headers
> INSTALL libperf_headers
> INSTALL libsymbol_headers
> INSTALL libapi_headers
> INSTALL libbpf_headers
> CC arch/powerpc/util/mem-events.o
> In file included from arch/powerpc/util/mem-events.c:3:
> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
> undeclared here (not in a function)
> 5 | extern struct perf_mem_event
> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
> |
> ^~~~~~~~~~~~~~~~~~~~
> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
> arch/powerpc/util/mem-events.o] Error 1
> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
> Error 2
> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
> Error 2
> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
> Error 2
> make[3]: *** Waiting for unfinished jobs....
> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
> make[1]: *** [Makefile.perf:251: sub-make] Error 2
> make: *** [Makefile:70: all] Error 2
>
> It seems some headerfiles are missing from arch/powerpc/util/mem-
> events.c
>

Leo updated the headerfiles for ARM. https://termbin.com/0dkn

I guess powerpc has to do the same thing. Could you please try the below
patch?

diff --git a/tools/perf/arch/powerpc/util/mem-events.c
b/tools/perf/arch/powerpc/util/mem-events.c
index 72a6ac2b52f5..765d4a054b0a 100644
--- a/tools/perf/arch/powerpc/util/mem-events.c
+++ b/tools/perf/arch/powerpc/util/mem-events.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
-#include "map_symbol.h"
+#include "util/map_symbol.h"
+#include "util/mem-events.h"
#include "mem-events.h"

#define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
= l, .aux_event = a }

Thanks,
Kan

> Thanks,
> Kajol Jain
>
> On 12/14/23 01:21, [email protected] wrote:
>> From: Kan Liang <[email protected]>
>>
>> Changes since V2:
>> - Fix the Arm64 building error (Leo)
>> - Add two new patches to clean up perf_mem_events__record_args()
>> and perf_pmus__num_mem_pmus() (Leo)
>>
>> Changes since V1:
>> - Fix strcmp of PMU name checking (Ravi)
>> - Fix "/," typo (Ian)
>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>> - Fold the header removal patch into the patch where the cleanups made.
>> (Arnaldo)
>> - Add reviewed-by and tested-by from Ian and Ravi
>>
>> As discussed in the below thread, the patch set is to clean up perf mem.
>> https://lore.kernel.org/lkml/[email protected]/
>>
>> Introduce generic functions perf_mem_events__ptr(),
>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>> ARCH specific ones.
>> Simplify the perf_mem_event__supported().
>>
>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>> mem-events.c for each ARCH.
>>
>> There is no functional change.
>>
>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>> etc. But I can only test it on two Intel platforms.
>> Please give it try, if you have machines with other ARCHs.
>>
>> Here are the test results:
>> Intel hybrid machine:
>>
>> $perf mem record -e list
>> ldlat-loads : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>
>> $perf mem record -v
>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>
>>
>> Intel SPR:
>> $perf mem record -e list
>> ldlat-loads : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>
>> $perf mem record -v
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu/mem-stores/P
>>
>> Kan Liang (7):
>> perf mem: Add mem_events into the supported perf_pmu
>> perf mem: Clean up perf_mem_events__ptr()
>> perf mem: Clean up perf_mem_events__name()
>> perf mem: Clean up perf_mem_event__supported()
>> perf mem: Clean up is_mem_loads_aux_event()
>> perf mem: Clean up perf_mem_events__record_args()
>> perf mem: Clean up perf_pmus__num_mem_pmus()
>>
>> tools/perf/arch/arm/util/pmu.c | 3 +
>> tools/perf/arch/arm64/util/mem-events.c | 39 +---
>> tools/perf/arch/arm64/util/mem-events.h | 7 +
>> tools/perf/arch/powerpc/util/mem-events.c | 13 +-
>> tools/perf/arch/powerpc/util/mem-events.h | 7 +
>> tools/perf/arch/powerpc/util/pmu.c | 11 ++
>> tools/perf/arch/s390/util/pmu.c | 3 +
>> tools/perf/arch/x86/util/mem-events.c | 99 ++--------
>> tools/perf/arch/x86/util/mem-events.h | 10 +
>> tools/perf/arch/x86/util/pmu.c | 19 +-
>> tools/perf/builtin-c2c.c | 45 ++---
>> tools/perf/builtin-mem.c | 48 ++---
>> tools/perf/util/mem-events.c | 217 +++++++++++++---------
>> tools/perf/util/mem-events.h | 19 +-
>> tools/perf/util/pmu.c | 4 +-
>> tools/perf/util/pmu.h | 7 +
>> tools/perf/util/pmus.c | 6 -
>> tools/perf/util/pmus.h | 1 -
>> 18 files changed, 278 insertions(+), 280 deletions(-)
>> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>
>

2024-01-02 20:08:41

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem

Hi Kajol Jain

On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>
>
> On 2023-12-19 4:26 a.m., kajoljain wrote:
>> Hi,
>> I was trying to test this patchset on powerpc.
>>
>> After applying it on top of acme's perf-tools-next branch, I am getting
>> below error:
>>
>> INSTALL libsubcmd_headers
>> INSTALL libperf_headers
>> INSTALL libsymbol_headers
>> INSTALL libapi_headers
>> INSTALL libbpf_headers
>> CC arch/powerpc/util/mem-events.o
>> In file included from arch/powerpc/util/mem-events.c:3:
>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>> undeclared here (not in a function)
>> 5 | extern struct perf_mem_event
>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>> |
>> ^~~~~~~~~~~~~~~~~~~~
>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>> arch/powerpc/util/mem-events.o] Error 1
>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>> Error 2
>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>> Error 2
>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>> Error 2
>> make[3]: *** Waiting for unfinished jobs....
>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>> make: *** [Makefile:70: all] Error 2
>>
>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>> events.c
>>
>
> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>
> I guess powerpc has to do the same thing. Could you please try the below
> patch?


Does the patch work on powerpc?


Thanks,
Kan
>
> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
> b/tools/perf/arch/powerpc/util/mem-events.c
> index 72a6ac2b52f5..765d4a054b0a 100644
> --- a/tools/perf/arch/powerpc/util/mem-events.c
> +++ b/tools/perf/arch/powerpc/util/mem-events.c
> @@ -1,5 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0
> -#include "map_symbol.h"
> +#include "util/map_symbol.h"
> +#include "util/mem-events.h"
> #include "mem-events.h"
>
> #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
> = l, .aux_event = a }
>
> Thanks,
> Kan
>
>> Thanks,
>> Kajol Jain
>>
>> On 12/14/23 01:21, [email protected] wrote:
>>> From: Kan Liang <[email protected]>
>>>
>>> Changes since V2:
>>> - Fix the Arm64 building error (Leo)
>>> - Add two new patches to clean up perf_mem_events__record_args()
>>> and perf_pmus__num_mem_pmus() (Leo)
>>>
>>> Changes since V1:
>>> - Fix strcmp of PMU name checking (Ravi)
>>> - Fix "/," typo (Ian)
>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>> - Fold the header removal patch into the patch where the cleanups made.
>>> (Arnaldo)
>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>
>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>> https://lore.kernel.org/lkml/[email protected]/
>>>
>>> Introduce generic functions perf_mem_events__ptr(),
>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>> ARCH specific ones.
>>> Simplify the perf_mem_event__supported().
>>>
>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>> mem-events.c for each ARCH.
>>>
>>> There is no functional change.
>>>
>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>> etc. But I can only test it on two Intel platforms.
>>> Please give it try, if you have machines with other ARCHs.
>>>
>>> Here are the test results:
>>> Intel hybrid machine:
>>>
>>> $perf mem record -e list
>>> ldlat-loads : available
>>> ldlat-stores : available
>>>
>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>
>>> $perf mem record -v
>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>
>>> $perf mem record -t store -v
>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>
>>>
>>> Intel SPR:
>>> $perf mem record -e list
>>> ldlat-loads : available
>>> ldlat-stores : available
>>>
>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>
>>> $perf mem record -v
>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>
>>> $perf mem record -t store -v
>>> calling: record -e cpu/mem-stores/P
>>>
>>> Kan Liang (7):
>>> perf mem: Add mem_events into the supported perf_pmu
>>> perf mem: Clean up perf_mem_events__ptr()
>>> perf mem: Clean up perf_mem_events__name()
>>> perf mem: Clean up perf_mem_event__supported()
>>> perf mem: Clean up is_mem_loads_aux_event()
>>> perf mem: Clean up perf_mem_events__record_args()
>>> perf mem: Clean up perf_pmus__num_mem_pmus()
>>>
>>> tools/perf/arch/arm/util/pmu.c | 3 +
>>> tools/perf/arch/arm64/util/mem-events.c | 39 +---
>>> tools/perf/arch/arm64/util/mem-events.h | 7 +
>>> tools/perf/arch/powerpc/util/mem-events.c | 13 +-
>>> tools/perf/arch/powerpc/util/mem-events.h | 7 +
>>> tools/perf/arch/powerpc/util/pmu.c | 11 ++
>>> tools/perf/arch/s390/util/pmu.c | 3 +
>>> tools/perf/arch/x86/util/mem-events.c | 99 ++--------
>>> tools/perf/arch/x86/util/mem-events.h | 10 +
>>> tools/perf/arch/x86/util/pmu.c | 19 +-
>>> tools/perf/builtin-c2c.c | 45 ++---
>>> tools/perf/builtin-mem.c | 48 ++---
>>> tools/perf/util/mem-events.c | 217 +++++++++++++---------
>>> tools/perf/util/mem-events.h | 19 +-
>>> tools/perf/util/pmu.c | 4 +-
>>> tools/perf/util/pmu.h | 7 +
>>> tools/perf/util/pmus.c | 6 -
>>> tools/perf/util/pmus.h | 1 -
>>> 18 files changed, 278 insertions(+), 280 deletions(-)
>>> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>
>>
>

2024-01-05 06:40:35

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 1/3/24 01:38, Liang, Kan wrote:
> Hi Kajol Jain
>
> On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>>
>>
>> On 2023-12-19 4:26 a.m., kajoljain wrote:
>>> Hi,
>>> I was trying to test this patchset on powerpc.
>>>
>>> After applying it on top of acme's perf-tools-next branch, I am getting
>>> below error:
>>>
>>> INSTALL libsubcmd_headers
>>> INSTALL libperf_headers
>>> INSTALL libsymbol_headers
>>> INSTALL libapi_headers
>>> INSTALL libbpf_headers
>>> CC arch/powerpc/util/mem-events.o
>>> In file included from arch/powerpc/util/mem-events.c:3:
>>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>>> undeclared here (not in a function)
>>> 5 | extern struct perf_mem_event
>>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>> |
>>> ^~~~~~~~~~~~~~~~~~~~
>>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>>> arch/powerpc/util/mem-events.o] Error 1
>>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>>> Error 2
>>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>>> Error 2
>>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>>> Error 2
>>> make[3]: *** Waiting for unfinished jobs....
>>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>>> make: *** [Makefile:70: all] Error 2
>>>
>>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>>> events.c
>>>
>>
>> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>>
>> I guess powerpc has to do the same thing. Could you please try the below
>> patch?
>
>
> Does the patch work on powerpc?

Hi Kan,
Sorry I went for vacation so couldn't update. Yes this fix works. But
we have another issue, actually this patch set changes uses ldlat
attribute. But ldlat is not supported in powerpc because of which perf
mem is failing in powerpc.

I am looking into a work around to fix this issue. I will update the fix.

Thanks,
Kajol Jain


>
>
> Thanks,
> Kan
>>
>> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
>> b/tools/perf/arch/powerpc/util/mem-events.c
>> index 72a6ac2b52f5..765d4a054b0a 100644
>> --- a/tools/perf/arch/powerpc/util/mem-events.c
>> +++ b/tools/perf/arch/powerpc/util/mem-events.c
>> @@ -1,5 +1,6 @@
>> // SPDX-License-Identifier: GPL-2.0
>> -#include "map_symbol.h"
>> +#include "util/map_symbol.h"
>> +#include "util/mem-events.h"
>> #include "mem-events.h"
>>
>> #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
>> = l, .aux_event = a }
>>
>> Thanks,
>> Kan
>>
>>> Thanks,
>>> Kajol Jain
>>>
>>> On 12/14/23 01:21, [email protected] wrote:
>>>> From: Kan Liang <[email protected]>
>>>>
>>>> Changes since V2:
>>>> - Fix the Arm64 building error (Leo)
>>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>> and perf_pmus__num_mem_pmus() (Leo)
>>>>
>>>> Changes since V1:
>>>> - Fix strcmp of PMU name checking (Ravi)
>>>> - Fix "/," typo (Ian)
>>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>>> - Fold the header removal patch into the patch where the cleanups made.
>>>> (Arnaldo)
>>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>>
>>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>>> https://lore.kernel.org/lkml/[email protected]/
>>>>
>>>> Introduce generic functions perf_mem_events__ptr(),
>>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>>> ARCH specific ones.
>>>> Simplify the perf_mem_event__supported().
>>>>
>>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>>> mem-events.c for each ARCH.
>>>>
>>>> There is no functional change.
>>>>
>>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>>> etc. But I can only test it on two Intel platforms.
>>>> Please give it try, if you have machines with other ARCHs.
>>>>
>>>> Here are the test results:
>>>> Intel hybrid machine:
>>>>
>>>> $perf mem record -e list
>>>> ldlat-loads : available
>>>> ldlat-stores : available
>>>>
>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>>
>>>> $perf mem record -v
>>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>>
>>>> $perf mem record -t store -v
>>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>>
>>>>
>>>> Intel SPR:
>>>> $perf mem record -e list
>>>> ldlat-loads : available
>>>> ldlat-stores : available
>>>>
>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>>
>>>> $perf mem record -v
>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>>
>>>> $perf mem record -t store -v
>>>> calling: record -e cpu/mem-stores/P
>>>>
>>>> Kan Liang (7):
>>>> perf mem: Add mem_events into the supported perf_pmu
>>>> perf mem: Clean up perf_mem_events__ptr()
>>>> perf mem: Clean up perf_mem_events__name()
>>>> perf mem: Clean up perf_mem_event__supported()
>>>> perf mem: Clean up is_mem_loads_aux_event()
>>>> perf mem: Clean up perf_mem_events__record_args()
>>>> perf mem: Clean up perf_pmus__num_mem_pmus()
>>>>
>>>> tools/perf/arch/arm/util/pmu.c | 3 +
>>>> tools/perf/arch/arm64/util/mem-events.c | 39 +---
>>>> tools/perf/arch/arm64/util/mem-events.h | 7 +
>>>> tools/perf/arch/powerpc/util/mem-events.c | 13 +-
>>>> tools/perf/arch/powerpc/util/mem-events.h | 7 +
>>>> tools/perf/arch/powerpc/util/pmu.c | 11 ++
>>>> tools/perf/arch/s390/util/pmu.c | 3 +
>>>> tools/perf/arch/x86/util/mem-events.c | 99 ++--------
>>>> tools/perf/arch/x86/util/mem-events.h | 10 +
>>>> tools/perf/arch/x86/util/pmu.c | 19 +-
>>>> tools/perf/builtin-c2c.c | 45 ++---
>>>> tools/perf/builtin-mem.c | 48 ++---
>>>> tools/perf/util/mem-events.c | 217 +++++++++++++---------
>>>> tools/perf/util/mem-events.h | 19 +-
>>>> tools/perf/util/pmu.c | 4 +-
>>>> tools/perf/util/pmu.h | 7 +
>>>> tools/perf/util/pmus.c | 6 -
>>>> tools/perf/util/pmus.h | 1 -
>>>> 18 files changed, 278 insertions(+), 280 deletions(-)
>>>> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>>
>>>
>>

2024-01-05 14:38:26

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 2024-01-05 1:38 a.m., kajoljain wrote:
>
>
> On 1/3/24 01:38, Liang, Kan wrote:
>> Hi Kajol Jain
>>
>> On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>>>
>>>
>>> On 2023-12-19 4:26 a.m., kajoljain wrote:
>>>> Hi,
>>>> I was trying to test this patchset on powerpc.
>>>>
>>>> After applying it on top of acme's perf-tools-next branch, I am getting
>>>> below error:
>>>>
>>>> INSTALL libsubcmd_headers
>>>> INSTALL libperf_headers
>>>> INSTALL libsymbol_headers
>>>> INSTALL libapi_headers
>>>> INSTALL libbpf_headers
>>>> CC arch/powerpc/util/mem-events.o
>>>> In file included from arch/powerpc/util/mem-events.c:3:
>>>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>>>> undeclared here (not in a function)
>>>> 5 | extern struct perf_mem_event
>>>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>>> |
>>>> ^~~~~~~~~~~~~~~~~~~~
>>>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>>>> arch/powerpc/util/mem-events.o] Error 1
>>>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>>>> Error 2
>>>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>>>> Error 2
>>>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>>>> Error 2
>>>> make[3]: *** Waiting for unfinished jobs....
>>>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>>>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>>>> make: *** [Makefile:70: all] Error 2
>>>>
>>>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>>>> events.c
>>>>
>>>
>>> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>>>
>>> I guess powerpc has to do the same thing. Could you please try the below
>>> patch?
>>
>>
>> Does the patch work on powerpc?
>
> Hi Kan,
> Sorry I went for vacation so couldn't update. Yes this fix works.

Thanks for the update.

> But
> we have another issue, actually this patch set changes uses ldlat
> attribute. But ldlat is not supported in powerpc because of which perf
> mem is failing in powerpc.

For powerpc, the patch 3 introduced a perf_mem_events_power, which
doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
sure if it's the problem.
Also, S390 still uses the default perf_mem_events, which includes ldlat.
I'm not sure if S390 supports the ldlat.

Thanks,
Kan
>
> I am looking into a work around to fix this issue. I will update the fix.
>
> Thanks,
> Kajol Jain
>
>
>>
>>
>> Thanks,
>> Kan
>>>
>>> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
>>> b/tools/perf/arch/powerpc/util/mem-events.c
>>> index 72a6ac2b52f5..765d4a054b0a 100644
>>> --- a/tools/perf/arch/powerpc/util/mem-events.c
>>> +++ b/tools/perf/arch/powerpc/util/mem-events.c
>>> @@ -1,5 +1,6 @@
>>> // SPDX-License-Identifier: GPL-2.0
>>> -#include "map_symbol.h"
>>> +#include "util/map_symbol.h"
>>> +#include "util/mem-events.h"
>>> #include "mem-events.h"
>>>
>>> #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
>>> = l, .aux_event = a }
>>>
>>> Thanks,
>>> Kan
>>>
>>>> Thanks,
>>>> Kajol Jain
>>>>
>>>> On 12/14/23 01:21, [email protected] wrote:
>>>>> From: Kan Liang <[email protected]>
>>>>>
>>>>> Changes since V2:
>>>>> - Fix the Arm64 building error (Leo)
>>>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>>> and perf_pmus__num_mem_pmus() (Leo)
>>>>>
>>>>> Changes since V1:
>>>>> - Fix strcmp of PMU name checking (Ravi)
>>>>> - Fix "/," typo (Ian)
>>>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>>>> - Fold the header removal patch into the patch where the cleanups made.
>>>>> (Arnaldo)
>>>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>>>
>>>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>>>> https://lore.kernel.org/lkml/[email protected]/
>>>>>
>>>>> Introduce generic functions perf_mem_events__ptr(),
>>>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>>>> ARCH specific ones.
>>>>> Simplify the perf_mem_event__supported().
>>>>>
>>>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>>>> mem-events.c for each ARCH.
>>>>>
>>>>> There is no functional change.
>>>>>
>>>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>>>> etc. But I can only test it on two Intel platforms.
>>>>> Please give it try, if you have machines with other ARCHs.
>>>>>
>>>>> Here are the test results:
>>>>> Intel hybrid machine:
>>>>>
>>>>> $perf mem record -e list
>>>>> ldlat-loads : available
>>>>> ldlat-stores : available
>>>>>
>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>>>
>>>>> $perf mem record -v
>>>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>>>
>>>>> $perf mem record -t store -v
>>>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>>>
>>>>>
>>>>> Intel SPR:
>>>>> $perf mem record -e list
>>>>> ldlat-loads : available
>>>>> ldlat-stores : available
>>>>>
>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>>>
>>>>> $perf mem record -v
>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>>>
>>>>> $perf mem record -t store -v
>>>>> calling: record -e cpu/mem-stores/P
>>>>>
>>>>> Kan Liang (7):
>>>>> perf mem: Add mem_events into the supported perf_pmu
>>>>> perf mem: Clean up perf_mem_events__ptr()
>>>>> perf mem: Clean up perf_mem_events__name()
>>>>> perf mem: Clean up perf_mem_event__supported()
>>>>> perf mem: Clean up is_mem_loads_aux_event()
>>>>> perf mem: Clean up perf_mem_events__record_args()
>>>>> perf mem: Clean up perf_pmus__num_mem_pmus()
>>>>>
>>>>> tools/perf/arch/arm/util/pmu.c | 3 +
>>>>> tools/perf/arch/arm64/util/mem-events.c | 39 +---
>>>>> tools/perf/arch/arm64/util/mem-events.h | 7 +
>>>>> tools/perf/arch/powerpc/util/mem-events.c | 13 +-
>>>>> tools/perf/arch/powerpc/util/mem-events.h | 7 +
>>>>> tools/perf/arch/powerpc/util/pmu.c | 11 ++
>>>>> tools/perf/arch/s390/util/pmu.c | 3 +
>>>>> tools/perf/arch/x86/util/mem-events.c | 99 ++--------
>>>>> tools/perf/arch/x86/util/mem-events.h | 10 +
>>>>> tools/perf/arch/x86/util/pmu.c | 19 +-
>>>>> tools/perf/builtin-c2c.c | 45 ++---
>>>>> tools/perf/builtin-mem.c | 48 ++---
>>>>> tools/perf/util/mem-events.c | 217 +++++++++++++---------
>>>>> tools/perf/util/mem-events.h | 19 +-
>>>>> tools/perf/util/pmu.c | 4 +-
>>>>> tools/perf/util/pmu.h | 7 +
>>>>> tools/perf/util/pmus.c | 6 -
>>>>> tools/perf/util/pmus.h | 1 -
>>>>> 18 files changed, 278 insertions(+), 280 deletions(-)
>>>>> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>>> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>>> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>>> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>>>
>>>>
>>>
>

2024-01-07 04:08:50

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem

On Wed, Dec 13, 2023 at 11:51:47AM -0800, [email protected] wrote:

[...]

> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
>
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
>
> There is no functional change.
>
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.
>
> Here are the test results:
> Intel hybrid machine:
>
> $perf mem record -e list
> ldlat-loads : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>
>
> Intel SPR:
> $perf mem record -e list
> ldlat-loads : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P

After applying this series, below tests pass with Arm SPE:

# ./perf c2c record -- /home/leoy/false_sharing.exe 2
# ./perf c2c report

# ./perf mem record -e list
# ./perf mem record -e spe-load -v --ldlat 50
# ./perf mem record -v
# ./perf mem report
# ./perf mem record -t store -v
# ./perf mem report

Tested-by: Leo Yan <[email protected]>

2024-01-09 14:03:04

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 2024-01-06 11:08 p.m., Leo Yan wrote:
> On Wed, Dec 13, 2023 at 11:51:47AM -0800, [email protected] wrote:
>
> [...]
>
>> Introduce generic functions perf_mem_events__ptr(),
>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>> ARCH specific ones.
>> Simplify the perf_mem_event__supported().
>>
>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>> mem-events.c for each ARCH.
>>
>> There is no functional change.
>>
>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>> etc. But I can only test it on two Intel platforms.
>> Please give it try, if you have machines with other ARCHs.
>>
>> Here are the test results:
>> Intel hybrid machine:
>>
>> $perf mem record -e list
>> ldlat-loads : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>
>> $perf mem record -v
>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>
>>
>> Intel SPR:
>> $perf mem record -e list
>> ldlat-loads : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>
>> $perf mem record -v
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu/mem-stores/P
>
> After applying this series, below tests pass with Arm SPE:
>
> # ./perf c2c record -- /home/leoy/false_sharing.exe 2
> # ./perf c2c report
>
> # ./perf mem record -e list
> # ./perf mem record -e spe-load -v --ldlat 50
> # ./perf mem record -v
> # ./perf mem report
> # ./perf mem record -t store -v
> # ./perf mem report
>
> Tested-by: Leo Yan <[email protected]>
>

Thanks Leo.

Kan


2024-01-16 13:23:41

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 2/7] perf mem: Clean up perf_mem_events__ptr()

patch looks fine to me.

Reviwed-by: Kajol Jain <[email protected]>
Tested-by: Kajol jain <[email protected]>

Thanks,
Kajol Jain

On 12/14/23 01:21, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> The mem_events can be retrieved from the struct perf_pmu now. An ARCH
> specific perf_mem_events__ptr() is not required anymore. Remove all of
> them.
>
> The Intel hybrid has multiple mem-events-supported PMUs. But they share
> the same mem_events. Other ARCHs only support one mem-events-supported
> PMU. In the configuration, it's good enough to only configure the
> mem_events for one PMU. Add perf_mem_events_find_pmu() which returns the
> first mem-events-supported PMU.
>
> In the perf_mem_events__init(), the perf_pmus__scan() is not required
> anymore. It avoids checking the sysfs for every PMU on the system.
>
> Make the perf_mem_events__record_args() more generic. Remove the
> perf_mem_events__print_unsupport_hybrid().
>
> Since pmu is added as a new parameter, rename perf_mem_events__ptr() to
> perf_pmu__mem_events_ptr(). Several other functions also do a similar
> rename.
>
> Reviewed-by: Ian Rogers <[email protected]>
> Tested-by: Ravi Bangoria <[email protected]>
> Signed-off-by: Kan Liang <[email protected]>
> ---
> tools/perf/arch/arm64/util/mem-events.c | 10 +--
> tools/perf/arch/x86/util/mem-events.c | 18 ++---
> tools/perf/builtin-c2c.c | 28 +++++--
> tools/perf/builtin-mem.c | 28 +++++--
> tools/perf/util/mem-events.c | 103 ++++++++++++------------
> tools/perf/util/mem-events.h | 9 ++-
> 6 files changed, 104 insertions(+), 92 deletions(-)
>
> diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
> index edf8207f7812..d3e69a520c2b 100644
> --- a/tools/perf/arch/arm64/util/mem-events.c
> +++ b/tools/perf/arch/arm64/util/mem-events.c
> @@ -13,17 +13,9 @@ struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
>
> static char mem_ev_name[100];
>
> -struct perf_mem_event *perf_mem_events__ptr(int i)
> -{
> - if (i >= PERF_MEM_EVENTS__MAX)
> - return NULL;
> -
> - return &perf_mem_events_arm[i];
> -}
> -
> const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
> {
> - struct perf_mem_event *e = perf_mem_events__ptr(i);
> + struct perf_mem_event *e = &perf_mem_events_arm[i];
>
> if (i >= PERF_MEM_EVENTS__MAX)
> return NULL;
> diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
> index 2b81d229982c..5fb41d50118d 100644
> --- a/tools/perf/arch/x86/util/mem-events.c
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -28,17 +28,6 @@ struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
> E("mem-ldst", "ibs_op//", "ibs_op"),
> };
>
> -struct perf_mem_event *perf_mem_events__ptr(int i)
> -{
> - if (i >= PERF_MEM_EVENTS__MAX)
> - return NULL;
> -
> - if (x86__is_amd_cpu())
> - return &perf_mem_events_amd[i];
> -
> - return &perf_mem_events_intel[i];
> -}
> -
> bool is_mem_loads_aux_event(struct evsel *leader)
> {
> struct perf_pmu *pmu = perf_pmus__find("cpu");
> @@ -54,7 +43,12 @@ bool is_mem_loads_aux_event(struct evsel *leader)
>
> const char *perf_mem_events__name(int i, const char *pmu_name)
> {
> - struct perf_mem_event *e = perf_mem_events__ptr(i);
> + struct perf_mem_event *e;
> +
> + if (x86__is_amd_cpu())
> + e = &perf_mem_events_amd[i];
> + else
> + e = &perf_mem_events_intel[i];
>
> if (!e)
> return NULL;
> diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
> index f78eea9e2153..838481505e08 100644
> --- a/tools/perf/builtin-c2c.c
> +++ b/tools/perf/builtin-c2c.c
> @@ -3215,12 +3215,19 @@ static int parse_record_events(const struct option *opt,
> const char *str, int unset __maybe_unused)
> {
> bool *event_set = (bool *) opt->value;
> + struct perf_pmu *pmu;
> +
> + pmu = perf_mem_events_find_pmu();
> + if (!pmu) {
> + pr_err("failed: there is no PMU that supports perf c2c\n");
> + exit(-1);
> + }
>
> if (!strcmp(str, "list")) {
> - perf_mem_events__list();
> + perf_pmu__mem_events_list(pmu);
> exit(0);
> }
> - if (perf_mem_events__parse(str))
> + if (perf_pmu__mem_events_parse(pmu, str))
> exit(-1);
>
> *event_set = true;
> @@ -3245,6 +3252,7 @@ static int perf_c2c__record(int argc, const char **argv)
> bool all_user = false, all_kernel = false;
> bool event_set = false;
> struct perf_mem_event *e;
> + struct perf_pmu *pmu;
> struct option options[] = {
> OPT_CALLBACK('e', "event", &event_set, "event",
> "event selector. Use 'perf c2c record -e list' to list available events",
> @@ -3256,7 +3264,13 @@ static int perf_c2c__record(int argc, const char **argv)
> OPT_END()
> };
>
> - if (perf_mem_events__init()) {
> + pmu = perf_mem_events_find_pmu();
> + if (!pmu) {
> + pr_err("failed: no PMU supports the memory events\n");
> + return -1;
> + }
> +
> + if (perf_pmu__mem_events_init(pmu)) {
> pr_err("failed: memory events not supported\n");
> return -1;
> }
> @@ -3280,7 +3294,7 @@ static int perf_c2c__record(int argc, const char **argv)
> rec_argv[i++] = "record";
>
> if (!event_set) {
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD_STORE);
> /*
> * The load and store operations are required, use the event
> * PERF_MEM_EVENTS__LOAD_STORE if it is supported.
> @@ -3289,15 +3303,15 @@ static int perf_c2c__record(int argc, const char **argv)
> e->record = true;
> rec_argv[i++] = "-W";
> } else {
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
> e->record = true;
>
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__STORE);
> e->record = true;
> }
> }
>
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
> if (e->record)
> rec_argv[i++] = "-W";
>
> diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
> index 51499c20da01..ef64bae77ca7 100644
> --- a/tools/perf/builtin-mem.c
> +++ b/tools/perf/builtin-mem.c
> @@ -43,12 +43,19 @@ static int parse_record_events(const struct option *opt,
> const char *str, int unset __maybe_unused)
> {
> struct perf_mem *mem = *(struct perf_mem **)opt->value;
> + struct perf_pmu *pmu;
> +
> + pmu = perf_mem_events_find_pmu();
> + if (!pmu) {
> + pr_err("failed: there is no PMU that supports perf mem\n");
> + exit(-1);
> + }
>
> if (!strcmp(str, "list")) {
> - perf_mem_events__list();
> + perf_pmu__mem_events_list(pmu);
> exit(0);
> }
> - if (perf_mem_events__parse(str))
> + if (perf_pmu__mem_events_parse(pmu, str))
> exit(-1);
>
> mem->operation = 0;
> @@ -72,6 +79,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
> int ret;
> bool all_user = false, all_kernel = false;
> struct perf_mem_event *e;
> + struct perf_pmu *pmu;
> struct option options[] = {
> OPT_CALLBACK('e', "event", &mem, "event",
> "event selector. use 'perf mem record -e list' to list available events",
> @@ -84,7 +92,13 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
> OPT_END()
> };
>
> - if (perf_mem_events__init()) {
> + pmu = perf_mem_events_find_pmu();
> + if (!pmu) {
> + pr_err("failed: no PMU supports the memory events\n");
> + return -1;
> + }
> +
> + if (perf_pmu__mem_events_init(pmu)) {
> pr_err("failed: memory events not supported\n");
> return -1;
> }
> @@ -113,7 +127,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
>
> rec_argv[i++] = "record";
>
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD_STORE);
>
> /*
> * The load and store operations are required, use the event
> @@ -126,17 +140,17 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
> rec_argv[i++] = "-W";
> } else {
> if (mem->operation & MEM_OPERATION_LOAD) {
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
> e->record = true;
> }
>
> if (mem->operation & MEM_OPERATION_STORE) {
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__STORE);
> e->record = true;
> }
> }
>
> - e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
> + e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
> if (e->record)
> rec_argv[i++] = "-W";
>
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index 0a8f415f5efe..27a33dc44964 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -29,17 +29,42 @@ struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> static char mem_loads_name[100];
> static bool mem_loads_name__init;
>
> -struct perf_mem_event * __weak perf_mem_events__ptr(int i)
> +struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i)
> {
> - if (i >= PERF_MEM_EVENTS__MAX)
> + if (i >= PERF_MEM_EVENTS__MAX || !pmu)
> return NULL;
>
> - return &perf_mem_events[i];
> + return &pmu->mem_events[i];
> +}
> +
> +static struct perf_pmu *perf_pmus__scan_mem(struct perf_pmu *pmu)
> +{
> + while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> + if (pmu->mem_events)
> + return pmu;
> + }
> + return NULL;
> +}
> +
> +struct perf_pmu *perf_mem_events_find_pmu(void)
> +{
> + /*
> + * The current perf mem doesn't support per-PMU configuration.
> + * The exact same configuration is applied to all the
> + * mem_events supported PMUs.
> + * Return the first mem_events supported PMU.
> + *
> + * Notes: The only case which may support multiple mem_events
> + * supported PMUs is Intel hybrid. The exact same mem_events
> + * is shared among the PMUs. Only configure the first PMU
> + * is good enough as well.
> + */
> + return perf_pmus__scan_mem(NULL);
> }
>
> const char * __weak perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
> {
> - struct perf_mem_event *e = perf_mem_events__ptr(i);
> + struct perf_mem_event *e = &perf_mem_events[i];
>
> if (!e)
> return NULL;
> @@ -61,7 +86,7 @@ __weak bool is_mem_loads_aux_event(struct evsel *leader __maybe_unused)
> return false;
> }
>
> -int perf_mem_events__parse(const char *str)
> +int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str)
> {
> char *tok, *saveptr = NULL;
> bool found = false;
> @@ -79,7 +104,7 @@ int perf_mem_events__parse(const char *str)
>
> while (tok) {
> for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
> - struct perf_mem_event *e = perf_mem_events__ptr(j);
> + struct perf_mem_event *e = perf_pmu__mem_events_ptr(pmu, j);
>
> if (!e->tag)
> continue;
> @@ -112,7 +137,7 @@ static bool perf_mem_event__supported(const char *mnt, struct perf_pmu *pmu,
> return !stat(path, &st);
> }
>
> -int perf_mem_events__init(void)
> +int perf_pmu__mem_events_init(struct perf_pmu *pmu)
> {
> const char *mnt = sysfs__mount();
> bool found = false;
> @@ -122,8 +147,7 @@ int perf_mem_events__init(void)
> return -ENOENT;
>
> for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
> - struct perf_mem_event *e = perf_mem_events__ptr(j);
> - struct perf_pmu *pmu = NULL;
> + struct perf_mem_event *e = perf_pmu__mem_events_ptr(pmu, j);
>
> /*
> * If the event entry isn't valid, skip initialization
> @@ -132,29 +156,20 @@ int perf_mem_events__init(void)
> if (!e->tag)
> continue;
>
> - /*
> - * Scan all PMUs not just core ones, since perf mem/c2c on
> - * platforms like AMD uses IBS OP PMU which is independent
> - * of core PMU.
> - */
> - while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> - e->supported |= perf_mem_event__supported(mnt, pmu, e);
> - if (e->supported) {
> - found = true;
> - break;
> - }
> - }
> + e->supported |= perf_mem_event__supported(mnt, pmu, e);
> + if (e->supported)
> + found = true;
> }
>
> return found ? 0 : -ENOENT;
> }
>
> -void perf_mem_events__list(void)
> +void perf_pmu__mem_events_list(struct perf_pmu *pmu)
> {
> int j;
>
> for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
> - struct perf_mem_event *e = perf_mem_events__ptr(j);
> + struct perf_mem_event *e = perf_pmu__mem_events_ptr(pmu, j);
>
> fprintf(stderr, "%-*s%-*s%s",
> e->tag ? 13 : 0,
> @@ -165,50 +180,32 @@ void perf_mem_events__list(void)
> }
> }
>
> -static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
> - int idx)
> -{
> - const char *mnt = sysfs__mount();
> - struct perf_pmu *pmu = NULL;
> -
> - while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> - if (!perf_mem_event__supported(mnt, pmu, e)) {
> - pr_err("failed: event '%s' not supported\n",
> - perf_mem_events__name(idx, pmu->name));
> - }
> - }
> -}
> -
> int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
> char **rec_tmp, int *tmp_nr)
> {
> const char *mnt = sysfs__mount();
> + struct perf_pmu *pmu = NULL;
> int i = *argv_nr, k = 0;
> struct perf_mem_event *e;
>
> - for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
> - e = perf_mem_events__ptr(j);
> - if (!e->record)
> - continue;
>
> - if (perf_pmus__num_mem_pmus() == 1) {
> - if (!e->supported) {
> - pr_err("failed: event '%s' not supported\n",
> - perf_mem_events__name(j, NULL));
> - return -1;
> - }
> + while ((pmu = perf_pmus__scan_mem(pmu)) != NULL) {
> + for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
> + e = perf_pmu__mem_events_ptr(pmu, j);
>
> - rec_argv[i++] = "-e";
> - rec_argv[i++] = perf_mem_events__name(j, NULL);
> - } else {
> - struct perf_pmu *pmu = NULL;
> + if (!e->record)
> + continue;
>
> if (!e->supported) {
> - perf_mem_events__print_unsupport_hybrid(e, j);
> + pr_err("failed: event '%s' not supported\n",
> + perf_mem_events__name(j, pmu->name));
> return -1;
> }
>
> - while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> + if (perf_pmus__num_mem_pmus() == 1) {
> + rec_argv[i++] = "-e";
> + rec_argv[i++] = perf_mem_events__name(j, NULL);
> + } else {
> const char *s = perf_mem_events__name(j, pmu->name);
>
> if (!perf_mem_event__supported(mnt, pmu, e))
> diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
> index 8c5694b2d0b0..0ad301a2e424 100644
> --- a/tools/perf/util/mem-events.h
> +++ b/tools/perf/util/mem-events.h
> @@ -36,14 +36,15 @@ enum {
> extern unsigned int perf_mem_events__loads_ldlat;
> extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];
>
> -int perf_mem_events__parse(const char *str);
> -int perf_mem_events__init(void);
> +int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str);
> +int perf_pmu__mem_events_init(struct perf_pmu *pmu);
>
> const char *perf_mem_events__name(int i, const char *pmu_name);
> -struct perf_mem_event *perf_mem_events__ptr(int i);
> +struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i);
> +struct perf_pmu *perf_mem_events_find_pmu(void);
> bool is_mem_loads_aux_event(struct evsel *leader);
>
> -void perf_mem_events__list(void);
> +void perf_pmu__mem_events_list(struct perf_pmu *pmu);
> int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
> char **rec_tmp, int *tmp_nr);
>

2024-01-16 14:00:31

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 3/7] perf mem: Clean up perf_mem_events__name()

Hi Kan,
With this patch perf mem is failing in powerpc with below error:

[command]# ./perf mem record ls
event syntax error: '../mem-loads,ldlat=30/P'
\___ Bad event or PMU

Unable to find PMU or event on a PMU of 'cpu'

Initial error:
event syntax error: '../mem-loads,ldlat=30/P'
\___ unknown term 'ldlat' for pmu 'cpu'

This issue is happening as powerpc doesn't support ldlat parameter. And
this patch missing build of pmu.c file.

I am able to fix build and ldlat issue with below changes:

diff --git a/tools/perf/arch/powerpc/util/Build
b/tools/perf/arch/powerpc/util/Build
index 9889245c555c..1d323f3a3322 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -2,6 +2,7 @@ perf-y += header.o
perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
perf-y += perf_regs.o
perf-y += mem-events.o
+perf-y += pmu.o
perf-y += sym-handling.o
perf-y += evsel.o
perf-y += event.o
diff --git a/tools/perf/arch/powerpc/util/mem-events.c
b/tools/perf/arch/powerpc/util/mem-events.c
index b7883e38950f..9140cdb1bbfb 100644
--- a/tools/perf/arch/powerpc/util/mem-events.c
+++ b/tools/perf/arch/powerpc/util/mem-events.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
-#include "map_symbol.h"
+#include "util/map_symbol.h"
+#include "util/mem-events.h"
#include "mem-events.h"

#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat
= l, .aux_event = a }
diff --git a/tools/perf/arch/powerpc/util/pmu.c
b/tools/perf/arch/powerpc/util/pmu.c
index 168173f88ddb..554675deef7b 100644
--- a/tools/perf/arch/powerpc/util/pmu.c
+++ b/tools/perf/arch/powerpc/util/pmu.c
@@ -3,6 +3,7 @@
#include <string.h>

#include "../../../util/pmu.h"
+#include "mem-events.h"

void perf_pmu__arch_init(struct perf_pmu *pmu)
{

Thanks,
Kajol Jain


On 12/14/23 01:21, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> Introduce a generic perf_mem_events__name(). Remove the ARCH-specific
> one.
>
> The mem_load events may have a different format. Add ldlat and aux_event
> in the struct perf_mem_event to indicate the format and the extra aux
> event.
>
> Add perf_mem_events_intel_aux[] to support the extra mem_load_aux event.
>
> Rename perf_mem_events__name to perf_pmu__mem_events_name.
>
> Reviewed-by: Ian Rogers <[email protected]>
> Tested-by: Ravi Bangoria <[email protected]>
> Signed-off-by: Kan Liang <[email protected]>
> ---
> tools/perf/arch/arm64/util/mem-events.c | 26 ++-------
> tools/perf/arch/powerpc/util/mem-events.c | 13 ++---
> tools/perf/arch/powerpc/util/mem-events.h | 7 +++
> tools/perf/arch/powerpc/util/pmu.c | 11 ++++
> tools/perf/arch/x86/util/mem-events.c | 70 +++++------------------
> tools/perf/arch/x86/util/mem-events.h | 1 +
> tools/perf/arch/x86/util/pmu.c | 8 ++-
> tools/perf/util/mem-events.c | 60 +++++++++++++------
> tools/perf/util/mem-events.h | 3 +-
> 9 files changed, 93 insertions(+), 106 deletions(-)
> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>
> diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
> index d3e69a520c2b..96460c46640a 100644
> --- a/tools/perf/arch/arm64/util/mem-events.c
> +++ b/tools/perf/arch/arm64/util/mem-events.c
> @@ -3,28 +3,10 @@
> #include "util/mem-events.h"
> #include "mem-events.h"
>
> -#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
> +#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
>
> struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
> - E("spe-load", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0"),
> - E("spe-store", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0"),
> - E("spe-ldst", "arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0"),
> + E("spe-load", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/", "arm_spe_0", true, 0),
> + E("spe-store", "%s/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/", "arm_spe_0", false, 0),
> + E("spe-ldst", "%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/", "arm_spe_0", true, 0),
> };
> -
> -static char mem_ev_name[100];
> -
> -const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
> -{
> - struct perf_mem_event *e = &perf_mem_events_arm[i];
> -
> - if (i >= PERF_MEM_EVENTS__MAX)
> - return NULL;
> -
> - if (i == PERF_MEM_EVENTS__LOAD || i == PERF_MEM_EVENTS__LOAD_STORE)
> - scnprintf(mem_ev_name, sizeof(mem_ev_name),
> - e->name, perf_mem_events__loads_ldlat);
> - else /* PERF_MEM_EVENTS__STORE */
> - scnprintf(mem_ev_name, sizeof(mem_ev_name), e->name);
> -
> - return mem_ev_name;
> -}
> diff --git a/tools/perf/arch/powerpc/util/mem-events.c b/tools/perf/arch/powerpc/util/mem-events.c
> index 78b986e5268d..b7883e38950f 100644
> --- a/tools/perf/arch/powerpc/util/mem-events.c
> +++ b/tools/perf/arch/powerpc/util/mem-events.c
> @@ -2,11 +2,10 @@
> #include "map_symbol.h"
> #include "mem-events.h"
>
> -/* PowerPC does not support 'ldlat' parameter. */
> -const char *perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
> -{
> - if (i == PERF_MEM_EVENTS__LOAD)
> - return "cpu/mem-loads/";
> +#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
>
> - return "cpu/mem-stores/";
> -}
> +struct perf_mem_event perf_mem_events_power[PERF_MEM_EVENTS__MAX] = {
> + E("ldlat-loads", "%s/mem-loads/", "cpu/events/mem-loads", false, 0),
> + E("ldlat-stores", "%s/mem-stores/", "cpu/events/mem-stores", false, 0),
> + E(NULL, NULL, NULL, false, 0),
> +};
> diff --git a/tools/perf/arch/powerpc/util/mem-events.h b/tools/perf/arch/powerpc/util/mem-events.h
> new file mode 100644
> index 000000000000..6acc3d1b6873
> --- /dev/null
> +++ b/tools/perf/arch/powerpc/util/mem-events.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _POWER_MEM_EVENTS_H
> +#define _POWER_MEM_EVENTS_H
> +
> +extern struct perf_mem_event perf_mem_events_power[PERF_MEM_EVENTS__MAX];
> +
> +#endif /* _POWER_MEM_EVENTS_H */
> diff --git a/tools/perf/arch/powerpc/util/pmu.c b/tools/perf/arch/powerpc/util/pmu.c
> new file mode 100644
> index 000000000000..168173f88ddb
> --- /dev/null
> +++ b/tools/perf/arch/powerpc/util/pmu.c
> @@ -0,0 +1,11 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <string.h>
> +
> +#include "../../../util/pmu.h"
> +
> +void perf_pmu__arch_init(struct perf_pmu *pmu)
> +{
> + if (pmu->is_core)
> + pmu->mem_events = perf_mem_events_power;
> +}
> diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
> index 5fb41d50118d..f0e66a0151a0 100644
> --- a/tools/perf/arch/x86/util/mem-events.c
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -7,25 +7,26 @@
> #include "linux/string.h"
> #include "env.h"
>
> -static char mem_loads_name[100];
> -static bool mem_loads_name__init;
> -static char mem_stores_name[100];
> -
> #define MEM_LOADS_AUX 0x8203
> -#define MEM_LOADS_AUX_NAME "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P"
>
> -#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
> +#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
>
> struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
> - E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
> - E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores"),
> - E(NULL, NULL, NULL),
> + E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads", true, 0),
> + E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores", false, 0),
> + E(NULL, NULL, NULL, false, 0),
> +};
> +
> +struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX] = {
> + E("ldlat-loads", "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P", "%s/events/mem-loads", true, MEM_LOADS_AUX),
> + E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores", false, 0),
> + E(NULL, NULL, NULL, false, 0),
> };
>
> struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
> - E(NULL, NULL, NULL),
> - E(NULL, NULL, NULL),
> - E("mem-ldst", "ibs_op//", "ibs_op"),
> + E(NULL, NULL, NULL, false, 0),
> + E(NULL, NULL, NULL, false, 0),
> + E("mem-ldst", "%s//", "ibs_op", false, 0),
> };
>
> bool is_mem_loads_aux_event(struct evsel *leader)
> @@ -40,48 +41,3 @@ bool is_mem_loads_aux_event(struct evsel *leader)
>
> return leader->core.attr.config == MEM_LOADS_AUX;
> }
> -
> -const char *perf_mem_events__name(int i, const char *pmu_name)
> -{
> - struct perf_mem_event *e;
> -
> - if (x86__is_amd_cpu())
> - e = &perf_mem_events_amd[i];
> - else
> - e = &perf_mem_events_intel[i];
> -
> - if (!e)
> - return NULL;
> -
> - if (i == PERF_MEM_EVENTS__LOAD) {
> - if (mem_loads_name__init && !pmu_name)
> - return mem_loads_name;
> -
> - if (!pmu_name) {
> - mem_loads_name__init = true;
> - pmu_name = "cpu";
> - }
> -
> - if (perf_pmus__have_event(pmu_name, "mem-loads-aux")) {
> - scnprintf(mem_loads_name, sizeof(mem_loads_name),
> - MEM_LOADS_AUX_NAME, pmu_name, pmu_name,
> - perf_mem_events__loads_ldlat);
> - } else {
> - scnprintf(mem_loads_name, sizeof(mem_loads_name),
> - e->name, pmu_name,
> - perf_mem_events__loads_ldlat);
> - }
> - return mem_loads_name;
> - }
> -
> - if (i == PERF_MEM_EVENTS__STORE) {
> - if (!pmu_name)
> - pmu_name = "cpu";
> -
> - scnprintf(mem_stores_name, sizeof(mem_stores_name),
> - e->name, pmu_name);
> - return mem_stores_name;
> - }
> -
> - return e->name;
> -}
> diff --git a/tools/perf/arch/x86/util/mem-events.h b/tools/perf/arch/x86/util/mem-events.h
> index 3959e427f482..f55c8d3b7d59 100644
> --- a/tools/perf/arch/x86/util/mem-events.h
> +++ b/tools/perf/arch/x86/util/mem-events.h
> @@ -3,6 +3,7 @@
> #define _X86_MEM_EVENTS_H
>
> extern struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX];
> +extern struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX];
>
> extern struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX];
>
> diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
> index cd22e80e5657..0f49ff13cfe2 100644
> --- a/tools/perf/arch/x86/util/pmu.c
> +++ b/tools/perf/arch/x86/util/pmu.c
> @@ -35,8 +35,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
> if (x86__is_amd_cpu()) {
> if (!strcmp(pmu->name, "ibs_op"))
> pmu->mem_events = perf_mem_events_amd;
> - } else if (pmu->is_core)
> - pmu->mem_events = perf_mem_events_intel;
> + } else if (pmu->is_core) {
> + if (perf_pmu__have_event(pmu, "mem-loads-aux"))
> + pmu->mem_events = perf_mem_events_intel_aux;
> + else
> + pmu->mem_events = perf_mem_events_intel;
> + }
> }
>
> int perf_pmus__num_mem_pmus(void)
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index 27a33dc44964..51e53e33df03 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -17,17 +17,17 @@
>
> unsigned int perf_mem_events__loads_ldlat = 30;
>
> -#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
> +#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
>
> struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
> - E("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "cpu/events/mem-loads"),
> - E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
> - E(NULL, NULL, NULL),
> + E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "cpu/events/mem-loads", true, 0),
> + E("ldlat-stores", "%s/mem-stores/P", "cpu/events/mem-stores", false, 0),
> + E(NULL, NULL, NULL, false, 0),
> };
> #undef E
>
> static char mem_loads_name[100];
> -static bool mem_loads_name__init;
> +static char mem_stores_name[100];
>
> struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i)
> {
> @@ -62,23 +62,49 @@ struct perf_pmu *perf_mem_events_find_pmu(void)
> return perf_pmus__scan_mem(NULL);
> }
>
> -const char * __weak perf_mem_events__name(int i, const char *pmu_name __maybe_unused)
> +static const char *perf_pmu__mem_events_name(int i, struct perf_pmu *pmu)
> {
> - struct perf_mem_event *e = &perf_mem_events[i];
> + struct perf_mem_event *e;
>
> + if (i >= PERF_MEM_EVENTS__MAX || !pmu)
> + return NULL;
> +
> + e = &pmu->mem_events[i];
> if (!e)
> return NULL;
>
> - if (i == PERF_MEM_EVENTS__LOAD) {
> - if (!mem_loads_name__init) {
> - mem_loads_name__init = true;
> - scnprintf(mem_loads_name, sizeof(mem_loads_name),
> - e->name, perf_mem_events__loads_ldlat);
> + if (i == PERF_MEM_EVENTS__LOAD || i == PERF_MEM_EVENTS__LOAD_STORE) {
> + if (e->ldlat) {
> + if (!e->aux_event) {
> + /* ARM and Most of Intel */
> + scnprintf(mem_loads_name, sizeof(mem_loads_name),
> + e->name, pmu->name,
> + perf_mem_events__loads_ldlat);
> + } else {
> + /* Intel with mem-loads-aux event */
> + scnprintf(mem_loads_name, sizeof(mem_loads_name),
> + e->name, pmu->name, pmu->name,
> + perf_mem_events__loads_ldlat);
> + }
> + } else {
> + if (!e->aux_event) {
> + /* AMD and POWER */
> + scnprintf(mem_loads_name, sizeof(mem_loads_name),
> + e->name, pmu->name);
> + } else
> + return NULL;
> }
> +
> return mem_loads_name;
> }
>
> - return e->name;
> + if (i == PERF_MEM_EVENTS__STORE) {
> + scnprintf(mem_stores_name, sizeof(mem_stores_name),
> + e->name, pmu->name);
> + return mem_stores_name;
> + }
> +
> + return NULL;
> }
>
> __weak bool is_mem_loads_aux_event(struct evsel *leader __maybe_unused)
> @@ -175,7 +201,7 @@ void perf_pmu__mem_events_list(struct perf_pmu *pmu)
> e->tag ? 13 : 0,
> e->tag ? : "",
> e->tag && verbose > 0 ? 25 : 0,
> - e->tag && verbose > 0 ? perf_mem_events__name(j, NULL) : "",
> + e->tag && verbose > 0 ? perf_pmu__mem_events_name(j, pmu) : "",
> e->supported ? ": available\n" : "");
> }
> }
> @@ -198,15 +224,15 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
>
> if (!e->supported) {
> pr_err("failed: event '%s' not supported\n",
> - perf_mem_events__name(j, pmu->name));
> + perf_pmu__mem_events_name(j, pmu));
> return -1;
> }
>
> if (perf_pmus__num_mem_pmus() == 1) {
> rec_argv[i++] = "-e";
> - rec_argv[i++] = perf_mem_events__name(j, NULL);
> + rec_argv[i++] = perf_pmu__mem_events_name(j, pmu);
> } else {
> - const char *s = perf_mem_events__name(j, pmu->name);
> + const char *s = perf_pmu__mem_events_name(j, pmu);
>
> if (!perf_mem_event__supported(mnt, pmu, e))
> continue;
> diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
> index 0ad301a2e424..79d342768d12 100644
> --- a/tools/perf/util/mem-events.h
> +++ b/tools/perf/util/mem-events.h
> @@ -14,6 +14,8 @@
> struct perf_mem_event {
> bool record;
> bool supported;
> + bool ldlat;
> + u32 aux_event;
> const char *tag;
> const char *name;
> const char *sysfs_name;
> @@ -39,7 +41,6 @@ extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];
> int perf_pmu__mem_events_parse(struct perf_pmu *pmu, const char *str);
> int perf_pmu__mem_events_init(struct perf_pmu *pmu);
>
> -const char *perf_mem_events__name(int i, const char *pmu_name);
> struct perf_mem_event *perf_pmu__mem_events_ptr(struct perf_pmu *pmu, int i);
> struct perf_pmu *perf_mem_events_find_pmu(void);
> bool is_mem_loads_aux_event(struct evsel *leader);

2024-01-16 14:06:28

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 1/5/24 20:08, Liang, Kan wrote:
>
>
> On 2024-01-05 1:38 a.m., kajoljain wrote:
>>
>>
>> On 1/3/24 01:38, Liang, Kan wrote:
>>> Hi Kajol Jain
>>>
>>> On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>>>>
>>>>
>>>> On 2023-12-19 4:26 a.m., kajoljain wrote:
>>>>> Hi,
>>>>> I was trying to test this patchset on powerpc.
>>>>>
>>>>> After applying it on top of acme's perf-tools-next branch, I am getting
>>>>> below error:
>>>>>
>>>>> INSTALL libsubcmd_headers
>>>>> INSTALL libperf_headers
>>>>> INSTALL libsymbol_headers
>>>>> INSTALL libapi_headers
>>>>> INSTALL libbpf_headers
>>>>> CC arch/powerpc/util/mem-events.o
>>>>> In file included from arch/powerpc/util/mem-events.c:3:
>>>>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>>>>> undeclared here (not in a function)
>>>>> 5 | extern struct perf_mem_event
>>>>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>>>> |
>>>>> ^~~~~~~~~~~~~~~~~~~~
>>>>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>>>>> arch/powerpc/util/mem-events.o] Error 1
>>>>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>>>>> Error 2
>>>>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>>>>> Error 2
>>>>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>>>>> Error 2
>>>>> make[3]: *** Waiting for unfinished jobs....
>>>>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>>>>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>>>>> make: *** [Makefile:70: all] Error 2
>>>>>
>>>>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>>>>> events.c
>>>>>
>>>>
>>>> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>>>>
>>>> I guess powerpc has to do the same thing. Could you please try the below
>>>> patch?
>>>
>>>
>>> Does the patch work on powerpc?
>>
>> Hi Kan,
>> Sorry I went for vacation so couldn't update. Yes this fix works.
>
> Thanks for the update.
>
>> But
>> we have another issue, actually this patch set changes uses ldlat
>> attribute. But ldlat is not supported in powerpc because of which perf
>> mem is failing in powerpc.
>
> For powerpc, the patch 3 introduced a perf_mem_events_power, which
> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
> sure if it's the problem.

Hi Kan,
Correct there were some small issues with patch 3, I added fix for that.

> Also, S390 still uses the default perf_mem_events, which includes ldlat.
> I'm not sure if S390 supports the ldlat.

I checked it, I didn't find ldlat parameter defined in arch/s390
directory. I think its better to make default ldlat value as false
in tools/perf/util/mem-events.c file.

Thanks,
Kajol Jain

>
> Thanks,
> Kan
>>
>> I am looking into a work around to fix this issue. I will update the fix.
>>
>> Thanks,
>> Kajol Jain
>>
>>
>>>
>>>
>>> Thanks,
>>> Kan
>>>>
>>>> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
>>>> b/tools/perf/arch/powerpc/util/mem-events.c
>>>> index 72a6ac2b52f5..765d4a054b0a 100644
>>>> --- a/tools/perf/arch/powerpc/util/mem-events.c
>>>> +++ b/tools/perf/arch/powerpc/util/mem-events.c
>>>> @@ -1,5 +1,6 @@
>>>> // SPDX-License-Identifier: GPL-2.0
>>>> -#include "map_symbol.h"
>>>> +#include "util/map_symbol.h"
>>>> +#include "util/mem-events.h"
>>>> #include "mem-events.h"
>>>>
>>>> #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
>>>> = l, .aux_event = a }
>>>>
>>>> Thanks,
>>>> Kan
>>>>
>>>>> Thanks,
>>>>> Kajol Jain
>>>>>
>>>>> On 12/14/23 01:21, [email protected] wrote:
>>>>>> From: Kan Liang <[email protected]>
>>>>>>
>>>>>> Changes since V2:
>>>>>> - Fix the Arm64 building error (Leo)
>>>>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>>>> and perf_pmus__num_mem_pmus() (Leo)
>>>>>>
>>>>>> Changes since V1:
>>>>>> - Fix strcmp of PMU name checking (Ravi)
>>>>>> - Fix "/," typo (Ian)
>>>>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>>>>> - Fold the header removal patch into the patch where the cleanups made.
>>>>>> (Arnaldo)
>>>>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>>>>
>>>>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>>>>> https://lore.kernel.org/lkml/[email protected]/
>>>>>>
>>>>>> Introduce generic functions perf_mem_events__ptr(),
>>>>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>>>>> ARCH specific ones.
>>>>>> Simplify the perf_mem_event__supported().
>>>>>>
>>>>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>>>>> mem-events.c for each ARCH.
>>>>>>
>>>>>> There is no functional change.
>>>>>>
>>>>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>>>>> etc. But I can only test it on two Intel platforms.
>>>>>> Please give it try, if you have machines with other ARCHs.
>>>>>>
>>>>>> Here are the test results:
>>>>>> Intel hybrid machine:
>>>>>>
>>>>>> $perf mem record -e list
>>>>>> ldlat-loads : available
>>>>>> ldlat-stores : available
>>>>>>
>>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>>>>
>>>>>> $perf mem record -v
>>>>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>>>>
>>>>>> $perf mem record -t store -v
>>>>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>>>>
>>>>>>
>>>>>> Intel SPR:
>>>>>> $perf mem record -e list
>>>>>> ldlat-loads : available
>>>>>> ldlat-stores : available
>>>>>>
>>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>>>>
>>>>>> $perf mem record -v
>>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>>>>
>>>>>> $perf mem record -t store -v
>>>>>> calling: record -e cpu/mem-stores/P
>>>>>>
>>>>>> Kan Liang (7):
>>>>>> perf mem: Add mem_events into the supported perf_pmu
>>>>>> perf mem: Clean up perf_mem_events__ptr()
>>>>>> perf mem: Clean up perf_mem_events__name()
>>>>>> perf mem: Clean up perf_mem_event__supported()
>>>>>> perf mem: Clean up is_mem_loads_aux_event()
>>>>>> perf mem: Clean up perf_mem_events__record_args()
>>>>>> perf mem: Clean up perf_pmus__num_mem_pmus()
>>>>>>
>>>>>> tools/perf/arch/arm/util/pmu.c | 3 +
>>>>>> tools/perf/arch/arm64/util/mem-events.c | 39 +---
>>>>>> tools/perf/arch/arm64/util/mem-events.h | 7 +
>>>>>> tools/perf/arch/powerpc/util/mem-events.c | 13 +-
>>>>>> tools/perf/arch/powerpc/util/mem-events.h | 7 +
>>>>>> tools/perf/arch/powerpc/util/pmu.c | 11 ++
>>>>>> tools/perf/arch/s390/util/pmu.c | 3 +
>>>>>> tools/perf/arch/x86/util/mem-events.c | 99 ++--------
>>>>>> tools/perf/arch/x86/util/mem-events.h | 10 +
>>>>>> tools/perf/arch/x86/util/pmu.c | 19 +-
>>>>>> tools/perf/builtin-c2c.c | 45 ++---
>>>>>> tools/perf/builtin-mem.c | 48 ++---
>>>>>> tools/perf/util/mem-events.c | 217 +++++++++++++---------
>>>>>> tools/perf/util/mem-events.h | 19 +-
>>>>>> tools/perf/util/pmu.c | 4 +-
>>>>>> tools/perf/util/pmu.h | 7 +
>>>>>> tools/perf/util/pmus.c | 6 -
>>>>>> tools/perf/util/pmus.h | 1 -
>>>>>> 18 files changed, 278 insertions(+), 280 deletions(-)
>>>>>> create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>>>> create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>>>> create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>>>> create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>>>>
>>>>>
>>>>
>>
>

2024-01-16 16:37:28

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 2024-01-16 9:05 a.m., kajoljain wrote:
>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>> sure if it's the problem.
> Hi Kan,
> Correct there were some small issues with patch 3, I added fix for that.
>

Thanks Kajol Jain! I will fold your fix into V4.

>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>> I'm not sure if S390 supports the ldlat.
> I checked it, I didn't find ldlat parameter defined in arch/s390
> directory. I think its better to make default ldlat value as false
> in tools/perf/util/mem-events.c file.

The s390 may not be the only user for the default perf_mem_events[] in
the tools/perf/util/mem-events.c. We probably cannot change the default
value.
We may share the perf_mem_events_power[] between powerpc and s390. (We
did the similar share for arm and arm64.)

How about the below patch (not tested.)

diff --git a/tools/perf/arch/s390/util/pmu.c
b/tools/perf/arch/s390/util/pmu.c
index 225d7dc2379c..411034c984bb 100644
--- a/tools/perf/arch/s390/util/pmu.c
+++ b/tools/perf/arch/s390/util/pmu.c
@@ -8,6 +8,7 @@
#include <string.h>

#include "../../../util/pmu.h"
+#include "../../powerpc/util/mem-events.h"

#define S390_PMUPAI_CRYPTO "pai_crypto"
#define S390_PMUPAI_EXT "pai_ext"
@@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
pmu->selectable = true;

if (pmu->is_core)
- pmu->mem_events = perf_mem_events;
+ pmu->mem_events = perf_mem_events_power;
}



However, the original s390 code doesn't include any s390 specific code
for perf_mem. So I thought it uses the default perf_mem_events[].
Is there something I missed?

Or does the s390 even support mem events? If not, I may remove the
mem_events from s390.

Thanks,
Kan

2024-01-23 05:31:05

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 1/16/24 22:07, Liang, Kan wrote:
>
>
> On 2024-01-16 9:05 a.m., kajoljain wrote:
>>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>>> sure if it's the problem.
>> Hi Kan,
>> Correct there were some small issues with patch 3, I added fix for that.
>>
>
> Thanks Kajol Jain! I will fold your fix into V4.
>
>>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>>> I'm not sure if S390 supports the ldlat.
>> I checked it, I didn't find ldlat parameter defined in arch/s390
>> directory. I think its better to make default ldlat value as false
>> in tools/perf/util/mem-events.c file.
>
> The s390 may not be the only user for the default perf_mem_events[] in
> the tools/perf/util/mem-events.c. We probably cannot change the default
> value.
> We may share the perf_mem_events_power[] between powerpc and s390. (We
> did the similar share for arm and arm64.)
>
> How about the below patch (not tested.)
>
> diff --git a/tools/perf/arch/s390/util/pmu.c
> b/tools/perf/arch/s390/util/pmu.c
> index 225d7dc2379c..411034c984bb 100644
> --- a/tools/perf/arch/s390/util/pmu.c
> +++ b/tools/perf/arch/s390/util/pmu.c
> @@ -8,6 +8,7 @@
> #include <string.h>
>
> #include "../../../util/pmu.h"
> +#include "../../powerpc/util/mem-events.h"
>
> #define S390_PMUPAI_CRYPTO "pai_crypto"
> #define S390_PMUPAI_EXT "pai_ext"
> @@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
> pmu->selectable = true;
>
> if (pmu->is_core)
> - pmu->mem_events = perf_mem_events;
> + pmu->mem_events = perf_mem_events_power;
> }
>
>
>
> However, the original s390 code doesn't include any s390 specific code
> for perf_mem. So I thought it uses the default perf_mem_events[].
> Is there something I missed?
>
> Or does the s390 even support mem events? If not, I may remove the
> mem_events from s390.

Hi Kan,
I don't have s390 system to do testing. But from my end I am fine
with the changes.

Thanks,
Kajol Jain

>
> Thanks,
> Kan

2024-01-23 06:00:44

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem

On 1/23/24 06:30, kajoljain wrote:
>
>
> On 1/16/24 22:07, Liang, Kan wrote:
>>
>>
>> On 2024-01-16 9:05 a.m., kajoljain wrote:
>>>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>>>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>>>> sure if it's the problem.
>>> Hi Kan,
>>> Correct there were some small issues with patch 3, I added fix for that.
>>>
>>
>> Thanks Kajol Jain! I will fold your fix into V4.
>>
>>>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>>>> I'm not sure if S390 supports the ldlat.
>>> I checked it, I didn't find ldlat parameter defined in arch/s390
>>> directory. I think its better to make default ldlat value as false
>>> in tools/perf/util/mem-events.c file.
>>
>> The s390 may not be the only user for the default perf_mem_events[] in
>> the tools/perf/util/mem-events.c. We probably cannot change the default
>> value.
>> We may share the perf_mem_events_power[] between powerpc and s390. (We
>> did the similar share for arm and arm64.)
>>
>> How about the below patch (not tested.)
>>
>> diff --git a/tools/perf/arch/s390/util/pmu.c
>> b/tools/perf/arch/s390/util/pmu.c
>> index 225d7dc2379c..411034c984bb 100644
>> --- a/tools/perf/arch/s390/util/pmu.c
>> +++ b/tools/perf/arch/s390/util/pmu.c
>> @@ -8,6 +8,7 @@
>> #include <string.h>
>>
>> #include "../../../util/pmu.h"
>> +#include "../../powerpc/util/mem-events.h"
>>
>> #define S390_PMUPAI_CRYPTO "pai_crypto"
>> #define S390_PMUPAI_EXT "pai_ext"
>> @@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
>> pmu->selectable = true;
>>
>> if (pmu->is_core)
>> - pmu->mem_events = perf_mem_events;
>> + pmu->mem_events = perf_mem_events_power;
>> }
>>
>>
>>
>> However, the original s390 code doesn't include any s390 specific code
>> for perf_mem. So I thought it uses the default perf_mem_events[].
>> Is there something I missed?
>>
>> Or does the s390 even support mem events? If not, I may remove the
>> mem_events from s390.
>
> Hi Kan,
> I don't have s390 system to do testing. But from my end I am fine
> with the changes.
>
> Thanks,
> Kajol Jain
>

s390 does not support perf mem at all. Right now it is save to remove it from s390.
Thanks

>>
>> Thanks,
>> Kan
>

--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


2024-01-23 16:51:39

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V3 0/7] Clean up perf mem



On 2024-01-23 12:56 a.m., Thomas Richter wrote:
> On 1/23/24 06:30, kajoljain wrote:
>>
>>
>> On 1/16/24 22:07, Liang, Kan wrote:
>>>
>>>
>>> On 2024-01-16 9:05 a.m., kajoljain wrote:
>>>>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>>>>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>>>>> sure if it's the problem.
>>>> Hi Kan,
>>>> Correct there were some small issues with patch 3, I added fix for that.
>>>>
>>>
>>> Thanks Kajol Jain! I will fold your fix into V4.
>>>
>>>>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>>>>> I'm not sure if S390 supports the ldlat.
>>>> I checked it, I didn't find ldlat parameter defined in arch/s390
>>>> directory. I think its better to make default ldlat value as false
>>>> in tools/perf/util/mem-events.c file.
>>>
>>> The s390 may not be the only user for the default perf_mem_events[] in
>>> the tools/perf/util/mem-events.c. We probably cannot change the default
>>> value.
>>> We may share the perf_mem_events_power[] between powerpc and s390. (We
>>> did the similar share for arm and arm64.)
>>>
>>> How about the below patch (not tested.)
>>>
>>> diff --git a/tools/perf/arch/s390/util/pmu.c
>>> b/tools/perf/arch/s390/util/pmu.c
>>> index 225d7dc2379c..411034c984bb 100644
>>> --- a/tools/perf/arch/s390/util/pmu.c
>>> +++ b/tools/perf/arch/s390/util/pmu.c
>>> @@ -8,6 +8,7 @@
>>> #include <string.h>
>>>
>>> #include "../../../util/pmu.h"
>>> +#include "../../powerpc/util/mem-events.h"
>>>
>>> #define S390_PMUPAI_CRYPTO "pai_crypto"
>>> #define S390_PMUPAI_EXT "pai_ext"
>>> @@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
>>> pmu->selectable = true;
>>>
>>> if (pmu->is_core)
>>> - pmu->mem_events = perf_mem_events;
>>> + pmu->mem_events = perf_mem_events_power;
>>> }
>>>
>>>
>>>
>>> However, the original s390 code doesn't include any s390 specific code
>>> for perf_mem. So I thought it uses the default perf_mem_events[].
>>> Is there something I missed?
>>>
>>> Or does the s390 even support mem events? If not, I may remove the
>>> mem_events from s390.
>>
>> Hi Kan,
>> I don't have s390 system to do testing. But from my end I am fine
>> with the changes.
>>
>> Thanks,
>> Kajol Jain
>>
>
> s390 does not support perf mem at all. Right now it is save to remove it from s390.

Thanks for the confirmation!

Thanks,
Kan

> Thanks
>
>>>
>>> Thanks,
>>> Kan
>>
>