2022-05-23 07:39:51

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 0/5] perf/amd: Zen4 IBS extensions support (tool changes)

Kernel side of changes have already been applied to tip/perf/core.
This series contains only perf tool changes.

v3: https://lore.kernel.org/lkml/[email protected]
v3->v4:
- Pmu name won't repeat for different core types on heterogeneous
systems. Remove core_type field from HEADER_PMU_CAPS.
- Include rational to change code besides header sync in patch #4
description.
- Add Acked-by Ian Rogers (patch #1 and #3).

Original cover letter:

IBS support has been enhanced with two new features in upcoming uarch:
1. DataSrc extension and 2. L3 Miss Filtering capability. Both are
indicated by CPUID_Fn8000001B_EAX bit 11.

DataSrc extension provides additional data source details for tagged
load/store operations. Add support for these new bits in perf report/
script raw-dump.

IBS L3 miss filtering works by tagging an instruction on IBS counter
overflow and generating an NMI if the tagged instruction causes an L3
miss. Samples without an L3 miss are discarded and counter is reset
with random value (between 1-15 for fetch pmu and 1-127 for op pmu).
This helps in reducing sampling overhead when user is interested only
in such samples. One of the use case of such filtered samples is to
feed data to page-migration daemon in tiered memory systems.

Add support for L3 miss filtering in IBS driver via new pmu attribute
"l3missonly". Example usage:

# perf record -a -e ibs_op/l3missonly=1/ --raw-samples sleep 5
# perf report -D

Some important points to keep in mind while using L3 miss filtering:
1. Hw internally reset sampling period when tagged instruction does
not cause L3 miss. But there is no way to reconstruct aggregated
sampling period when this happens.
2. L3 miss is not the actual event being counted. Rather, IBS will
count fetch, cycles or uOps depending on the configuration. Thus
sampling period have no direct connection to L3 misses.

1st causes sampling period skew. Thus, I've added warning message at
perf record:

# perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
and tagged operation does not cause L3 Miss. This causes sampling period skew.

User can configure smaller sampling period to get more samples while
using l3missonly.

Ravi Bangoria (5):
perf record ibs: Warn about sampling period skew
perf header: Parse non-cpu pmu capabilities
perf/x86/ibs: Add new IBS register bits into header
perf tool ibs: Sync amd ibs header file
perf script ibs: Support new IBS bits in raw trace dump

arch/x86/include/asm/amd-ibs.h | 16 +-
tools/arch/x86/include/asm/amd-ibs.h | 16 +-
.../Documentation/perf.data-file-format.txt | 17 ++
tools/perf/arch/x86/util/evsel.c | 50 +++++
tools/perf/util/amd-sample-raw.c | 68 ++++++-
tools/perf/util/env.c | 47 +++++
tools/perf/util/env.h | 10 +
tools/perf/util/evsel.c | 7 +
tools/perf/util/evsel.h | 1 +
tools/perf/util/header.c | 185 ++++++++++++++++++
tools/perf/util/header.h | 1 +
tools/perf/util/pmu.c | 15 +-
tools/perf/util/pmu.h | 2 +
13 files changed, 411 insertions(+), 24 deletions(-)

--
2.31.1



2022-05-23 07:44:13

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 1/5] perf record ibs: Warn about sampling period skew

Samples without an L3 miss are discarded and counter is reset with
random value (between 1-15 for fetch pmu and 1-127 for op pmu) when
IBS L3 miss filtering is enabled. This causes a sampling period skew
but there is no way to reconstruct aggregated sampling period. So
print a warning at perf record if user sets l3missonly=1.

Ex:
# perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
and tagged operation does not cause L3 Miss. This causes sampling period skew.

Signed-off-by: Ravi Bangoria <[email protected]>
Acked-by: Ian Rogers <[email protected]>
---
tools/perf/arch/x86/util/evsel.c | 50 ++++++++++++++++++++++++++++++++
tools/perf/util/evsel.c | 7 +++++
tools/perf/util/evsel.h | 1 +
3 files changed, 58 insertions(+)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index ac2899a25b7a..e8ff4ddb53c9 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -4,6 +4,8 @@
#include "util/evsel.h"
#include "util/env.h"
#include "linux/string.h"
+#include "util/pmu.h"
+#include "util/debug.h"

void arch_evsel__set_sample_weight(struct evsel *evsel)
{
@@ -29,3 +31,51 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)

free(env.cpuid);
}
+
+static void ibs_l3miss_warn(void)
+{
+ pr_warning(
+"WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled\n"
+"and tagged operation does not cause L3 Miss. This causes sampling period skew.\n");
+}
+
+void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr)
+{
+ struct perf_pmu *evsel_pmu, *ibs_fetch_pmu, *ibs_op_pmu;
+ static int warned_once;
+ /* 0: Uninitialized, 1: Yes, -1: No */
+ static int is_amd;
+
+ if (warned_once || is_amd == -1)
+ return;
+
+ if (!is_amd) {
+ struct perf_env *env = evsel__env(evsel);
+
+ if (!perf_env__cpuid(env) || !env->cpuid ||
+ !strstarts(env->cpuid, "AuthenticAMD")) {
+ is_amd = -1;
+ return;
+ }
+ is_amd = 1;
+ }
+
+ evsel_pmu = evsel__find_pmu(evsel);
+ if (!evsel_pmu)
+ return;
+
+ ibs_fetch_pmu = perf_pmu__find("ibs_fetch");
+ ibs_op_pmu = perf_pmu__find("ibs_op");
+
+ if (ibs_fetch_pmu && ibs_fetch_pmu->type == evsel_pmu->type) {
+ if (attr->config & (1ULL << 59)) {
+ ibs_l3miss_warn();
+ warned_once = 1;
+ }
+ } else if (ibs_op_pmu && ibs_op_pmu->type == evsel_pmu->type) {
+ if (attr->config & (1ULL << 16)) {
+ ibs_l3miss_warn();
+ warned_once = 1;
+ }
+ }
+}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2a1729e7aee4..e9be9b94a062 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1064,6 +1064,11 @@ void __weak arch_evsel__fixup_new_cycles(struct perf_event_attr *attr __maybe_un
{
}

+void __weak arch__post_evsel_config(struct evsel *evsel __maybe_unused,
+ struct perf_event_attr *attr __maybe_unused)
+{
+}
+
static void evsel__set_default_freq_period(struct record_opts *opts,
struct perf_event_attr *attr)
{
@@ -1339,6 +1344,8 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
*/
if (evsel__is_dummy_event(evsel))
evsel__reset_sample_bit(evsel, BRANCH_STACK);
+
+ arch__post_evsel_config(evsel, attr);
}

int evsel__set_filter(struct evsel *evsel, const char *filter)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 041b42d33bf5..207de5082ee9 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -281,6 +281,7 @@ void evsel__set_sample_id(struct evsel *evsel, bool use_sample_identifier);

void arch_evsel__set_sample_weight(struct evsel *evsel);
void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr);
+void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr);

int evsel__set_filter(struct evsel *evsel, const char *filter);
int evsel__append_tp_filter(struct evsel *evsel, const char *filter);
--
2.31.1


2022-05-23 07:59:53

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities

Pmus advertise their capabilities via sysfs attribute files but
perf tool currently parses only core(cpu) pmu capabilities. Add
support for parsing non-cpu pmu capabilities.

Signed-off-by: Ravi Bangoria <[email protected]>
---
.../Documentation/perf.data-file-format.txt | 17 ++
tools/perf/util/env.c | 47 +++++
tools/perf/util/env.h | 10 +
tools/perf/util/header.c | 185 ++++++++++++++++++
tools/perf/util/header.h | 1 +
tools/perf/util/pmu.c | 15 +-
tools/perf/util/pmu.h | 2 +
7 files changed, 273 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt
index f56d0e0fbff6..64dddaecda21 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -435,6 +435,23 @@ struct {
} [nr_pmu];
};

+ HEADER_PMU_CAPS = 32,
+
+ List of pmu capabilities (except cpu pmu which is already
+ covered by HEADER_CPU_PMU_CAPS / HEADER_HYBRID_CPU_PMU_CAPS)
+
+struct {
+ u32 nr_pmus;
+ struct {
+ char pmu_name[];
+ u32 nr_caps;
+ struct {
+ char name[];
+ char value[];
+ } [nr_caps];
+ } [nr_pmus];
+};
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS = 256,

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 579e44c59914..3ecdd7683e45 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -180,6 +180,7 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused)
void perf_env__exit(struct perf_env *env)
{
int i;
+ u32 j;

perf_env__purge_bpf(env);
perf_env__purge_cgroups(env);
@@ -222,6 +223,14 @@ void perf_env__exit(struct perf_env *env)
zfree(&env->hybrid_cpc_nodes[i].pmu_name);
}
zfree(&env->hybrid_cpc_nodes);
+
+ for (i = 0; i < env->nr_pmus_with_caps; i++) {
+ zfree(&env->env_pmu_caps[i].pmu_name);
+ for (j = 0; j < env->env_pmu_caps[i].nr_caps; j++)
+ zfree(&env->env_pmu_caps[i].pmu_caps[j]);
+ zfree(&env->env_pmu_caps[i].pmu_caps);
+ }
+ zfree(&env->env_pmu_caps);
}

void perf_env__init(struct perf_env *env)
@@ -527,3 +536,41 @@ int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu)

return cpu.cpu >= 0 && cpu.cpu < env->nr_numa_map ? env->numa_map[cpu.cpu] : -1;
}
+
+char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
+ const char *cap)
+{
+ struct env_pmu_caps *env_pmu_caps = env->env_pmu_caps;
+ char *cap_eq;
+ int cap_size;
+ char **ptr;
+ int i;
+ u32 j;
+
+ if (!pmu_name || !cap)
+ return NULL;
+
+ cap_size = strlen(cap);
+ cap_eq = zalloc(cap_size + 2);
+ if (!cap_eq)
+ return NULL;
+
+ memcpy(cap_eq, cap, cap_size);
+ cap_eq[cap_size] = '=';
+
+ for (i = 0; i < env->nr_pmus_with_caps; i++) {
+ if (strcmp(env_pmu_caps[i].pmu_name, pmu_name))
+ continue;
+
+ ptr = env_pmu_caps[i].pmu_caps;
+
+ for (j = 0; j < env_pmu_caps[i].nr_caps; j++) {
+ if (!strncmp(ptr[j], cap_eq, cap_size + 1)) {
+ free(cap_eq);
+ return &ptr[j][cap_size + 1];
+ }
+ }
+ }
+ free(cap_eq);
+ return NULL;
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index a3541f98e1fc..a21d62ee9de8 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -50,6 +50,12 @@ struct hybrid_cpc_node {
char *pmu_name;
};

+struct env_pmu_caps {
+ char *pmu_name;
+ u32 nr_caps;
+ char **pmu_caps;
+};
+
struct perf_env {
char *hostname;
char *os_release;
@@ -75,6 +81,7 @@ struct perf_env {
int nr_cpu_pmu_caps;
int nr_hybrid_nodes;
int nr_hybrid_cpc_nodes;
+ int nr_pmus_with_caps;
char *cmdline;
const char **cmdline_argv;
char *sibling_cores;
@@ -95,6 +102,7 @@ struct perf_env {
unsigned long long memory_bsize;
struct hybrid_node *hybrid_nodes;
struct hybrid_cpc_node *hybrid_cpc_nodes;
+ struct env_pmu_caps *env_pmu_caps;
#ifdef HAVE_LIBBPF_SUPPORT
/*
* bpf_info_lock protects bpf rbtrees. This is needed because the
@@ -172,4 +180,6 @@ bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);

int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu);
+char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
+ const char *cap);
#endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index a27132e5a5ef..b4505dbb9f4a 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1580,6 +1580,67 @@ static int write_hybrid_cpu_pmu_caps(struct feat_fd *ff,
return 0;
}

+/*
+ * File format:
+ *
+ * struct {
+ * u32 nr_pmus;
+ * struct {
+ * char pmu_name[];
+ * u32 nr_caps;
+ * struct {
+ * char name[];
+ * char value[];
+ * } [nr_caps];
+ * } [nr_pmus];
+ * };
+ */
+static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
+{
+ struct perf_pmu_caps *caps = NULL;
+ struct perf_pmu *pmu = NULL;
+ u32 nr_pmus = 0;
+ int ret;
+
+ while ((pmu = perf_pmu__scan(pmu))) {
+ if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||
+ perf_pmu__caps_parse(pmu) <= 0)
+ continue;
+ nr_pmus++;
+ }
+
+ ret = do_write(ff, &nr_pmus, sizeof(nr_pmus));
+ if (ret < 0)
+ return ret;
+
+ if (!nr_pmus)
+ return 0;
+
+ while ((pmu = perf_pmu__scan(pmu))) {
+ if (!pmu->name || !strncmp(pmu->name, "cpu", 3) || !pmu->nr_caps)
+ continue;
+
+ ret = do_write_string(ff, pmu->name);
+ if (ret < 0)
+ return ret;
+
+ ret = do_write(ff, &pmu->nr_caps, sizeof(pmu->nr_caps));
+ if (ret < 0)
+ return ret;
+
+ list_for_each_entry(caps, &pmu->caps, list) {
+ ret = do_write_string(ff, caps->name);
+ if (ret < 0)
+ return ret;
+
+ ret = do_write_string(ff, caps->value);
+ if (ret < 0)
+ return ret;
+ }
+ }
+ return 0;
+}
+
static void print_hostname(struct feat_fd *ff, FILE *fp)
{
fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
@@ -2209,6 +2270,32 @@ static void print_mem_topology(struct feat_fd *ff, FILE *fp)
}
}

+static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
+{
+ struct env_pmu_caps *env_pmu_caps = ff->ph->env.env_pmu_caps;
+ int nr_pmus_with_caps = ff->ph->env.nr_pmus_with_caps;
+ const char *delimiter = "";
+ char **ptr;
+ int i;
+ u32 j;
+
+ if (!nr_pmus_with_caps)
+ return;
+
+ for (i = 0; i < nr_pmus_with_caps; i++) {
+ fprintf(fp, "# %s pmu capabilities: ", env_pmu_caps[i].pmu_name);
+
+ ptr = env_pmu_caps[i].pmu_caps;
+
+ delimiter = "";
+ for (j = 0; j < env_pmu_caps[i].nr_caps; j++) {
+ fprintf(fp, "%s%s", delimiter, ptr[j]);
+ delimiter = ", ";
+ }
+ fprintf(fp, "\n");
+ }
+}
+
static int __event_process_build_id(struct perf_record_header_build_id *bev,
char *filename,
struct perf_session *session)
@@ -3319,6 +3406,103 @@ static int process_hybrid_cpu_pmu_caps(struct feat_fd *ff,
return ret;
}

+static int __process_pmu_caps(struct feat_fd *ff, struct env_pmu_caps *env_pmu_caps)
+{
+ u32 nr_caps = env_pmu_caps->nr_caps;
+ int name_size, value_size;
+ char *name, *value, *ptr;
+ u32 i;
+
+ env_pmu_caps->pmu_caps = zalloc(sizeof(char *) * nr_caps);
+ if (!env_pmu_caps->pmu_caps)
+ return -1;
+
+ for (i = 0; i < nr_caps; i++) {
+ name = do_read_string(ff);
+ if (!name)
+ goto error;
+
+ value = do_read_string(ff);
+ if (!value)
+ goto free_name;
+
+ name_size = strlen(name);
+ value_size = strlen(value);
+ ptr = zalloc(sizeof(char) * (name_size + value_size + 2));
+ if (!ptr)
+ goto free_value;
+
+ memcpy(ptr, name, name_size);
+ ptr[name_size] = '=';
+ memcpy(ptr + name_size + 1, value, value_size);
+ env_pmu_caps->pmu_caps[i] = ptr;
+
+ free(value);
+ free(name);
+ }
+ return 0;
+
+free_value:
+ free(value);
+free_name:
+ free(name);
+error:
+ for (; i > 0; i--)
+ free(env_pmu_caps->pmu_caps[i - 1]);
+ free(env_pmu_caps->pmu_caps);
+ return -1;
+}
+
+static int process_pmu_caps(struct feat_fd *ff, void *data __maybe_unused)
+{
+ struct env_pmu_caps *env_pmu_caps;
+ u32 nr_pmus;
+ u32 i, j;
+
+ ff->ph->env.nr_pmus_with_caps = 0;
+ ff->ph->env.env_pmu_caps = NULL;
+
+ if (do_read_u32(ff, &nr_pmus))
+ return -1;
+
+ if (!nr_pmus)
+ return 0;
+
+ env_pmu_caps = zalloc(sizeof(struct env_pmu_caps) * nr_pmus);
+ if (!env_pmu_caps)
+ return -ENOMEM;
+
+ for (i = 0; i < nr_pmus; i++) {
+ env_pmu_caps[i].pmu_name = do_read_string(ff);
+ if (!env_pmu_caps[i].pmu_name)
+ goto error;
+
+ if (do_read_u32(ff, &env_pmu_caps[i].nr_caps))
+ goto free_pmu_name;
+
+ if (!__process_pmu_caps(ff, &env_pmu_caps[i]))
+ continue;
+
+free_pmu_name:
+ free(env_pmu_caps[i].pmu_name);
+ goto error;
+ }
+
+ ff->ph->env.nr_pmus_with_caps = nr_pmus;
+ ff->ph->env.env_pmu_caps = env_pmu_caps;
+ return 0;
+
+error:
+ for (; i > 0; i--) {
+ free(env_pmu_caps[i - 1].pmu_name);
+ for (j = 0; j < env_pmu_caps[i - 1].nr_caps; j++)
+ free(env_pmu_caps[i - 1].pmu_caps[j]);
+ free(env_pmu_caps[i - 1].pmu_caps);
+ }
+ free(env_pmu_caps);
+ return -1;
+}
+
#define FEAT_OPR(n, func, __full_only) \
[HEADER_##n] = { \
.name = __stringify(n), \
@@ -3382,6 +3566,7 @@ const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE] = {
FEAT_OPR(CLOCK_DATA, clock_data, false),
FEAT_OPN(HYBRID_TOPOLOGY, hybrid_topology, true),
FEAT_OPR(HYBRID_CPU_PMU_CAPS, hybrid_cpu_pmu_caps, false),
+ FEAT_OPR(PMU_CAPS, pmu_caps, false),
};

struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 0eb4bc29a5a4..e9a067bb8b9e 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -47,6 +47,7 @@ enum {
HEADER_CLOCK_DATA,
HEADER_HYBRID_TOPOLOGY,
HEADER_HYBRID_CPU_PMU_CAPS,
+ HEADER_PMU_CAPS,
HEADER_LAST_FEATURE,
HEADER_FEAT_BITS = 256,
};
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 9a1c7e63e663..8d599acb7569 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1890,16 +1890,22 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
const char *sysfs = sysfs__mountpoint();
DIR *caps_dir;
struct dirent *evt_ent;
- int nr_caps = 0;
+
+ if (pmu->caps_initialized)
+ return pmu->nr_caps;

if (!sysfs)
return -1;

+ pmu->nr_caps = 0;
+
snprintf(caps_path, PATH_MAX,
"%s" EVENT_SOURCE_DEVICE_PATH "%s/caps", sysfs, pmu->name);

- if (stat(caps_path, &st) < 0)
+ if (stat(caps_path, &st) < 0) {
+ pmu->caps_initialized = true;
return 0; /* no error if caps does not exist */
+ }

caps_dir = opendir(caps_path);
if (!caps_dir)
@@ -1926,13 +1932,14 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
continue;
}

- nr_caps++;
+ pmu->nr_caps++;
fclose(file);
}

closedir(caps_dir);

- return nr_caps;
+ pmu->caps_initialized = true;
+ return pmu->nr_caps;
}

void perf_pmu__warn_invalid_config(struct perf_pmu *pmu, __u64 config,
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 541889fa9f9c..4b45fd8da5a3 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -46,6 +46,8 @@ struct perf_pmu {
struct perf_cpu_map *cpus;
struct list_head format; /* HEAD struct perf_pmu_format -> list */
struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
+ bool caps_initialized;
+ u32 nr_caps;
struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
struct list_head list; /* ELEM */
struct list_head hybrid_list;
--
2.31.1


2022-05-23 08:15:06

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 4/5] perf tool ibs: Sync amd ibs header file

IBS support has been enhanced with two new features in upcoming uarch:
1. DataSrc extension and 2. L3 miss filtering. Additional set of bits
has been introduced in IBS registers to exploit these features.
New bits are already defining in arch/x86/ header. Sync it with tools
header file. Also rename existing ibs_op_data field 'data_src' to
'data_src_lo'.

Signed-off-by: Ravi Bangoria <[email protected]>
---
tools/arch/x86/include/asm/amd-ibs.h | 16 ++++++++++------
tools/perf/util/amd-sample-raw.c | 4 ++--
2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/arch/x86/include/asm/amd-ibs.h b/tools/arch/x86/include/asm/amd-ibs.h
index 765e9e752d03..9a3312e12e2e 100644
--- a/tools/arch/x86/include/asm/amd-ibs.h
+++ b/tools/arch/x86/include/asm/amd-ibs.h
@@ -29,7 +29,10 @@ union ibs_fetch_ctl {
rand_en:1, /* 57: random tagging enable */
fetch_l2_miss:1,/* 58: L2 miss for sampled fetch
* (needs IbsFetchComp) */
- reserved:5; /* 59-63: reserved */
+ l3_miss_only:1, /* 59: Collect L3 miss samples only */
+ fetch_oc_miss:1,/* 60: Op cache miss for the sampled fetch */
+ fetch_l3_miss:1,/* 61: L3 cache miss for the sampled fetch */
+ reserved:2; /* 62-63: reserved */
};
};

@@ -38,14 +41,14 @@ union ibs_op_ctl {
__u64 val;
struct {
__u64 opmaxcnt:16, /* 0-15: periodic op max. count */
- reserved0:1, /* 16: reserved */
+ l3_miss_only:1, /* 16: Collect L3 miss samples only */
op_en:1, /* 17: op sampling enable */
op_val:1, /* 18: op sample valid */
cnt_ctl:1, /* 19: periodic op counter control */
opmaxcnt_ext:7, /* 20-26: upper 7 bits of periodic op maximum count */
- reserved1:5, /* 27-31: reserved */
+ reserved0:5, /* 27-31: reserved */
opcurcnt:27, /* 32-58: periodic op counter current count */
- reserved2:5; /* 59-63: reserved */
+ reserved1:5; /* 59-63: reserved */
};
};

@@ -71,11 +74,12 @@ union ibs_op_data {
union ibs_op_data2 {
__u64 val;
struct {
- __u64 data_src:3, /* 0-2: data source */
+ __u64 data_src_lo:3, /* 0-2: data source low */
reserved0:1, /* 3: reserved */
rmt_node:1, /* 4: destination node */
cache_hit_st:1, /* 5: cache hit state */
- reserved1:57; /* 5-63: reserved */
+ data_src_hi:2, /* 6-7: data source high */
+ reserved1:56; /* 8-63: reserved */
};
};

diff --git a/tools/perf/util/amd-sample-raw.c b/tools/perf/util/amd-sample-raw.c
index d19d765195c5..3b623ea6ee7e 100644
--- a/tools/perf/util/amd-sample-raw.c
+++ b/tools/perf/util/amd-sample-raw.c
@@ -98,9 +98,9 @@ static void pr_ibs_op_data2(union ibs_op_data2 reg)
};

printf("ibs_op_data2:\t%016llx %sRmtNode %d%s\n", reg.val,
- reg.data_src == 2 ? (reg.cache_hit_st ? "CacheHitSt 1=O-State "
+ reg.data_src_lo == 2 ? (reg.cache_hit_st ? "CacheHitSt 1=O-State "
: "CacheHitSt 0=M-state ") : "",
- reg.rmt_node, data_src_str[reg.data_src]);
+ reg.rmt_node, data_src_str[reg.data_src_lo]);
}

static void pr_ibs_op_data3(union ibs_op_data3 reg)
--
2.31.1


2022-05-23 09:04:52

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities

On Mon, May 23, 2022 at 09:09:42AM +0530, Ravi Bangoria wrote:

SNIP

> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index a27132e5a5ef..b4505dbb9f4a 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -1580,6 +1580,67 @@ static int write_hybrid_cpu_pmu_caps(struct feat_fd *ff,
> return 0;
> }
>
> +/*
> + * File format:
> + *
> + * struct {
> + * u32 nr_pmus;
> + * struct {
> + * char pmu_name[];
> + * u32 nr_caps;
> + * struct {
> + * char name[];
> + * char value[];
> + * } [nr_caps];
> + * } [nr_pmus];
> + * };
> + */
> +static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
> +{
> + struct perf_pmu_caps *caps = NULL;
> + struct perf_pmu *pmu = NULL;
> + u32 nr_pmus = 0;
> + int ret;
> +
> + while ((pmu = perf_pmu__scan(pmu))) {
> + if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||

should we check for the hybrid names as well?
there's a helper perf_pmu__is_hybrid,
aybe you can use that

jirka


> + perf_pmu__caps_parse(pmu) <= 0)
> + continue;
> + nr_pmus++;
> + }
> +
> + ret = do_write(ff, &nr_pmus, sizeof(nr_pmus));
> + if (ret < 0)
> + return ret;
> +
> + if (!nr_pmus)
> + return 0;
> +
> + while ((pmu = perf_pmu__scan(pmu))) {
> + if (!pmu->name || !strncmp(pmu->name, "cpu", 3) || !pmu->nr_caps)
> + continue;
> +
> + ret = do_write_string(ff, pmu->name);
> + if (ret < 0)
> + return ret;
> +
> + ret = do_write(ff, &pmu->nr_caps, sizeof(pmu->nr_caps));
> + if (ret < 0)
> + return ret;
> +
> + list_for_each_entry(caps, &pmu->caps, list) {
> + ret = do_write_string(ff, caps->name);
> + if (ret < 0)
> + return ret;
> +
> + ret = do_write_string(ff, caps->value);
> + if (ret < 0)
> + return ret;
> + }
> + }
> + return 0;
> +}
> +
> static void print_hostname(struct feat_fd *ff, FILE *fp)
> {
> fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
> @@ -2209,6 +2270,32 @@ static void print_mem_topology(struct feat_fd *ff, FILE *fp)
> }
> }
>
> +static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
> +{
> + struct env_pmu_caps *env_pmu_caps = ff->ph->env.env_pmu_caps;
> + int nr_pmus_with_caps = ff->ph->env.nr_pmus_with_caps;
> + const char *delimiter = "";
> + char **ptr;
> + int i;
> + u32 j;
> +
> + if (!nr_pmus_with_caps)
> + return;
> +
> + for (i = 0; i < nr_pmus_with_caps; i++) {
> + fprintf(fp, "# %s pmu capabilities: ", env_pmu_caps[i].pmu_name);
> +
> + ptr = env_pmu_caps[i].pmu_caps;
> +
> + delimiter = "";
> + for (j = 0; j < env_pmu_caps[i].nr_caps; j++) {
> + fprintf(fp, "%s%s", delimiter, ptr[j]);
> + delimiter = ", ";
> + }
> + fprintf(fp, "\n");
> + }
> +}
> +
> static int __event_process_build_id(struct perf_record_header_build_id *bev,
> char *filename,
> struct perf_session *session)
> @@ -3319,6 +3406,103 @@ static int process_hybrid_cpu_pmu_caps(struct feat_fd *ff,
> return ret;
> }
>
> +static int __process_pmu_caps(struct feat_fd *ff, struct env_pmu_caps *env_pmu_caps)
> +{
> + u32 nr_caps = env_pmu_caps->nr_caps;
> + int name_size, value_size;
> + char *name, *value, *ptr;
> + u32 i;
> +
> + env_pmu_caps->pmu_caps = zalloc(sizeof(char *) * nr_caps);
> + if (!env_pmu_caps->pmu_caps)
> + return -1;
> +
> + for (i = 0; i < nr_caps; i++) {
> + name = do_read_string(ff);
> + if (!name)
> + goto error;
> +
> + value = do_read_string(ff);
> + if (!value)
> + goto free_name;
> +
> + name_size = strlen(name);
> + value_size = strlen(value);
> + ptr = zalloc(sizeof(char) * (name_size + value_size + 2));
> + if (!ptr)
> + goto free_value;
> +
> + memcpy(ptr, name, name_size);
> + ptr[name_size] = '=';
> + memcpy(ptr + name_size + 1, value, value_size);
> + env_pmu_caps->pmu_caps[i] = ptr;
> +
> + free(value);
> + free(name);
> + }
> + return 0;
> +
> +free_value:
> + free(value);
> +free_name:
> + free(name);
> +error:
> + for (; i > 0; i--)
> + free(env_pmu_caps->pmu_caps[i - 1]);
> + free(env_pmu_caps->pmu_caps);
> + return -1;
> +}
> +
> +static int process_pmu_caps(struct feat_fd *ff, void *data __maybe_unused)
> +{
> + struct env_pmu_caps *env_pmu_caps;
> + u32 nr_pmus;
> + u32 i, j;
> +
> + ff->ph->env.nr_pmus_with_caps = 0;
> + ff->ph->env.env_pmu_caps = NULL;
> +
> + if (do_read_u32(ff, &nr_pmus))
> + return -1;
> +
> + if (!nr_pmus)
> + return 0;
> +
> + env_pmu_caps = zalloc(sizeof(struct env_pmu_caps) * nr_pmus);
> + if (!env_pmu_caps)
> + return -ENOMEM;
> +
> + for (i = 0; i < nr_pmus; i++) {
> + env_pmu_caps[i].pmu_name = do_read_string(ff);
> + if (!env_pmu_caps[i].pmu_name)
> + goto error;
> +
> + if (do_read_u32(ff, &env_pmu_caps[i].nr_caps))
> + goto free_pmu_name;
> +
> + if (!__process_pmu_caps(ff, &env_pmu_caps[i]))
> + continue;
> +
> +free_pmu_name:
> + free(env_pmu_caps[i].pmu_name);
> + goto error;
> + }
> +
> + ff->ph->env.nr_pmus_with_caps = nr_pmus;
> + ff->ph->env.env_pmu_caps = env_pmu_caps;
> + return 0;
> +
> +error:
> + for (; i > 0; i--) {
> + free(env_pmu_caps[i - 1].pmu_name);
> + for (j = 0; j < env_pmu_caps[i - 1].nr_caps; j++)
> + free(env_pmu_caps[i - 1].pmu_caps[j]);
> + free(env_pmu_caps[i - 1].pmu_caps);
> + }
> + free(env_pmu_caps);
> + return -1;
> +}
> +
> #define FEAT_OPR(n, func, __full_only) \
> [HEADER_##n] = { \
> .name = __stringify(n), \
> @@ -3382,6 +3566,7 @@ const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE] = {
> FEAT_OPR(CLOCK_DATA, clock_data, false),
> FEAT_OPN(HYBRID_TOPOLOGY, hybrid_topology, true),
> FEAT_OPR(HYBRID_CPU_PMU_CAPS, hybrid_cpu_pmu_caps, false),
> + FEAT_OPR(PMU_CAPS, pmu_caps, false),
> };
>
> struct header_print_data {
> diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
> index 0eb4bc29a5a4..e9a067bb8b9e 100644
> --- a/tools/perf/util/header.h
> +++ b/tools/perf/util/header.h
> @@ -47,6 +47,7 @@ enum {
> HEADER_CLOCK_DATA,
> HEADER_HYBRID_TOPOLOGY,
> HEADER_HYBRID_CPU_PMU_CAPS,
> + HEADER_PMU_CAPS,
> HEADER_LAST_FEATURE,
> HEADER_FEAT_BITS = 256,
> };
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 9a1c7e63e663..8d599acb7569 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -1890,16 +1890,22 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
> const char *sysfs = sysfs__mountpoint();
> DIR *caps_dir;
> struct dirent *evt_ent;
> - int nr_caps = 0;
> +
> + if (pmu->caps_initialized)
> + return pmu->nr_caps;
>
> if (!sysfs)
> return -1;
>
> + pmu->nr_caps = 0;
> +
> snprintf(caps_path, PATH_MAX,
> "%s" EVENT_SOURCE_DEVICE_PATH "%s/caps", sysfs, pmu->name);
>
> - if (stat(caps_path, &st) < 0)
> + if (stat(caps_path, &st) < 0) {
> + pmu->caps_initialized = true;
> return 0; /* no error if caps does not exist */
> + }
>
> caps_dir = opendir(caps_path);
> if (!caps_dir)
> @@ -1926,13 +1932,14 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
> continue;
> }
>
> - nr_caps++;
> + pmu->nr_caps++;
> fclose(file);
> }
>
> closedir(caps_dir);
>
> - return nr_caps;
> + pmu->caps_initialized = true;
> + return pmu->nr_caps;
> }
>
> void perf_pmu__warn_invalid_config(struct perf_pmu *pmu, __u64 config,
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 541889fa9f9c..4b45fd8da5a3 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -46,6 +46,8 @@ struct perf_pmu {
> struct perf_cpu_map *cpus;
> struct list_head format; /* HEAD struct perf_pmu_format -> list */
> struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
> + bool caps_initialized;
> + u32 nr_caps;
> struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
> struct list_head list; /* ELEM */
> struct list_head hybrid_list;
> --
> 2.31.1
>

2022-05-23 13:45:10

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities

On 23-May-22 2:34 PM, Jiri Olsa wrote:
> On Mon, May 23, 2022 at 09:09:42AM +0530, Ravi Bangoria wrote:
>
> SNIP
>
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index a27132e5a5ef..b4505dbb9f4a 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -1580,6 +1580,67 @@ static int write_hybrid_cpu_pmu_caps(struct feat_fd *ff,
>> return 0;
>> }
>>
>> +/*
>> + * File format:
>> + *
>> + * struct {
>> + * u32 nr_pmus;
>> + * struct {
>> + * char pmu_name[];
>> + * u32 nr_caps;
>> + * struct {
>> + * char name[];
>> + * char value[];
>> + * } [nr_caps];
>> + * } [nr_pmus];
>> + * };
>> + */
>> +static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
>> +{
>> + struct perf_pmu_caps *caps = NULL;
>> + struct perf_pmu *pmu = NULL;
>> + u32 nr_pmus = 0;
>> + int ret;
>> +
>> + while ((pmu = perf_pmu__scan(pmu))) {
>> + if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||
>
> should we check for the hybrid names as well?
> there's a helper perf_pmu__is_hybrid,
> aybe you can use that

Yup. Will include those.

Thanks,
Ravi

2022-05-25 19:53:42

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities



On 5/22/2022 11:39 PM, Ravi Bangoria wrote:
> Pmus advertise their capabilities via sysfs attribute files but
> perf tool currently parses only core(cpu) pmu capabilities. Add
> support for parsing non-cpu pmu capabilities.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> .../Documentation/perf.data-file-format.txt | 17 ++
> tools/perf/util/env.c | 47 +++++
> tools/perf/util/env.h | 10 +
> tools/perf/util/header.c | 185 ++++++++++++++++++
> tools/perf/util/header.h | 1 +
> tools/perf/util/pmu.c | 15 +-
> tools/perf/util/pmu.h | 2 +
> 7 files changed, 273 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt
> index f56d0e0fbff6..64dddaecda21 100644
> --- a/tools/perf/Documentation/perf.data-file-format.txt
> +++ b/tools/perf/Documentation/perf.data-file-format.txt
> @@ -435,6 +435,23 @@ struct {
> } [nr_pmu];
> };
>
> + HEADER_PMU_CAPS = 32,
> +
> + List of pmu capabilities (except cpu pmu which is already
> + covered by HEADER_CPU_PMU_CAPS / HEADER_HYBRID_CPU_PMU_CAPS)
> +
> +struct {
> + u32 nr_pmus;
> + struct {
> + char pmu_name[];
> + u32 nr_caps;
> + struct {
> + char name[];
> + char value[];
> + } [nr_caps];
> + } [nr_pmus];
> +};
> +
> other bits are reserved and should ignored for now
> HEADER_FEAT_BITS = 256,
>
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 579e44c59914..3ecdd7683e45 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -180,6 +180,7 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused)
> void perf_env__exit(struct perf_env *env)
> {
> int i;
> + u32 j;
>
> perf_env__purge_bpf(env);
> perf_env__purge_cgroups(env);
> @@ -222,6 +223,14 @@ void perf_env__exit(struct perf_env *env)
> zfree(&env->hybrid_cpc_nodes[i].pmu_name);
> }
> zfree(&env->hybrid_cpc_nodes);
> +
> + for (i = 0; i < env->nr_pmus_with_caps; i++) {
> + zfree(&env->env_pmu_caps[i].pmu_name);
> + for (j = 0; j < env->env_pmu_caps[i].nr_caps; j++)
> + zfree(&env->env_pmu_caps[i].pmu_caps[j]);
> + zfree(&env->env_pmu_caps[i].pmu_caps);
> + }
> + zfree(&env->env_pmu_caps);
> }
>
> void perf_env__init(struct perf_env *env)
> @@ -527,3 +536,41 @@ int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu)
>
> return cpu.cpu >= 0 && cpu.cpu < env->nr_numa_map ? env->numa_map[cpu.cpu] : -1;
> }
> +
> +char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
> + const char *cap)
> +{
> + struct env_pmu_caps *env_pmu_caps = env->env_pmu_caps;
> + char *cap_eq;
> + int cap_size;
> + char **ptr;
> + int i;
> + u32 j;
> +
> + if (!pmu_name || !cap)
> + return NULL;
> +
> + cap_size = strlen(cap);
> + cap_eq = zalloc(cap_size + 2);
> + if (!cap_eq)
> + return NULL;
> +
> + memcpy(cap_eq, cap, cap_size);
> + cap_eq[cap_size] = '=';
> +
> + for (i = 0; i < env->nr_pmus_with_caps; i++) {
> + if (strcmp(env_pmu_caps[i].pmu_name, pmu_name))
> + continue;
> +
> + ptr = env_pmu_caps[i].pmu_caps;
> +
> + for (j = 0; j < env_pmu_caps[i].nr_caps; j++) {
> + if (!strncmp(ptr[j], cap_eq, cap_size + 1)) {
> + free(cap_eq);
> + return &ptr[j][cap_size + 1];
> + }
> + }
> + }
> + free(cap_eq);
> + return NULL;
> +}
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index a3541f98e1fc..a21d62ee9de8 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -50,6 +50,12 @@ struct hybrid_cpc_node {
> char *pmu_name;
> };
>
> +struct env_pmu_caps {
> + char *pmu_name;
> + u32 nr_caps;
> + char **pmu_caps;
> +};
> +
> struct perf_env {
> char *hostname;
> char *os_release;
> @@ -75,6 +81,7 @@ struct perf_env {
> int nr_cpu_pmu_caps;
> int nr_hybrid_nodes;
> int nr_hybrid_cpc_nodes;
> + int nr_pmus_with_caps;
> char *cmdline;
> const char **cmdline_argv;
> char *sibling_cores;
> @@ -95,6 +102,7 @@ struct perf_env {
> unsigned long long memory_bsize;
> struct hybrid_node *hybrid_nodes;
> struct hybrid_cpc_node *hybrid_cpc_nodes;
> + struct env_pmu_caps *env_pmu_caps;
> #ifdef HAVE_LIBBPF_SUPPORT
> /*
> * bpf_info_lock protects bpf rbtrees. This is needed because the
> @@ -172,4 +180,6 @@ bool perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
> struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
>
> int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu);
> +char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
> + const char *cap);
> #endif /* __PERF_ENV_H */
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index a27132e5a5ef..b4505dbb9f4a 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -1580,6 +1580,67 @@ static int write_hybrid_cpu_pmu_caps(struct feat_fd *ff,
> return 0;
> }
>
> +/*
> + * File format:
> + *
> + * struct {
> + * u32 nr_pmus;
> + * struct {
> + * char pmu_name[];
> + * u32 nr_caps;
> + * struct {
> + * char name[];
> + * char value[];
> + * } [nr_caps];
> + * } [nr_pmus];
> + * };
> + */
> +static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
> +{
> + struct perf_pmu_caps *caps = NULL;
> + struct perf_pmu *pmu = NULL;
> + u32 nr_pmus = 0;
> + int ret;
> +
> + while ((pmu = perf_pmu__scan(pmu))) {
> + if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||
> + perf_pmu__caps_parse(pmu) <= 0)
> + continue;
> + nr_pmus++;
> + }
> +
> + ret = do_write(ff, &nr_pmus, sizeof(nr_pmus));
> + if (ret < 0)
> + return ret;
> +
> + if (!nr_pmus)
> + return 0;
> +
> + while ((pmu = perf_pmu__scan(pmu))) {
> + if (!pmu->name || !strncmp(pmu->name, "cpu", 3) || !pmu->nr_caps)
> + continue;
> +
> + ret = do_write_string(ff, pmu->name);
> + if (ret < 0)
> + return ret;
> +
> + ret = do_write(ff, &pmu->nr_caps, sizeof(pmu->nr_caps));
> + if (ret < 0)
> + return ret;
> +
> + list_for_each_entry(caps, &pmu->caps, list) {
> + ret = do_write_string(ff, caps->name);
> + if (ret < 0)
> + return ret;
> +
> + ret = do_write_string(ff, caps->value);
> + if (ret < 0)
> + return ret;
> + }
> + }

The write_per_cpu_pmu_caps() also does a similar thing. Can we factor
out a generic write_pmu_caps() which works for both cpu and non-cpu pmu
capabilities?

It seems the print_pmu_caps()/process_pmu_caps() can also does similar
factor out.


Actually, more aggressively, why not use the HEADER_PMU_CAPS to replace
the HEADER_HYBRID_CPU_PMU_CAPS? The HEADER_HYBRID_CPU_PMU_CAPS is the
last header feature. It seems doable. We can always write/print the
"cpu_" kind of PMU first to be compatible with the old tools.

Thanks,
Kan

> + return 0;
> +}
> +
> static void print_hostname(struct feat_fd *ff, FILE *fp)
> {
> fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
> @@ -2209,6 +2270,32 @@ static void print_mem_topology(struct feat_fd *ff, FILE *fp)
> }
> }
>
> +static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
> +{
> + struct env_pmu_caps *env_pmu_caps = ff->ph->env.env_pmu_caps;
> + int nr_pmus_with_caps = ff->ph->env.nr_pmus_with_caps;
> + const char *delimiter = "";
> + char **ptr;
> + int i;
> + u32 j;
> +
> + if (!nr_pmus_with_caps)
> + return;
> +
> + for (i = 0; i < nr_pmus_with_caps; i++) {
> + fprintf(fp, "# %s pmu capabilities: ", env_pmu_caps[i].pmu_name);
> +
> + ptr = env_pmu_caps[i].pmu_caps;
> +
> + delimiter = "";
> + for (j = 0; j < env_pmu_caps[i].nr_caps; j++) {
> + fprintf(fp, "%s%s", delimiter, ptr[j]);
> + delimiter = ", ";
> + }
> + fprintf(fp, "\n");
> + }
> +}
> +
> static int __event_process_build_id(struct perf_record_header_build_id *bev,
> char *filename,
> struct perf_session *session)
> @@ -3319,6 +3406,103 @@ static int process_hybrid_cpu_pmu_caps(struct feat_fd *ff,
> return ret;
> }
>
> +static int __process_pmu_caps(struct feat_fd *ff, struct env_pmu_caps *env_pmu_caps)
> +{
> + u32 nr_caps = env_pmu_caps->nr_caps;
> + int name_size, value_size;
> + char *name, *value, *ptr;
> + u32 i;
> +
> + env_pmu_caps->pmu_caps = zalloc(sizeof(char *) * nr_caps);
> + if (!env_pmu_caps->pmu_caps)
> + return -1;
> +
> + for (i = 0; i < nr_caps; i++) {
> + name = do_read_string(ff);
> + if (!name)
> + goto error;
> +
> + value = do_read_string(ff);
> + if (!value)
> + goto free_name;
> +
> + name_size = strlen(name);
> + value_size = strlen(value);
> + ptr = zalloc(sizeof(char) * (name_size + value_size + 2));
> + if (!ptr)
> + goto free_value;
> +
> + memcpy(ptr, name, name_size);
> + ptr[name_size] = '=';
> + memcpy(ptr + name_size + 1, value, value_size);
> + env_pmu_caps->pmu_caps[i] = ptr;
> +
> + free(value);
> + free(name);
> + }
> + return 0;
> +
> +free_value:
> + free(value);
> +free_name:
> + free(name);
> +error:
> + for (; i > 0; i--)
> + free(env_pmu_caps->pmu_caps[i - 1]);
> + free(env_pmu_caps->pmu_caps);
> + return -1;
> +}
> +
> +static int process_pmu_caps(struct feat_fd *ff, void *data __maybe_unused)
> +{
> + struct env_pmu_caps *env_pmu_caps;
> + u32 nr_pmus;
> + u32 i, j;
> +
> + ff->ph->env.nr_pmus_with_caps = 0;
> + ff->ph->env.env_pmu_caps = NULL;
> +
> + if (do_read_u32(ff, &nr_pmus))
> + return -1;
> +
> + if (!nr_pmus)
> + return 0;
> +
> + env_pmu_caps = zalloc(sizeof(struct env_pmu_caps) * nr_pmus);
> + if (!env_pmu_caps)
> + return -ENOMEM;
> +
> + for (i = 0; i < nr_pmus; i++) {
> + env_pmu_caps[i].pmu_name = do_read_string(ff);
> + if (!env_pmu_caps[i].pmu_name)
> + goto error;
> +
> + if (do_read_u32(ff, &env_pmu_caps[i].nr_caps))
> + goto free_pmu_name;
> +
> + if (!__process_pmu_caps(ff, &env_pmu_caps[i]))
> + continue;
> +
> +free_pmu_name:
> + free(env_pmu_caps[i].pmu_name);
> + goto error;
> + }
> +
> + ff->ph->env.nr_pmus_with_caps = nr_pmus;
> + ff->ph->env.env_pmu_caps = env_pmu_caps;
> + return 0;
> +
> +error:
> + for (; i > 0; i--) {
> + free(env_pmu_caps[i - 1].pmu_name);
> + for (j = 0; j < env_pmu_caps[i - 1].nr_caps; j++)
> + free(env_pmu_caps[i - 1].pmu_caps[j]);
> + free(env_pmu_caps[i - 1].pmu_caps);
> + }
> + free(env_pmu_caps);
> + return -1;
> +}
> +
> #define FEAT_OPR(n, func, __full_only) \
> [HEADER_##n] = { \
> .name = __stringify(n), \
> @@ -3382,6 +3566,7 @@ const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE] = {
> FEAT_OPR(CLOCK_DATA, clock_data, false),
> FEAT_OPN(HYBRID_TOPOLOGY, hybrid_topology, true),
> FEAT_OPR(HYBRID_CPU_PMU_CAPS, hybrid_cpu_pmu_caps, false),
> + FEAT_OPR(PMU_CAPS, pmu_caps, false),
> };
>
> struct header_print_data {
> diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
> index 0eb4bc29a5a4..e9a067bb8b9e 100644
> --- a/tools/perf/util/header.h
> +++ b/tools/perf/util/header.h
> @@ -47,6 +47,7 @@ enum {
> HEADER_CLOCK_DATA,
> HEADER_HYBRID_TOPOLOGY,
> HEADER_HYBRID_CPU_PMU_CAPS,
> + HEADER_PMU_CAPS,
> HEADER_LAST_FEATURE,
> HEADER_FEAT_BITS = 256,
> };
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 9a1c7e63e663..8d599acb7569 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -1890,16 +1890,22 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
> const char *sysfs = sysfs__mountpoint();
> DIR *caps_dir;
> struct dirent *evt_ent;
> - int nr_caps = 0;
> +
> + if (pmu->caps_initialized)
> + return pmu->nr_caps;
>
> if (!sysfs)
> return -1;
>
> + pmu->nr_caps = 0;
> +
> snprintf(caps_path, PATH_MAX,
> "%s" EVENT_SOURCE_DEVICE_PATH "%s/caps", sysfs, pmu->name);
>
> - if (stat(caps_path, &st) < 0)
> + if (stat(caps_path, &st) < 0) {
> + pmu->caps_initialized = true;
> return 0; /* no error if caps does not exist */
> + }
>
> caps_dir = opendir(caps_path);
> if (!caps_dir)
> @@ -1926,13 +1932,14 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
> continue;
> }
>
> - nr_caps++;
> + pmu->nr_caps++;
> fclose(file);
> }
>
> closedir(caps_dir);
>
> - return nr_caps;
> + pmu->caps_initialized = true;
> + return pmu->nr_caps;
> }
>
> void perf_pmu__warn_invalid_config(struct perf_pmu *pmu, __u64 config,
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 541889fa9f9c..4b45fd8da5a3 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -46,6 +46,8 @@ struct perf_pmu {
> struct perf_cpu_map *cpus;
> struct list_head format; /* HEAD struct perf_pmu_format -> list */
> struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
> + bool caps_initialized;
> + u32 nr_caps;
> struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
> struct list_head list; /* ELEM */
> struct list_head hybrid_list;

2022-05-27 08:54:12

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities

On 26-May-22 9:25 PM, Liang, Kan wrote:
>
>
> On 5/26/2022 11:08 AM, Ravi Bangoria wrote:
>> Hi Kan,
>>
>> [...]
>>
>>>> +static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
>>>> +{
>>>> +    struct perf_pmu_caps *caps = NULL;
>>>> +    struct perf_pmu *pmu = NULL;
>>>> +    u32 nr_pmus = 0;
>>>> +    int ret;
>>>> +
>>>> +    while ((pmu = perf_pmu__scan(pmu))) {
>>>> +        if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||
>>>> +            perf_pmu__caps_parse(pmu) <= 0)
>>>> +            continue;
>>>> +        nr_pmus++;
>>>> +    }
>>>> +
>>>> +    ret = do_write(ff, &nr_pmus, sizeof(nr_pmus));
>>>> +    if (ret < 0)
>>>> +        return ret;
>>>> +
>>>> +    if (!nr_pmus)
>>>> +        return 0;
>>>> +
>>>> +    while ((pmu = perf_pmu__scan(pmu))) {
>>>> +        if (!pmu->name || !strncmp(pmu->name, "cpu", 3) || !pmu->nr_caps)
>>>> +            continue;
>>>> +
>>>> +        ret = do_write_string(ff, pmu->name);
>>>> +        if (ret < 0)
>>>> +            return ret;
>>>> +
>>>> +        ret = do_write(ff, &pmu->nr_caps, sizeof(pmu->nr_caps));
>>>> +        if (ret < 0)
>>>> +            return ret;
>>>> +
>>>> +        list_for_each_entry(caps, &pmu->caps, list) {
>>>> +            ret = do_write_string(ff, caps->name);
>>>> +            if (ret < 0)
>>>> +                return ret;
>>>> +
>>>> +            ret = do_write_string(ff, caps->value);
>>>> +            if (ret < 0)
>>>> +                return ret;
>>>> +        }
>>>> +    }
>>>
>>> The write_per_cpu_pmu_caps() also does a similar thing. Can we factor out a generic write_pmu_caps() which works for both cpu and non-cpu pmu capabilities?
>>
>> I might be able to do this but..
>>
>>> It seems the print_pmu_caps()/process_pmu_caps() can also does similar factor out.
>>
>> not this, see below..
>>
>>> Actually, more aggressively, why not use the HEADER_PMU_CAPS to replace the HEADER_HYBRID_CPU_PMU_CAPS? The HEADER_HYBRID_CPU_PMU_CAPS is the last header feature. It seems doable. We can always write/print the "cpu_" kind of PMU first to be compatible with the old tools.
>>
>> There are some differences in how capabilities are stored in perf.data header
>> as well as perf_env. In case of HEADER_CPU_PMU_CAPS or
>> HEADER_HYBRID_CPU_PMU_CAPS, all capabilities are stored in a single string
>> separated by NULL character.
>
> I think this is the format for the internal string, not the format of the perf.data header.

Yeah I just realized that after replying. Anyway, thanks for clarifying.
Will change internal format of HEADER_HYBRID_CPU_PMU_CAPS (as well as
HEADER_CPU_PMU_CAPS), and replace HEADER_HYBRID_CPU_PMU_CAPS with
HEADER_PMU_CAPS.

Thanks,
Ravi

2022-05-27 10:16:54

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities

Hi Kan,

[...]

>> +static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
>> +{
>> +    struct perf_pmu_caps *caps = NULL;
>> +    struct perf_pmu *pmu = NULL;
>> +    u32 nr_pmus = 0;
>> +    int ret;
>> +
>> +    while ((pmu = perf_pmu__scan(pmu))) {
>> +        if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||
>> +            perf_pmu__caps_parse(pmu) <= 0)
>> +            continue;
>> +        nr_pmus++;
>> +    }
>> +
>> +    ret = do_write(ff, &nr_pmus, sizeof(nr_pmus));
>> +    if (ret < 0)
>> +        return ret;
>> +
>> +    if (!nr_pmus)
>> +        return 0;
>> +
>> +    while ((pmu = perf_pmu__scan(pmu))) {
>> +        if (!pmu->name || !strncmp(pmu->name, "cpu", 3) || !pmu->nr_caps)
>> +            continue;
>> +
>> +        ret = do_write_string(ff, pmu->name);
>> +        if (ret < 0)
>> +            return ret;
>> +
>> +        ret = do_write(ff, &pmu->nr_caps, sizeof(pmu->nr_caps));
>> +        if (ret < 0)
>> +            return ret;
>> +
>> +        list_for_each_entry(caps, &pmu->caps, list) {
>> +            ret = do_write_string(ff, caps->name);
>> +            if (ret < 0)
>> +                return ret;
>> +
>> +            ret = do_write_string(ff, caps->value);
>> +            if (ret < 0)
>> +                return ret;
>> +        }
>> +    }
>
> The write_per_cpu_pmu_caps() also does a similar thing. Can we factor out a generic write_pmu_caps() which works for both cpu and non-cpu pmu capabilities?

I might be able to do this but..

> It seems the print_pmu_caps()/process_pmu_caps() can also does similar factor out.

not this, see below..

> Actually, more aggressively, why not use the HEADER_PMU_CAPS to replace the HEADER_HYBRID_CPU_PMU_CAPS? The HEADER_HYBRID_CPU_PMU_CAPS is the last header feature. It seems doable. We can always write/print the "cpu_" kind of PMU first to be compatible with the old tools.

There are some differences in how capabilities are stored in perf.data header
as well as perf_env. In case of HEADER_CPU_PMU_CAPS or
HEADER_HYBRID_CPU_PMU_CAPS, all capabilities are stored in a single string
separated by NULL character. Whereas, in case of HEADER_PMU_CAPS, they are
stored as an array of strings. The reason for this difference is, searching
in an array is far easier compared to searching in a NULL separated string.
So, I don't think I can extend HEADER_HYBRID_CPU_PMU_CAPS as HEADER_PMU_CAPS
without adding complexity in perf_env__find_pmu_cap().

Thanks for the review,
Ravi

2022-05-28 18:31:59

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] perf header: Parse non-cpu pmu capabilities



On 5/26/2022 11:08 AM, Ravi Bangoria wrote:
> Hi Kan,
>
> [...]
>
>>> +static int write_pmu_caps(struct feat_fd *ff, struct evlist *evlist __maybe_unused)
>>> +{
>>> +    struct perf_pmu_caps *caps = NULL;
>>> +    struct perf_pmu *pmu = NULL;
>>> +    u32 nr_pmus = 0;
>>> +    int ret;
>>> +
>>> +    while ((pmu = perf_pmu__scan(pmu))) {
>>> +        if (!pmu->name || !strncmp(pmu->name, "cpu", 3) ||
>>> +            perf_pmu__caps_parse(pmu) <= 0)
>>> +            continue;
>>> +        nr_pmus++;
>>> +    }
>>> +
>>> +    ret = do_write(ff, &nr_pmus, sizeof(nr_pmus));
>>> +    if (ret < 0)
>>> +        return ret;
>>> +
>>> +    if (!nr_pmus)
>>> +        return 0;
>>> +
>>> +    while ((pmu = perf_pmu__scan(pmu))) {
>>> +        if (!pmu->name || !strncmp(pmu->name, "cpu", 3) || !pmu->nr_caps)
>>> +            continue;
>>> +
>>> +        ret = do_write_string(ff, pmu->name);
>>> +        if (ret < 0)
>>> +            return ret;
>>> +
>>> +        ret = do_write(ff, &pmu->nr_caps, sizeof(pmu->nr_caps));
>>> +        if (ret < 0)
>>> +            return ret;
>>> +
>>> +        list_for_each_entry(caps, &pmu->caps, list) {
>>> +            ret = do_write_string(ff, caps->name);
>>> +            if (ret < 0)
>>> +                return ret;
>>> +
>>> +            ret = do_write_string(ff, caps->value);
>>> +            if (ret < 0)
>>> +                return ret;
>>> +        }
>>> +    }
>>
>> The write_per_cpu_pmu_caps() also does a similar thing. Can we factor out a generic write_pmu_caps() which works for both cpu and non-cpu pmu capabilities?
>
> I might be able to do this but..
>
>> It seems the print_pmu_caps()/process_pmu_caps() can also does similar factor out.
>
> not this, see below..
>
>> Actually, more aggressively, why not use the HEADER_PMU_CAPS to replace the HEADER_HYBRID_CPU_PMU_CAPS? The HEADER_HYBRID_CPU_PMU_CAPS is the last header feature. It seems doable. We can always write/print the "cpu_" kind of PMU first to be compatible with the old tools.
>
> There are some differences in how capabilities are stored in perf.data header
> as well as perf_env. In case of HEADER_CPU_PMU_CAPS or
> HEADER_HYBRID_CPU_PMU_CAPS, all capabilities are stored in a single string
> separated by NULL character.

I think this is the format for the internal string, not the format of
the perf.data header.

For the perf.data, here is the existing format for the
HEADER_HYBRID_CPU_PMU_CAPS.

struct {
u32 nr_pmu;
struct {
u32 nr_cpu_pmu_caps;
{
char name[];
char value[];
} [nr_cpu_pmu_caps];
char pmu_name[];
} [nr_pmu];
};

Here is your proposal.

+struct {
+ u32 nr_pmus;
+ struct {
+ char pmu_name[];
+ u32 nr_caps;
+ struct {
+ char name[];
+ char value[];
+ } [nr_caps];
+ } [nr_pmus];
+};

From my understanding, they are the same. (It doesn't matter where we
put the char pmu_name[];)

That's also why I think we should merge the HEADER_HYBRID_CPU_PMU_CAPS
and HEADER_PMU_CAPS. I don't think it make senses to basically handle
the same thing with different codes.


> Whereas, in case of HEADER_PMU_CAPS, they are
> stored as an array of strings. The reason for this difference is, searching
> in an array is far easier compared to searching in a NULL separated string.

I think the hybrid_cpc_node can be replaced by the env_pmu_caps.
Then you don't need to modify the perf_env__find_pmu_cap().

Thanks,
Kan

> So, I don't think I can extend HEADER_HYBRID_CPU_PMU_CAPS as HEADER_PMU_CAPS
> without adding complexity in perf_env__find_pmu_cap().
>
> Thanks for the review,
> Ravi