2021-12-08 02:46:14

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 00/22] Refactor perf cpumap

Perf cpu map has various functions where a cpumap and index are passed
in order to load the cpu. A problem with this is that the wrong index
may be passed for the cpumap, causing problems like aggregation on the
wrong CPU:
https://lore.kernel.org/lkml/[email protected]/

This patch set refactors the cpu map API, greatly reducing it and
explicitly passing the cpu (rather than the pair) to functions that
need it. Comments are added at the same time.

Ian Rogers (22):
libperf: Add comments to perf_cpu_map.
perf stat: Add aggr creators that are passed a cpu.
perf stat: Switch aggregation to use for_each loop
perf stat: Switch to cpu version of cpu_map__get
perf cpumap: Switch cpu_map__build_map to cpu function
perf cpumap: Remove map+index get_socket
perf cpumap: Remove map+index get_die
perf cpumap: Remove map+index get_core
perf cpumap: Remove map+index get_node
perf cpumap: Add comments to aggr_cpu_id
perf cpumap: Remove unused cpu_map__socket
perf cpumap: Simplify equal function name.
perf cpumap: Rename empty functions.
perf cpumap: Document cpu__get_node and remove redundant function
perf cpumap: Remove map from function names that don't use a map.
perf cpumap: Remove cpu_map__cpu, use libperf function.
perf cpumap: Refactor cpu_map__build_map
perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
perf cpumap: Move 'has' function to libperf
perf cpumap: Add some comments to cpu_aggr_map
perf cpumap: Trim the cpu_aggr_map
perf stat: Fix memory leak in check_per_pkg

tools/lib/perf/cpumap.c | 7 +-
tools/lib/perf/include/internal/cpumap.h | 9 +-
tools/lib/perf/include/perf/cpumap.h | 1 +
tools/perf/arch/arm/util/cs-etm.c | 16 +-
tools/perf/builtin-ftrace.c | 2 +-
tools/perf/builtin-sched.c | 6 +-
tools/perf/builtin-stat.c | 273 ++++++++++++-----------
tools/perf/tests/topology.c | 10 +-
tools/perf/util/cpumap.c | 182 ++++++---------
tools/perf/util/cpumap.h | 102 ++++++---
tools/perf/util/cputopo.c | 2 +-
tools/perf/util/env.c | 6 +-
tools/perf/util/stat-display.c | 69 +++---
tools/perf/util/stat.c | 9 +-
tools/perf/util/stat.h | 3 +-
15 files changed, 361 insertions(+), 336 deletions(-)

--
2.34.1.400.ga245620fadb-goog



2021-12-08 02:46:19

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 01/22] libperf: Add comments to perf_cpu_map.

A particular observed problem is confusing the index with the CPU value,
documentation should hopefully reduce this type of problem.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/lib/perf/include/internal/cpumap.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 840d4032587b..1c1726f4a04e 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -4,9 +4,16 @@

#include <linux/refcount.h>

+/**
+ * A sized, reference counted, sorted array of integers representing CPU
+ * numbers. This is commonly used to capture which CPUs a PMU is associated
+ * with.
+ */
struct perf_cpu_map {
refcount_t refcnt;
+ /** Length of the map array. */
int nr;
+ /** The CPU values. */
int map[];
};

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:20

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 02/22] perf stat: Add aggr creators that are passed a cpu.

The cpu_map and index can get confused. Add variants of the cpu_map__get
routines that are passed a cpu. Make the existing cpu_map__get routines
use the new functions with a view to remove them when no longer used.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
tools/perf/util/cpumap.h | 6 ++-
2 files changed, 51 insertions(+), 34 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 87d3eca9b872..49fba2c53822 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
return ret ?: value;
}

-struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
- void *data __maybe_unused)
+struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
{
- int cpu;
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();

- if (idx > map->nr)
- return id;
-
- cpu = map->map[idx];
-
id.socket = cpu_map__get_socket_id(cpu);
return id;
}

+struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
+ void *data)
+{
+ if (idx < 0 || idx > map->nr)
+ return cpu_map__empty_aggr_cpu_id();
+
+ return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
+}
+
static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
{
struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
@@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
return ret ?: value;
}

-struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
+struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
{
- int cpu, die;
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
-
- if (idx > map->nr)
- return id;
-
- cpu = map->map[idx];
+ struct aggr_cpu_id id;
+ int die;

die = cpu_map__get_die_id(cpu);
/* There is no die_id on legacy system. */
@@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
* with the socket ID and then add die to
* make a unique ID.
*/
- id = cpu_map__get_socket(map, idx, data);
+ id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
if (cpu_map__aggr_cpu_id_is_empty(id))
return id;

@@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
return id;
}

+struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
+ void *data)
+{
+ if (idx < 0 || idx > map->nr)
+ return cpu_map__empty_aggr_cpu_id();
+
+ return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
+}
+
int cpu_map__get_core_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
@@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
return cpu__get_node(cpu);
}

-struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
+struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
{
- int cpu;
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
-
- if (idx > map->nr)
- return id;
-
- cpu = map->map[idx];
-
- cpu = cpu_map__get_core_id(cpu);
+ struct aggr_cpu_id id;
+ int core = cpu_map__get_core_id(cpu);

/* cpu_map__get_die returns a struct with socket and die set*/
- id = cpu_map__get_die(map, idx, data);
+ id = cpu_map__get_die_aggr_by_cpu(cpu, data);
if (cpu_map__aggr_cpu_id_is_empty(id))
return id;

@@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
* core_id is relative to socket and die, we need a global id.
* So we combine the result from cpu_map__get_die with the core id
*/
- id.core = cpu;
+ id.core = core;
return id;
+
}

-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
+struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
+{
+ if (idx < 0 || idx > map->nr)
+ return cpu_map__empty_aggr_cpu_id();
+
+ return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
+}
+
+struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
{
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();

+ id.node = cpu_map__get_node_id(cpu);
+ return id;
+}
+
+struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
+{
if (idx < 0 || idx >= map->nr)
- return id;
+ return cpu_map__empty_aggr_cpu_id();

- id.node = cpu_map__get_node_id(map->map[idx]);
- return id;
+ return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
}

int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index a27eeaf086e8..c62d67704425 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
int cpu_map__get_socket_id(int cpu);
+struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__get_die_id(int cpu);
+struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__get_core_id(int cpu);
+struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__get_node_id(int cpu);
-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
+struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
+struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:25

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 03/22] perf stat: Switch aggregation to use for_each loop

Tidy up the use of cpu and index to hopefully make the code less error
prone. Avoid unused warnings with (void) which will be removed in a
later patch.

In aggr_update_shadow, the perf_cpu_map is switched from
the evlist to the counter's cpu map, so the index is appropriate. This
addresses a problem where uncore counts, with a cpumap like:
$ cat /sys/devices/uncore_imc_0/cpumask
0,18
Don't aggregate counts in CPUs based on the index of those values in the
cpumap (0 and 1) but on the actual CPU (0 and 18). Thereby correcting
metric calculations in per-socket mode for counters with without a full
cpumask.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/stat-display.c | 48 +++++++++++++++++++---------------
1 file changed, 27 insertions(+), 21 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 588601000f3f..efab39a759ff 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -330,8 +330,8 @@ static void print_metric_header(struct perf_stat_config *config,
static int first_shadow_cpu(struct perf_stat_config *config,
struct evsel *evsel, struct aggr_cpu_id id)
{
- struct evlist *evlist = evsel->evlist;
- int i;
+ struct perf_cpu_map *cpus;
+ int cpu, idx;

if (config->aggr_mode == AGGR_NONE)
return id.core;
@@ -339,14 +339,11 @@ static int first_shadow_cpu(struct perf_stat_config *config,
if (!config->aggr_get_id)
return 0;

- for (i = 0; i < evsel__nr_cpus(evsel); i++) {
- int cpu2 = evsel__cpus(evsel)->map[i];
-
- if (cpu_map__compare_aggr_cpu_id(
- config->aggr_get_id(config, evlist->core.cpus, cpu2),
- id)) {
- return cpu2;
- }
+ cpus = evsel__cpus(evsel);
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+ if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
+ id))
+ return cpu;
}
return 0;
}
@@ -516,20 +513,23 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
static void aggr_update_shadow(struct perf_stat_config *config,
struct evlist *evlist)
{
- int cpu, s;
+ int cpu, idx, s;
struct aggr_cpu_id s2, id;
u64 val;
struct evsel *counter;
+ struct perf_cpu_map *cpus;

for (s = 0; s < config->aggr_map->nr; s++) {
id = config->aggr_map->map[s];
evlist__for_each_entry(evlist, counter) {
+ cpus = evsel__cpus(counter);
val = 0;
- for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
- s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+ (void)cpu;
+ s2 = config->aggr_get_id(config, cpus, idx);
if (!cpu_map__compare_aggr_cpu_id(s2, id))
continue;
- val += perf_counts(counter->counts, cpu, 0)->val;
+ val += perf_counts(counter->counts, idx, 0)->val;
}
perf_stat__update_shadow_stats(counter, val,
first_shadow_cpu(config, counter, id),
@@ -634,18 +634,21 @@ static void aggr_cb(struct perf_stat_config *config,
struct evsel *counter, void *data, bool first)
{
struct aggr_data *ad = data;
- int cpu;
+ int idx, cpu;
+ struct perf_cpu_map *cpus;
struct aggr_cpu_id s2;

- for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
+ cpus = evsel__cpus(counter);
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
struct perf_counts_values *counts;

- s2 = config->aggr_get_id(config, evsel__cpus(counter), cpu);
+ (void)cpu;
+ s2 = config->aggr_get_id(config, cpus, idx);
if (!cpu_map__compare_aggr_cpu_id(s2, ad->id))
continue;
if (first)
ad->nr++;
- counts = perf_counts(counter->counts, cpu, 0);
+ counts = perf_counts(counter->counts, idx, 0);
/*
* When any result is bad, make them all to give
* consistent output in interval mode.
@@ -1208,10 +1211,13 @@ static void print_percore_thread(struct perf_stat_config *config,
{
int s;
struct aggr_cpu_id s2, id;
+ struct perf_cpu_map *cpus;
bool first = true;
+ int idx, cpu;

- for (int i = 0; i < evsel__nr_cpus(counter); i++) {
- s2 = config->aggr_get_id(config, evsel__cpus(counter), i);
+ cpus = evsel__cpus(counter);
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+ s2 = config->aggr_get_id(config, cpus, idx);
for (s = 0; s < config->aggr_map->nr; s++) {
id = config->aggr_map->map[s];
if (cpu_map__compare_aggr_cpu_id(s2, id))
@@ -1220,7 +1226,7 @@ static void print_percore_thread(struct perf_stat_config *config,

print_counter_aggrdata(config, counter, s,
prefix, false,
- &first, i);
+ &first, cpu);
}
}

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:27

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 04/22] perf stat: Switch to cpu version of cpu_map__get

Avoid bugs where the wrong index is passed with the cpu_map.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-stat.c | 93 +++++++++++++++++++---------------
tools/perf/util/stat-display.c | 11 ++--
tools/perf/util/stat.h | 3 +-
3 files changed, 57 insertions(+), 50 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7974933dbc77..cbccd6038109 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1299,69 +1299,63 @@ static struct option stat_options[] = {
};

static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int cpu)
+ int cpu)
{
- return cpu_map__get_socket(map, cpu, NULL);
+ return cpu_map__get_socket_aggr_by_cpu(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_die(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int cpu)
+ int cpu)
{
- return cpu_map__get_die(map, cpu, NULL);
+ return cpu_map__get_die_aggr_by_cpu(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_core(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int cpu)
+ int cpu)
{
- return cpu_map__get_core(map, cpu, NULL);
+ return cpu_map__get_core_aggr_by_cpu(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int cpu)
+ int cpu)
{
- return cpu_map__get_node(map, cpu, NULL);
+ return cpu_map__get_node_aggr_by_cpu(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
- aggr_get_id_t get_id, struct perf_cpu_map *map, int idx)
+ aggr_get_id_t get_id, int cpu)
{
- int cpu;
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();

- if (idx >= map->nr)
- return id;
-
- cpu = map->map[idx];
-
if (cpu_map__aggr_cpu_id_is_empty(config->cpus_aggr_map->map[cpu]))
- config->cpus_aggr_map->map[cpu] = get_id(config, map, idx);
+ config->cpus_aggr_map->map[cpu] = get_id(config, cpu);

id = config->cpus_aggr_map->map[cpu];
return id;
}

static struct aggr_cpu_id perf_stat__get_socket_cached(struct perf_stat_config *config,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_stat__get_aggr(config, perf_stat__get_socket, map, idx);
+ return perf_stat__get_aggr(config, perf_stat__get_socket, cpu);
}

static struct aggr_cpu_id perf_stat__get_die_cached(struct perf_stat_config *config,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_stat__get_aggr(config, perf_stat__get_die, map, idx);
+ return perf_stat__get_aggr(config, perf_stat__get_die, cpu);
}

static struct aggr_cpu_id perf_stat__get_core_cached(struct perf_stat_config *config,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
+ return perf_stat__get_aggr(config, perf_stat__get_core, cpu);
}

static struct aggr_cpu_id perf_stat__get_node_cached(struct perf_stat_config *config,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_stat__get_aggr(config, perf_stat__get_node, map, idx);
+ return perf_stat__get_aggr(config, perf_stat__get_node, cpu);
}

static bool term_percore_set(void)
@@ -1459,8 +1453,9 @@ static void perf_stat__exit_aggr_mode(void)
stat_config.cpus_aggr_map = NULL;
}

-static inline int perf_env__get_cpu(struct perf_env *env, struct perf_cpu_map *map, int idx)
+static inline int perf_env__get_cpu(void *data, struct perf_cpu_map *map, int idx)
{
+ struct perf_env *env = data;
int cpu;

if (idx > map->nr)
@@ -1474,10 +1469,9 @@ static inline int perf_env__get_cpu(struct perf_env *env, struct perf_cpu_map *m
return cpu;
}

-static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
- int cpu = perf_env__get_cpu(env, map, idx);
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();

if (cpu != -1)
@@ -1486,11 +1480,15 @@ static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx
return id;
}

-static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx, void *data)
+{
+ return perf_env__get_socket_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
+static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
- int cpu = perf_env__get_cpu(env, map, idx);

if (cpu != -1) {
/*
@@ -1505,11 +1503,15 @@ static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, v
return id;
}

-static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, void *data)
+{
+ return perf_env__get_die_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
+static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
- int cpu = perf_env__get_cpu(env, map, idx);

if (cpu != -1) {
/*
@@ -1525,15 +1527,24 @@ static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx,
return id;
}

-static struct aggr_cpu_id perf_env__get_node(struct perf_cpu_map *map, int idx, void *data)
+static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx, void *data)
+{
+ return perf_env__get_core_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
+static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
{
- int cpu = perf_env__get_cpu(data, map, idx);
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();

id.node = perf_env__numa_node(data, cpu);
return id;
}

+static struct aggr_cpu_id perf_env__get_node(struct perf_cpu_map *map, int idx, void *data)
+{
+ return perf_env__get_node_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
+}
+
static int perf_env__build_socket_map(struct perf_env *env, struct perf_cpu_map *cpus,
struct cpu_aggr_map **sockp)
{
@@ -1559,26 +1570,26 @@ static int perf_env__build_node_map(struct perf_env *env, struct perf_cpu_map *c
}

static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_env__get_socket(map, idx, &perf_stat.session->header.env);
+ return perf_env__get_socket_aggr_by_cpu(cpu, &perf_stat.session->header.env);
}
static struct aggr_cpu_id perf_stat__get_die_file(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_env__get_die(map, idx, &perf_stat.session->header.env);
+ return perf_env__get_die_aggr_by_cpu(cpu, &perf_stat.session->header.env);
}

static struct aggr_cpu_id perf_stat__get_core_file(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_env__get_core(map, idx, &perf_stat.session->header.env);
+ return perf_env__get_core_aggr_by_cpu(cpu, &perf_stat.session->header.env);
}

static struct aggr_cpu_id perf_stat__get_node_file(struct perf_stat_config *config __maybe_unused,
- struct perf_cpu_map *map, int idx)
+ int cpu)
{
- return perf_env__get_node(map, idx, &perf_stat.session->header.env);
+ return perf_env__get_node_aggr_by_cpu(cpu, &perf_stat.session->header.env);
}

static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index efab39a759ff..6c40b91d5e32 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -341,8 +341,7 @@ static int first_shadow_cpu(struct perf_stat_config *config,

cpus = evsel__cpus(evsel);
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
- if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
- id))
+ if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpu), id))
return cpu;
}
return 0;
@@ -525,8 +524,7 @@ static void aggr_update_shadow(struct perf_stat_config *config,
cpus = evsel__cpus(counter);
val = 0;
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
- (void)cpu;
- s2 = config->aggr_get_id(config, cpus, idx);
+ s2 = config->aggr_get_id(config, cpu);
if (!cpu_map__compare_aggr_cpu_id(s2, id))
continue;
val += perf_counts(counter->counts, idx, 0)->val;
@@ -642,8 +640,7 @@ static void aggr_cb(struct perf_stat_config *config,
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
struct perf_counts_values *counts;

- (void)cpu;
- s2 = config->aggr_get_id(config, cpus, idx);
+ s2 = config->aggr_get_id(config, cpu);
if (!cpu_map__compare_aggr_cpu_id(s2, ad->id))
continue;
if (first)
@@ -1217,7 +1214,7 @@ static void print_percore_thread(struct perf_stat_config *config,

cpus = evsel__cpus(counter);
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
- s2 = config->aggr_get_id(config, cpus, idx);
+ s2 = config->aggr_get_id(config, cpu);
for (s = 0; s < config->aggr_map->nr; s++) {
id = config->aggr_map->map[s];
if (cpu_map__compare_aggr_cpu_id(s2, id))
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 32c8527de347..32cf24186229 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -108,8 +108,7 @@ struct runtime_stat {
struct rblist value_list;
};

-typedef struct aggr_cpu_id (*aggr_get_id_t)(struct perf_stat_config *config,
- struct perf_cpu_map *m, int cpu);
+typedef struct aggr_cpu_id (*aggr_get_id_t)(struct perf_stat_config *config, int cpu);

struct perf_stat_config {
enum aggr_mode aggr_mode;
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:29

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 05/22] perf cpumap: Switch cpu_map__build_map to cpu function

Avoid error prone cpu_map + idx variant. Remove now unused functions.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-stat.c | 28 ++++------------------------
tools/perf/util/cpumap.c | 12 ++++++------
tools/perf/util/cpumap.h | 2 +-
3 files changed, 11 insertions(+), 31 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index cbccd6038109..79a435573b44 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1480,11 +1480,6 @@ static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
return id;
}

-static struct aggr_cpu_id perf_env__get_socket(struct perf_cpu_map *map, int idx, void *data)
-{
- return perf_env__get_socket_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
@@ -1503,11 +1498,6 @@ static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
return id;
}

-static struct aggr_cpu_id perf_env__get_die(struct perf_cpu_map *map, int idx, void *data)
-{
- return perf_env__get_die_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
@@ -1527,11 +1517,6 @@ static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
return id;
}

-static struct aggr_cpu_id perf_env__get_core(struct perf_cpu_map *map, int idx, void *data)
-{
- return perf_env__get_core_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
{
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
@@ -1540,33 +1525,28 @@ static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
return id;
}

-static struct aggr_cpu_id perf_env__get_node(struct perf_cpu_map *map, int idx, void *data)
-{
- return perf_env__get_node_aggr_by_cpu(perf_env__get_cpu(data, map, idx), data);
-}
-
static int perf_env__build_socket_map(struct perf_env *env, struct perf_cpu_map *cpus,
struct cpu_aggr_map **sockp)
{
- return cpu_map__build_map(cpus, sockp, perf_env__get_socket, env);
+ return cpu_map__build_map(cpus, sockp, perf_env__get_socket_aggr_by_cpu, env);
}

static int perf_env__build_die_map(struct perf_env *env, struct perf_cpu_map *cpus,
struct cpu_aggr_map **diep)
{
- return cpu_map__build_map(cpus, diep, perf_env__get_die, env);
+ return cpu_map__build_map(cpus, diep, perf_env__get_die_aggr_by_cpu, env);
}

static int perf_env__build_core_map(struct perf_env *env, struct perf_cpu_map *cpus,
struct cpu_aggr_map **corep)
{
- return cpu_map__build_map(cpus, corep, perf_env__get_core, env);
+ return cpu_map__build_map(cpus, corep, perf_env__get_core_aggr_by_cpu, env);
}

static int perf_env__build_node_map(struct perf_env *env, struct perf_cpu_map *cpus,
struct cpu_aggr_map **nodep)
{
- return cpu_map__build_map(cpus, nodep, perf_env__get_node, env);
+ return cpu_map__build_map(cpus, nodep, perf_env__get_node_aggr_by_cpu, env);
}

static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 49fba2c53822..feaf34b25efc 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -163,7 +163,7 @@ static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
}

int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
- struct aggr_cpu_id (*f)(struct perf_cpu_map *map, int cpu, void *data),
+ struct aggr_cpu_id (*f)(int cpu, void *data),
void *data)
{
int nr = cpus->nr;
@@ -178,7 +178,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
c->nr = 0;

for (cpu = 0; cpu < nr; cpu++) {
- s1 = f(cpus, cpu, data);
+ s1 = f(cpu, data);
for (s2 = 0; s2 < c->nr; s2++) {
if (cpu_map__compare_aggr_cpu_id(s1, c->map[s2]))
break;
@@ -290,22 +290,22 @@ struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *da

int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
{
- return cpu_map__build_map(cpus, sockp, cpu_map__get_socket, NULL);
+ return cpu_map__build_map(cpus, sockp, cpu_map__get_socket_aggr_by_cpu, NULL);
}

int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep)
{
- return cpu_map__build_map(cpus, diep, cpu_map__get_die, NULL);
+ return cpu_map__build_map(cpus, diep, cpu_map__get_die_aggr_by_cpu, NULL);
}

int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep)
{
- return cpu_map__build_map(cpus, corep, cpu_map__get_core, NULL);
+ return cpu_map__build_map(cpus, corep, cpu_map__get_core_aggr_by_cpu, NULL);
}

int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **numap)
{
- return cpu_map__build_map(cpus, numap, cpu_map__get_node, NULL);
+ return cpu_map__build_map(cpus, numap, cpu_map__get_node_aggr_by_cpu, NULL);
}

/* setup simple routines to easily access node numbers given a cpu number */
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index c62d67704425..9648816c4255 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -63,7 +63,7 @@ int cpu__max_present_cpu(void);
int cpu__get_node(int cpu);

int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
- struct aggr_cpu_id (*f)(struct perf_cpu_map *map, int cpu, void *data),
+ struct aggr_cpu_id (*f)(int cpu, void *data),
void *data);

int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:32

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 06/22] perf cpumap: Remove map+index get_socket

Migrate final users to appropriate cpu variant.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/tests/topology.c | 2 +-
tools/perf/util/cpumap.c | 9 ---------
tools/perf/util/cpumap.h | 1 -
tools/perf/util/stat.c | 2 +-
4 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 869986139146..69a64074b897 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -150,7 +150,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that socket ID contains only socket
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_socket(map, i, NULL);
+ id = cpu_map__get_socket_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
session->header.env.cpu[map->map[i]].socket_id == id.socket);

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index feaf34b25efc..342a5eaee9d3 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -136,15 +136,6 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
return id;
}

-struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
- void *data)
-{
- if (idx < 0 || idx > map->nr)
- return cpu_map__empty_aggr_cpu_id();
-
- return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
-}
-
static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
{
struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 9648816c4255..a53af24301d2 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -32,7 +32,6 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
int cpu_map__get_socket_id(int cpu);
struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__get_die_id(int cpu);
struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 09ea334586f2..9eca1111fa52 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -328,7 +328,7 @@ static int check_per_pkg(struct evsel *counter,
if (!(vals->run && vals->ena))
return 0;

- s = cpu_map__get_socket(cpus, cpu, NULL).socket;
+ s = cpu_map__get_socket_id(cpu);
if (s < 0)
return -1;

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:34

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 07/22] perf cpumap: Remove map+index get_die

Migrate final users to appropriate cpu variant.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/tests/topology.c | 2 +-
tools/perf/util/cpumap.c | 9 ---------
tools/perf/util/cpumap.h | 1 -
tools/perf/util/stat.c | 2 +-
4 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 69a64074b897..ce085b6f379b 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -136,7 +136,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that die ID contains socket and die
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_die(map, i, NULL);
+ id = cpu_map__get_die_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Die map - Socket ID doesn't match",
session->header.env.cpu[map->map[i]].socket_id == id.socket);

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 342a5eaee9d3..ff91c32da688 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -216,15 +216,6 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
return id;
}

-struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
- void *data)
-{
- if (idx < 0 || idx > map->nr)
- return cpu_map__empty_aggr_cpu_id();
-
- return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
-}
-
int cpu_map__get_core_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index a53af24301d2..365ed69699e1 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -34,7 +34,6 @@ int cpu_map__get_socket_id(int cpu);
struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
int cpu_map__get_die_id(int cpu);
struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__get_core_id(int cpu);
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 9eca1111fa52..5ed99bcfe91e 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -336,7 +336,7 @@ static int check_per_pkg(struct evsel *counter,
* On multi-die system, die_id > 0. On no-die system, die_id = 0.
* We use hashmap(socket, die) to check the used socket+die pair.
*/
- d = cpu_map__get_die(cpus, cpu, NULL).die;
+ d = cpu_map__get_die_id(cpu);
if (d < 0)
return -1;

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:39

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 08/22] perf cpumap: Remove map+index get_core

Migrate final users to appropriate cpu variant.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/tests/topology.c | 2 +-
tools/perf/util/cpumap.c | 8 --------
tools/perf/util/cpumap.h | 1 -
3 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index ce085b6f379b..9a671670415a 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -121,7 +121,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that core ID contains socket, die and core
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_core(map, i, NULL);
+ id = cpu_map__get_core_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Core map - Core ID doesn't match",
session->header.env.cpu[map->map[i]].core_id == id.core);

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index ff91c32da688..e8149bcf8bfa 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -246,14 +246,6 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)

}

-struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
-{
- if (idx < 0 || idx > map->nr)
- return cpu_map__empty_aggr_cpu_id();
-
- return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
-}
-
struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
{
struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 365ed69699e1..7e1829468bd6 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -36,7 +36,6 @@ int cpu_map__get_die_id(int cpu);
struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
int cpu_map__get_core_id(int cpu);
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__get_node_id(int cpu);
struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:44

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 09/22] perf cpumap: Remove map+index get_node

Migrate final users to appropriate cpu variant.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/tests/topology.c | 2 +-
tools/perf/util/cpumap.c | 8 --------
tools/perf/util/cpumap.h | 1 -
3 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 9a671670415a..5992b323c4f5 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -162,7 +162,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that node ID contains only node
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_node(map, i, NULL);
+ id = cpu_map__get_node_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Node map - Node ID doesn't match",
cpu__get_node(map->map[i]) == id.node);
TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index e8149bcf8bfa..f67b2e7aac13 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -254,14 +254,6 @@ struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unu
return id;
}

-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
-{
- if (idx < 0 || idx >= map->nr)
- return cpu_map__empty_aggr_cpu_id();
-
- return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
-}
-
int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
{
return cpu_map__build_map(cpus, sockp, cpu_map__get_socket_aggr_by_cpu, NULL);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 7e1829468bd6..f0121dd4fdcb 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -38,7 +38,6 @@ int cpu_map__get_core_id(int cpu);
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
int cpu_map__get_node_id(int cpu);
struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:47

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 10/22] perf cpumap: Add comments to aggr_cpu_id

This code is already tested in topology.c.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index f0121dd4fdcb..edd93e1db36a 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -7,11 +7,20 @@
#include <internal/cpumap.h>
#include <perf/cpumap.h>

+/** Identify where counts are aggregated, -1 implies not to aggregate. */
struct aggr_cpu_id {
+ /** A value in the range 0 to number of threads. */
int thread;
+ /** The numa node X as read from /sys/devices/system/node/nodeX. */
int node;
+ /**
+ * The socket number as read from
+ * /sys/devices/system/cpu/cpuX/topology/physical_package_id.
+ */
int socket;
+ /** The die id as read from /sys/devices/system/cpu/cpuX/topology/die_id. */
int die;
+ /** The core id as read from /sys/devices/system/cpu/cpuX/topology/core_id. */
int core;
};

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:50

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 11/22] perf cpumap: Remove unused cpu_map__socket

Unused function so remove.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.h | 7 -------
1 file changed, 7 deletions(-)

diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index edd93e1db36a..22e53fd54657 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -53,13 +53,6 @@ int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **cor
int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **nodep);
const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */

-static inline int cpu_map__socket(struct perf_cpu_map *sock, int s)
-{
- if (!sock || s > sock->nr || s < 0)
- return 0;
- return sock->map[s];
-}
-
int cpu__setup_cpunode_map(void);

int cpu__max_node(void);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:53

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 12/22] perf cpumap: Simplify equal function name.

Rename cpu_map__compare_aggr_cpu_id to aggr_cpu_id__equal, the cpu_map
part of the name is misleading. Equal better describes the function than
compare.
Switch to const pointer rather than value as struct given the number of
variables in aggr_cpu_id.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.c | 14 +++++++-------
tools/perf/util/cpumap.h | 2 +-
tools/perf/util/stat-display.c | 18 ++++++++++--------
3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index f67b2e7aac13..8fa00a6221c8 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -171,7 +171,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
for (cpu = 0; cpu < nr; cpu++) {
s1 = f(cpu, data);
for (s2 = 0; s2 < c->nr; s2++) {
- if (cpu_map__compare_aggr_cpu_id(s1, c->map[s2]))
+ if (aggr_cpu_id__equal(&s1, &c->map[s2]))
break;
}
if (s2 == c->nr) {
@@ -593,13 +593,13 @@ const struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */
return online;
}

-bool cpu_map__compare_aggr_cpu_id(struct aggr_cpu_id a, struct aggr_cpu_id b)
+bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b)
{
- return a.thread == b.thread &&
- a.node == b.node &&
- a.socket == b.socket &&
- a.die == b.die &&
- a.core == b.core;
+ return a->thread == b->thread &&
+ a->node == b->node &&
+ a->socket == b->socket &&
+ a->die == b->die &&
+ a->core == b->core;
}

bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a)
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 22e53fd54657..652b76c69376 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -67,7 +67,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);

-bool cpu_map__compare_aggr_cpu_id(struct aggr_cpu_id a, struct aggr_cpu_id b);
+bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a);
struct aggr_cpu_id cpu_map__empty_aggr_cpu_id(void);

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6c40b91d5e32..0241436bb1fb 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -328,20 +328,22 @@ static void print_metric_header(struct perf_stat_config *config,
}

static int first_shadow_cpu(struct perf_stat_config *config,
- struct evsel *evsel, struct aggr_cpu_id id)
+ struct evsel *evsel, const struct aggr_cpu_id *id)
{
struct perf_cpu_map *cpus;
int cpu, idx;

if (config->aggr_mode == AGGR_NONE)
- return id.core;
+ return id->core;

if (!config->aggr_get_id)
return 0;

cpus = evsel__cpus(evsel);
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
- if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpu), id))
+ struct aggr_cpu_id cpu_id = config->aggr_get_id(config, cpu);
+
+ if (aggr_cpu_id__equal(&cpu_id, id))
return cpu;
}
return 0;
@@ -501,7 +503,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
}

perf_stat__print_shadow_stats(config, counter, uval,
- first_shadow_cpu(config, counter, id),
+ first_shadow_cpu(config, counter, &id),
&out, &config->metric_events, st);
if (!config->csv_output && !config->metric_only) {
print_noise(config, counter, noise);
@@ -525,12 +527,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
val = 0;
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
s2 = config->aggr_get_id(config, cpu);
- if (!cpu_map__compare_aggr_cpu_id(s2, id))
+ if (!aggr_cpu_id__equal(&s2, &id))
continue;
val += perf_counts(counter->counts, idx, 0)->val;
}
perf_stat__update_shadow_stats(counter, val,
- first_shadow_cpu(config, counter, id),
+ first_shadow_cpu(config, counter, &id),
&rt_stat);
}
}
@@ -641,7 +643,7 @@ static void aggr_cb(struct perf_stat_config *config,
struct perf_counts_values *counts;

s2 = config->aggr_get_id(config, cpu);
- if (!cpu_map__compare_aggr_cpu_id(s2, ad->id))
+ if (!aggr_cpu_id__equal(&s2, &ad->id))
continue;
if (first)
ad->nr++;
@@ -1217,7 +1219,7 @@ static void print_percore_thread(struct perf_stat_config *config,
s2 = config->aggr_get_id(config, cpu);
for (s = 0; s < config->aggr_map->nr; s++) {
id = config->aggr_map->map[s];
- if (cpu_map__compare_aggr_cpu_id(s2, id))
+ if (aggr_cpu_id__equal(&s2, &id))
break;
}

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:54

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 13/22] perf cpumap: Rename empty functions.

Remove cpu_map from name as a cpu_map isn't used. Pass a const pointer
rather than by value to avoid unnecessary copying.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-stat.c | 12 ++++++------
tools/perf/util/cpumap.c | 24 ++++++++++++------------
tools/perf/util/cpumap.h | 4 ++--
tools/perf/util/stat-display.c | 10 +++++-----
4 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 79a435573b44..a3575b27015b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1325,9 +1325,9 @@ static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __
static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
aggr_get_id_t get_id, int cpu)
{
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

- if (cpu_map__aggr_cpu_id_is_empty(config->cpus_aggr_map->map[cpu]))
+ if (aggr_cpu_id__is_empty(&config->cpus_aggr_map->map[cpu]))
config->cpus_aggr_map->map[cpu] = get_id(config, cpu);

id = config->cpus_aggr_map->map[cpu];
@@ -1472,7 +1472,7 @@ static inline int perf_env__get_cpu(void *data, struct perf_cpu_map *map, int id
static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

if (cpu != -1)
id.socket = env->cpu[cpu].socket_id;
@@ -1483,7 +1483,7 @@ static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(int cpu, void *data)
static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

if (cpu != -1) {
/*
@@ -1501,7 +1501,7 @@ static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(int cpu, void *data)
static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)
{
struct perf_env *env = data;
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

if (cpu != -1) {
/*
@@ -1519,7 +1519,7 @@ static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(int cpu, void *data)

static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
{
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

id.node = perf_env__numa_node(data, cpu);
return id;
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 8fa00a6221c8..b3e1304aca0c 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -104,7 +104,7 @@ struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr)

cpus->nr = nr;
for (i = 0; i < nr; i++)
- cpus->map[i] = cpu_map__empty_aggr_cpu_id();
+ cpus->map[i] = aggr_cpu_id__empty();

refcount_set(&cpus->refcnt, 1);
}
@@ -130,7 +130,7 @@ int cpu_map__get_socket_id(int cpu)

struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
{
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

id.socket = cpu_map__get_socket_id(cpu);
return id;
@@ -209,7 +209,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
* make a unique ID.
*/
id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
- if (cpu_map__aggr_cpu_id_is_empty(id))
+ if (aggr_cpu_id__is_empty(&id))
return id;

id.die = die;
@@ -234,7 +234,7 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)

/* cpu_map__get_die returns a struct with socket and die set*/
id = cpu_map__get_die_aggr_by_cpu(cpu, data);
- if (cpu_map__aggr_cpu_id_is_empty(id))
+ if (aggr_cpu_id__is_empty(&id))
return id;

/*
@@ -248,7 +248,7 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)

struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
{
- struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
+ struct aggr_cpu_id id = aggr_cpu_id__empty();

id.node = cpu_map__get_node_id(cpu);
return id;
@@ -602,16 +602,16 @@ bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b
a->core == b->core;
}

-bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a)
+bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a)
{
- return a.thread == -1 &&
- a.node == -1 &&
- a.socket == -1 &&
- a.die == -1 &&
- a.core == -1;
+ return a->thread == -1 &&
+ a->node == -1 &&
+ a->socket == -1 &&
+ a->die == -1 &&
+ a->core == -1;
}

-struct aggr_cpu_id cpu_map__empty_aggr_cpu_id(void)
+struct aggr_cpu_id aggr_cpu_id__empty(void)
{
struct aggr_cpu_id ret = {
.thread = -1,
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 652b76c69376..9589b0001a28 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -68,7 +68,7 @@ int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);

bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
-bool cpu_map__aggr_cpu_id_is_empty(struct aggr_cpu_id a);
-struct aggr_cpu_id cpu_map__empty_aggr_cpu_id(void);
+bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
+struct aggr_cpu_id aggr_cpu_id__empty(void);

#endif /* __PERF_CPUMAP_H */
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 0241436bb1fb..870b1db71fbc 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -698,7 +698,7 @@ static void print_counter_aggrdata(struct perf_stat_config *config,

uval = val * counter->scale;
if (cpu != -1) {
- id = cpu_map__empty_aggr_cpu_id();
+ id = aggr_cpu_id__empty();
id.core = cpu;
}
printout(config, id, nr, counter, uval,
@@ -780,7 +780,7 @@ static struct perf_aggr_thread_value *sort_aggr_thread(
continue;

buf[i].counter = counter;
- buf[i].id = cpu_map__empty_aggr_cpu_id();
+ buf[i].id = aggr_cpu_id__empty();
buf[i].id.thread = thread;
buf[i].uval = uval;
buf[i].val = val;
@@ -868,7 +868,7 @@ static void print_counter_aggr(struct perf_stat_config *config,
fprintf(output, "%s", prefix);

uval = cd.avg * counter->scale;
- printout(config, cpu_map__empty_aggr_cpu_id(), 0, counter, uval, prefix, cd.avg_running,
+ printout(config, aggr_cpu_id__empty(), 0, counter, uval, prefix, cd.avg_running,
cd.avg_enabled, cd.avg, &rt_stat);
if (!metric_only)
fprintf(output, "\n");
@@ -911,7 +911,7 @@ static void print_counter(struct perf_stat_config *config,
fprintf(output, "%s", prefix);

uval = val * counter->scale;
- id = cpu_map__empty_aggr_cpu_id();
+ id = aggr_cpu_id__empty();
id.core = cpu;
printout(config, id, 0, counter, uval, prefix,
run, ena, 1.0, &rt_stat);
@@ -938,7 +938,7 @@ static void print_no_aggr_metric(struct perf_stat_config *config,
if (prefix)
fputs(prefix, config->output);
evlist__for_each_entry(evlist, counter) {
- id = cpu_map__empty_aggr_cpu_id();
+ id = aggr_cpu_id__empty();
id.core = cpu;
if (first) {
aggr_printout(config, counter, id, 0);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:46:58

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 16/22] perf cpumap: Remove cpu_map__cpu, use libperf function.

Switch the remaining few users of cpu_map__cpu to perf_cpu_map__cpu and
remove the function.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-ftrace.c | 2 +-
tools/perf/util/cpumap.c | 9 ++-------
tools/perf/util/cpumap.h | 1 -
3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 87cb11a7a3ee..5a65dc7a1580 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -303,7 +303,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
int ret;
int last_cpu;

- last_cpu = cpu_map__cpu(cpumap, cpumap->nr - 1);
+ last_cpu = perf_cpu_map__cpu(cpumap, cpumap->nr - 1);
mask_size = last_cpu / 4 + 2; /* one more byte for EOS */
mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index e0d7f1da5858..32f9fc2dd389 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -485,11 +485,6 @@ bool cpu_map__has(struct perf_cpu_map *cpus, int cpu)
return perf_cpu_map__idx(cpus, cpu) != -1;
}

-int cpu_map__cpu(struct perf_cpu_map *cpus, int idx)
-{
- return cpus->map[idx];
-}
-
size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
{
int i, cpu, start = -1;
@@ -547,7 +542,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
int i, cpu;
char *ptr = buf;
unsigned char *bitmap;
- int last_cpu = cpu_map__cpu(map, map->nr - 1);
+ int last_cpu = perf_cpu_map__cpu(map, map->nr - 1);

if (buf == NULL)
return 0;
@@ -559,7 +554,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
}

for (i = 0; i < map->nr; i++) {
- cpu = cpu_map__cpu(map, i);
+ cpu = perf_cpu_map__cpu(map, i);
bitmap[cpu / 8] |= 1 << (cpu % 8);
}

diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index a053bf31a3f0..87545bcd461d 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -80,7 +80,6 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
struct aggr_cpu_id (*f)(int cpu, void *data),
void *data);

-int cpu_map__cpu(struct perf_cpu_map *cpus, int idx);
bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);

bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:00

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 15/22] perf cpumap: Remove map from function names that don't use a map.

Move to the cpu name and document for consistency.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.c | 12 ++++++------
tools/perf/util/cpumap.h | 19 ++++++++++++++++---
tools/perf/util/env.c | 6 +++---
tools/perf/util/stat.c | 4 ++--
4 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 1626b0991408..e0d7f1da5858 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -126,7 +126,7 @@ static int cpu__get_topology_int(int cpu, const char *name, int *value)
return sysfs__read_int(path, value);
}

-int cpu_map__get_socket_id(int cpu)
+int cpu__get_socket_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "physical_package_id", &value);
return ret ?: value;
@@ -136,7 +136,7 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
{
struct aggr_cpu_id id = aggr_cpu_id__empty();

- id.socket = cpu_map__get_socket_id(cpu);
+ id.socket = cpu__get_socket_id(cpu);
return id;
}

@@ -190,7 +190,7 @@ int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
return 0;
}

-int cpu_map__get_die_id(int cpu)
+int cpu__get_die_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "die_id", &value);

@@ -202,7 +202,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
struct aggr_cpu_id id;
int die;

- die = cpu_map__get_die_id(cpu);
+ die = cpu__get_die_id(cpu);
/* There is no die_id on legacy system. */
if (die == -1)
die = 0;
@@ -220,7 +220,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
return id;
}

-int cpu_map__get_core_id(int cpu)
+int cpu__get_core_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
return ret ?: value;
@@ -229,7 +229,7 @@ int cpu_map__get_core_id(int cpu)
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
{
struct aggr_cpu_id id;
- int core = cpu_map__get_core_id(cpu);
+ int core = cpu__get_core_id(cpu);

/* cpu_map__get_die returns a struct with socket and die set*/
id = cpu_map__get_die_aggr_by_cpu(cpu, data);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index f849f01c5860..a053bf31a3f0 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -39,11 +39,8 @@ struct perf_cpu_map *cpu_map__new_data(struct perf_record_cpu_map_data *data);
size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
-int cpu_map__get_socket_id(int cpu);
struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
-int cpu_map__get_die_id(int cpu);
struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
-int cpu_map__get_core_id(int cpu);
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
@@ -62,6 +59,22 @@ int cpu__max_present_cpu(void);
* /sys/devices/system/node/nodeX for the given CPU.
*/
int cpu__get_node(int cpu);
+/**
+ * cpu__get_socket_id - Returns the socket number as read from
+ * /sys/devices/system/cpu/cpuX/topology/physical_package_id for the given CPU.
+ */
+int cpu__get_socket_id(int cpu);
+/**
+ * cpu__get_die_id - Returns the die id as read from
+ * /sys/devices/system/cpu/cpuX/topology/die_id for the given CPU.
+ */
+int cpu__get_die_id(int cpu);
+/**
+ * cpu__get_core_id - Returns the core id as read from
+ * /sys/devices/system/cpu/cpuX/topology/core_id for the given CPU.
+ */
+int cpu__get_core_id(int cpu);
+

int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
struct aggr_cpu_id (*f)(int cpu, void *data),
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index b9904896eb97..fd12c0dcaefb 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -302,9 +302,9 @@ int perf_env__read_cpu_topology_map(struct perf_env *env)
return -ENOMEM;

for (cpu = 0; cpu < nr_cpus; ++cpu) {
- env->cpu[cpu].core_id = cpu_map__get_core_id(cpu);
- env->cpu[cpu].socket_id = cpu_map__get_socket_id(cpu);
- env->cpu[cpu].die_id = cpu_map__get_die_id(cpu);
+ env->cpu[cpu].core_id = cpu__get_core_id(cpu);
+ env->cpu[cpu].socket_id = cpu__get_socket_id(cpu);
+ env->cpu[cpu].die_id = cpu__get_die_id(cpu);
}

env->nr_cpus_avail = nr_cpus;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 5ed99bcfe91e..5c24aca0968c 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -328,7 +328,7 @@ static int check_per_pkg(struct evsel *counter,
if (!(vals->run && vals->ena))
return 0;

- s = cpu_map__get_socket_id(cpu);
+ s = cpu__get_socket_id(cpu);
if (s < 0)
return -1;

@@ -336,7 +336,7 @@ static int check_per_pkg(struct evsel *counter,
* On multi-die system, die_id > 0. On no-die system, die_id = 0.
* We use hashmap(socket, die) to check the used socket+die pair.
*/
- d = cpu_map__get_die_id(cpu);
+ d = cpu__get_die_id(cpu);
if (d < 0)
return -1;

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:03

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 14/22] perf cpumap: Document cpu__get_node and remove redundant function

cpu_map__get_node_id isn't used externally and merely delegates to
cpu__get_node.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.c | 11 +++++------
tools/perf/util/cpumap.h | 5 ++++-
2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index b3e1304aca0c..1626b0991408 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -16,6 +16,10 @@
static int max_cpu_num;
static int max_present_cpu_num;
static int max_node_num;
+/**
+ * The numa node X as read from /sys/devices/system/node/nodeX indexed by the
+ * CPU number.
+ */
static int *cpunode_map;

static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
@@ -222,11 +226,6 @@ int cpu_map__get_core_id(int cpu)
return ret ?: value;
}

-int cpu_map__get_node_id(int cpu)
-{
- return cpu__get_node(cpu);
-}
-
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
{
struct aggr_cpu_id id;
@@ -250,7 +249,7 @@ struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unu
{
struct aggr_cpu_id id = aggr_cpu_id__empty();

- id.node = cpu_map__get_node_id(cpu);
+ id.node = cpu__get_node(cpu);
return id;
}

diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 9589b0001a28..f849f01c5860 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -45,7 +45,6 @@ int cpu_map__get_die_id(int cpu);
struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
int cpu_map__get_core_id(int cpu);
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
-int cpu_map__get_node_id(int cpu);
struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
@@ -58,6 +57,10 @@ int cpu__setup_cpunode_map(void);
int cpu__max_node(void);
int cpu__max_cpu(void);
int cpu__max_present_cpu(void);
+/**
+ * cpu__get_node - Returns the numa node X as read from
+ * /sys/devices/system/node/nodeX for the given CPU.
+ */
int cpu__get_node(int cpu);

int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:05

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 17/22] perf cpumap: Refactor cpu_map__build_map

Turn it into a cpu_aggr_map__new. Pass helper functions. Refactor
builtin-stat calls to manually pass function pointers. Try to reduce
some copy-paste code.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-stat.c | 188 ++++++++++++++++++++------------------
tools/perf/util/cpumap.c | 59 +++++-------
tools/perf/util/cpumap.h | 16 ++--
3 files changed, 131 insertions(+), 132 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a3575b27015b..e318b41b67cc 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1298,6 +1298,17 @@ static struct option stat_options[] = {
OPT_END()
};

+static const char *const aggr_mode__string[] = {
+ [AGGR_CORE] = "core",
+ [AGGR_DIE] = "die",
+ [AGGR_GLOBAL] = "global",
+ [AGGR_NODE] = "node",
+ [AGGR_NONE] = "none",
+ [AGGR_SOCKET] = "socket",
+ [AGGR_THREAD] = "thread",
+ [AGGR_UNSET] = "unset",
+};
+
static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
int cpu)
{
@@ -1370,54 +1381,67 @@ static bool term_percore_set(void)
return false;
}

-static int perf_stat_init_aggr_mode(void)
+static aggr_cpu_id_get_t aggr_mode__get_aggr(enum aggr_mode aggr_mode)
{
- int nr;
+ switch (aggr_mode) {
+ case AGGR_SOCKET:
+ return cpu_map__get_socket_aggr_by_cpu;
+ case AGGR_DIE:
+ return cpu_map__get_die_aggr_by_cpu;
+ case AGGR_CORE:
+ return cpu_map__get_core_aggr_by_cpu;
+ case AGGR_NODE:
+ return cpu_map__get_node_aggr_by_cpu;
+ case AGGR_NONE:
+ if (term_percore_set())
+ return cpu_map__get_core_aggr_by_cpu;
+
+ return NULL;
+ case AGGR_GLOBAL:
+ case AGGR_THREAD:
+ case AGGR_UNSET:
+ default:
+ return NULL;
+ }
+}

- switch (stat_config.aggr_mode) {
+static aggr_get_id_t aggr_mode__get_id(enum aggr_mode aggr_mode)
+{
+ switch (aggr_mode) {
case AGGR_SOCKET:
- if (cpu_map__build_socket_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build socket map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_socket_cached;
- break;
+ return perf_stat__get_socket_cached;
case AGGR_DIE:
- if (cpu_map__build_die_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build die map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_die_cached;
- break;
+ return perf_stat__get_die_cached;
case AGGR_CORE:
- if (cpu_map__build_core_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build core map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_core_cached;
- break;
+ return perf_stat__get_core_cached;
case AGGR_NODE:
- if (cpu_map__build_node_map(evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build core map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_node_cached;
- break;
+ return perf_stat__get_node_cached;
case AGGR_NONE:
if (term_percore_set()) {
- if (cpu_map__build_core_map(evsel_list->core.cpus,
- &stat_config.aggr_map)) {
- perror("cannot build core map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_core_cached;
+ return perf_stat__get_core_cached;
}
- break;
+ return NULL;
case AGGR_GLOBAL:
case AGGR_THREAD:
case AGGR_UNSET:
default:
- break;
+ return NULL;
+ }
+}
+
+static int perf_stat_init_aggr_mode(void)
+{
+ int nr;
+ aggr_cpu_id_get_t f = aggr_mode__get_aggr(stat_config.aggr_mode);
+
+ if (f) {
+ stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus,
+ f, /*data=*/NULL);
+ if (!stat_config.aggr_map) {
+ pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
+ return -1;
+ }
+ stat_config.aggr_get_id = aggr_mode__get_id(stat_config.aggr_mode);
}

/*
@@ -1525,30 +1549,6 @@ static struct aggr_cpu_id perf_env__get_node_aggr_by_cpu(int cpu, void *data)
return id;
}

-static int perf_env__build_socket_map(struct perf_env *env, struct perf_cpu_map *cpus,
- struct cpu_aggr_map **sockp)
-{
- return cpu_map__build_map(cpus, sockp, perf_env__get_socket_aggr_by_cpu, env);
-}
-
-static int perf_env__build_die_map(struct perf_env *env, struct perf_cpu_map *cpus,
- struct cpu_aggr_map **diep)
-{
- return cpu_map__build_map(cpus, diep, perf_env__get_die_aggr_by_cpu, env);
-}
-
-static int perf_env__build_core_map(struct perf_env *env, struct perf_cpu_map *cpus,
- struct cpu_aggr_map **corep)
-{
- return cpu_map__build_map(cpus, corep, perf_env__get_core_aggr_by_cpu, env);
-}
-
-static int perf_env__build_node_map(struct perf_env *env, struct perf_cpu_map *cpus,
- struct cpu_aggr_map **nodep)
-{
- return cpu_map__build_map(cpus, nodep, perf_env__get_node_aggr_by_cpu, env);
-}
-
static struct aggr_cpu_id perf_stat__get_socket_file(struct perf_stat_config *config __maybe_unused,
int cpu)
{
@@ -1572,47 +1572,61 @@ static struct aggr_cpu_id perf_stat__get_node_file(struct perf_stat_config *conf
return perf_env__get_node_aggr_by_cpu(cpu, &perf_stat.session->header.env);
}

-static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
+static aggr_cpu_id_get_t aggr_mode__get_aggr_file(enum aggr_mode aggr_mode)
{
- struct perf_env *env = &st->session->header.env;
+ switch (aggr_mode) {
+ case AGGR_SOCKET:
+ return perf_env__get_socket_aggr_by_cpu;
+ case AGGR_DIE:
+ return perf_env__get_die_aggr_by_cpu;
+ case AGGR_CORE:
+ return perf_env__get_core_aggr_by_cpu;
+ case AGGR_NODE:
+ return perf_env__get_node_aggr_by_cpu;
+ case AGGR_NONE:
+ case AGGR_GLOBAL:
+ case AGGR_THREAD:
+ case AGGR_UNSET:
+ default:
+ return NULL;
+ }
+}

- switch (stat_config.aggr_mode) {
+static aggr_get_id_t aggr_mode__get_id_file(enum aggr_mode aggr_mode)
+{
+ switch (aggr_mode) {
case AGGR_SOCKET:
- if (perf_env__build_socket_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build socket map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_socket_file;
- break;
+ return perf_stat__get_socket_file;
case AGGR_DIE:
- if (perf_env__build_die_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build die map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_die_file;
- break;
+ return perf_stat__get_die_file;
case AGGR_CORE:
- if (perf_env__build_core_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build core map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_core_file;
- break;
+ return perf_stat__get_core_file;
case AGGR_NODE:
- if (perf_env__build_node_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
- perror("cannot build core map");
- return -1;
- }
- stat_config.aggr_get_id = perf_stat__get_node_file;
- break;
+ return perf_stat__get_node_file;
case AGGR_NONE:
case AGGR_GLOBAL:
case AGGR_THREAD:
case AGGR_UNSET:
default:
- break;
+ return NULL;
}
+}
+
+static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
+{
+ struct perf_env *env = &st->session->header.env;

+ aggr_cpu_id_get_t f = aggr_mode__get_aggr_file(stat_config.aggr_mode);
+
+ if (!f)
+ return 0;
+
+ stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus, f, env);
+ if (!stat_config.aggr_map) {
+ pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
+ return -1;
+ }
+ stat_config.aggr_get_id = aggr_mode__get_id_file(stat_config.aggr_mode);
return 0;
}

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 32f9fc2dd389..ba4468f691c8 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -140,7 +140,7 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
return id;
}

-static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
+static int aggr_cpu_id__cmp(const void *a_pointer, const void *b_pointer)
{
struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
struct aggr_cpu_id *b = (struct aggr_cpu_id *)b_pointer;
@@ -157,37 +157,40 @@ static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
return a->thread - b->thread;
}

-int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
- struct aggr_cpu_id (*f)(int cpu, void *data),
- void *data)
+struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
+ aggr_cpu_id_get_t f,
+ void *data)
{
- int nr = cpus->nr;
- struct cpu_aggr_map *c = cpu_aggr_map__empty_new(nr);
- int cpu, s2;
- struct aggr_cpu_id s1;
+ int cpu, idx;
+ struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);

if (!c)
- return -1;
+ return NULL;

/* Reset size as it may only be partially filled */
c->nr = 0;

- for (cpu = 0; cpu < nr; cpu++) {
- s1 = f(cpu, data);
- for (s2 = 0; s2 < c->nr; s2++) {
- if (aggr_cpu_id__equal(&s1, &c->map[s2]))
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+ bool duplicate = false;
+ struct aggr_cpu_id cpu_id = f(cpu, data);
+
+ for (int j = 0; j < c->nr; j++) {
+ if (aggr_cpu_id__equal(&cpu_id, &c->map[j])) {
+ duplicate = true;
break;
+ }
}
- if (s2 == c->nr) {
- c->map[c->nr] = s1;
+ if (!duplicate) {
+ c->map[c->nr] = cpu_id;
c->nr++;
}
}
+
/* ensure we process id in increasing order */
- qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), cmp_aggr_cpu_id);
+ qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);
+
+ return c;

- *res = c;
- return 0;
}

int cpu__get_die_id(int cpu)
@@ -253,26 +256,6 @@ struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unu
return id;
}

-int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
-{
- return cpu_map__build_map(cpus, sockp, cpu_map__get_socket_aggr_by_cpu, NULL);
-}
-
-int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep)
-{
- return cpu_map__build_map(cpus, diep, cpu_map__get_die_aggr_by_cpu, NULL);
-}
-
-int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep)
-{
- return cpu_map__build_map(cpus, corep, cpu_map__get_core_aggr_by_cpu, NULL);
-}
-
-int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **numap)
-{
- return cpu_map__build_map(cpus, numap, cpu_map__get_node_aggr_by_cpu, NULL);
-}
-
/* setup simple routines to easily access node numbers given a cpu number */
static int get_max_num(char *path, int *max)
{
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 87545bcd461d..02e8c80fea0a 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -43,10 +43,6 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
-int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
-int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
-int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
-int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **nodep);
const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */

int cpu__setup_cpunode_map(void);
@@ -75,10 +71,16 @@ int cpu__get_die_id(int cpu);
*/
int cpu__get_core_id(int cpu);

+typedef struct aggr_cpu_id (*aggr_cpu_id_get_t)(int cpu, void *data);

-int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
- struct aggr_cpu_id (*f)(int cpu, void *data),
- void *data);
+/**
+ * cpu_aggr_map__new - Create a cpu_aggr_map with an aggr_cpu_id for each cpu in
+ * cpus. The aggr_cpu_id is created with 'f' that may have a data value passed
+ * to it. The cpu_aggr_map is sorted with duplicate values removed.
+ */
+struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
+ aggr_cpu_id_get_t f,
+ void *data);

bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:12

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 18/22] perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions

The functions don't use a cpu_map so reduce them to being like
constructors of aggr_cpu_id.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-stat.c | 18 +++++++++---------
tools/perf/tests/topology.c | 8 ++++----
tools/perf/util/cpumap.c | 14 +++++++-------
tools/perf/util/cpumap.h | 29 +++++++++++++++++++++++++----
4 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e318b41b67cc..46097c46f345 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1312,25 +1312,25 @@ static const char *const aggr_mode__string[] = {
static struct aggr_cpu_id perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
int cpu)
{
- return cpu_map__get_socket_aggr_by_cpu(cpu, /*data=*/NULL);
+ return aggr_cpu_id__socket(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_die(struct perf_stat_config *config __maybe_unused,
int cpu)
{
- return cpu_map__get_die_aggr_by_cpu(cpu, /*data=*/NULL);
+ return aggr_cpu_id__die(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_core(struct perf_stat_config *config __maybe_unused,
int cpu)
{
- return cpu_map__get_core_aggr_by_cpu(cpu, /*data=*/NULL);
+ return aggr_cpu_id__core(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_node(struct perf_stat_config *config __maybe_unused,
int cpu)
{
- return cpu_map__get_node_aggr_by_cpu(cpu, /*data=*/NULL);
+ return aggr_cpu_id__node(cpu, /*data=*/NULL);
}

static struct aggr_cpu_id perf_stat__get_aggr(struct perf_stat_config *config,
@@ -1385,16 +1385,16 @@ static aggr_cpu_id_get_t aggr_mode__get_aggr(enum aggr_mode aggr_mode)
{
switch (aggr_mode) {
case AGGR_SOCKET:
- return cpu_map__get_socket_aggr_by_cpu;
+ return aggr_cpu_id__socket;
case AGGR_DIE:
- return cpu_map__get_die_aggr_by_cpu;
+ return aggr_cpu_id__die;
case AGGR_CORE:
- return cpu_map__get_core_aggr_by_cpu;
+ return aggr_cpu_id__core;
case AGGR_NODE:
- return cpu_map__get_node_aggr_by_cpu;
+ return aggr_cpu_id__node;
case AGGR_NONE:
if (term_percore_set())
- return cpu_map__get_core_aggr_by_cpu;
+ return aggr_cpu_id__core;

return NULL;
case AGGR_GLOBAL:
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 5992b323c4f5..0cb7b015b4b9 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -121,7 +121,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that core ID contains socket, die and core
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_core_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+ id = aggr_cpu_id__core(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Core map - Core ID doesn't match",
session->header.env.cpu[map->map[i]].core_id == id.core);

@@ -136,7 +136,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that die ID contains socket and die
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_die_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+ id = aggr_cpu_id__die(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Die map - Socket ID doesn't match",
session->header.env.cpu[map->map[i]].socket_id == id.socket);

@@ -150,7 +150,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that socket ID contains only socket
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_socket_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+ id = aggr_cpu_id__socket(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
session->header.env.cpu[map->map[i]].socket_id == id.socket);

@@ -162,7 +162,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)

// Test that node ID contains only node
for (i = 0; i < map->nr; i++) {
- id = cpu_map__get_node_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);
+ id = aggr_cpu_id__node(perf_cpu_map__cpu(map, i), NULL);
TEST_ASSERT_VAL("Node map - Node ID doesn't match",
cpu__get_node(map->map[i]) == id.node);
TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index ba4468f691c8..0e325559c33c 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -132,7 +132,7 @@ int cpu__get_socket_id(int cpu)
return ret ?: value;
}

-struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
+struct aggr_cpu_id aggr_cpu_id__socket(int cpu, void *data __maybe_unused)
{
struct aggr_cpu_id id = aggr_cpu_id__empty();

@@ -200,7 +200,7 @@ int cpu__get_die_id(int cpu)
return ret ?: value;
}

-struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data)
{
struct aggr_cpu_id id;
int die;
@@ -215,7 +215,7 @@ struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
* with the socket ID and then add die to
* make a unique ID.
*/
- id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
+ id = aggr_cpu_id__socket(cpu, data);
if (aggr_cpu_id__is_empty(&id))
return id;

@@ -229,13 +229,13 @@ int cpu__get_core_id(int cpu)
return ret ?: value;
}

-struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
+struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data)
{
struct aggr_cpu_id id;
int core = cpu__get_core_id(cpu);

- /* cpu_map__get_die returns a struct with socket and die set*/
- id = cpu_map__get_die_aggr_by_cpu(cpu, data);
+ /* aggr_cpu_id__die returns a struct with socket and die set*/
+ id = aggr_cpu_id__die(cpu, data);
if (aggr_cpu_id__is_empty(&id))
return id;

@@ -248,7 +248,7 @@ struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)

}

-struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
+struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data __maybe_unused)
{
struct aggr_cpu_id id = aggr_cpu_id__empty();

diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 02e8c80fea0a..15043e764fa6 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -39,10 +39,6 @@ struct perf_cpu_map *cpu_map__new_data(struct perf_record_cpu_map_data *data);
size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
-struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
-struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */

int cpu__setup_cpunode_map(void);
@@ -88,4 +84,29 @@ bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b
bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
struct aggr_cpu_id aggr_cpu_id__empty(void);

+
+/**
+ * aggr_cpu_id__socket - Create an aggr_cpu_id with the socket populated with
+ * the socket for cpu. The function signature is compatible with
+ * aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__socket(int cpu, void *data);
+/**
+ * aggr_cpu_id__die - Create an aggr_cpu_id with the die and socket populated
+ * with the die and socket for cpu. The function signature is compatible with
+ * aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__die(int cpu, void *data);
+/**
+ * aggr_cpu_id__core - Create an aggr_cpu_id with the core, die and socket
+ * populated with the core, die and socket for cpu. The function signature is
+ * compatible with aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__core(int cpu, void *data);
+/**
+ * aggr_cpu_id__node - Create an aggr_cpu_id with the numa node populated for
+ * cpu. The function signature is compatible with aggr_cpu_id_get_t.
+ */
+struct aggr_cpu_id aggr_cpu_id__node(int cpu, void *data);
+
#endif /* __PERF_CPUMAP_H */
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:14

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 19/22] perf cpumap: Move 'has' function to libperf

Make the cpu map argument const for consistency with the rest of the
API. Modify cpu_map__idx accordingly.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/lib/perf/cpumap.c | 7 ++++++-
tools/lib/perf/include/internal/cpumap.h | 2 +-
tools/lib/perf/include/perf/cpumap.h | 1 +
tools/perf/arch/arm/util/cs-etm.c | 16 ++++++++--------
tools/perf/builtin-sched.c | 6 +++---
tools/perf/tests/topology.c | 2 +-
tools/perf/util/cpumap.c | 5 -----
tools/perf/util/cpumap.h | 2 --
tools/perf/util/cputopo.c | 2 +-
9 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index adaad3dddf6e..3c36a06771af 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -268,7 +268,7 @@ bool perf_cpu_map__empty(const struct perf_cpu_map *map)
return map ? map->map[0] == -1 : true;
}

-int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu)
+int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
{
int low = 0, high = cpus->nr;

@@ -288,6 +288,11 @@ int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu)
return -1;
}

+bool perf_cpu_map__has(const struct perf_cpu_map *cpus, int cpu)
+{
+ return perf_cpu_map__idx(cpus, cpu) != -1;
+}
+
int perf_cpu_map__max(struct perf_cpu_map *map)
{
// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 1c1726f4a04e..6a0ff0abcbc4 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -21,6 +21,6 @@ struct perf_cpu_map {
#define MAX_NR_CPUS 2048
#endif

-int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu);
+int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu);

#endif /* __LIBPERF_INTERNAL_CPUMAP_H */
diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
index 7c27766ea0bf..3f1c0afa3ccd 100644
--- a/tools/lib/perf/include/perf/cpumap.h
+++ b/tools/lib/perf/include/perf/cpumap.h
@@ -20,6 +20,7 @@ LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
LIBPERF_API bool perf_cpu_map__empty(const struct perf_cpu_map *map);
LIBPERF_API int perf_cpu_map__max(struct perf_cpu_map *map);
+LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);

#define perf_cpu_map__for_each_cpu(cpu, idx, cpus) \
for ((idx) = 0, (cpu) = perf_cpu_map__cpu(cpus, idx); \
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 293a23bf8be3..76c66780617c 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -204,8 +204,8 @@ static int cs_etm_set_option(struct auxtrace_record *itr,

/* Set option of each CPU we have */
for (i = 0; i < cpu__max_cpu(); i++) {
- if (!cpu_map__has(event_cpus, i) ||
- !cpu_map__has(online_cpus, i))
+ if (!perf_cpu_map__has(event_cpus, i) ||
+ !perf_cpu_map__has(online_cpus, i))
continue;

if (option & BIT(ETM_OPT_CTXTID)) {
@@ -542,8 +542,8 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
/* cpu map is not empty, we have specific CPUs to work with */
if (!perf_cpu_map__empty(event_cpus)) {
for (i = 0; i < cpu__max_cpu(); i++) {
- if (!cpu_map__has(event_cpus, i) ||
- !cpu_map__has(online_cpus, i))
+ if (!perf_cpu_map__has(event_cpus, i) ||
+ !perf_cpu_map__has(online_cpus, i))
continue;

if (cs_etm_is_ete(itr, i))
@@ -556,7 +556,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
} else {
/* get configuration for all CPUs in the system */
for (i = 0; i < cpu__max_cpu(); i++) {
- if (!cpu_map__has(online_cpus, i))
+ if (!perf_cpu_map__has(online_cpus, i))
continue;

if (cs_etm_is_ete(itr, i))
@@ -741,8 +741,8 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
} else {
/* Make sure all specified CPUs are online */
for (i = 0; i < perf_cpu_map__nr(event_cpus); i++) {
- if (cpu_map__has(event_cpus, i) &&
- !cpu_map__has(online_cpus, i))
+ if (perf_cpu_map__has(event_cpus, i) &&
+ !perf_cpu_map__has(online_cpus, i))
return -EINVAL;
}

@@ -763,7 +763,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
offset = CS_ETM_SNAPSHOT + 1;

for (i = 0; i < cpu__max_cpu() && offset < priv_size; i++)
- if (cpu_map__has(cpu_map, i))
+ if (perf_cpu_map__has(cpu_map, i))
cs_etm_get_metadata(i, &offset, itr, info);

perf_cpu_map__put(online_cpus);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 4527f632ebe4..9da1da4749c9 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1617,10 +1617,10 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
if (curr_thread && thread__has_color(curr_thread))
pid_color = COLOR_PIDS;

- if (sched->map.cpus && !cpu_map__has(sched->map.cpus, cpu))
+ if (sched->map.cpus && !perf_cpu_map__has(sched->map.cpus, cpu))
continue;

- if (sched->map.color_cpus && cpu_map__has(sched->map.color_cpus, cpu))
+ if (sched->map.color_cpus && perf_cpu_map__has(sched->map.color_cpus, cpu))
cpu_color = COLOR_CPUS;

if (cpu != this_cpu)
@@ -1639,7 +1639,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
color_fprintf(stdout, color, " ");
}

- if (sched->map.cpus && !cpu_map__has(sched->map.cpus, this_cpu))
+ if (sched->map.cpus && !perf_cpu_map__has(sched->map.cpus, this_cpu))
goto out;

timestamp__scnprintf_usec(timestamp, stimestamp, sizeof(stimestamp));
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index 0cb7b015b4b9..cb29ea7ec409 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -112,7 +112,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
TEST_ASSERT_VAL("Session header CPU map not set", session->header.env.cpu);

for (i = 0; i < session->header.env.nr_cpus_avail; i++) {
- if (!cpu_map__has(map, i))
+ if (!perf_cpu_map__has(map, i))
continue;
pr_debug("CPU %d, core %d, socket %d\n", i,
session->header.env.cpu[i].core_id,
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 0e325559c33c..8a72ee996722 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -463,11 +463,6 @@ int cpu__setup_cpunode_map(void)
return 0;
}

-bool cpu_map__has(struct perf_cpu_map *cpus, int cpu)
-{
- return perf_cpu_map__idx(cpus, cpu) != -1;
-}
-
size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
{
int i, cpu, start = -1;
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 15043e764fa6..832fc53f3c11 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -78,8 +78,6 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
aggr_cpu_id_get_t f,
void *data);

-bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
-
bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
struct aggr_cpu_id aggr_cpu_id__empty(void);
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index 51b429c86f98..8affb37d90e7 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -218,7 +218,7 @@ struct cpu_topology *cpu_topology__new(void)
tp->core_cpus_list = addr;

for (i = 0; i < nr; i++) {
- if (!cpu_map__has(map, i))
+ if (!perf_cpu_map__has(map, i))
continue;

ret = build_cpu_topology(tp, i);
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:21

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 21/22] perf cpumap: Trim the cpu_aggr_map

cpu_aggr_map__new removes duplicates, when this happens shrink the
array.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 8a72ee996722..985c87f1f1ca 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -185,7 +185,12 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
c->nr++;
}
}
-
+ /* Trim. */
+ if (c->nr != cpus->nr) {
+ c = realloc(c, sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
+ if (!c)
+ return NULL;
+ }
/* ensure we process id in increasing order */
qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);

--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:24

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 20/22] perf cpumap: Add some comments to cpu_aggr_map

Move cpu_aggr_map__empty_new to be with other cpu_aggr_map function.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/cpumap.h | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 832fc53f3c11..8acef8ff8753 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -24,16 +24,18 @@ struct aggr_cpu_id {
int core;
};

+/** A collection of aggr_cpu_id values, the "built" version is sorted and uniqued. */
struct cpu_aggr_map {
refcount_t refcnt;
+ /** Number of valid entries. */
int nr;
+ /** The entries. */
struct aggr_cpu_id map[];
};

struct perf_record_cpu_map_data;

struct perf_cpu_map *perf_cpu_map__empty_new(int nr);
-struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr);

struct perf_cpu_map *cpu_map__new_data(struct perf_record_cpu_map_data *data);
size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
@@ -67,6 +69,12 @@ int cpu__get_die_id(int cpu);
*/
int cpu__get_core_id(int cpu);

+/**
+ * cpu_aggr_map__empty_new - Create a cpu_aggr_map of size nr with every entry
+ * being empty.
+ */
+struct cpu_aggr_map *cpu_aggr_map__empty_new(int nr);
+
typedef struct aggr_cpu_id (*aggr_cpu_id_get_t)(int cpu, void *data);

/**
--
2.34.1.400.ga245620fadb-goog


2021-12-08 02:47:25

by Ian Rogers

[permalink] [raw]
Subject: [PATCH 22/22] perf stat: Fix memory leak in check_per_pkg

If the key is already present then free the key used for lookup.

Found with:
$ perf stat -M IO_Read_BW /bin/true

==1749112==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 4 object(s) allocated from:
#0 0x7f6f6fa7d7cf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x55acecd9d7a6 in check_per_pkg util/stat.c:343
#2 0x55acecd9d9c5 in process_counter_values util/stat.c:365
#3 0x55acecd9e0ab in process_counter_maps util/stat.c:421
#4 0x55acecd9e292 in perf_stat_process_counter util/stat.c:443
#5 0x55aceca8553e in read_counters ./tools/perf/builtin-stat.c:470
#6 0x55aceca88fe3 in __run_perf_stat ./tools/perf/builtin-stat.c:1023
#7 0x55aceca89146 in run_perf_stat ./tools/perf/builtin-stat.c:1048
#8 0x55aceca90858 in cmd_stat ./tools/perf/builtin-stat.c:2555
#9 0x55acecc05fa5 in run_builtin ./tools/perf/perf.c:313
#10 0x55acecc064fe in handle_internal_command ./tools/perf/perf.c:365
#11 0x55acecc068bb in run_argv ./tools/perf/perf.c:409
#12 0x55acecc070aa in main ./tools/perf/perf.c:539

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/stat.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 5c24aca0968c..c69b221f5e3e 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -345,9 +345,10 @@ static int check_per_pkg(struct evsel *counter,
return -ENOMEM;

*key = (uint64_t)d << 32 | s;
- if (hashmap__find(mask, (void *)key, NULL))
+ if (hashmap__find(mask, (void *)key, NULL)) {
*skip = true;
- else
+ free(key);
+ } else
ret = hashmap__add(mask, (void *)key, (void *)1);

return ret;
--
2.34.1.400.ga245620fadb-goog


2021-12-08 12:06:14

by John Garry

[permalink] [raw]
Subject: Re: [PATCH 01/22] libperf: Add comments to perf_cpu_map.

On 08/12/2021 02:45, Ian Rogers wrote:
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 840d4032587b..1c1726f4a04e 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -4,9 +4,16 @@
>
> #include <linux/refcount.h>
>
> +/**
> + * A sized, reference counted, sorted array of integers representing CPU
> + * numbers. This is commonly used to capture which CPUs a PMU is associated
> + * with.
> + */
> struct perf_cpu_map {
> refcount_t refcnt;
> + /** Length of the map array. */
> int nr;
> + /** The CPU values. */
> int map[];

would simply more distinct names for the variables help instead of or in
addition to comments?

Generally developers don't always check comments where the struct is
defined when the meaning could be judged intuitively

Thanks,
John


2021-12-08 12:50:49

by John Garry

[permalink] [raw]
Subject: Re: [PATCH 02/22] perf stat: Add aggr creators that are passed a cpu.

On 08/12/2021 02:45, Ian Rogers wrote:
> The cpu_map and index can get confused. Add variants of the cpu_map__get
> routines that are passed a cpu. Make the existing cpu_map__get routines
> use the new functions with a view to remove them when no longer used.
>
> Signed-off-by: Ian Rogers<[email protected]>
> ---
> tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
> tools/perf/util/cpumap.h | 6 ++-
> 2 files changed, 51 insertions(+), 34 deletions(-)
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 87d3eca9b872..49fba2c53822 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
> return ret ?: value;
> }
>
> -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> - void *data __maybe_unused)
> +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
> {
> - int cpu;
> struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
>
> - if (idx > map->nr)
> - return id;
> -
> - cpu = map->map[idx];
> -
> id.socket = cpu_map__get_socket_id(cpu);
> return id;
> }
>
> +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,

This is code added in this patch - so is "idx" a cpu map index? that's
what the commit message implies.

regardless of this - you add code here and then remove it later in the
series. Can you arrange the series such that any code added in the
series is not removed (later in that series)? That's a general practice
we adhere to.

Thanks,
John

2021-12-08 14:34:30

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 01/22] libperf: Add comments to perf_cpu_map.

On Wed, Dec 8, 2021 at 4:06 AM John Garry <[email protected]> wrote:
>
> On 08/12/2021 02:45, Ian Rogers wrote:
> > diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> > index 840d4032587b..1c1726f4a04e 100644
> > --- a/tools/lib/perf/include/internal/cpumap.h
> > +++ b/tools/lib/perf/include/internal/cpumap.h
> > @@ -4,9 +4,16 @@
> >
> > #include <linux/refcount.h>
> >
> > +/**
> > + * A sized, reference counted, sorted array of integers representing CPU
> > + * numbers. This is commonly used to capture which CPUs a PMU is associated
> > + * with.
> > + */
> > struct perf_cpu_map {
> > refcount_t refcnt;
> > + /** Length of the map array. */
> > int nr;
> > + /** The CPU values. */
> > int map[];
>
> would simply more distinct names for the variables help instead of or in
> addition to comments?

Thanks John! I agree. The phrase that is often used is intention
revealing names. The kernel style for naming is to be brief:
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#naming
These names are both brief. nr is a little unusual, of course an
integer is a number - size and length are common names in situations
like these. In this case number makes sense as it is the number of
CPUs in the array, and there is a certain readability in saying number
of CPUs and not length or size of CPUs. The name map I have issue
with, it is always a smell if you are calling a variable a data type.
Given the convention in the context of this code I decided to leave
it. Something like array_of_cpu_values would be more intention
revealing but when run through the variable name shrinkifier could end
up as just being array, which would be little better than map.

The guidance on comments is that they are good and to focus on the
what of what the code is doing:
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#commenting
refcnt was intention revealing enough and so I didn't add a comment to it.

> Generally developers don't always check comments where the struct is
> defined when the meaning could be judged intuitively

Agreed. I think there could be a follow up to change to better names.
As I was lacking a better suggestion I think for the time being, and
in this patch set, we can keep things as they are.

Thanks,
Ian

> Thanks,
> John
>

2021-12-08 15:09:45

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 01/22] libperf: Add comments to perf_cpu_map.

On Wed, Dec 8, 2021 at 6:34 AM Ian Rogers <[email protected]> wrote:
>
> On Wed, Dec 8, 2021 at 4:06 AM John Garry <[email protected]> wrote:
> >
> > On 08/12/2021 02:45, Ian Rogers wrote:
> > > diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> > > index 840d4032587b..1c1726f4a04e 100644
> > > --- a/tools/lib/perf/include/internal/cpumap.h
> > > +++ b/tools/lib/perf/include/internal/cpumap.h
> > > @@ -4,9 +4,16 @@
> > >
> > > #include <linux/refcount.h>
> > >
> > > +/**
> > > + * A sized, reference counted, sorted array of integers representing CPU
> > > + * numbers. This is commonly used to capture which CPUs a PMU is associated
> > > + * with.
> > > + */
> > > struct perf_cpu_map {
> > > refcount_t refcnt;
> > > + /** Length of the map array. */
> > > int nr;
> > > + /** The CPU values. */
> > > int map[];
> >
> > would simply more distinct names for the variables help instead of or in
> > addition to comments?
>
> Thanks John! I agree. The phrase that is often used is intention
> revealing names. The kernel style for naming is to be brief:
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html#naming
> These names are both brief. nr is a little unusual, of course an
> integer is a number - size and length are common names in situations
> like these. In this case number makes sense as it is the number of
> CPUs in the array, and there is a certain readability in saying number
> of CPUs and not length or size of CPUs. The name map I have issue
> with, it is always a smell if you are calling a variable a data type.
> Given the convention in the context of this code I decided to leave
> it. Something like array_of_cpu_values would be more intention
> revealing but when run through the variable name shrinkifier could end
> up as just being array, which would be little better than map.
>
> The guidance on comments is that they are good and to focus on the
> what of what the code is doing:
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html#commenting
> refcnt was intention revealing enough and so I didn't add a comment to it.
>
> > Generally developers don't always check comments where the struct is
> > defined when the meaning could be judged intuitively
>
> Agreed. I think there could be a follow up to change to better names.
> As I was lacking a better suggestion I think for the time being, and
> in this patch set, we can keep things as they are.

A related follow up could be to switch perf_cpu_map to the more
conventional cpu_set_t:
https://man7.org/linux/man-pages/man3/CPU_SET.3.html
However, that wouldn't allow the reference count to be alongside the contents.

Thanks,
Ian

> Thanks,
> Ian
>
> > Thanks,
> > John
> >

2021-12-08 17:59:57

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH 19/22] perf cpumap: Move 'has' function to libperf

On Tue, Dec 07, 2021 at 06:46:04PM -0800, Ian Rogers wrote:
> Make the cpu map argument const for consistency with the rest of the
> API. Modify cpu_map__idx accordingly.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/lib/perf/cpumap.c | 7 ++++++-
> tools/lib/perf/include/internal/cpumap.h | 2 +-
> tools/lib/perf/include/perf/cpumap.h | 1 +
> tools/perf/arch/arm/util/cs-etm.c | 16 ++++++++--------

For the coresight part:

Reviewed-by: Mathieu Poirier <[email protected]>

> tools/perf/builtin-sched.c | 6 +++---
> tools/perf/tests/topology.c | 2 +-
> tools/perf/util/cpumap.c | 5 -----
> tools/perf/util/cpumap.h | 2 --
> tools/perf/util/cputopo.c | 2 +-
> 9 files changed, 21 insertions(+), 22 deletions(-)
>
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index adaad3dddf6e..3c36a06771af 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -268,7 +268,7 @@ bool perf_cpu_map__empty(const struct perf_cpu_map *map)
> return map ? map->map[0] == -1 : true;
> }
>
> -int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu)
> +int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu)
> {
> int low = 0, high = cpus->nr;
>
> @@ -288,6 +288,11 @@ int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu)
> return -1;
> }
>
> +bool perf_cpu_map__has(const struct perf_cpu_map *cpus, int cpu)
> +{
> + return perf_cpu_map__idx(cpus, cpu) != -1;
> +}
> +
> int perf_cpu_map__max(struct perf_cpu_map *map)
> {
> // cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 1c1726f4a04e..6a0ff0abcbc4 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -21,6 +21,6 @@ struct perf_cpu_map {
> #define MAX_NR_CPUS 2048
> #endif
>
> -int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu);
> +int perf_cpu_map__idx(const struct perf_cpu_map *cpus, int cpu);
>
> #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
> diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
> index 7c27766ea0bf..3f1c0afa3ccd 100644
> --- a/tools/lib/perf/include/perf/cpumap.h
> +++ b/tools/lib/perf/include/perf/cpumap.h
> @@ -20,6 +20,7 @@ LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
> LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
> LIBPERF_API bool perf_cpu_map__empty(const struct perf_cpu_map *map);
> LIBPERF_API int perf_cpu_map__max(struct perf_cpu_map *map);
> +LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
>
> #define perf_cpu_map__for_each_cpu(cpu, idx, cpus) \
> for ((idx) = 0, (cpu) = perf_cpu_map__cpu(cpus, idx); \
> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
> index 293a23bf8be3..76c66780617c 100644
> --- a/tools/perf/arch/arm/util/cs-etm.c
> +++ b/tools/perf/arch/arm/util/cs-etm.c
> @@ -204,8 +204,8 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
>
> /* Set option of each CPU we have */
> for (i = 0; i < cpu__max_cpu(); i++) {
> - if (!cpu_map__has(event_cpus, i) ||
> - !cpu_map__has(online_cpus, i))
> + if (!perf_cpu_map__has(event_cpus, i) ||
> + !perf_cpu_map__has(online_cpus, i))
> continue;
>
> if (option & BIT(ETM_OPT_CTXTID)) {
> @@ -542,8 +542,8 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
> /* cpu map is not empty, we have specific CPUs to work with */
> if (!perf_cpu_map__empty(event_cpus)) {
> for (i = 0; i < cpu__max_cpu(); i++) {
> - if (!cpu_map__has(event_cpus, i) ||
> - !cpu_map__has(online_cpus, i))
> + if (!perf_cpu_map__has(event_cpus, i) ||
> + !perf_cpu_map__has(online_cpus, i))
> continue;
>
> if (cs_etm_is_ete(itr, i))
> @@ -556,7 +556,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
> } else {
> /* get configuration for all CPUs in the system */
> for (i = 0; i < cpu__max_cpu(); i++) {
> - if (!cpu_map__has(online_cpus, i))
> + if (!perf_cpu_map__has(online_cpus, i))
> continue;
>
> if (cs_etm_is_ete(itr, i))
> @@ -741,8 +741,8 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
> } else {
> /* Make sure all specified CPUs are online */
> for (i = 0; i < perf_cpu_map__nr(event_cpus); i++) {
> - if (cpu_map__has(event_cpus, i) &&
> - !cpu_map__has(online_cpus, i))
> + if (perf_cpu_map__has(event_cpus, i) &&
> + !perf_cpu_map__has(online_cpus, i))
> return -EINVAL;
> }
>
> @@ -763,7 +763,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
> offset = CS_ETM_SNAPSHOT + 1;
>
> for (i = 0; i < cpu__max_cpu() && offset < priv_size; i++)
> - if (cpu_map__has(cpu_map, i))
> + if (perf_cpu_map__has(cpu_map, i))
> cs_etm_get_metadata(i, &offset, itr, info);
>
> perf_cpu_map__put(online_cpus);
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 4527f632ebe4..9da1da4749c9 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1617,10 +1617,10 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
> if (curr_thread && thread__has_color(curr_thread))
> pid_color = COLOR_PIDS;
>
> - if (sched->map.cpus && !cpu_map__has(sched->map.cpus, cpu))
> + if (sched->map.cpus && !perf_cpu_map__has(sched->map.cpus, cpu))
> continue;
>
> - if (sched->map.color_cpus && cpu_map__has(sched->map.color_cpus, cpu))
> + if (sched->map.color_cpus && perf_cpu_map__has(sched->map.color_cpus, cpu))
> cpu_color = COLOR_CPUS;
>
> if (cpu != this_cpu)
> @@ -1639,7 +1639,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
> color_fprintf(stdout, color, " ");
> }
>
> - if (sched->map.cpus && !cpu_map__has(sched->map.cpus, this_cpu))
> + if (sched->map.cpus && !perf_cpu_map__has(sched->map.cpus, this_cpu))
> goto out;
>
> timestamp__scnprintf_usec(timestamp, stimestamp, sizeof(stimestamp));
> diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
> index 0cb7b015b4b9..cb29ea7ec409 100644
> --- a/tools/perf/tests/topology.c
> +++ b/tools/perf/tests/topology.c
> @@ -112,7 +112,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
> TEST_ASSERT_VAL("Session header CPU map not set", session->header.env.cpu);
>
> for (i = 0; i < session->header.env.nr_cpus_avail; i++) {
> - if (!cpu_map__has(map, i))
> + if (!perf_cpu_map__has(map, i))
> continue;
> pr_debug("CPU %d, core %d, socket %d\n", i,
> session->header.env.cpu[i].core_id,
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 0e325559c33c..8a72ee996722 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -463,11 +463,6 @@ int cpu__setup_cpunode_map(void)
> return 0;
> }
>
> -bool cpu_map__has(struct perf_cpu_map *cpus, int cpu)
> -{
> - return perf_cpu_map__idx(cpus, cpu) != -1;
> -}
> -
> size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
> {
> int i, cpu, start = -1;
> diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> index 15043e764fa6..832fc53f3c11 100644
> --- a/tools/perf/util/cpumap.h
> +++ b/tools/perf/util/cpumap.h
> @@ -78,8 +78,6 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
> aggr_cpu_id_get_t f,
> void *data);
>
> -bool cpu_map__has(struct perf_cpu_map *cpus, int cpu);
> -
> bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b);
> bool aggr_cpu_id__is_empty(const struct aggr_cpu_id *a);
> struct aggr_cpu_id aggr_cpu_id__empty(void);
> diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
> index 51b429c86f98..8affb37d90e7 100644
> --- a/tools/perf/util/cputopo.c
> +++ b/tools/perf/util/cputopo.c
> @@ -218,7 +218,7 @@ struct cpu_topology *cpu_topology__new(void)
> tp->core_cpus_list = addr;
>
> for (i = 0; i < nr; i++) {
> - if (!cpu_map__has(map, i))
> + if (!perf_cpu_map__has(map, i))
> continue;
>
> ret = build_cpu_topology(tp, i);
> --
> 2.34.1.400.ga245620fadb-goog
>

2021-12-10 19:08:23

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 01/22] libperf: Add comments to perf_cpu_map.

Em Wed, Dec 08, 2021 at 06:34:14AM -0800, Ian Rogers escreveu:
> On Wed, Dec 8, 2021 at 4:06 AM John Garry <[email protected]> wrote:
> >
> > On 08/12/2021 02:45, Ian Rogers wrote:
> > > diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> > > index 840d4032587b..1c1726f4a04e 100644
> > > --- a/tools/lib/perf/include/internal/cpumap.h
> > > +++ b/tools/lib/perf/include/internal/cpumap.h
> > > @@ -4,9 +4,16 @@
> > >
> > > #include <linux/refcount.h>
> > >
> > > +/**
> > > + * A sized, reference counted, sorted array of integers representing CPU
> > > + * numbers. This is commonly used to capture which CPUs a PMU is associated
> > > + * with.
> > > + */
> > > struct perf_cpu_map {
> > > refcount_t refcnt;
> > > + /** Length of the map array. */
> > > int nr;
> > > + /** The CPU values. */
> > > int map[];
> >
> > would simply more distinct names for the variables help instead of or in
> > addition to comments?

Well, in this case the typical usage doesn't help, as 'struct
perf_cpu_map' are being used simply as "map" where it should be cpu_map,
so we would have:

cpu_map->nr

And all should be obvious, no? Otherwise we would have redundant 'cpu',
like:

cpu_map->nr_cpus

And 'map' should really be entries, so:

cpu_map->entries[index];

Would be clear enough, o?

> Thanks John! I agree. The phrase that is often used is intention
> revealing names. The kernel style for naming is to be brief:
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html#naming
> These names are both brief. nr is a little unusual, of course an
> integer is a number - size and length are common names in situations
> like these. In this case number makes sense as it is the number of
> CPUs in the array, and there is a certain readability in saying number
> of CPUs and not length or size of CPUs. The name map I have issue
> with, it is always a smell if you are calling a variable a data type.
> Given the convention in the context of this code I decided to leave
> it. Something like array_of_cpu_values would be more intention
> revealing but when run through the variable name shrinkifier could end
> up as just being array, which would be little better than map.
>
> The guidance on comments is that they are good and to focus on the
> what of what the code is doing:
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html#commenting
> refcnt was intention revealing enough and so I didn't add a comment to it.
>
> > Generally developers don't always check comments where the struct is
> > defined when the meaning could be judged intuitively
>
> Agreed. I think there could be a follow up to change to better names.
> As I was lacking a better suggestion I think for the time being, and
> in this patch set, we can keep things as they are.
>
> Thanks,
> Ian
>
> > Thanks,
> > John
> >

--

- Arnaldo

2021-12-10 19:10:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/22] perf stat: Add aggr creators that are passed a cpu.

Em Tue, Dec 07, 2021 at 06:45:47PM -0800, Ian Rogers escreveu:
> The cpu_map and index can get confused. Add variants of the cpu_map__get
> routines that are passed a cpu. Make the existing cpu_map__get routines
> use the new functions with a view to remove them when no longer used.

Looks ok from a quick lock

> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/util/cpumap.c | 79 +++++++++++++++++++++++-----------------
> tools/perf/util/cpumap.h | 6 ++-
> 2 files changed, 51 insertions(+), 34 deletions(-)
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 87d3eca9b872..49fba2c53822 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -128,21 +128,23 @@ int cpu_map__get_socket_id(int cpu)
> return ret ?: value;
> }
>
> -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> - void *data __maybe_unused)
> +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_unused)
> {
> - int cpu;
> struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
>
> - if (idx > map->nr)
> - return id;
> -
> - cpu = map->map[idx];
> -
> id.socket = cpu_map__get_socket_id(cpu);
> return id;
> }
>
> +struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> + void *data)
> +{
> + if (idx < 0 || idx > map->nr)
> + return cpu_map__empty_aggr_cpu_id();
> +
> + return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> +}
> +
> static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> {
> struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> @@ -200,15 +202,10 @@ int cpu_map__get_die_id(int cpu)
> return ret ?: value;
> }
>
> -struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data)
> +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data)
> {
> - int cpu, die;
> - struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> -
> - if (idx > map->nr)
> - return id;
> -
> - cpu = map->map[idx];
> + struct aggr_cpu_id id;
> + int die;
>
> die = cpu_map__get_die_id(cpu);
> /* There is no die_id on legacy system. */
> @@ -220,7 +217,7 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> * with the socket ID and then add die to
> * make a unique ID.
> */
> - id = cpu_map__get_socket(map, idx, data);
> + id = cpu_map__get_socket_aggr_by_cpu(cpu, data);
> if (cpu_map__aggr_cpu_id_is_empty(id))
> return id;
>
> @@ -228,6 +225,15 @@ struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *dat
> return id;
> }
>
> +struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx,
> + void *data)
> +{
> + if (idx < 0 || idx > map->nr)
> + return cpu_map__empty_aggr_cpu_id();
> +
> + return cpu_map__get_die_aggr_by_cpu(map->map[idx], data);
> +}
> +
> int cpu_map__get_core_id(int cpu)
> {
> int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
> @@ -239,20 +245,13 @@ int cpu_map__get_node_id(int cpu)
> return cpu__get_node(cpu);
> }
>
> -struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data)
> {
> - int cpu;
> - struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
> -
> - if (idx > map->nr)
> - return id;
> -
> - cpu = map->map[idx];
> -
> - cpu = cpu_map__get_core_id(cpu);
> + struct aggr_cpu_id id;
> + int core = cpu_map__get_core_id(cpu);
>
> /* cpu_map__get_die returns a struct with socket and die set*/
> - id = cpu_map__get_die(map, idx, data);
> + id = cpu_map__get_die_aggr_by_cpu(cpu, data);
> if (cpu_map__aggr_cpu_id_is_empty(id))
> return id;
>
> @@ -260,19 +259,33 @@ struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *da
> * core_id is relative to socket and die, we need a global id.
> * So we combine the result from cpu_map__get_die with the core id
> */
> - id.core = cpu;
> + id.core = core;
> return id;
> +
> }
>
> -struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data __maybe_unused)
> +struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data)
> +{
> + if (idx < 0 || idx > map->nr)
> + return cpu_map__empty_aggr_cpu_id();
> +
> + return cpu_map__get_core_aggr_by_cpu(map->map[idx], data);
> +}
> +
> +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data __maybe_unused)
> {
> struct aggr_cpu_id id = cpu_map__empty_aggr_cpu_id();
>
> + id.node = cpu_map__get_node_id(cpu);
> + return id;
> +}
> +
> +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data)
> +{
> if (idx < 0 || idx >= map->nr)
> - return id;
> + return cpu_map__empty_aggr_cpu_id();
>
> - id.node = cpu_map__get_node_id(map->map[idx]);
> - return id;
> + return cpu_map__get_node_aggr_by_cpu(map->map[idx], data);
> }
>
> int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp)
> diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> index a27eeaf086e8..c62d67704425 100644
> --- a/tools/perf/util/cpumap.h
> +++ b/tools/perf/util/cpumap.h
> @@ -31,13 +31,17 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size);
> size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
> size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
> int cpu_map__get_socket_id(int cpu);
> +struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
> struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
> int cpu_map__get_die_id(int cpu);
> +struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
> struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
> int cpu_map__get_core_id(int cpu);
> +struct aggr_cpu_id cpu_map__get_core_aggr_by_cpu(int cpu, void *data);
> struct aggr_cpu_id cpu_map__get_core(struct perf_cpu_map *map, int idx, void *data);
> int cpu_map__get_node_id(int cpu);
> -struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> +struct aggr_cpu_id cpu_map__get_node_aggr_by_cpu(int cpu, void *data);
> +struct aggr_cpu_id cpu_map__get_node(struct perf_cpu_map *map, int idx, void *data);
> int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **sockp);
> int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **diep);
> int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **corep);
> --
> 2.34.1.400.ga245620fadb-goog

--

- Arnaldo

2021-12-11 19:24:56

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 21/22] perf cpumap: Trim the cpu_aggr_map

On Tue, Dec 07, 2021 at 06:46:06PM -0800, Ian Rogers wrote:
> cpu_aggr_map__new removes duplicates, when this happens shrink the
> array.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/util/cpumap.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 8a72ee996722..985c87f1f1ca 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -185,7 +185,12 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
> c->nr++;
> }
> }
> -
> + /* Trim. */
> + if (c->nr != cpus->nr) {
> + c = realloc(c, sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
> + if (!c)
> + return NULL;
> + }

curious.. we should do this, but did you detect some big waste in here?

thanks,
jirka

> /* ensure we process id in increasing order */
> qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);
>
> --
> 2.34.1.400.ga245620fadb-goog
>


2021-12-11 19:25:03

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 03/22] perf stat: Switch aggregation to use for_each loop

On Tue, Dec 07, 2021 at 06:45:48PM -0800, Ian Rogers wrote:
> Tidy up the use of cpu and index to hopefully make the code less error
> prone. Avoid unused warnings with (void) which will be removed in a
> later patch.
>
> In aggr_update_shadow, the perf_cpu_map is switched from
> the evlist to the counter's cpu map, so the index is appropriate. This
> addresses a problem where uncore counts, with a cpumap like:
> $ cat /sys/devices/uncore_imc_0/cpumask
> 0,18
> Don't aggregate counts in CPUs based on the index of those values in the
> cpumap (0 and 1) but on the actual CPU (0 and 18). Thereby correcting
> metric calculations in per-socket mode for counters with without a full
> cpumask.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/util/stat-display.c | 48 +++++++++++++++++++---------------
> 1 file changed, 27 insertions(+), 21 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 588601000f3f..efab39a759ff 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -330,8 +330,8 @@ static void print_metric_header(struct perf_stat_config *config,
> static int first_shadow_cpu(struct perf_stat_config *config,
> struct evsel *evsel, struct aggr_cpu_id id)
> {
> - struct evlist *evlist = evsel->evlist;
> - int i;
> + struct perf_cpu_map *cpus;
> + int cpu, idx;
>
> if (config->aggr_mode == AGGR_NONE)
> return id.core;
> @@ -339,14 +339,11 @@ static int first_shadow_cpu(struct perf_stat_config *config,
> if (!config->aggr_get_id)
> return 0;
>
> - for (i = 0; i < evsel__nr_cpus(evsel); i++) {
> - int cpu2 = evsel__cpus(evsel)->map[i];
> -
> - if (cpu_map__compare_aggr_cpu_id(
> - config->aggr_get_id(config, evlist->core.cpus, cpu2),
> - id)) {
> - return cpu2;
> - }
> + cpus = evsel__cpus(evsel);
> + perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
> + if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
> + id))
> + return cpu;

so this looks strange, you pass idx instead of cpu2 to aggr_get_id,
which takes idx as 3rd argument, so it looks like it was broken now,
should this be a separate fix?

also the original code for some reason passed evlist->core.cpus
to aggr_get_id, which might differ rom evsel's cpus

same for aggr_update_shadow change

jirka


2021-12-11 19:25:12

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 06/22] perf cpumap: Remove map+index get_socket

On Tue, Dec 07, 2021 at 06:45:51PM -0800, Ian Rogers wrote:
> Migrate final users to appropriate cpu variant.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/tests/topology.c | 2 +-
> tools/perf/util/cpumap.c | 9 ---------
> tools/perf/util/cpumap.h | 1 -
> tools/perf/util/stat.c | 2 +-
> 4 files changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
> index 869986139146..69a64074b897 100644
> --- a/tools/perf/tests/topology.c
> +++ b/tools/perf/tests/topology.c
> @@ -150,7 +150,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
>
> // Test that socket ID contains only socket
> for (i = 0; i < map->nr; i++) {
> - id = cpu_map__get_socket(map, i, NULL);
> + id = cpu_map__get_socket_aggr_by_cpu(perf_cpu_map__cpu(map, i), NULL);

you could also use the perf_cpu_map__for_each_cpu in here?
same for the following changes

jirka

> TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
> session->header.env.cpu[map->map[i]].socket_id == id.socket);
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index feaf34b25efc..342a5eaee9d3 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -136,15 +136,6 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
> return id;
> }
>
> -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx,
> - void *data)
> -{
> - if (idx < 0 || idx > map->nr)
> - return cpu_map__empty_aggr_cpu_id();
> -
> - return cpu_map__get_socket_aggr_by_cpu(map->map[idx], data);
> -}
> -
> static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> {
> struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> index 9648816c4255..a53af24301d2 100644
> --- a/tools/perf/util/cpumap.h
> +++ b/tools/perf/util/cpumap.h
> @@ -32,7 +32,6 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size);
> size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp);
> int cpu_map__get_socket_id(int cpu);
> struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data);
> -struct aggr_cpu_id cpu_map__get_socket(struct perf_cpu_map *map, int idx, void *data);
> int cpu_map__get_die_id(int cpu);
> struct aggr_cpu_id cpu_map__get_die_aggr_by_cpu(int cpu, void *data);
> struct aggr_cpu_id cpu_map__get_die(struct perf_cpu_map *map, int idx, void *data);
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 09ea334586f2..9eca1111fa52 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -328,7 +328,7 @@ static int check_per_pkg(struct evsel *counter,
> if (!(vals->run && vals->ena))
> return 0;
>
> - s = cpu_map__get_socket(cpus, cpu, NULL).socket;
> + s = cpu_map__get_socket_id(cpu);
> if (s < 0)
> return -1;
>
> --
> 2.34.1.400.ga245620fadb-goog
>


2021-12-11 19:25:19

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 17/22] perf cpumap: Refactor cpu_map__build_map

On Tue, Dec 07, 2021 at 06:46:02PM -0800, Ian Rogers wrote:

SNIP

> - perror("cannot build core map");
> - return -1;
> - }
> - stat_config.aggr_get_id = perf_stat__get_core_file;
> - break;
> + return perf_stat__get_core_file;
> case AGGR_NODE:
> - if (perf_env__build_node_map(env, evsel_list->core.cpus, &stat_config.aggr_map)) {
> - perror("cannot build core map");
> - return -1;
> - }
> - stat_config.aggr_get_id = perf_stat__get_node_file;
> - break;
> + return perf_stat__get_node_file;
> case AGGR_NONE:
> case AGGR_GLOBAL:
> case AGGR_THREAD:
> case AGGR_UNSET:
> default:
> - break;
> + return NULL;
> }
> +}
> +
> +static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
> +{
> + struct perf_env *env = &st->session->header.env;
>
> + aggr_cpu_id_get_t f = aggr_mode__get_aggr_file(stat_config.aggr_mode);

we use get_id for aggr_get_id_t, maybe we could use it instead of 'f' in
here as well

> +
> + if (!f)
> + return 0;
> +
> + stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus, f, env);
> + if (!stat_config.aggr_map) {
> + pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
> + return -1;
> + }
> + stat_config.aggr_get_id = aggr_mode__get_id_file(stat_config.aggr_mode);
> return 0;
> }
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 32f9fc2dd389..ba4468f691c8 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -140,7 +140,7 @@ struct aggr_cpu_id cpu_map__get_socket_aggr_by_cpu(int cpu, void *data __maybe_u
> return id;
> }
>
> -static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> +static int aggr_cpu_id__cmp(const void *a_pointer, const void *b_pointer)
> {
> struct aggr_cpu_id *a = (struct aggr_cpu_id *)a_pointer;
> struct aggr_cpu_id *b = (struct aggr_cpu_id *)b_pointer;
> @@ -157,37 +157,40 @@ static int cmp_aggr_cpu_id(const void *a_pointer, const void *b_pointer)
> return a->thread - b->thread;
> }
>
> -int cpu_map__build_map(struct perf_cpu_map *cpus, struct cpu_aggr_map **res,
> - struct aggr_cpu_id (*f)(int cpu, void *data),
> - void *data)
> +struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
> + aggr_cpu_id_get_t f,
> + void *data)

same here

thanks,
jirka

> {
> - int nr = cpus->nr;
> - struct cpu_aggr_map *c = cpu_aggr_map__empty_new(nr);
> - int cpu, s2;
> - struct aggr_cpu_id s1;
> + int cpu, idx;
> + struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
>
> if (!c)
> - return -1;
> + return NULL;
>
> /* Reset size as it may only be partially filled */
> c->nr = 0;
>
> - for (cpu = 0; cpu < nr; cpu++) {
> - s1 = f(cpu, data);
> - for (s2 = 0; s2 < c->nr; s2++) {
> - if (aggr_cpu_id__equal(&s1, &c->map[s2]))
> + perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
> + bool duplicate = false;
> + struct aggr_cpu_id cpu_id = f(cpu, data);
> +
> + for (int j = 0; j < c->nr; j++) {
> + if (aggr_cpu_id__equal(&cpu_id, &c->map[j])) {
> + duplicate = true;
> break;
> + }
> }
> - if (s2 == c->nr) {
> - c->map[c->nr] = s1;
> + if (!duplicate) {
> + c->map[c->nr] = cpu_id;
> c->nr++;
> }
> }
> +
> /* ensure we process id in increasing order */
> - qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), cmp_aggr_cpu_id);
> + qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);
> +
> + return c;
>
> - *res = c;
> - return 0;
> }
>

SNIP


2021-12-13 08:56:42

by John Garry

[permalink] [raw]
Subject: Re: [PATCH 01/22] libperf: Add comments to perf_cpu_map.

On 10/12/2021 19:08, Arnaldo Carvalho de Melo wrote:
>>>> +/**
>>>> + * A sized, reference counted, sorted array of integers representing CPU
>>>> + * numbers. This is commonly used to capture which CPUs a PMU is associated
>>>> + * with.
>>>> + */
>>>> struct perf_cpu_map {
>>>> refcount_t refcnt;
>>>> + /** Length of the map array. */
>>>> int nr;

I'd have /s/nr/len/, as it means the map length, as opposed to confusing
nr meaning with number of cpus in the host or something else. And the
new comment uses "Length" also.

>>>> + /** The CPU values. */
>>>> int map[];
>>> would simply more distinct names for the variables help instead of or in
>>> addition to comments?
> Well, in this case the typical usage doesn't help, as 'struct
> perf_cpu_map' are being used simply as "map"

There are a lot of instances to change ... but I am all up for using
consistent and well-meaning variable / argument names per type.

> where it should be cpu_map,
> so we would have:
>
> cpu_map->nr
>
> And all should be obvious, no? Otherwise we would have redundant 'cpu',
> like:
>
> cpu_map->nr_cpus
>
> And 'map' should really be entries, so:
>
> cpu_map->entries[index];
>
> Would be clear enough, o?
>
>> Thanks John! I agree. The phrase that is often used is intention
>> revealing names. The kernel style for naming is to be brief:


2021-12-13 11:39:41

by James Clark

[permalink] [raw]
Subject: Re: [PATCH 00/22] Refactor perf cpumap



On 08/12/2021 02:45, Ian Rogers wrote:
> Perf cpu map has various functions where a cpumap and index are passed
> in order to load the cpu. A problem with this is that the wrong index
> may be passed for the cpumap, causing problems like aggregation on the
> wrong CPU:
> https://lore.kernel.org/lkml/[email protected]/
>
> This patch set refactors the cpu map API, greatly reducing it and
> explicitly passing the cpu (rather than the pair) to functions that
> need it. Comments are added at the same time.
>
> Ian Rogers (22):
> libperf: Add comments to perf_cpu_map.
> perf stat: Add aggr creators that are passed a cpu.
> perf stat: Switch aggregation to use for_each loop
> perf stat: Switch to cpu version of cpu_map__get
> perf cpumap: Switch cpu_map__build_map to cpu function
> perf cpumap: Remove map+index get_socket
> perf cpumap: Remove map+index get_die
> perf cpumap: Remove map+index get_core
> perf cpumap: Remove map+index get_node
> perf cpumap: Add comments to aggr_cpu_id
> perf cpumap: Remove unused cpu_map__socket
> perf cpumap: Simplify equal function name.
> perf cpumap: Rename empty functions.
> perf cpumap: Document cpu__get_node and remove redundant function
> perf cpumap: Remove map from function names that don't use a map.
> perf cpumap: Remove cpu_map__cpu, use libperf function.
> perf cpumap: Refactor cpu_map__build_map
> perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
> perf cpumap: Move 'has' function to libperf
> perf cpumap: Add some comments to cpu_aggr_map
> perf cpumap: Trim the cpu_aggr_map
> perf stat: Fix memory leak in check_per_pkg
>
> tools/lib/perf/cpumap.c | 7 +-
> tools/lib/perf/include/internal/cpumap.h | 9 +-
> tools/lib/perf/include/perf/cpumap.h | 1 +
> tools/perf/arch/arm/util/cs-etm.c | 16 +-
> tools/perf/builtin-ftrace.c | 2 +-
> tools/perf/builtin-sched.c | 6 +-
> tools/perf/builtin-stat.c | 273 ++++++++++++-----------
> tools/perf/tests/topology.c | 10 +-
> tools/perf/util/cpumap.c | 182 ++++++---------
> tools/perf/util/cpumap.h | 102 ++++++---
> tools/perf/util/cputopo.c | 2 +-
> tools/perf/util/env.c | 6 +-
> tools/perf/util/stat-display.c | 69 +++---
> tools/perf/util/stat.c | 9 +-
> tools/perf/util/stat.h | 3 +-
> 15 files changed, 361 insertions(+), 336 deletions(-)
>

For the whole set:

Reviewed-by: James Clark <[email protected]>

I didn't see any obvious issues with mixing up aggregation modes or CPU/idx types. Also
gave perf stat a test in the different modes and didn't see an issue.

But I'm wondering if it's possible to go further and add a struct around the CPU int so that the
compiler checks for correctness instead. It still seems quite easy to mix up index and
CPU, for example these functions are subtly different, but both use int:

LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);

Something like this would make it impossible to make a mistake:

struct cpu { int cpu };

I mean it's more of a coincidence that CPUs can be identified by an integer, but they are more
of an object than an integer, so it could make sense to wrap it. But maybe it could be quite
cumbersome to use and be overkill.


2021-12-13 16:10:18

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 00/22] Refactor perf cpumap

On Mon, Dec 13, 2021 at 3:39 AM James Clark <[email protected]> wrote:
>
>
>
> On 08/12/2021 02:45, Ian Rogers wrote:
> > Perf cpu map has various functions where a cpumap and index are passed
> > in order to load the cpu. A problem with this is that the wrong index
> > may be passed for the cpumap, causing problems like aggregation on the
> > wrong CPU:
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > This patch set refactors the cpu map API, greatly reducing it and
> > explicitly passing the cpu (rather than the pair) to functions that
> > need it. Comments are added at the same time.
> >
> > Ian Rogers (22):
> > libperf: Add comments to perf_cpu_map.
> > perf stat: Add aggr creators that are passed a cpu.
> > perf stat: Switch aggregation to use for_each loop
> > perf stat: Switch to cpu version of cpu_map__get
> > perf cpumap: Switch cpu_map__build_map to cpu function
> > perf cpumap: Remove map+index get_socket
> > perf cpumap: Remove map+index get_die
> > perf cpumap: Remove map+index get_core
> > perf cpumap: Remove map+index get_node
> > perf cpumap: Add comments to aggr_cpu_id
> > perf cpumap: Remove unused cpu_map__socket
> > perf cpumap: Simplify equal function name.
> > perf cpumap: Rename empty functions.
> > perf cpumap: Document cpu__get_node and remove redundant function
> > perf cpumap: Remove map from function names that don't use a map.
> > perf cpumap: Remove cpu_map__cpu, use libperf function.
> > perf cpumap: Refactor cpu_map__build_map
> > perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
> > perf cpumap: Move 'has' function to libperf
> > perf cpumap: Add some comments to cpu_aggr_map
> > perf cpumap: Trim the cpu_aggr_map
> > perf stat: Fix memory leak in check_per_pkg
> >
> > tools/lib/perf/cpumap.c | 7 +-
> > tools/lib/perf/include/internal/cpumap.h | 9 +-
> > tools/lib/perf/include/perf/cpumap.h | 1 +
> > tools/perf/arch/arm/util/cs-etm.c | 16 +-
> > tools/perf/builtin-ftrace.c | 2 +-
> > tools/perf/builtin-sched.c | 6 +-
> > tools/perf/builtin-stat.c | 273 ++++++++++++-----------
> > tools/perf/tests/topology.c | 10 +-
> > tools/perf/util/cpumap.c | 182 ++++++---------
> > tools/perf/util/cpumap.h | 102 ++++++---
> > tools/perf/util/cputopo.c | 2 +-
> > tools/perf/util/env.c | 6 +-
> > tools/perf/util/stat-display.c | 69 +++---
> > tools/perf/util/stat.c | 9 +-
> > tools/perf/util/stat.h | 3 +-
> > 15 files changed, 361 insertions(+), 336 deletions(-)
> >
>
> For the whole set:
>
> Reviewed-by: James Clark <[email protected]>
>
> I didn't see any obvious issues with mixing up aggregation modes or CPU/idx types. Also
> gave perf stat a test in the different modes and didn't see an issue.
>
> But I'm wondering if it's possible to go further and add a struct around the CPU int so that the
> compiler checks for correctness instead. It still seems quite easy to mix up index and
> CPU, for example these functions are subtly different, but both use int:
>
> LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
> LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
>
> Something like this would make it impossible to make a mistake:
>
> struct cpu { int cpu };
>
> I mean it's more of a coincidence that CPUs can be identified by an integer, but they are more
> of an object than an integer, so it could make sense to wrap it. But maybe it could be quite
> cumbersome to use and be overkill.

Thanks James! I am working on a v2 patch set and will have a go at
adding this to the end.

Ian

2021-12-13 16:12:02

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 21/22] perf cpumap: Trim the cpu_aggr_map

On Sat, Dec 11, 2021 at 11:24 AM Jiri Olsa <[email protected]> wrote:
>
> On Tue, Dec 07, 2021 at 06:46:06PM -0800, Ian Rogers wrote:
> > cpu_aggr_map__new removes duplicates, when this happens shrink the
> > array.
> >
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/util/cpumap.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> > index 8a72ee996722..985c87f1f1ca 100644
> > --- a/tools/perf/util/cpumap.c
> > +++ b/tools/perf/util/cpumap.c
> > @@ -185,7 +185,12 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
> > c->nr++;
> > }
> > }
> > -
> > + /* Trim. */
> > + if (c->nr != cpus->nr) {
> > + c = realloc(c, sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
> > + if (!c)
> > + return NULL;
> > + }
>
> curious.. we should do this, but did you detect some big waste in here?

No real size implications, but I was after coaxing address sanitizer
into detecting potential index out of bounds problems.

Thanks,
Ian

> thanks,
> jirka
>
> > /* ensure we process id in increasing order */
> > qsort(c->map, c->nr, sizeof(struct aggr_cpu_id), aggr_cpu_id__cmp);
> >
> > --
> > 2.34.1.400.ga245620fadb-goog
> >
>

2021-12-13 16:17:22

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 03/22] perf stat: Switch aggregation to use for_each loop

On Sat, Dec 11, 2021 at 11:25 AM Jiri Olsa <[email protected]> wrote:
>
> On Tue, Dec 07, 2021 at 06:45:48PM -0800, Ian Rogers wrote:
> > Tidy up the use of cpu and index to hopefully make the code less error
> > prone. Avoid unused warnings with (void) which will be removed in a
> > later patch.
> >
> > In aggr_update_shadow, the perf_cpu_map is switched from
> > the evlist to the counter's cpu map, so the index is appropriate. This
> > addresses a problem where uncore counts, with a cpumap like:
> > $ cat /sys/devices/uncore_imc_0/cpumask
> > 0,18
> > Don't aggregate counts in CPUs based on the index of those values in the
> > cpumap (0 and 1) but on the actual CPU (0 and 18). Thereby correcting
> > metric calculations in per-socket mode for counters with without a full
> > cpumask.
> >
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/util/stat-display.c | 48 +++++++++++++++++++---------------
> > 1 file changed, 27 insertions(+), 21 deletions(-)
> >
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index 588601000f3f..efab39a759ff 100644
> > --- a/tools/perf/util/stat-display.c
> > +++ b/tools/perf/util/stat-display.c
> > @@ -330,8 +330,8 @@ static void print_metric_header(struct perf_stat_config *config,
> > static int first_shadow_cpu(struct perf_stat_config *config,
> > struct evsel *evsel, struct aggr_cpu_id id)
> > {
> > - struct evlist *evlist = evsel->evlist;
> > - int i;
> > + struct perf_cpu_map *cpus;
> > + int cpu, idx;
> >
> > if (config->aggr_mode == AGGR_NONE)
> > return id.core;
> > @@ -339,14 +339,11 @@ static int first_shadow_cpu(struct perf_stat_config *config,
> > if (!config->aggr_get_id)
> > return 0;
> >
> > - for (i = 0; i < evsel__nr_cpus(evsel); i++) {
> > - int cpu2 = evsel__cpus(evsel)->map[i];
> > -
> > - if (cpu_map__compare_aggr_cpu_id(
> > - config->aggr_get_id(config, evlist->core.cpus, cpu2),
> > - id)) {
> > - return cpu2;
> > - }
> > + cpus = evsel__cpus(evsel);
> > + perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
> > + if (cpu_map__compare_aggr_cpu_id(config->aggr_get_id(config, cpus, idx),
> > + id))
> > + return cpu;
>
> so this looks strange, you pass idx instead of cpu2 to aggr_get_id,
> which takes idx as 3rd argument, so it looks like it was broken now,
> should this be a separate fix?

Yep, I tried to cover this in the commit message, but agree a separate
patch would be clearer. The aggregation is currently broken on
anything other than CPU 0 or when the CPU mask covers every CPU - the
case for something like topdown, hence this not being spotted.

> also the original code for some reason passed evlist->core.cpus
> to aggr_get_id, which might differ rom evsel's cpus

Part of the same fix.

> same for aggr_update_shadow change

In this case the cpu is really an index and so the change is just
renaming one to the other for the sake of clarity.

Thanks,
Ian

> jirka
>

2021-12-13 22:06:58

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 00/22] Refactor perf cpumap

On Mon, Dec 13, 2021 at 8:10 AM Ian Rogers <[email protected]> wrote:
>
> On Mon, Dec 13, 2021 at 3:39 AM James Clark <[email protected]> wrote:
> >
> >
> >
> > On 08/12/2021 02:45, Ian Rogers wrote:
> > > Perf cpu map has various functions where a cpumap and index are passed
> > > in order to load the cpu. A problem with this is that the wrong index
> > > may be passed for the cpumap, causing problems like aggregation on the
> > > wrong CPU:
> > > https://lore.kernel.org/lkml/[email protected]/
> > >
> > > This patch set refactors the cpu map API, greatly reducing it and
> > > explicitly passing the cpu (rather than the pair) to functions that
> > > need it. Comments are added at the same time.
> > >
> > > Ian Rogers (22):
> > > libperf: Add comments to perf_cpu_map.
> > > perf stat: Add aggr creators that are passed a cpu.
> > > perf stat: Switch aggregation to use for_each loop
> > > perf stat: Switch to cpu version of cpu_map__get
> > > perf cpumap: Switch cpu_map__build_map to cpu function
> > > perf cpumap: Remove map+index get_socket
> > > perf cpumap: Remove map+index get_die
> > > perf cpumap: Remove map+index get_core
> > > perf cpumap: Remove map+index get_node
> > > perf cpumap: Add comments to aggr_cpu_id
> > > perf cpumap: Remove unused cpu_map__socket
> > > perf cpumap: Simplify equal function name.
> > > perf cpumap: Rename empty functions.
> > > perf cpumap: Document cpu__get_node and remove redundant function
> > > perf cpumap: Remove map from function names that don't use a map.
> > > perf cpumap: Remove cpu_map__cpu, use libperf function.
> > > perf cpumap: Refactor cpu_map__build_map
> > > perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
> > > perf cpumap: Move 'has' function to libperf
> > > perf cpumap: Add some comments to cpu_aggr_map
> > > perf cpumap: Trim the cpu_aggr_map
> > > perf stat: Fix memory leak in check_per_pkg
> > >
> > > tools/lib/perf/cpumap.c | 7 +-
> > > tools/lib/perf/include/internal/cpumap.h | 9 +-
> > > tools/lib/perf/include/perf/cpumap.h | 1 +
> > > tools/perf/arch/arm/util/cs-etm.c | 16 +-
> > > tools/perf/builtin-ftrace.c | 2 +-
> > > tools/perf/builtin-sched.c | 6 +-
> > > tools/perf/builtin-stat.c | 273 ++++++++++++-----------
> > > tools/perf/tests/topology.c | 10 +-
> > > tools/perf/util/cpumap.c | 182 ++++++---------
> > > tools/perf/util/cpumap.h | 102 ++++++---
> > > tools/perf/util/cputopo.c | 2 +-
> > > tools/perf/util/env.c | 6 +-
> > > tools/perf/util/stat-display.c | 69 +++---
> > > tools/perf/util/stat.c | 9 +-
> > > tools/perf/util/stat.h | 3 +-
> > > 15 files changed, 361 insertions(+), 336 deletions(-)
> > >
> >
> > For the whole set:
> >
> > Reviewed-by: James Clark <[email protected]>
> >
> > I didn't see any obvious issues with mixing up aggregation modes or CPU/idx types. Also
> > gave perf stat a test in the different modes and didn't see an issue.
> >
> > But I'm wondering if it's possible to go further and add a struct around the CPU int so that the
> > compiler checks for correctness instead. It still seems quite easy to mix up index and
> > CPU, for example these functions are subtly different, but both use int:
> >
> > LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
> > LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
> >
> > Something like this would make it impossible to make a mistake:
> >
> > struct cpu { int cpu };
> >
> > I mean it's more of a coincidence that CPUs can be identified by an integer, but they are more
> > of an object than an integer, so it could make sense to wrap it. But maybe it could be quite
> > cumbersome to use and be overkill.
>
> Thanks James! I am working on a v2 patch set and will have a go at
> adding this to the end.
>
> Ian

I was checking on the style issues around wrapping an int with a
struct, and it is preferred style to enforce strict type checking (by
way of an old post):
https://lore.kernel.org/all/[email protected]/

Thanks,
Ian