2021-08-21 09:21:49

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 00/37] perf: use workqueue for evlist operations

Hi,

this patchset adds multithreading through the workqueue to the most
important evlist operations: enable, disable, close, and open.
Their multithreaded implementation is then used in perf-record through
a new option '--threads'.
It is dependent on the workqueue patchset (v3):
https://lore.kernel.org/lkml/[email protected]/

In each operation, each worker is assigned a cpu and pins itself to that
cpu to perform the operation on all evsels. In case enough threads are
provided, the underlying threads are pinned to the cpus, otherwise, they
just change their affinity temporarily.

Parallelization of enable, disable, and close is pretty straightforward,
while open requires more work to separate the actual open from all
fallback mechanisms.
In the multithreaded implementation of open, each thread will run until it
finishes or gets an error. When all threads finish, the main process
checks for errors. If it finds one, it applies a fallback and resumes the
open on all cpus from where they encountered the error.

I have tested the main fallback mechanisms (precise_ip, ignore missing
thread,fd limit increase), but not all of the missing feature ones.
I also ran perf test, with no errors. Below you can find the skipped
tests (I ommitted successfull results for brevity):
$ sudo ./perf test
23: Watchpoint :
23.1: Read Only Watchpoint : Skip (missing hardware support)
58: builtin clang support : Skip (not compiled in)
63: Test libpfm4 support : Skip (not compiled in)
89: perf stat --bpf-counters test : Skip
90: Check Arm CoreSight trace data recording and synthesized samples: Skip

I know the patchset is huge, but I didn't have time to split it (and
my time is running out). In any case, I tried to keep related patches
close together. It is organized as follows:
- 1 - 3: remove the cpu iterator inside evsel to simplify
parallelization. In the doing, cpumap idx and max methods are
improved based on the assumption that the cpumap is ordered.
- 4 - 5: preparation patches for adding affinities to threadpool.
- 6 - 8: add affinity support to threadpool and workqueue (preparation
for adding workqueue to evsel).
- 9: preparation for adding workqueue to evsel.
- 10 - 13: add multithreading to evlist enable, disable, and close.
- 14 - 27: preparation for adding multithreading to evlist__open.
- 28: add multithreading to evlist__open.
- 29 - 34: use multithreaded evlist operations in perf-record.
- 35 - 38: improve evlist-open-close benchmark, adding multithreading
and detailed output.

I'll be happy to split it if necessary in the future, with the goal of
merging my GSoC work, but, for now, I need to send it as is to include it
in my final report.

Below are some experimental results of evlist-open-close benchmark run on:
- laptop (2 cores + hyperthreading)
- vm (16 vCPUs)

The command line was:
$ ./perf bench internals evlist-open-close [-j] <options>

where [-j] was included only in the specified rows and <options> refers
to the column label:
- "" (dummy): open one dummy event on all cpus.
- "-n 100": open 100 dummy events on all cpus.
- "-e '{cs,cycles}'": open the "{cs,cycles}" group on all cpus.
- "-u 0": open one dummy event on all cpus for all processes of root user
(~300 on my laptop; ~360 on my VM).

Results:
machine configuration (dummy) -n 100 -e '{cs,cycles}' -u 0
laptop perf/core 980 +- 130 10514 +- 313 31950 +- 526 14529 +- 241
laptop this w/o -j 698 +- 102 11302 +- 283 31807 +- 448 13885 +- 143
laptop this w/ -j 2233 +- 261 5434 +- 386 13586 +- 443 9465 +- 568
vm perf/core 9818 +- 94 89993 +- 941 N/A 266414 +-1431
vm this w/o -j 5245 +- 88 82806 +- 922 N/A 260416 +-1563
vm this w -j 37787 +- 748 54844 +-1089 N/A 101088 +-1900

Comments:
- opening one dummy event in single threaded mode is faster than
perf/core, probably due to the changes in how evsel cpus are iterated.
- opening one event in multithreaded mode is not worth it.
- in all other cases, multithreaded mode is between 1.5x and 2.5x faster.

Time breakdown on my laptop:
One dummy event per cpu:
$ ./perf bench internals evlist-open-close -d
# Running 'internals/evlist-open-close' benchmark:
Number of workers: 1
Number of cpus: 4
Number of threads: 1
Number of events: 1 (4 fds)
Number of iterations: 100
Average open-close took: 879.250 usec (+- 103.238 usec)
init took: 0.040 usec (+- 0.020 usec)
open took: 28.900 usec (+- 6.675 usec)
mmap took: 415.870 usec (+- 17.391 usec)
enable took: 240.950 usec (+- 92.641 usec)
disable took: 64.670 usec (+- 9.722 usec)
munmap took: 16.450 usec (+- 3.571 usec)
close took: 112.100 usec (+- 25.465 usec)
fini took: 0.220 usec (+- 0.042 usec)
$ ./perf bench internals evlist-open-close -j -d
# Running 'internals/evlist-open-close' benchmark:
Number of workers: 4
Number of cpus: 4
Number of threads: 1
Number of events: 1 (4 fds)
Number of iterations: 100
Average open-close took: 1979.670 usec (+- 271.772 usec)
init took: 30.860 usec (+- 1.190 usec)
open took: 552.040 usec (+- 96.166 usec)
mmap took: 428.970 usec (+- 9.492 usec)
enable took: 222.740 usec (+- 56.112 usec)
disable took: 191.990 usec (+- 55.029 usec)
munmap took: 13.670 usec (+- 0.754 usec)
close took: 155.660 usec (+- 44.079 usec)
fini took: 383.520 usec (+- 87.476 usec)

Comments:
- the overhead comes from open (spinnig up threads) and fini
(terminating threads). There could be some improvements there (e.g.
not waiting for the thread to spawn).
- enable, disable, and close also take longer due to the overhead of
assigning the tasks to the workers.

One dummy event per process per cpu:
$ ./perf bench internals evlist-open-close -d -u0
# Running 'internals/evlist-open-close' benchmark:
Number of workers: 1
Number of cpus: 4
Number of threads: 295
Number of events: 1 (1180 fds)
Number of iterations: 100
Average open-close took: 15101.380 usec (+- 247.959 usec)
init took: 0.010 usec (+- 0.010 usec)
open took: 4224.460 usec (+- 119.028 usec)
mmap took: 3235.210 usec (+- 55.867 usec)
enable took: 2359.570 usec (+- 44.923 usec)
disable took: 2321.100 usec (+- 80.779 usec)
munmap took: 304.440 usec (+- 11.558 usec)
close took: 2655.920 usec (+- 59.089 usec)
fini took: 0.380 usec (+- 0.051 usec)
$ ./perf bench internals evlist-open-close -j -d -u0
# Running 'internals/evlist-open-close' benchmark:
Number of workers: 4
Number of cpus: 4
Number of threads: 298
Number of events: 1 (1192 fds)
Number of iterations: 100
Average open-close took: 10321.060 usec (+- 771.875 usec)
init took: 31.530 usec (+- 0.721 usec)
open took: 2849.870 usec (+- 533.019 usec)
mmap took: 3267.810 usec (+- 87.465 usec)
enable took: 1041.160 usec (+- 66.324 usec)
disable took: 1176.970 usec (+- 134.291 usec)
munmap took: 253.680 usec (+- 4.525 usec)
close took: 1204.550 usec (+- 101.284 usec)
fini took: 495.260 usec (+- 136.661 usec)
Comments:
- mmap/munmap are not parallelized and account for 20% of the time.
- open time is reduced by 33% (due to overhead of spinning up threads,
which is around half ms).
- enable, disable, and close times are halved.

It is not always worth using multithreading in evlist operations, but
in the good cases the improvements can be significant.
For this reason, we could include an heuristic to decide whether to
use it or not (e.g. use it only if there are less than X event fds to open).

It'd be great to see more experiments like these on a bigger machine, and
on a more realistic scenario (instead of dummy events).

Thanks,
Riccardo

Riccardo Mancini (37):
libperf cpumap: improve idx function
libperf cpumap: improve max function
perf evlist: replace evsel__cpu_iter* functions with evsel__find_cpu
perf util: add mmap_cpu_mask__duplicate function
perf util/mmap: add missing bitops.h header
perf workqueue: add affinities to threadpool
perf workqueue: add support for setting affinities to workers
perf workqueue: add method to execute work on specific CPU
perf python: add workqueue dependency
perf evlist: add multithreading helper
perf evlist: add multithreading to evlist__disable
perf evlist: add multithreading to evlist__enable
perf evlist: add multithreading to evlist__close
perf evsel: remove retry_sample_id goto label
perf evsel: separate open preparation from open itself
perf evsel: save open flags in evsel
perf evsel: separate missing feature disabling from evsel__open_cpu
perf evsel: add evsel__prepare_open function
perf evsel: separate missing feature detection from evsel__open_cpu
perf evsel: separate rlimit increase from evsel__open_cpu
perf evsel: move ignore_missing_thread to fallback code
perf evsel: move test_attr__open to success path in evsel__open_cpu
perf evsel: move bpf_counter__install_pe to success path in
evsel__open_cpu
perf evsel: handle precise_ip fallback in evsel__open_cpu
perf evsel: move event open in evsel__open_cpu to separate function
perf evsel: add evsel__open_per_cpu_no_fallback function
perf evlist: add evlist__for_each_entry_from macro
perf evlist: add multithreading to evlist__open
perf evlist: add custom fallback to evlist__open
perf record: use evlist__open_custom
tools lib/subcmd: add OPT_UINTEGER_OPTARG option type
perf record: add --threads option
perf record: pin threads to monitored cpus if enough threads available
perf record: apply multithreading in init and fini phases
perf test/evlist-open-close: add multithreading
perf test/evlist-open-close: use inline func to convert timeval to
usec
perf test/evlist-open-close: add detailed output mode

tools/lib/perf/cpumap.c | 27 +-
tools/lib/subcmd/parse-options.h | 1 +
tools/perf/Documentation/perf-record.txt | 9 +
tools/perf/bench/evlist-open-close.c | 116 +++++-
tools/perf/builtin-record.c | 128 ++++--
tools/perf/builtin-stat.c | 24 +-
tools/perf/tests/workqueue.c | 87 ++++-
tools/perf/util/evlist.c | 464 +++++++++++++++++-----
tools/perf/util/evlist.h | 44 ++-
tools/perf/util/evsel.c | 473 ++++++++++++++---------
tools/perf/util/evsel.h | 36 +-
tools/perf/util/mmap.c | 12 +
tools/perf/util/mmap.h | 4 +
tools/perf/util/python-ext-sources | 2 +
tools/perf/util/record.h | 3 +
tools/perf/util/workqueue/threadpool.c | 70 ++++
tools/perf/util/workqueue/threadpool.h | 7 +
tools/perf/util/workqueue/workqueue.c | 154 +++++++-
tools/perf/util/workqueue/workqueue.h | 18 +
19 files changed, 1334 insertions(+), 345 deletions(-)

--
2.31.1


2021-08-21 09:21:53

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 11/37] perf evlist: add multithreading to evlist__disable

In this patch, evlist__for_each_evsel_cpu is used in evlist__disable to
allow it to run in parallel.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.c | 57 +++++++++++++++++++++-------------------
1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f9fdbd85a163ee97..5c82246813c51c51 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -486,41 +486,44 @@ int evlist__for_each_evsel_cpu(struct evlist *evlist, evsel__cpu_func func, void

}

+struct evlist_disable_args {
+ char *evsel_name;
+ int imm;
+};
+
+static int __evlist__disable_evsel_cpu_func(struct evlist *evlist __maybe_unused,
+ struct evsel *pos, int cpu, void *_args)
+{
+ int ret = 0;
+ struct evlist_disable_args *args = _args;
+
+ if (evsel__strcmp(pos, args->evsel_name))
+ return 0;
+ if (pos->disabled || !evsel__is_group_leader(pos) || !pos->core.fd)
+ return 0;
+ if (pos->immediate)
+ ret = 1;
+ if (pos->immediate != args->imm)
+ return ret;
+ evsel__disable_cpu(pos, cpu);
+
+ return ret;
+}
+
+
static void __evlist__disable(struct evlist *evlist, char *evsel_name)
{
struct evsel *pos;
- struct affinity affinity;
- int cpu, i, imm = 0, cpu_idx;
- bool has_imm = false;
-
- if (affinity__setup(&affinity) < 0)
- return;
+ int ret;
+ struct evlist_disable_args args = { .evsel_name = evsel_name };

/* Disable 'immediate' events last */
- for (imm = 0; imm <= 1; imm++) {
- evlist__for_each_cpu(evlist, i, cpu) {
- affinity__set(&affinity, cpu);
-
- evlist__for_each_entry(evlist, pos) {
- if (evsel__strcmp(pos, evsel_name))
- continue;
- cpu_idx = evsel__find_cpu(pos, cpu);
- if (cpu_idx < 0)
- continue;
- if (pos->disabled || !evsel__is_group_leader(pos) || !pos->core.fd)
- continue;
- if (pos->immediate)
- has_imm = true;
- if (pos->immediate != imm)
- continue;
- evsel__disable_cpu(pos, cpu_idx);
- }
- }
- if (!has_imm)
+ for (args.imm = 0; args.imm <= 1; args.imm++) {
+ ret = evlist__for_each_evsel_cpu(evlist, __evlist__disable_evsel_cpu_func, &args);
+ if (!ret) // does not have immediate evsels
break;
}

- affinity__cleanup(&affinity);
evlist__for_each_entry(evlist, pos) {
if (evsel__strcmp(pos, evsel_name))
continue;
--
2.31.1

2021-08-21 09:22:03

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 12/37] perf evlist: add multithreading to evlist__enable

In this patch, evlist__for_each_evsel_cpu is used in evlist__enable to
allow it to run in parallel.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.c | 41 +++++++++++++---------------------------
1 file changed, 13 insertions(+), 28 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5c82246813c51c51..fbe2d6ed8ecc8f21 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -552,37 +552,22 @@ void evlist__disable_evsel(struct evlist *evlist, char *evsel_name)
__evlist__disable(evlist, evsel_name);
}

-static void __evlist__enable(struct evlist *evlist, char *evsel_name)
+static int __evlist__enable_evsel_cpu_fun(struct evlist *evlist __maybe_unused,
+ struct evsel *pos, int cpu, void *args)
{
- struct evsel *pos;
- struct affinity affinity;
- int cpu, i, cpu_idx;
+ char *evsel_name = args;

- if (affinity__setup(&affinity) < 0)
- return;
-
- evlist__for_each_cpu(evlist, i, cpu) {
- affinity__set(&affinity, cpu);
+ if (evsel__strcmp(pos, evsel_name))
+ return 0;
+ if (!evsel__is_group_leader(pos) || !pos->core.fd)
+ return 0;
+ evsel__enable_cpu(pos, cpu);
+ return 0;
+}

- evlist__for_each_entry(evlist, pos) {
- if (evsel__strcmp(pos, evsel_name))
- continue;
- cpu_idx = evsel__find_cpu(pos, cpu);
- if (cpu_idx < 0)
- continue;
- if (!evsel__is_group_leader(pos) || !pos->core.fd)
- continue;
- evsel__enable_cpu(pos, cpu_idx);
- }
- }
- affinity__cleanup(&affinity);
- evlist__for_each_entry(evlist, pos) {
- if (evsel__strcmp(pos, evsel_name))
- continue;
- if (!evsel__is_group_leader(pos) || !pos->core.fd)
- continue;
- pos->disabled = false;
- }
+static void __evlist__enable(struct evlist *evlist, char *evsel_name)
+{
+ evlist__for_each_evsel_cpu(evlist, __evlist__enable_evsel_cpu_fun, evsel_name);

/*
* Even single event sets the 'enabled' for evlist,
--
2.31.1

2021-08-21 09:22:04

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 13/37] perf evlist: add multithreading to evlist__close

In this patch, evlist__for_each_evsel_cpu is used in evlist__close to
allow it to run in parallel.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index fbe2d6ed8ecc8f21..f0a839107e8849bf 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1340,11 +1340,17 @@ void evlist__set_selected(struct evlist *evlist, struct evsel *evsel)
evlist->selected = evsel;
}

+static int __evlist__close_evsel_cpu_func(struct evlist *evlist __maybe_unused,
+ struct evsel *evsel, int cpu,
+ void *args __maybe_unused)
+{
+ perf_evsel__close_cpu(&evsel->core, cpu);
+ return 0;
+}
+
void evlist__close(struct evlist *evlist)
{
struct evsel *evsel;
- struct affinity affinity;
- int cpu, i, cpu_idx;

/*
* With perf record core.cpus is usually NULL.
@@ -1356,19 +1362,8 @@ void evlist__close(struct evlist *evlist)
return;
}

- if (affinity__setup(&affinity) < 0)
- return;
- evlist__for_each_cpu(evlist, i, cpu) {
- affinity__set(&affinity, cpu);
+ evlist__for_each_evsel_cpu(evlist, __evlist__close_evsel_cpu_func, NULL);

- evlist__for_each_entry_reverse(evlist, evsel) {
- cpu_idx = evsel__find_cpu(evsel, cpu);
- if (cpu_idx < 0)
- continue;
- perf_evsel__close_cpu(&evsel->core, cpu_idx);
- }
- }
- affinity__cleanup(&affinity);
evlist__for_each_entry_reverse(evlist, evsel) {
perf_evsel__free_fd(&evsel->core);
perf_evsel__free_id(&evsel->core);
--
2.31.1

2021-08-21 09:22:12

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 14/37] perf evsel: remove retry_sample_id goto label

As far as I can tell, there is no good reason, apart from optimization
to have the retry_sample_id separate from fallback_missing_features.
Probably, this label was added to avoid reapplying patches for missing
features that had already been applied.
However, missing features that have been added later have not used this
optimization, always jumping to fallback_missing_features and reapplying
all missing features.

This patch removes that label, replacing it with
fallback_missing_features.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f61e5dd53f5d2859..7b4bb3229a16524e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1825,7 +1825,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
evsel->core.attr.bpf_event = 0;
if (perf_missing_features.branch_hw_idx)
evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_HW_INDEX;
-retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->core.attr.sample_id_all = 0;

@@ -2006,7 +2005,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
} else if (!perf_missing_features.sample_id_all) {
perf_missing_features.sample_id_all = true;
pr_debug2_peo("switching off sample_id_all\n");
- goto retry_sample_id;
+ goto fallback_missing_features;
} else if (!perf_missing_features.lbr_flags &&
(evsel->core.attr.branch_sample_type &
(PERF_SAMPLE_BRANCH_NO_CYCLES |
--
2.31.1

2021-08-21 09:22:15

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 04/37] perf util: add mmap_cpu_mask__duplicate function

This patch adds a new function in util/mmap.c to duplicate a mmap_cpu_mask.

This new function will be used in the following patches.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/mmap.c | 12 ++++++++++++
tools/perf/util/mmap.h | 3 +++
2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index ab7108d22428b527..9e9a447682cc962c 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -350,3 +350,15 @@ int perf_mmap__push(struct mmap *md, void *to,
out:
return rc;
}
+
+int mmap_cpu_mask__duplicate(struct mmap_cpu_mask *original,
+ struct mmap_cpu_mask *clone)
+{
+ clone->nbits = original->nbits;
+ clone->bits = bitmap_alloc(original->nbits);
+ if (!clone->bits)
+ return -ENOMEM;
+
+ memcpy(clone->bits, original->bits, MMAP_CPU_MASK_BYTES(original));
+ return 0;
+}
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 9d5f589f02ae70e1..b4923e587fd7749c 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -63,4 +63,7 @@ size_t mmap__mmap_len(struct mmap *map);

void mmap_cpu_mask__scnprintf(struct mmap_cpu_mask *mask, const char *tag);

+int mmap_cpu_mask__duplicate(struct mmap_cpu_mask *original,
+ struct mmap_cpu_mask *clone);
+
#endif /*__PERF_MMAP_H */
--
2.31.1

2021-08-21 09:22:19

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 15/37] perf evsel: separate open preparation from open itself

This is a preparatory patch for the following patches with the goal to
separate in evlist__open_cpu the actual perf_event_open, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the first lines of evsel__open_cpu into a new
__evsel__prepare_open function.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 45 +++++++++++++++++++++++++++++++----------
1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7b4bb3229a16524e..ddf324e2e17a0951 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1746,22 +1746,20 @@ static int perf_event_open(struct evsel *evsel,
return fd;
}

-static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
- struct perf_thread_map *threads,
- int start_cpu, int end_cpu)
+
+static struct perf_cpu_map *empty_cpu_map;
+static struct perf_thread_map *empty_thread_map;
+
+static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads)
{
- int cpu, thread, nthreads;
- unsigned long flags = PERF_FLAG_FD_CLOEXEC;
- int pid = -1, err, old_errno;
- enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
+ int nthreads;

if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
(perf_missing_features.aux_output && evsel->core.attr.aux_output))
return -EINVAL;

if (cpus == NULL) {
- static struct perf_cpu_map *empty_cpu_map;
-
if (empty_cpu_map == NULL) {
empty_cpu_map = perf_cpu_map__dummy_new();
if (empty_cpu_map == NULL)
@@ -1772,8 +1770,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
}

if (threads == NULL) {
- static struct perf_thread_map *empty_thread_map;
-
if (empty_thread_map == NULL) {
empty_thread_map = thread_map__new_by_tid(-1);
if (empty_thread_map == NULL)
@@ -1792,6 +1788,33 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
perf_evsel__alloc_fd(&evsel->core, cpus->nr, nthreads) < 0)
return -ENOMEM;

+ return 0;
+}
+
+static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads,
+ int start_cpu, int end_cpu)
+{
+ int cpu, thread, nthreads;
+ unsigned long flags = PERF_FLAG_FD_CLOEXEC;
+ int pid = -1, err, old_errno;
+ enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
+
+ err = __evsel__prepare_open(evsel, cpus, threads);
+ if (err)
+ return err;
+
+ if (cpus == NULL)
+ cpus = empty_cpu_map;
+
+ if (threads == NULL)
+ threads = empty_thread_map;
+
+ if (evsel->core.system_wide)
+ nthreads = 1;
+ else
+ nthreads = threads->nr;
+
if (evsel->cgrp) {
flags |= PERF_FLAG_PID_CGROUP;
pid = evsel->cgrp->fd;
--
2.31.1

2021-08-21 09:22:32

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 16/37] perf evsel: save open flags in evsel

This patch caches the flags used in perf_event_open inside evsel, so
that they can be set in __evsel__prepare_open (this will be useful
in following patches, when the fallback mechanisms will be handled
outside the open itself).
This also optimizes the code, by not having to recompute them everytime.

Since flags are now saved in evsel, the flags argument in
perf_event_open is removed.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 24 ++++++++++++------------
tools/perf/util/evsel.h | 1 +
2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ddf324e2e17a0951..509a2970a94b3142 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1710,17 +1710,16 @@ static void display_attr(struct perf_event_attr *attr)
}

static int perf_event_open(struct evsel *evsel,
- pid_t pid, int cpu, int group_fd,
- unsigned long flags)
+ pid_t pid, int cpu, int group_fd)
{
int precise_ip = evsel->core.attr.precise_ip;
int fd;

while (1) {
pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
- pid, cpu, group_fd, flags);
+ pid, cpu, group_fd, evsel->open_flags);

- fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, group_fd, flags);
+ fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, group_fd, evsel->open_flags);
if (fd >= 0)
break;

@@ -1788,6 +1787,10 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
perf_evsel__alloc_fd(&evsel->core, cpus->nr, nthreads) < 0)
return -ENOMEM;

+ evsel->open_flags = PERF_FLAG_FD_CLOEXEC;
+ if (evsel->cgrp)
+ evsel->open_flags |= PERF_FLAG_PID_CGROUP;
+
return 0;
}

@@ -1796,7 +1799,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
int start_cpu, int end_cpu)
{
int cpu, thread, nthreads;
- unsigned long flags = PERF_FLAG_FD_CLOEXEC;
int pid = -1, err, old_errno;
enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;

@@ -1815,10 +1817,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
else
nthreads = threads->nr;

- if (evsel->cgrp) {
- flags |= PERF_FLAG_PID_CGROUP;
+ if (evsel->cgrp)
pid = evsel->cgrp->fd;
- }

fallback_missing_features:
if (perf_missing_features.weight_struct) {
@@ -1832,7 +1832,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
evsel->core.attr.clockid = 0;
}
if (perf_missing_features.cloexec)
- flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
+ evsel->open_flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
if (perf_missing_features.mmap2)
evsel->core.attr.mmap2 = 0;
if (perf_missing_features.exclude_guest)
@@ -1866,7 +1866,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
test_attr__ready();

fd = perf_event_open(evsel, pid, cpus->map[cpu],
- group_fd, flags);
+ group_fd);

FD(evsel, cpu, thread) = fd;

@@ -1874,7 +1874,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,

if (unlikely(test_attr__enabled)) {
test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
- fd, group_fd, flags);
+ fd, group_fd, evsel->open_flags);
}

if (fd < 0) {
@@ -2012,7 +2012,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
perf_missing_features.clockid = true;
pr_debug2_peo("switching off use_clockid\n");
goto fallback_missing_features;
- } else if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+ } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
perf_missing_features.cloexec = true;
pr_debug2_peo("switching off cloexec flag\n");
goto fallback_missing_features;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index eabccce406886320..1c0057e80d080f2f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -149,6 +149,7 @@ struct evsel {
struct bperf_leader_bpf *leader_skel;
struct bperf_follower_bpf *follower_skel;
};
+ unsigned long open_flags;
};

struct perf_missing_features {
--
2.31.1

2021-08-21 09:22:36

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 08/37] perf workqueue: add method to execute work on specific CPU

This patch adds the possibility to schedule a work item on a specific
CPU.
There are 2 possibilities:
- threads are pinned to a CPU using the new functions
workqueue_set_affinity_cpu and workqueue_set_affinities_cpu
- no thread is pinned to the requested cpu. In this case, affinity will
be set before (and cleared after) executing the work.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/workqueue/workqueue.c | 133 +++++++++++++++++++++++++-
tools/perf/util/workqueue/workqueue.h | 12 +++
2 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/workqueue/workqueue.c b/tools/perf/util/workqueue/workqueue.c
index 61f1b6c41deba031..650170a6a11f56bd 100644
--- a/tools/perf/util/workqueue/workqueue.c
+++ b/tools/perf/util/workqueue/workqueue.c
@@ -10,9 +10,12 @@
#include <linux/string.h>
#include <linux/zalloc.h>
#include <linux/kernel.h>
+#include <linux/bitmap.h>
#include "debug.h"
#include <internal/lib.h>
#include "workqueue.h"
+#include <perf/cpumap.h>
+#include "util/affinity.h"

struct workqueue_struct *global_wq;

@@ -43,6 +46,10 @@ struct workqueue_struct {
struct worker **workers; /* array of all workers */
struct worker *next_worker; /* next worker to choose (round robin) */
int first_stopped_worker; /* next worker to start if needed */
+ struct {
+ int *map; /* maps cpu to thread idx */
+ int size; /* size of the map array */
+ } cpu_to_tidx_map;
};

static const char * const workqueue_errno_str[] = {
@@ -429,6 +436,7 @@ static void worker_thread(int tidx, struct task_struct *task)
struct workqueue_struct *create_workqueue(int nr_threads)
{
int ret, err = 0;
+ int nr_cpus = sysconf(_SC_NPROCESSORS_ONLN);
struct workqueue_struct *wq = zalloc(sizeof(struct workqueue_struct));

if (!wq) {
@@ -449,10 +457,18 @@ struct workqueue_struct *create_workqueue(int nr_threads)
goto out_delete_pool;
}

+ wq->cpu_to_tidx_map.size = nr_cpus;
+ wq->cpu_to_tidx_map.map = calloc(nr_cpus, sizeof(*wq->cpu_to_tidx_map.map));
+ if (!wq->workers) {
+ err = -ENOMEM;
+ goto out_free_workers;
+ }
+ memset(wq->cpu_to_tidx_map.map, -1, nr_cpus * sizeof(*wq->cpu_to_tidx_map.map));
+
ret = pthread_mutex_init(&wq->lock, NULL);
if (ret) {
err = -ret;
- goto out_free_workers;
+ goto out_free_cpu_to_idx_map;
}

ret = pthread_cond_init(&wq->idle_cond, NULL);
@@ -494,6 +510,8 @@ struct workqueue_struct *create_workqueue(int nr_threads)
pthread_mutex_destroy(&wq->lock);
out_free_workers:
free(wq->workers);
+out_free_cpu_to_idx_map:
+ free(wq->cpu_to_tidx_map.map);
out_delete_pool:
threadpool__delete(wq->pool);
out_free_wq:
@@ -552,6 +570,7 @@ int destroy_workqueue(struct workqueue_struct *wq)
wq->msg_pipe[1] = -1;

zfree(&wq->workers);
+ zfree(&wq->cpu_to_tidx_map.map);
free(wq);
return err;
}
@@ -779,6 +798,118 @@ int workqueue_set_affinity(struct workqueue_struct *wq, int tidx,
return wq->pool_errno ? -WORKQUEUE_ERROR__POOLAFFINITY : 0;
}

+/**
+ * workqueue_set_affinity_cpu - set affinity to @cpu to thread @tidx in @wq->pool
+ *
+ * If cpu is -1, then affinity is set to all online processors.
+ */
+int workqueue_set_affinity_cpu(struct workqueue_struct *wq, int tidx, int cpu)
+{
+ struct mmap_cpu_mask affinity;
+ int i, err;
+
+ if (cpu >= 0)
+ affinity.nbits = cpu+1;
+ else
+ affinity.nbits = wq->cpu_to_tidx_map.size;
+
+ affinity.bits = bitmap_alloc(affinity.nbits);
+ if (!affinity.bits) {
+ pr_debug2("Failed allocation of bitmapset\n");
+ return -ENOMEM;
+ }
+
+ if (cpu >= 0)
+ test_and_set_bit(cpu, affinity.bits);
+ else
+ bitmap_fill(affinity.bits, affinity.nbits);
+
+ err = workqueue_set_affinity(wq, tidx, &affinity);
+ if (err)
+ goto out;
+
+ // find and unset this thread from the map
+ for (i = 0; i < wq->cpu_to_tidx_map.size; i++) {
+ if (wq->cpu_to_tidx_map.map[i] == tidx)
+ wq->cpu_to_tidx_map.map[i] = -1;
+ }
+
+ if (cpu >= 0)
+ wq->cpu_to_tidx_map.map[cpu] = tidx;
+
+out:
+ bitmap_free(affinity.bits);
+ return err;
+}
+
+/**
+ * workqueue_set_affinities_cpu - set single-cpu affinities to all threads in @wq->pool
+ */
+int workqueue_set_affinities_cpu(struct workqueue_struct *wq,
+ struct perf_cpu_map *cpus)
+{
+ int cpu, idx, err;
+
+ if (perf_cpu_map__nr(cpus) > threadpool__size(wq->pool))
+ return -EINVAL;
+
+
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+ err = workqueue_set_affinity_cpu(wq, idx, cpu);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+struct cpu_bound_work {
+ struct work_struct work;
+ int cpu;
+ struct work_struct *original_work;
+};
+
+static void set_affinity_and_execute(struct work_struct *work)
+{
+ struct cpu_bound_work *cpu_bound_work = container_of(work, struct cpu_bound_work, work);
+ struct affinity affinity;
+
+ if (affinity__setup(&affinity) < 0)
+ goto out;
+
+ affinity__set(&affinity, cpu_bound_work->cpu);
+ cpu_bound_work->original_work->func(cpu_bound_work->original_work);
+ affinity__cleanup(&affinity);
+
+out:
+ free(cpu_bound_work);
+}
+
+/**
+ * queue_work_on - execute @work on @cpu
+ *
+ * The work is assigned to the worker pinned to @cpu, if any.
+ * Otherwise, affinity is set before running the work and unset after.
+ */
+int queue_work_on(int cpu, struct workqueue_struct *wq, struct work_struct *work)
+{
+ struct cpu_bound_work *cpu_bound_work;
+ int tidx = wq->cpu_to_tidx_map.map[cpu];
+
+ if (tidx >= 0)
+ return queue_work_on_worker(tidx, wq, work);
+
+ cpu_bound_work = malloc(sizeof(*cpu_bound_work));
+ if (!cpu_bound_work)
+ return -ENOMEM;
+
+ init_work(&cpu_bound_work->work);
+ cpu_bound_work->work.func = set_affinity_and_execute;
+ cpu_bound_work->cpu = cpu;
+ cpu_bound_work->original_work = work;
+ return queue_work(wq, &cpu_bound_work->work);
+}
+
/**
* init_work - initialize the @work struct
*/
diff --git a/tools/perf/util/workqueue/workqueue.h b/tools/perf/util/workqueue/workqueue.h
index dc6baee138b22ab2..a91a37e367b62d02 100644
--- a/tools/perf/util/workqueue/workqueue.h
+++ b/tools/perf/util/workqueue/workqueue.h
@@ -25,6 +25,7 @@ extern int workqueue_nr_threads(struct workqueue_struct *wq);

extern int queue_work(struct workqueue_struct *wq, struct work_struct *work);
extern int queue_work_on_worker(int tidx, struct workqueue_struct *wq, struct work_struct *work);
+extern int queue_work_on(int cpu, struct workqueue_struct *wq, struct work_struct *work);

extern int flush_workqueue(struct workqueue_struct *wq);

@@ -32,6 +33,9 @@ extern int workqueue_set_affinities(struct workqueue_struct *wq,
struct mmap_cpu_mask *affinities);
extern int workqueue_set_affinity(struct workqueue_struct *wq, int tidx,
struct mmap_cpu_mask *affinity);
+extern int workqueue_set_affinity_cpu(struct workqueue_struct *wq, int tidx, int cpu);
+extern int workqueue_set_affinities_cpu(struct workqueue_struct *wq,
+ struct perf_cpu_map *cpus);

extern void init_work(struct work_struct *work);

@@ -82,6 +86,14 @@ static inline int schedule_work_on_worker(int tidx, struct work_struct *work)
return queue_work_on_worker(tidx, global_wq, work);
}

+/**
+ * schedule_work_on - queue @work to be executed on @cpu by global_wq
+ */
+static inline int schedule_work_on(int cpu, struct work_struct *work)
+{
+ return queue_work_on(cpu, global_wq, work);
+}
+
/**
* flush_scheduled_work - ensure that any scheduled work in global_wq has run to completion
*/
--
2.31.1

2021-08-21 09:22:38

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 09/37] perf python: add workqueue dependency

This patch adds util/workqueue/*.c to python-ext-sources, which is
needed for the following patch, where workqueue will be used in
util/evlist.c.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/python-ext-sources | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/python-ext-sources b/tools/perf/util/python-ext-sources
index d7c976671e3a131a..7453f65c6f7a4396 100644
--- a/tools/perf/util/python-ext-sources
+++ b/tools/perf/util/python-ext-sources
@@ -39,3 +39,5 @@ util/rwsem.c
util/hashmap.c
util/pmu-hybrid.c
util/fncache.c
+util/workqueue/threadpool.c
+util/workqueue/workqueue.c
--
2.31.1

2021-08-21 09:22:43

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 10/37] perf evlist: add multithreading helper

This patch adds the function evlist__for_each_evsel_cpu, which executes
the given function on each evsel, for each cpu.

If perf_singlethreaded is unset, this function will use a workqueue to
execute the function.

This helper function will be used in the following patches.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.c | 117 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.h | 14 +++++
2 files changed, 131 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3d55d9a53b9f4875..f9fdbd85a163ee97 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -27,6 +27,8 @@
#include "util/perf_api_probe.h"
#include "util/evsel_fprintf.h"
#include "util/evlist-hybrid.h"
+#include "util/util.h"
+#include "util/workqueue/workqueue.h"
#include <signal.h>
#include <unistd.h>
#include <sched.h>
@@ -41,6 +43,7 @@
#include <sys/prctl.h>

#include <linux/bitops.h>
+#include <linux/kernel.h>
#include <linux/hash.h>
#include <linux/log2.h>
#include <linux/err.h>
@@ -369,6 +372,120 @@ static int evlist__is_enabled(struct evlist *evlist)
return false;
}

+struct evlist_work {
+ struct work_struct work;
+ struct evlist *evlist;
+ int cpu;
+ evsel__cpu_func func;
+ void *args;
+ int ret;
+};
+
+static void evlist__for_each_evsel_cpu_thread_func(struct work_struct *_work)
+{
+ struct evlist_work *work = container_of(_work, struct evlist_work, work);
+ int cpu_idx, ret, err = 0;
+ struct evsel *pos;
+
+ work->ret = 0;
+ evlist__for_each_entry(work->evlist, pos) {
+ cpu_idx = evsel__find_cpu(pos, work->cpu);
+ if (cpu_idx < 0)
+ continue;
+ ret = work->func(work->evlist, pos, cpu_idx, work->args);
+ if (ret) {
+ work->ret = ret;
+ if (err < 0) // error
+ return;
+ }
+ }
+}
+
+static int evlist__for_each_evsel_cpu_multithreaded(struct evlist *evlist,
+ evsel__cpu_func func, void *args)
+{
+ int i, cpu, ret;
+ struct evlist_work *works;
+ char errbuf[WORKQUEUE_STRERR_BUFSIZE];
+
+ works = calloc(perf_cpu_map__nr(evlist->core.all_cpus), sizeof(*works));
+ perf_cpu_map__for_each_cpu(cpu, i, evlist->core.all_cpus) {
+ init_work(&works[i].work);
+ works[i].evlist = evlist;
+ works[i].work.func = evlist__for_each_evsel_cpu_thread_func;
+ works[i].cpu = cpu;
+ works[i].func = func;
+ works[i].args = args;
+ works[i].ret = 0;
+
+ ret = schedule_work_on(cpu, &works[i].work);
+ if (ret) {
+ workqueue_strerror(global_wq, ret, errbuf, sizeof(errbuf));
+ pr_debug("schedule_work: %s\n", errbuf);
+ break;
+ }
+ }
+
+ ret = flush_scheduled_work();
+ if (ret) {
+ workqueue_strerror(global_wq, ret, errbuf, sizeof(errbuf));
+ pr_debug("flush_scheduled_work: %s\n", errbuf);
+ goto out;
+ }
+
+ perf_cpu_map__for_each_cpu(cpu, i, evlist->core.all_cpus) {
+ if (works[i].ret) {
+ ret = works[i].ret;
+ if (works[i].ret < 0) // error
+ goto out;
+ }
+ }
+out:
+ free(works);
+
+ return ret;
+}
+
+static int evlist__for_each_evsel_cpu_singlethreaded(struct evlist *evlist,
+ evsel__cpu_func func, void *args)
+{
+ int ret, err = 0, i, cpu, cpu_idx;
+ struct affinity affinity;
+ struct evsel *pos;
+
+ if (affinity__setup(&affinity) < 0)
+ return -1;
+
+ evlist__for_each_cpu(evlist, i, cpu) {
+ affinity__set(&affinity, cpu);
+
+ evlist__for_each_entry(evlist, pos) {
+ cpu_idx = evsel__find_cpu(pos, cpu);
+ if (cpu_idx < 0)
+ continue;
+ ret = func(evlist, pos, cpu_idx, args);
+ if (ret) {
+ err = ret;
+ if (err < 0) // error
+ goto out;
+ }
+ }
+ }
+
+out:
+ affinity__cleanup(&affinity);
+ return err;
+}
+
+int evlist__for_each_evsel_cpu(struct evlist *evlist, evsel__cpu_func func, void *args)
+{
+ if (perf_singlethreaded)
+ return evlist__for_each_evsel_cpu_singlethreaded(evlist, func, args);
+ else
+ return evlist__for_each_evsel_cpu_multithreaded(evlist, func, args);
+
+}
+
static void __evlist__disable(struct evlist *evlist, char *evsel_name)
{
struct evsel *pos;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index fde893170c7ba6d2..5f24a45d4e3cf30a 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -363,4 +363,18 @@ struct evsel *evlist__find_evsel(struct evlist *evlist, int idx);

int evlist__scnprintf_evsels(struct evlist *evlist, size_t size, char *bf);
void evlist__check_mem_load_aux(struct evlist *evlist);
+
+/**
+ * evsel__cpu_func - function to run on each evsel for each cpu
+ * @evlist: the parent evlist
+ * @evsel: the processed evsel
+ * @cpu: index of the cpu in evsel->core.cpus
+ * @args: additional custom arguments
+ *
+ * Returns:
+ * A negative value is considered as an error.
+ * A positive value will be propagated to evlist__for_each_evsel_cpu.
+ */
+typedef int (*evsel__cpu_func)(struct evlist *evlist, struct evsel *evsel, int cpu, void *args);
+int evlist__for_each_evsel_cpu(struct evlist *evlist, evsel__cpu_func func, void *args);
#endif /* __PERF_EVLIST_H */
--
2.31.1

2021-08-21 09:22:57

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 19/37] perf evsel: separate missing feature detection from evsel__open_cpu

This is a preparatory patch for the following patches with the goal to
separate in evlist__open_cpu the actual opening, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the missing feature detection in evsel__open_cpu
into a new evsel__detect_missing_features function.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 174 +++++++++++++++++++++-------------------
tools/perf/util/evsel.h | 1 +
2 files changed, 92 insertions(+), 83 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4e9a3e62075305f1..c393bd992322d925 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1841,6 +1841,96 @@ int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
return err;
}

+bool evsel__detect_missing_features(struct evsel *evsel)
+{
+ /*
+ * Must probe features in the order they were added to the
+ * perf_event_attr interface.
+ */
+ if (!perf_missing_features.weight_struct &&
+ (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) {
+ perf_missing_features.weight_struct = true;
+ pr_debug2("switching off weight struct support\n");
+ return true;
+ } else if (!perf_missing_features.code_page_size &&
+ (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
+ perf_missing_features.code_page_size = true;
+ pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
+ return false;
+ } else if (!perf_missing_features.data_page_size &&
+ (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
+ perf_missing_features.data_page_size = true;
+ pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
+ return false;
+ } else if (!perf_missing_features.cgroup && evsel->core.attr.cgroup) {
+ perf_missing_features.cgroup = true;
+ pr_debug2_peo("Kernel has no cgroup sampling support, bailing out\n");
+ return false;
+ } else if (!perf_missing_features.branch_hw_idx &&
+ (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX)) {
+ perf_missing_features.branch_hw_idx = true;
+ pr_debug2("switching off branch HW index support\n");
+ return true;
+ } else if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
+ perf_missing_features.aux_output = true;
+ pr_debug2_peo("Kernel has no attr.aux_output support, bailing out\n");
+ return false;
+ } else if (!perf_missing_features.bpf && evsel->core.attr.bpf_event) {
+ perf_missing_features.bpf = true;
+ pr_debug2_peo("switching off bpf_event\n");
+ return true;
+ } else if (!perf_missing_features.ksymbol && evsel->core.attr.ksymbol) {
+ perf_missing_features.ksymbol = true;
+ pr_debug2_peo("switching off ksymbol\n");
+ return true;
+ } else if (!perf_missing_features.write_backward && evsel->core.attr.write_backward) {
+ perf_missing_features.write_backward = true;
+ pr_debug2_peo("switching off write_backward\n");
+ return false;
+ } else if (!perf_missing_features.clockid_wrong && evsel->core.attr.use_clockid) {
+ perf_missing_features.clockid_wrong = true;
+ pr_debug2_peo("switching off clockid\n");
+ return true;
+ } else if (!perf_missing_features.clockid && evsel->core.attr.use_clockid) {
+ perf_missing_features.clockid = true;
+ pr_debug2_peo("switching off use_clockid\n");
+ return true;
+ } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
+ perf_missing_features.cloexec = true;
+ pr_debug2_peo("switching off cloexec flag\n");
+ return true;
+ } else if (!perf_missing_features.mmap2 && evsel->core.attr.mmap2) {
+ perf_missing_features.mmap2 = true;
+ pr_debug2_peo("switching off mmap2\n");
+ return true;
+ } else if (!perf_missing_features.exclude_guest &&
+ (evsel->core.attr.exclude_guest || evsel->core.attr.exclude_host)) {
+ perf_missing_features.exclude_guest = true;
+ pr_debug2_peo("switching off exclude_guest, exclude_host\n");
+ return true;
+ } else if (!perf_missing_features.sample_id_all) {
+ perf_missing_features.sample_id_all = true;
+ pr_debug2_peo("switching off sample_id_all\n");
+ return true;
+ } else if (!perf_missing_features.lbr_flags &&
+ (evsel->core.attr.branch_sample_type &
+ (PERF_SAMPLE_BRANCH_NO_CYCLES |
+ PERF_SAMPLE_BRANCH_NO_FLAGS))) {
+ perf_missing_features.lbr_flags = true;
+ pr_debug2_peo("switching off branch sample type no (cycles/flags)\n");
+ return true;
+ } else if (!perf_missing_features.group_read &&
+ evsel->core.attr.inherit &&
+ (evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
+ evsel__is_group_leader(evsel)) {
+ perf_missing_features.group_read = true;
+ pr_debug2_peo("switching off group read\n");
+ return true;
+ } else {
+ return false;
+ }
+}
+
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads,
int start_cpu, int end_cpu)
@@ -1979,90 +2069,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
if (err != -EINVAL || cpu > 0 || thread > 0)
goto out_close;

- /*
- * Must probe features in the order they were added to the
- * perf_event_attr interface.
- */
- if (!perf_missing_features.weight_struct &&
- (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) {
- perf_missing_features.weight_struct = true;
- pr_debug2("switching off weight struct support\n");
+ if (evsel__detect_missing_features(evsel))
goto fallback_missing_features;
- } else if (!perf_missing_features.code_page_size &&
- (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
- perf_missing_features.code_page_size = true;
- pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
- goto out_close;
- } else if (!perf_missing_features.data_page_size &&
- (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
- perf_missing_features.data_page_size = true;
- pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
- goto out_close;
- } else if (!perf_missing_features.cgroup && evsel->core.attr.cgroup) {
- perf_missing_features.cgroup = true;
- pr_debug2_peo("Kernel has no cgroup sampling support, bailing out\n");
- goto out_close;
- } else if (!perf_missing_features.branch_hw_idx &&
- (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX)) {
- perf_missing_features.branch_hw_idx = true;
- pr_debug2("switching off branch HW index support\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
- perf_missing_features.aux_output = true;
- pr_debug2_peo("Kernel has no attr.aux_output support, bailing out\n");
- goto out_close;
- } else if (!perf_missing_features.bpf && evsel->core.attr.bpf_event) {
- perf_missing_features.bpf = true;
- pr_debug2_peo("switching off bpf_event\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.ksymbol && evsel->core.attr.ksymbol) {
- perf_missing_features.ksymbol = true;
- pr_debug2_peo("switching off ksymbol\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.write_backward && evsel->core.attr.write_backward) {
- perf_missing_features.write_backward = true;
- pr_debug2_peo("switching off write_backward\n");
- goto out_close;
- } else if (!perf_missing_features.clockid_wrong && evsel->core.attr.use_clockid) {
- perf_missing_features.clockid_wrong = true;
- pr_debug2_peo("switching off clockid\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.clockid && evsel->core.attr.use_clockid) {
- perf_missing_features.clockid = true;
- pr_debug2_peo("switching off use_clockid\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
- perf_missing_features.cloexec = true;
- pr_debug2_peo("switching off cloexec flag\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.mmap2 && evsel->core.attr.mmap2) {
- perf_missing_features.mmap2 = true;
- pr_debug2_peo("switching off mmap2\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.exclude_guest &&
- (evsel->core.attr.exclude_guest || evsel->core.attr.exclude_host)) {
- perf_missing_features.exclude_guest = true;
- pr_debug2_peo("switching off exclude_guest, exclude_host\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.sample_id_all) {
- perf_missing_features.sample_id_all = true;
- pr_debug2_peo("switching off sample_id_all\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.lbr_flags &&
- (evsel->core.attr.branch_sample_type &
- (PERF_SAMPLE_BRANCH_NO_CYCLES |
- PERF_SAMPLE_BRANCH_NO_FLAGS))) {
- perf_missing_features.lbr_flags = true;
- pr_debug2_peo("switching off branch sample type no (cycles/flags)\n");
- goto fallback_missing_features;
- } else if (!perf_missing_features.group_read &&
- evsel->core.attr.inherit &&
- (evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
- evsel__is_group_leader(evsel)) {
- perf_missing_features.group_read = true;
- pr_debug2_peo("switching off group read\n");
- goto fallback_missing_features;
- }
out_close:
if (err)
threads->err_thread = thread;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 58aa998e1814ac9e..a83fb7f69b1ead73 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -288,6 +288,7 @@ int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
void evsel__close(struct evsel *evsel);
int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
+bool evsel__detect_missing_features(struct evsel *evsel);

struct perf_sample;

--
2.31.1

2021-08-21 09:22:58

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 22/37] perf evsel: move test_attr__open to success path in evsel__open_cpu

test_attr__open ignores the fd if -1, therefore it is safe to move it to
the success path (fd >= 0).

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a1a88607fd59efcb..d2b0391569286080 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2011,11 +2011,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,

bpf_counter__install_pe(evsel, cpu, fd);

- if (unlikely(test_attr__enabled)) {
- test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
- fd, group_fd, evsel->open_flags);
- }
-
if (fd < 0) {
err = -errno;

@@ -2024,6 +2019,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
goto try_fallback;
}

+ if (unlikely(test_attr__enabled)) {
+ test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
+ fd, group_fd, evsel->open_flags);
+ }
+
pr_debug2_peo(" = %d\n", fd);

if (evsel->bpf_fd >= 0) {
--
2.31.1

2021-08-21 09:23:04

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 23/37] perf evsel: move bpf_counter__install_pe to success path in evsel__open_cpu

I don't see why bpf_counter__install_pe should get called even if fd=-1,
so I'm moving it to the success path.

This will be useful in following patches to separate the actual open and
the related operations from the fallback mechanisms.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d2b0391569286080..3e556afed8dd396c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2009,8 +2009,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,

FD(evsel, cpu, thread) = fd;

- bpf_counter__install_pe(evsel, cpu, fd);
-
if (fd < 0) {
err = -errno;

@@ -2019,6 +2017,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
goto try_fallback;
}

+ bpf_counter__install_pe(evsel, cpu, fd);
+
if (unlikely(test_attr__enabled)) {
test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
fd, group_fd, evsel->open_flags);
--
2.31.1

2021-08-21 09:23:11

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 27/37] perf evlist: add evlist__for_each_entry_from macro

This patch adds a new iteration macro for evlist that resumes iteration
from a given evsel in the evlist.

This macro will be used in the next patch

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 5f24a45d4e3cf30a..b0c2da0f9755b2d1 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -275,6 +275,22 @@ void evlist__to_front(struct evlist *evlist, struct evsel *move_evsel);
#define evlist__for_each_entry_continue(evlist, evsel) \
__evlist__for_each_entry_continue(&(evlist)->core.entries, evsel)

+/**
+ * __evlist__for_each_entry_from - continue iteration from @evsel (included)
+ * @list: list_head instance to iterate
+ * @evsel: struct evsel iterator
+ */
+#define __evlist__for_each_entry_from(list, evsel) \
+ list_for_each_entry_from(evsel, list, core.node)
+
+/**
+ * evlist__for_each_entry_from - continue iteration from @evsel (included)
+ * @evlist: evlist instance to iterate
+ * @evsel: struct evsel iterator
+ */
+#define evlist__for_each_entry_from(evlist, evsel) \
+ __evlist__for_each_entry_from(&(evlist)->core.entries, evsel)
+
/**
* __evlist__for_each_entry_reverse - iterate thru all the evsels in reverse order
* @list: list_head instance to iterate
--
2.31.1

2021-08-21 09:23:19

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 18/37] perf evsel: add evsel__prepare_open function

This function will prepare the evsel and disable the missing features.
It will be used in one of the following patches.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 14 ++++++++++++++
tools/perf/util/evsel.h | 2 ++
2 files changed, 16 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f0bc89f743915800..4e9a3e62075305f1 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1827,6 +1827,20 @@ static void evsel__disable_missing_features(struct evsel *evsel)
evsel->core.attr.sample_id_all = 0;
}

+int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads)
+{
+ int err;
+
+ err = __evsel__prepare_open(evsel, cpus, threads);
+ if (err)
+ return err;
+
+ evsel__disable_missing_features(evsel);
+
+ return err;
+}
+
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads,
int start_cpu, int end_cpu)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 1c0057e80d080f2f..58aa998e1814ac9e 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -286,6 +286,8 @@ int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads)
int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
void evsel__close(struct evsel *evsel);
+int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads);

struct perf_sample;

--
2.31.1

2021-08-21 09:23:21

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 30/37] perf record: use evlist__open_custom

This patch replace the custom evlist opening implemented in record__open
with the new standard function evlist__open_custom.
Differently from before this patch, in case of a weak group all fds are
closed and then reopened (instead of just closing failing ones).

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/builtin-record.c | 63 ++++++++++++++++++++++++-------------
1 file changed, 42 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index cccc2d0f9977d5b3..dc9a814b2e7906fc 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -882,6 +882,37 @@ static int record__mmap(struct record *rec)
return record__mmap_evlist(rec, rec->evlist);
}

+struct record_open_custom_fallback {
+ struct evlist_open_custom_fallback fallback;
+ struct record_opts *opts;
+ bool retry;
+};
+
+static bool record__open_fallback(struct evlist_open_custom_fallback *_fallback,
+ struct evlist *evlist, struct evsel *evsel, int err)
+{
+ char msg[BUFSIZ];
+ struct record_open_custom_fallback *fallback = container_of(_fallback,
+ struct record_open_custom_fallback, fallback);
+
+ if (evsel__fallback(evsel, -err, msg, sizeof(msg))) {
+ if (verbose > 0)
+ ui__warning("%s\n", msg);
+ return true;
+ }
+ if ((err == -EINVAL || err == -EBADF) &&
+ evsel->core.leader != &evsel->core &&
+ evsel->weak_group) {
+ evlist__reset_weak_group(evlist, evsel, true);
+ fallback->retry = true;
+ return false;
+ }
+
+ evsel__open_strerror(evsel, &fallback->opts->target, -err, msg, sizeof(msg));
+ ui__error("%s\n", msg);
+ return false;
+}
+
static int record__open(struct record *rec)
{
char msg[BUFSIZ];
@@ -890,6 +921,12 @@ static int record__open(struct record *rec)
struct perf_session *session = rec->session;
struct record_opts *opts = &rec->opts;
int rc = 0;
+ struct record_open_custom_fallback cust_fb = {
+ .fallback = {
+ .func = record__open_fallback
+ },
+ .opts = opts
+ };

/*
* For initial_delay, system wide or a hybrid system, we need to add a
@@ -919,28 +956,12 @@ static int record__open(struct record *rec)

evlist__config(evlist, opts, &callchain_param);

- evlist__for_each_entry(evlist, pos) {
-try_again:
- if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
- if (evsel__fallback(pos, errno, msg, sizeof(msg))) {
- if (verbose > 0)
- ui__warning("%s\n", msg);
- goto try_again;
- }
- if ((errno == EINVAL || errno == EBADF) &&
- pos->core.leader != &pos->core &&
- pos->weak_group) {
- pos = evlist__reset_weak_group(evlist, pos, true);
- goto try_again;
- }
- rc = -errno;
- evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
- ui__error("%s\n", msg);
- goto out;
- }

- pos->supported = true;
- }
+ do {
+ cust_fb.retry = false;
+ rc = evlist__open_custom(evlist, &cust_fb.fallback);
+ } while (cust_fb.retry);
+

if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
pr_warning(
--
2.31.1

2021-08-21 09:23:31

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 29/37] perf evlist: add custom fallback to evlist__open

This patch adds the function evlist__open_custom, which makes it possible
to provide a custom fallback mechanism to evsel__open.
This function will be used in record__open in the following patch.
This could also be used to adapt perf-stat way of opening to the new
multithreaded mechanism.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.c | 27 +++++++++++++++++++++------
tools/perf/util/evlist.h | 9 +++++++++
2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3472038d719ec7d4..22b9607dcc637a2d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1405,15 +1405,20 @@ static int evlist__create_syswide_maps(struct evlist *evlist)
return err;
}

-static int evlist__open_singlethreaded(struct evlist *evlist)
+static int evlist__open_singlethreaded(struct evlist *evlist,
+ struct evlist_open_custom_fallback *cust_fb)
{
struct evsel *evsel;
int err;

evlist__for_each_entry(evlist, evsel) {
+try_again:
err = evsel__open(evsel, evsel->core.cpus, evsel->core.threads);
- if (err < 0)
+ if (err < 0) {
+ if (cust_fb && cust_fb->func(cust_fb, evlist, evsel, err))
+ goto try_again;
return err;
+ }
}

return 0;
@@ -1469,7 +1474,8 @@ static void evlist__open_multithreaded_func(struct work_struct *_work)
}
}

-static int evlist__open_multithreaded(struct evlist *evlist)
+static int evlist__open_multithreaded(struct evlist *evlist,
+ struct evlist_open_custom_fallback *cust_fb)
{
int cpu, cpuid, cpuidx, thread, err;
struct evlist_open_work *works;
@@ -1567,6 +1573,9 @@ static int evlist__open_multithreaded(struct evlist *evlist)
if (evsel__detect_missing_features(evsel))
goto reprepare;

+ if (cust_fb && cust_fb->func(cust_fb, evlist, evsel, err))
+ goto reprepare;
+
// no fallback worked, return the error
goto out;
}
@@ -1579,7 +1588,8 @@ static int evlist__open_multithreaded(struct evlist *evlist)
return err;
}

-int evlist__open(struct evlist *evlist)
+int evlist__open_custom(struct evlist *evlist,
+ struct evlist_open_custom_fallback *cust_fb)
{
int err;

@@ -1596,9 +1606,9 @@ int evlist__open(struct evlist *evlist)
evlist__update_id_pos(evlist);

if (perf_singlethreaded)
- err = evlist__open_singlethreaded(evlist);
+ err = evlist__open_singlethreaded(evlist, cust_fb);
else
- err = evlist__open_multithreaded(evlist);
+ err = evlist__open_multithreaded(evlist, cust_fb);

if (err)
goto out_err;
@@ -1610,6 +1620,11 @@ int evlist__open(struct evlist *evlist)
return err;
}

+int evlist__open(struct evlist *evlist)
+{
+ return evlist__open_custom(evlist, NULL);
+}
+
int evlist__prepare_workload(struct evlist *evlist, struct target *target, const char *argv[],
bool pipe_output, void (*exec_error)(int signo, siginfo_t *info, void *ucontext))
{
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b0c2da0f9755b2d1..cb753b4c4e70b4ba 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -161,6 +161,15 @@ void evlist__mmap_consume(struct evlist *evlist, int idx);
int evlist__open(struct evlist *evlist);
void evlist__close(struct evlist *evlist);

+struct evlist_open_custom_fallback;
+typedef bool (*evlist_open_custom_fallback_fn)(struct evlist_open_custom_fallback *fallback,
+ struct evlist *evlist, struct evsel *evsel, int err);
+struct evlist_open_custom_fallback {
+ evlist_open_custom_fallback_fn func;
+};
+int evlist__open_custom(struct evlist *evlist, struct evlist_open_custom_fallback *cust_fb);
+
+
struct callchain_param;

void evlist__set_id_pos(struct evlist *evlist);
--
2.31.1

2021-08-21 09:23:40

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 17/37] perf evsel: separate missing feature disabling from evsel__open_cpu

This is a preparatory patch for the following patches with the goal to
separate in evlist__open_cpu the actual opening, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the disabling of missing features from
evlist__open_cpu into a new function evsel__disable_missing_features.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 57 ++++++++++++++++++++++-------------------
1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 509a2970a94b3142..f0bc89f743915800 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1794,33 +1794,8 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
return 0;
}

-static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
- struct perf_thread_map *threads,
- int start_cpu, int end_cpu)
+static void evsel__disable_missing_features(struct evsel *evsel)
{
- int cpu, thread, nthreads;
- int pid = -1, err, old_errno;
- enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
-
- err = __evsel__prepare_open(evsel, cpus, threads);
- if (err)
- return err;
-
- if (cpus == NULL)
- cpus = empty_cpu_map;
-
- if (threads == NULL)
- threads = empty_thread_map;
-
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
-
- if (evsel->cgrp)
- pid = evsel->cgrp->fd;
-
-fallback_missing_features:
if (perf_missing_features.weight_struct) {
evsel__set_sample_bit(evsel, WEIGHT);
evsel__reset_sample_bit(evsel, WEIGHT_STRUCT);
@@ -1850,6 +1825,36 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_HW_INDEX;
if (perf_missing_features.sample_id_all)
evsel->core.attr.sample_id_all = 0;
+}
+
+static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads,
+ int start_cpu, int end_cpu)
+{
+ int cpu, thread, nthreads;
+ int pid = -1, err, old_errno;
+ enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
+
+ err = __evsel__prepare_open(evsel, cpus, threads);
+ if (err)
+ return err;
+
+ if (cpus == NULL)
+ cpus = empty_cpu_map;
+
+ if (threads == NULL)
+ threads = empty_thread_map;
+
+ if (evsel->core.system_wide)
+ nthreads = 1;
+ else
+ nthreads = threads->nr;
+
+ if (evsel->cgrp)
+ pid = evsel->cgrp->fd;
+
+fallback_missing_features:
+ evsel__disable_missing_features(evsel);

display_attr(&evsel->core.attr);

--
2.31.1

2021-08-21 09:23:43

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 25/37] perf evsel: move event open in evsel__open_cpu to separate function

This is the final patch splitting evsel__open_cpu.
This patch moves the entire loop code to a separate function, to be
reused for the multithreaded code.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 142 ++++++++++++++++++++++++----------------
tools/perf/util/evsel.h | 12 ++++
2 files changed, 99 insertions(+), 55 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2e95416b8320c6b9..e41f55a7a70ea630 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1945,6 +1945,82 @@ bool evsel__increase_rlimit(enum rlimit_action *set_rlimit)
return false;
}

+static struct perf_event_open_result perf_event_open(struct evsel *evsel,
+ pid_t pid, int cpu, int thread,
+ struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads)
+{
+ int fd, group_fd, rc;
+ struct perf_event_open_result res;
+
+ if (!evsel->cgrp && !evsel->core.system_wide)
+ pid = perf_thread_map__pid(threads, thread);
+
+ group_fd = get_group_fd(evsel, cpu, thread);
+
+ test_attr__ready();
+
+ pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
+ pid, cpus->map[cpu], group_fd, evsel->open_flags);
+
+ fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
+ group_fd, evsel->open_flags);
+
+ FD(evsel, cpu, thread) = fd;
+ res.fd = fd;
+
+ if (fd < 0) {
+ rc = -errno;
+
+ pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
+ rc);
+ res.rc = rc;
+ res.err = PEO_FALLBACK;
+ return res;
+ }
+
+ bpf_counter__install_pe(evsel, cpu, fd);
+
+ if (unlikely(test_attr__enabled)) {
+ test_attr__open(&evsel->core.attr, pid,
+ cpus->map[cpu], fd,
+ group_fd, evsel->open_flags);
+ }
+
+ pr_debug2_peo(" = %d\n", fd);
+
+ if (evsel->bpf_fd >= 0) {
+ int evt_fd = fd;
+ int bpf_fd = evsel->bpf_fd;
+
+ rc = ioctl(evt_fd,
+ PERF_EVENT_IOC_SET_BPF,
+ bpf_fd);
+ if (rc && errno != EEXIST) {
+ pr_err("failed to attach bpf fd %d: %s\n",
+ bpf_fd, strerror(errno));
+ res.rc = -EINVAL;
+ res.err = PEO_ERROR;
+ return res;
+ }
+ }
+
+ /*
+ * If we succeeded but had to kill clockid, fail and
+ * have evsel__open_strerror() print us a nice error.
+ */
+ if (perf_missing_features.clockid ||
+ perf_missing_features.clockid_wrong) {
+ res.rc = -EINVAL;
+ res.err = PEO_ERROR;
+ return res;
+ }
+
+ res.rc = 0;
+ res.err = PEO_SUCCESS;
+ return res;
+}
+
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads,
int start_cpu, int end_cpu)
@@ -1952,6 +2028,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
int cpu, thread, nthreads;
int pid = -1, err, old_errno;
enum rlimit_action set_rlimit = NO_CHANGE;
+ struct perf_event_open_result peo_res;

err = __evsel__prepare_open(evsel, cpus, threads);
if (err)
@@ -1979,67 +2056,22 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
for (cpu = start_cpu; cpu < end_cpu; cpu++) {

for (thread = 0; thread < nthreads; thread++) {
- int fd, group_fd;
retry_open:
if (thread >= nthreads)
break;

- if (!evsel->cgrp && !evsel->core.system_wide)
- pid = perf_thread_map__pid(threads, thread);
-
- group_fd = get_group_fd(evsel, cpu, thread);
-
- test_attr__ready();
-
- pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
- pid, cpus->map[cpu], group_fd, evsel->open_flags);
+ peo_res = perf_event_open(evsel, pid, cpu, thread, cpus,
+ threads);

- fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
- group_fd, evsel->open_flags);
-
- FD(evsel, cpu, thread) = fd;
-
- if (fd < 0) {
- err = -errno;
-
- pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
- err);
+ err = peo_res.rc;
+ switch (peo_res.err) {
+ case PEO_SUCCESS:
+ set_rlimit = NO_CHANGE;
+ continue;
+ case PEO_FALLBACK:
goto try_fallback;
- }
-
- bpf_counter__install_pe(evsel, cpu, fd);
-
- if (unlikely(test_attr__enabled)) {
- test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
- fd, group_fd, evsel->open_flags);
- }
-
- pr_debug2_peo(" = %d\n", fd);
-
- if (evsel->bpf_fd >= 0) {
- int evt_fd = fd;
- int bpf_fd = evsel->bpf_fd;
-
- err = ioctl(evt_fd,
- PERF_EVENT_IOC_SET_BPF,
- bpf_fd);
- if (err && errno != EEXIST) {
- pr_err("failed to attach bpf fd %d: %s\n",
- bpf_fd, strerror(errno));
- err = -EINVAL;
- goto out_close;
- }
- }
-
- set_rlimit = NO_CHANGE;
-
- /*
- * If we succeeded but had to kill clockid, fail and
- * have evsel__open_strerror() print us a nice error.
- */
- if (perf_missing_features.clockid ||
- perf_missing_features.clockid_wrong) {
- err = -EINVAL;
+ default:
+ case PEO_ERROR:
goto out_close;
}
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 0a245afab2d87d74..8c9827a93ac001a7 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -282,6 +282,18 @@ int evsel__enable(struct evsel *evsel);
int evsel__disable(struct evsel *evsel);
int evsel__disable_cpu(struct evsel *evsel, int cpu);

+enum perf_event_open_err {
+ PEO_SUCCESS,
+ PEO_FALLBACK,
+ PEO_ERROR
+};
+
+struct perf_event_open_result {
+ enum perf_event_open_err err;
+ int rc;
+ int fd;
+};
+
int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu);
int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads);
int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
--
2.31.1

2021-08-21 09:23:53

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 32/37] perf record: add --threads option

This patch adds a new --threads option to perf-record, which sets the
number of threads to use for multithreaded operations (synthesis and, in
following patches, evlist).

The new option will override the --num-thread-synthesize option if set.
By default, no thread will be used. The option can also be passed
without any argument, setting the number of threads to the number of
online cpus.

Furthermore, two new perf configs are added to selectively disable
multithreading in either synthesis and evlist.

To keep the same behaviour for --num-thread-synthesize, setting only that
option will cause multithreading to be enabled only in synthesis (by
overriding the perf config options for multithreaded synthesis and
evlist).

Examples:
$ ./perf record --threads
uses one thread per cpu for synthesis (and evlist in following patches)

$ ./perf record --threads 2 --num-thread-synthesize 4
the two options shouldn't be mixed, the behaviour would be using 2
threads for everything (4 is ignored)

$ ./perf record --num-thread-synthesize 4
same behaviour as before: 4 threads, but only for synthesis

$ ./perf config record.multithreaded_synthesis=no
$ ./perf record --threads
uses multithreading for everything but synthesis (i.e. evlist in
following patches)

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 9 ++++++
tools/perf/builtin-record.c | 35 +++++++++++++++++++-----
tools/perf/util/record.h | 3 ++
3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index f1079ee7f2ecf4a8..f5525e3a36e0cf2a 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -629,6 +629,15 @@ appended unit character - B/K/M/G
The number of threads to run when synthesizing events for existing processes.
By default, the number of threads equals 1.

+--threads::
+ The number of threads to use for operations which have multithreaded
+ support (synthesize, evlist).
+ Setting this option overrides --num-thread-synthesize.
+ You can selectively disable any of the multithreaded operations through
+ perf-config record.multithreaded-{synthesis,evlist}.
+ By default, the number of threads equals 1.
+ Setting this option without any parameter sets it to the number of online cpus.
+
ifdef::HAVE_LIBPFM[]
--pfm-events events::
Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc9a814b2e7906fc..7802a0e25f631fac 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1496,7 +1496,7 @@ static int record__synthesize(struct record *rec, bool tail)
if (err < 0)
pr_warning("Couldn't synthesize cgroup events.\n");

- if (rec->opts.nr_threads_synthesize > 1) {
+ if (rec->opts.multithreaded_synthesis) {
perf_set_multithreaded();
f = process_locked_synthesized_event;
}
@@ -1504,7 +1504,7 @@ static int record__synthesize(struct record *rec, bool tail)
err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->core.threads,
f, opts->sample_address);

- if (rec->opts.nr_threads_synthesize > 1)
+ if (rec->opts.multithreaded_synthesis)
perf_set_singlethreaded();

out:
@@ -2188,6 +2188,12 @@ static int perf_record_config(const char *var, const char *value, void *cb)
rec->opts.nr_cblocks = nr_cblocks_default;
}
#endif
+ if (!strcmp(var, "record.multithreaded-synthesis"))
+ rec->opts.multithreaded_synthesis = perf_config_bool(var, value);
+
+ if (!strcmp(var, "record.multithreaded-evlist"))
+ rec->opts.multithreaded_evlist = perf_config_bool(var, value);
+

return 0;
}
@@ -2434,6 +2440,9 @@ static struct record record = {
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
.nr_threads_synthesize = 1,
+ .nr_threads = 1,
+ .multithreaded_evlist = true,
+ .multithreaded_synthesis = true,
.ctl_fd = -1,
.ctl_fd_ack = -1,
},
@@ -2640,6 +2649,9 @@ static struct option __record_options[] = {
OPT_UINTEGER(0, "num-thread-synthesize",
&record.opts.nr_threads_synthesize,
"number of threads to run for event synthesis"),
+ OPT_UINTEGER_OPTARG(0, "threads",
+ &record.opts.nr_threads, UINT_MAX,
+ "number of threads to use"),
#ifdef HAVE_LIBPFM
OPT_CALLBACK(0, "pfm-events", &record.evlist, "event",
"libpfm4 event selector. use 'perf list' to list available events",
@@ -2915,10 +2927,19 @@ int cmd_record(int argc, const char **argv)
rec->opts.comp_level = comp_level_max;
pr_debug("comp level: %d\n", rec->opts.comp_level);

- if (rec->opts.nr_threads_synthesize == UINT_MAX)
- rec->opts.nr_threads_synthesize = sysconf(_SC_NPROCESSORS_ONLN);
- if (rec->opts.nr_threads_synthesize > 1) {
- err = setup_global_workqueue(rec->opts.nr_threads_synthesize);
+ if (rec->opts.nr_threads <= 1) {
+ rec->opts.multithreaded_evlist = false;
+ if (rec->opts.nr_threads_synthesize > 1) {
+ rec->opts.multithreaded_synthesis = true;
+ rec->opts.nr_threads = rec->opts.nr_threads_synthesize;
+ } else {
+ rec->opts.multithreaded_synthesis = false;
+ }
+ }
+ if (rec->opts.nr_threads == UINT_MAX)
+ rec->opts.nr_threads = sysconf(_SC_NPROCESSORS_ONLN);
+ if (rec->opts.nr_threads > 1) {
+ err = setup_global_workqueue(rec->opts.nr_threads);
if (err) {
create_workqueue_strerror(global_wq, errbuf, sizeof(errbuf));
pr_err("setup_global_workqueue: %s\n", errbuf);
@@ -2928,7 +2949,7 @@ int cmd_record(int argc, const char **argv)

err = __cmd_record(&record, argc, argv);

- if (rec->opts.nr_threads_synthesize > 1)
+ if (rec->opts.nr_threads > 1)
teardown_global_workqueue();
out:
bitmap_free(rec->affinity_mask.bits);
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 68f471d9a88b2b36..9c47a7904a43ffc7 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -74,6 +74,9 @@ struct record_opts {
int mmap_flush;
unsigned int comp_level;
unsigned int nr_threads_synthesize;
+ unsigned int nr_threads;
+ bool multithreaded_synthesis;
+ bool multithreaded_evlist;
int ctl_fd;
int ctl_fd_ack;
bool ctl_fd_close;
--
2.31.1

2021-08-21 09:23:53

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 24/37] perf evsel: handle precise_ip fallback in evsel__open_cpu

This is another patch in the effort to separate the fallback mechanisms
from the open itself.

In case of precise_ip fallback, the original precise_ip will be stored
in the evsel (it was stored in a local variable) and the open will be
retried. Since the precise_ip fallback will be the first in the chain of
fallbacks, there should be no functional change with this patch.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 59 ++++++++++++++++++-----------------------
tools/perf/util/evsel.h | 2 ++
2 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3e556afed8dd396c..2e95416b8320c6b9 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1709,42 +1709,29 @@ static void display_attr(struct perf_event_attr *attr)
}
}

-static int perf_event_open(struct evsel *evsel,
- pid_t pid, int cpu, int group_fd)
+bool evsel__precise_ip_fallback(struct evsel *evsel)
{
- int precise_ip = evsel->core.attr.precise_ip;
- int fd;
-
- while (1) {
- pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
- pid, cpu, group_fd, evsel->open_flags);
-
- fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, group_fd, evsel->open_flags);
- if (fd >= 0)
- break;
-
- /* Do not try less precise if not requested. */
- if (!evsel->precise_max)
- break;
-
- /*
- * We tried all the precise_ip values, and it's
- * still failing, so leave it to standard fallback.
- */
- if (!evsel->core.attr.precise_ip) {
- evsel->core.attr.precise_ip = precise_ip;
- break;
- }
+ /* Do not try less precise if not requested. */
+ if (!evsel->precise_max)
+ return false;

- pr_debug2_peo("\nsys_perf_event_open failed, error %d\n", -ENOTSUP);
- evsel->core.attr.precise_ip--;
- pr_debug2_peo("decreasing precise_ip by one (%d)\n", evsel->core.attr.precise_ip);
- display_attr(&evsel->core.attr);
+ /*
+ * We tried all the precise_ip values, and it's
+ * still failing, so leave it to standard fallback.
+ */
+ if (!evsel->core.attr.precise_ip) {
+ evsel->core.attr.precise_ip = evsel->precise_ip_original;
+ return false;
}

- return fd;
-}
+ if (!evsel->precise_ip_original)
+ evsel->precise_ip_original = evsel->core.attr.precise_ip;

+ evsel->core.attr.precise_ip--;
+ pr_debug2_peo("decreasing precise_ip by one (%d)\n", evsel->core.attr.precise_ip);
+ display_attr(&evsel->core.attr);
+ return true;
+}

static struct perf_cpu_map *empty_cpu_map;
static struct perf_thread_map *empty_thread_map;
@@ -2004,8 +1991,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,

test_attr__ready();

- fd = perf_event_open(evsel, pid, cpus->map[cpu],
- group_fd);
+ pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
+ pid, cpus->map[cpu], group_fd, evsel->open_flags);
+
+ fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
+ group_fd, evsel->open_flags);

FD(evsel, cpu, thread) = fd;

@@ -2058,6 +2048,9 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
return 0;

try_fallback:
+ if (evsel__precise_ip_fallback(evsel))
+ goto retry_open;
+
if (evsel__ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
/* We just removed 1 thread, so lower the upper nthreads limit. */
nthreads--;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 661d2677a19b6248..0a245afab2d87d74 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -150,6 +150,7 @@ struct evsel {
struct bperf_follower_bpf *follower_skel;
};
unsigned long open_flags;
+ int precise_ip_original;
};

struct perf_missing_features {
@@ -297,6 +298,7 @@ bool evsel__ignore_missing_thread(struct evsel *evsel,
int nr_cpus, int cpu,
struct perf_thread_map *threads,
int thread, int err);
+bool evsel__precise_ip_fallback(struct evsel *evsel);

struct perf_sample;

--
2.31.1

2021-08-21 09:23:58

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 21/37] perf evsel: move ignore_missing_thread to fallback code

This patch moves ignore_missing_thread outside the perf_event_open loop.
Doing so, we need to move the retry_open flag a few places higher, with
minimal impact. Furthermore, thread need not be decreased since it won't
get increased by the for loop (since we're jumping back inside), but we
need to check that the nthreads decrease didn't put thread out of range.

The goal is to have fallbacks handled in one place only, since in the
future parallel code, these would be handled separately.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 29 +++++++++++++----------------
tools/perf/util/evsel.h | 5 +++++
2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 916930ea31450265..a1a88607fd59efcb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1656,7 +1656,7 @@ static int update_fds(struct evsel *evsel,
return 0;
}

-static bool ignore_missing_thread(struct evsel *evsel,
+bool evsel__ignore_missing_thread(struct evsel *evsel,
int nr_cpus, int cpu,
struct perf_thread_map *threads,
int thread, int err)
@@ -1993,12 +1993,15 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,

for (thread = 0; thread < nthreads; thread++) {
int fd, group_fd;
+retry_open:
+ if (thread >= nthreads)
+ break;

if (!evsel->cgrp && !evsel->core.system_wide)
pid = perf_thread_map__pid(threads, thread);

group_fd = get_group_fd(evsel, cpu, thread);
-retry_open:
+
test_attr__ready();

fd = perf_event_open(evsel, pid, cpus->map[cpu],
@@ -2016,20 +2019,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
if (fd < 0) {
err = -errno;

- if (ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
- /*
- * We just removed 1 thread, so take a step
- * back on thread index and lower the upper
- * nthreads limit.
- */
- nthreads--;
- thread--;
-
- /* ... and pretend like nothing have happened. */
- err = 0;
- continue;
- }
-
pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
err);
goto try_fallback;
@@ -2069,6 +2058,14 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
return 0;

try_fallback:
+ if (evsel__ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
+ /* We just removed 1 thread, so lower the upper nthreads limit. */
+ nthreads--;
+
+ /* ... and pretend like nothing have happened. */
+ err = 0;
+ goto retry_open;
+ }
/*
* perf stat needs between 5 and 22 fds per CPU. When we run out
* of them try to increase the limits.
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index bf9abd9a5cbf9852..661d2677a19b6248 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -293,6 +293,11 @@ bool evsel__detect_missing_features(struct evsel *evsel);
enum rlimit_action { NO_CHANGE, SET_TO_MAX, INCREASED_MAX };
bool evsel__increase_rlimit(enum rlimit_action *set_rlimit);

+bool evsel__ignore_missing_thread(struct evsel *evsel,
+ int nr_cpus, int cpu,
+ struct perf_thread_map *threads,
+ int thread, int err);
+
struct perf_sample;

void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
--
2.31.1

2021-08-21 09:23:58

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 20/37] perf evsel: separate rlimit increase from evsel__open_cpu

This is a preparatory patch for the following patches with the goal to
separate from evlist__open_cpu the actual opening (which could be
performed in parallel), from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the rlimit increase from evsel__open_cpu.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 50 ++++++++++++++++++++++++-----------------
tools/perf/util/evsel.h | 3 +++
2 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c393bd992322d925..916930ea31450265 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1931,13 +1931,40 @@ bool evsel__detect_missing_features(struct evsel *evsel)
}
}

+bool evsel__increase_rlimit(enum rlimit_action *set_rlimit)
+{
+ int old_errno;
+ struct rlimit l;
+
+ if (*set_rlimit < INCREASED_MAX) {
+
+ old_errno = errno;
+ if (getrlimit(RLIMIT_NOFILE, &l) == 0) {
+ if (*set_rlimit == NO_CHANGE)
+ l.rlim_cur = l.rlim_max;
+ else {
+ l.rlim_cur = l.rlim_max + 1000;
+ l.rlim_max = l.rlim_cur;
+ }
+ if (setrlimit(RLIMIT_NOFILE, &l) == 0) {
+ (*set_rlimit) += 1;
+ errno = old_errno;
+ return true;
+ }
+ }
+ errno = old_errno;
+ }
+
+ return false;
+}
+
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads,
int start_cpu, int end_cpu)
{
int cpu, thread, nthreads;
int pid = -1, err, old_errno;
- enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
+ enum rlimit_action set_rlimit = NO_CHANGE;

err = __evsel__prepare_open(evsel, cpus, threads);
if (err)
@@ -2046,25 +2073,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
* perf stat needs between 5 and 22 fds per CPU. When we run out
* of them try to increase the limits.
*/
- if (err == -EMFILE && set_rlimit < INCREASED_MAX) {
- struct rlimit l;
-
- old_errno = errno;
- if (getrlimit(RLIMIT_NOFILE, &l) == 0) {
- if (set_rlimit == NO_CHANGE)
- l.rlim_cur = l.rlim_max;
- else {
- l.rlim_cur = l.rlim_max + 1000;
- l.rlim_max = l.rlim_cur;
- }
- if (setrlimit(RLIMIT_NOFILE, &l) == 0) {
- set_rlimit++;
- errno = old_errno;
- goto retry_open;
- }
- }
- errno = old_errno;
- }
+ if (err == -EMFILE && evsel__increase_rlimit(&set_rlimit))
+ goto retry_open;

if (err != -EINVAL || cpu > 0 || thread > 0)
goto out_close;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index a83fb7f69b1ead73..bf9abd9a5cbf9852 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -290,6 +290,9 @@ int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
bool evsel__detect_missing_features(struct evsel *evsel);

+enum rlimit_action { NO_CHANGE, SET_TO_MAX, INCREASED_MAX };
+bool evsel__increase_rlimit(enum rlimit_action *set_rlimit);
+
struct perf_sample;

void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
--
2.31.1

2021-08-21 09:24:04

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 28/37] perf evlist: add multithreading to evlist__open

This patch enables multithreading in evlist__open using the new
evsel__open_per_cpu_no_fallback function.

The multithreaded version tries to open everything in parallel. Once
workers are done, it checks their result and, in case of error, it
tries the fallback mechanisms present in evsel__open_cpu and restarts
the workers from were they've left.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evlist.c | 189 +++++++++++++++++++++++++++++++++++++--
1 file changed, 183 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a839107e8849bf..3472038d719ec7d4 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -19,6 +19,8 @@
#include "units.h"
#include "bpf_counter.h"
#include <internal/lib.h> // page_size
+#include <internal/cpumap.h>
+#include <internal/threadmap.h>
#include "affinity.h"
#include "../perf.h"
#include "asm/bug.h"
@@ -1403,11 +1405,184 @@ static int evlist__create_syswide_maps(struct evlist *evlist)
return err;
}

-int evlist__open(struct evlist *evlist)
+static int evlist__open_singlethreaded(struct evlist *evlist)
{
struct evsel *evsel;
int err;

+ evlist__for_each_entry(evlist, evsel) {
+ err = evsel__open(evsel, evsel->core.cpus, evsel->core.threads);
+ if (err < 0)
+ return err;
+ }
+
+ return 0;
+}
+
+struct evlist_open_work {
+ struct work_struct work;
+ struct evlist *evlist;
+ int cpu;
+ union {
+ int cpu_resume;
+ int cpu_err;
+ };
+ union {
+ struct evsel *evsel_resume;
+ struct evsel *evsel_err;
+ };
+ struct evsel_open_result res; // this is also used to resume work
+ bool progress; // did the worker do any progress?
+};
+
+static void evlist__open_multithreaded_func(struct work_struct *_work)
+{
+ struct evlist_open_work *work = container_of(_work, struct evlist_open_work, work);
+ struct evsel *evsel = work->evsel_resume;
+ int cpu_idx, thread_resume = work->res.thread;
+
+ work->res.peo_res.err = PEO_SUCCESS;
+ work->progress = false;
+
+ if (!evsel) // nothing to do
+ return;
+
+ work->evsel_err = NULL;
+
+ evlist__for_each_entry_from(work->evlist, evsel) {
+ cpu_idx = evsel__find_cpu(evsel, work->cpu);
+ if (cpu_idx < work->cpu_resume)
+ continue;
+
+ work->res = evsel__open_per_cpu_no_fallback(evsel,
+ evsel->core.cpus,
+ evsel->core.threads,
+ cpu_idx, thread_resume);
+ work->progress |= work->res.thread != thread_resume;
+ if (work->res.peo_res.err != PEO_SUCCESS) {
+ work->evsel_err = evsel;
+ work->cpu_err = cpu_idx;
+ break;
+ }
+
+ thread_resume = 0;
+ }
+}
+
+static int evlist__open_multithreaded(struct evlist *evlist)
+{
+ int cpu, cpuid, cpuidx, thread, err;
+ struct evlist_open_work *works;
+ char errbuf[WORKQUEUE_STRERR_BUFSIZE];
+ struct perf_event_open_result peo_res;
+ struct evsel *evsel;
+ struct perf_cpu_map *cpus;
+ struct perf_thread_map *threads;
+ enum rlimit_action set_rlimit = NO_CHANGE;
+ bool progress;
+
+ works = calloc(perf_cpu_map__nr(evlist->core.all_cpus), sizeof(*works));
+ if (!works)
+ return -ENOMEM;
+
+ perf_cpu_map__for_each_cpu(cpuid, cpuidx, evlist->core.all_cpus) {
+ init_work(&works[cpuidx].work);
+ works[cpuidx].work.func = evlist__open_multithreaded_func;
+ works[cpuidx].evlist = evlist;
+ works[cpuidx].cpu = cpuid;
+ works[cpuidx].evsel_resume = evlist__first(evlist);
+ }
+
+reprepare:
+ evlist__for_each_entry(evlist, evsel) {
+ err = evsel__prepare_open(evsel, evsel->core.cpus,
+ evsel->core.threads);
+ if (err)
+ goto out;
+ }
+retry:
+ perf_cpu_map__for_each_cpu(cpuid, cpuidx, evlist->core.all_cpus) {
+ err = schedule_work_on(cpuid, &works[cpuidx].work);
+ if (err) {
+ workqueue_strerror(global_wq, err, errbuf, sizeof(errbuf));
+ pr_debug("schedule_work: %s\n", errbuf);
+ goto out;
+ }
+ }
+
+ err = flush_scheduled_work();
+ if (err) {
+ workqueue_strerror(global_wq, err, errbuf, sizeof(errbuf));
+ pr_debug("flush_scheduled_work: %s\n", errbuf);
+ goto out;
+ }
+
+ // check if any event was opened (progress = true)
+ progress = false;
+ perf_cpu_map__for_each_cpu(cpuid, cpuidx, evlist->core.all_cpus) {
+ if (works[cpuidx].progress) {
+ progress = true;
+ break;
+ }
+ }
+
+ perf_cpu_map__for_each_cpu(cpuid, cpuidx, evlist->core.all_cpus) {
+ peo_res = works[cpuidx].res.peo_res;
+
+ switch (peo_res.err) {
+ case PEO_SUCCESS:
+ continue;
+ case PEO_FALLBACK:
+ err = peo_res.rc;
+ break;
+ default:
+ case PEO_ERROR:
+ err = peo_res.rc;
+ goto out;
+ }
+
+ // fallback
+ evsel = works[cpuidx].evsel_err;
+ cpus = evsel->core.cpus;
+ cpu = works[cpuidx].cpu_err;
+ threads = evsel->core.threads;
+ thread = works[cpuidx].res.thread;
+
+ if (evsel__precise_ip_fallback(evsel))
+ goto retry;
+
+ if (evsel__ignore_missing_thread(evsel, cpus->nr, cpu,
+ threads, thread, err))
+ goto retry;
+
+ // increase rlimit only if no progress was made
+ if (progress)
+ set_rlimit = NO_CHANGE;
+ if (err == -EMFILE && evsel__increase_rlimit(&set_rlimit))
+ goto retry;
+
+ if (err != -EINVAL || cpu > 0 || thread > 0)
+ goto out;
+
+ if (evsel__detect_missing_features(evsel))
+ goto reprepare;
+
+ // no fallback worked, return the error
+ goto out;
+ }
+
+ err = 0;
+
+out:
+ free(works);
+
+ return err;
+}
+
+int evlist__open(struct evlist *evlist)
+{
+ int err;
+
/*
* Default: one fd per CPU, all threads, aka systemwide
* as sys_perf_event_open(cpu = -1, thread = -1) is EINVAL
@@ -1420,11 +1595,13 @@ int evlist__open(struct evlist *evlist)

evlist__update_id_pos(evlist);

- evlist__for_each_entry(evlist, evsel) {
- err = evsel__open(evsel, evsel->core.cpus, evsel->core.threads);
- if (err < 0)
- goto out_err;
- }
+ if (perf_singlethreaded)
+ err = evlist__open_singlethreaded(evlist);
+ else
+ err = evlist__open_multithreaded(evlist);
+
+ if (err)
+ goto out_err;

return 0;
out_err:
--
2.31.1

2021-08-21 09:24:13

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 37/37] perf test/evlist-open-close: add detailed output mode

This patch adds a detailed output mode in the perf-test
evlist-open-close. In this output mode, the time taken by each single
evlist function is computed.

Normal mode:
$ sudo ./perf bench internals evlist-open-close
Number of workers: 1
Number of cpus: 4
Number of threads: 1
Number of events: 1 (4 fds)
Number of iterations: 100
Average open-close took: 1199.300 usec (+- 289.699 usec)

Detailed mode:
$ sudo ./perf bench internals evlist-open-close -d
Number of workers: 1
Number of cpus: 4
Number of threads: 1
Number of events: 1 (4 fds)
Number of iterations: 100
Average open-close took: 1199.300 usec (+- 289.699 usec)
init took: 0.000 usec (+- 0.000 usec)
open took: 25.600 usec (+- 1.778 usec)
mmap took: 532.000 usec (+- 58.133 usec)
enable took: 337.300 usec (+- 194.160 usec)
disable took: 181.700 usec (+- 85.307 usec)
munmap took: 22.100 usec (+- 4.045 usec)
close took: 100.300 usec (+- 21.329 usec)
fini took: 0.200 usec (+- 0.133 usec)

* init and fini represent the time taken before and after the evlist
operations (in this case the workqueue setup and teardown operations)

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/bench/evlist-open-close.c | 63 ++++++++++++++++++++++++++--
1 file changed, 60 insertions(+), 3 deletions(-)

diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
index 00d0aef564f80d44..8ba2799c66cafb3e 100644
--- a/tools/perf/bench/evlist-open-close.c
+++ b/tools/perf/bench/evlist-open-close.c
@@ -25,12 +25,18 @@
static int iterations = 100;
static int nr_events = 1;
static const char *event_string = "dummy";
+static bool detail;

static inline u64 timeval2usec(struct timeval *tv)
{
return tv->tv_sec * USEC_PER_SEC + tv->tv_usec;
}

+struct timers {
+ struct timeval start, end, diff;
+ struct stats init, open, mmap, enable, disable, munmap, close, fini;
+};
+
static struct record_opts opts = {
.sample_time = true,
.mmap_pages = UINT_MAX,
@@ -60,6 +66,7 @@ static const struct option options[] = {
OPT_STRING('u', "uid", &opts.target.uid_str, "user", "user to profile"),
OPT_BOOLEAN(0, "per-thread", &opts.target.per_thread, "use per-thread mmaps"),
OPT_UINTEGER_OPTARG('j', "threads", &opts.nr_threads, UINT_MAX, "Number of threads to use"),
+ OPT_BOOLEAN('d', "detail", &detail, "compute time taken by single functions"),
OPT_END()
};

@@ -113,11 +120,28 @@ static struct evlist *bench__create_evlist(char *evstr)
return NULL;
}

-static int bench__do_evlist_open_close(struct evlist *evlist)
+#define START_TIMER(timers) do { \
+ if (detail) { \
+ gettimeofday(&(timers)->start, NULL); \
+ } \
+} while (0)
+
+#define RECORD_TIMER(timers, field) do { \
+ if (detail) { \
+ gettimeofday(&(timers)->end, NULL); \
+ timersub(&(timers)->end, &(timers)->start, &(timers)->diff); \
+ update_stats(&(timers)->field, timeval2usec(&(timers)->diff)); \
+ (timers)->start = (timers)->end; \
+ } \
+} while (0)
+
+static int bench__do_evlist_open_close(struct evlist *evlist, struct timers *timers)
{
char sbuf[WORKQUEUE_STRERR_BUFSIZE];
int err = -1, ret;

+ START_TIMER(timers);
+
if (opts.nr_threads > 1) {
err = setup_global_workqueue(opts.nr_threads);
if (err) {
@@ -130,23 +154,30 @@ static int bench__do_evlist_open_close(struct evlist *evlist)

perf_set_multithreaded();
}
+ RECORD_TIMER(timers, init);

err = evlist__open(evlist);
if (err < 0) {
pr_err("evlist__open: %s\n", str_error_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
+ RECORD_TIMER(timers, open);

err = evlist__mmap(evlist, opts.mmap_pages);
if (err < 0) {
pr_err("evlist__mmap: %s\n", str_error_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
+ RECORD_TIMER(timers, mmap);

evlist__enable(evlist);
+ RECORD_TIMER(timers, enable);
evlist__disable(evlist);
+ RECORD_TIMER(timers, disable);
evlist__munmap(evlist);
+ RECORD_TIMER(timers, munmap);
evlist__close(evlist);
+ RECORD_TIMER(timers, close);

out:
if (opts.nr_threads > 1) {
@@ -159,10 +190,15 @@ static int bench__do_evlist_open_close(struct evlist *evlist)

perf_set_singlethreaded();
}
+ RECORD_TIMER(timers, fini);

return err;
}

+#define PRINT_TIMER(timers, field) \
+ printf("%20s took: %12.3f usec (+- %12.3f usec)\n", #field, \
+ avg_stats(&(timers)->field), stddev_stats(&(timers)->field))
+
static int bench_evlist_open_close__run(char *evstr)
{
// used to print statistics only
@@ -172,10 +208,21 @@ static int bench_evlist_open_close__run(char *evstr)
struct stats time_stats;
u64 runtime_us;
int i, err;
+ struct timers timers;

if (!evlist)
return -ENOMEM;

+ init_stats(&time_stats);
+ init_stats(&timers.init);
+ init_stats(&timers.open);
+ init_stats(&timers.mmap);
+ init_stats(&timers.enable);
+ init_stats(&timers.disable);
+ init_stats(&timers.munmap);
+ init_stats(&timers.close);
+ init_stats(&timers.fini);
+
init_stats(&time_stats);

printf(" Number of workers:\t%u\n", opts.nr_threads);
@@ -194,7 +241,7 @@ static int bench_evlist_open_close__run(char *evstr)
return -ENOMEM;

gettimeofday(&start, NULL);
- err = bench__do_evlist_open_close(evlist);
+ err = bench__do_evlist_open_close(evlist, &timers);
if (err) {
evlist__delete(evlist);
return err;
@@ -211,7 +258,17 @@ static int bench_evlist_open_close__run(char *evstr)

time_average = avg_stats(&time_stats);
time_stddev = stddev_stats(&time_stats);
- printf(" Average open-close took: %.3f usec (+- %.3f usec)\n", time_average, time_stddev);
+ printf(" Average open-close took: %12.3f usec (+- %12.3f usec)\n", time_average, time_stddev);
+ if (detail) {
+ PRINT_TIMER(&timers, init);
+ PRINT_TIMER(&timers, open);
+ PRINT_TIMER(&timers, mmap);
+ PRINT_TIMER(&timers, enable);
+ PRINT_TIMER(&timers, disable);
+ PRINT_TIMER(&timers, munmap);
+ PRINT_TIMER(&timers, close);
+ PRINT_TIMER(&timers, fini);
+ }

return 0;
}
--
2.31.1

2021-08-21 09:24:18

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 31/37] tools lib/subcmd: add OPT_UINTEGER_OPTARG option type

This patch adds OPT_UINTEGER_OPTARG, which is the same as OPT_UINTEGER,
but also makes it possible to use the option without any value, setting
the variable to a default value, d.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/lib/subcmd/parse-options.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/lib/subcmd/parse-options.h b/tools/lib/subcmd/parse-options.h
index d2414144eb8c9927..41b9b942504d398e 100644
--- a/tools/lib/subcmd/parse-options.h
+++ b/tools/lib/subcmd/parse-options.h
@@ -133,6 +133,7 @@ struct option {
#define OPT_SET_PTR(s, l, v, h, p) { .type = OPTION_SET_PTR, .short_name = (s), .long_name = (l), .value = (v), .help = (h), .defval = (p) }
#define OPT_INTEGER(s, l, v, h) { .type = OPTION_INTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, int *), .help = (h) }
#define OPT_UINTEGER(s, l, v, h) { .type = OPTION_UINTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h) }
+#define OPT_UINTEGER_OPTARG(s, l, v, d, h) { .type = OPTION_UINTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h), .flags = PARSE_OPT_OPTARG, .defval = (intptr_t)(d) }
#define OPT_LONG(s, l, v, h) { .type = OPTION_LONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, long *), .help = (h) }
#define OPT_ULONG(s, l, v, h) { .type = OPTION_ULONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned long *), .help = (h) }
#define OPT_U64(s, l, v, h) { .type = OPTION_U64, .short_name = (s), .long_name = (l), .value = check_vtype(v, u64 *), .help = (h) }
--
2.31.1

2021-08-21 09:24:24

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 33/37] perf record: pin threads to monitored cpus if enough threads available

This patch sets the affinity of the workqueue threads to pin them to
each monitored CPU in case the --threads option is set with enough
threads and evlist multithreading is enabled.

This yields a better performance for the evlist operations, since
affinity need not be sent by each thread everytime.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/builtin-record.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7802a0e25f631fac..e2d2445e05d7c07a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2945,6 +2945,16 @@ int cmd_record(int argc, const char **argv)
pr_err("setup_global_workqueue: %s\n", errbuf);
goto out;
}
+
+ if ((int)rec->opts.nr_threads >= rec->evlist->core.all_cpus->nr
+ && rec->opts.multithreaded_evlist) {
+ err = workqueue_set_affinities_cpu(global_wq, rec->evlist->core.all_cpus);
+ if (err) {
+ workqueue_strerror(global_wq, err, errbuf, sizeof(errbuf));
+ pr_err("workqueue_set_affinities_cpu: %s\n", errbuf);
+ goto out;
+ }
+ }
}

err = __cmd_record(&record, argc, argv);
--
2.31.1

2021-08-21 09:24:29

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 26/37] perf evsel: add evsel__open_per_cpu_no_fallback function

This function is equivalent to evsel__open, but without any fallback
mechanism, which should be handled separately.
It is also possible to a starting thread to be able to resume
a previous failing evsel__open_per_cpu_no_fallback.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/evsel.c | 44 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/evsel.h | 9 +++++++++
2 files changed, 53 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index e41f55a7a70ea630..74aad07f3c1102cb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2021,6 +2021,50 @@ static struct perf_event_open_result perf_event_open(struct evsel *evsel,
return res;
}

+struct evsel_open_result evsel__open_per_cpu_no_fallback(struct evsel *evsel,
+ struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads,
+ int cpu, int start_thread)
+{
+ int thread, nthreads, pid = -1;
+ struct evsel_open_result res = { .thread = start_thread };
+
+ if (cpus == NULL)
+ cpus = empty_cpu_map;
+
+ if (threads == NULL)
+ threads = empty_thread_map;
+
+ if (evsel->core.system_wide)
+ nthreads = 1;
+ else
+ nthreads = threads->nr;
+
+ if (evsel->cgrp)
+ pid = evsel->cgrp->fd;
+
+ display_attr(&evsel->core.attr);
+
+
+ for (thread = start_thread; thread < nthreads; thread++) {
+ res.peo_res = perf_event_open(evsel, pid, cpu, thread, cpus,
+ threads);
+
+ switch (res.peo_res.err) {
+ case PEO_SUCCESS:
+ continue;
+ case PEO_FALLBACK:
+ goto out;
+ default:
+ case PEO_ERROR:
+ goto out;
+ }
+ }
+out:
+ res.thread = thread;
+ return res;
+}
+
static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads,
int start_cpu, int end_cpu)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8c9827a93ac001a7..fa0d732d0a088016 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -294,10 +294,19 @@ struct perf_event_open_result {
int fd;
};

+struct evsel_open_result {
+ int thread;
+ struct perf_event_open_result peo_res;
+};
+
int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu);
int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads);
int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
+struct evsel_open_result evsel__open_per_cpu_no_fallback(struct evsel *evsel,
+ struct perf_cpu_map *cpus,
+ struct perf_thread_map *threads,
+ int cpu, int start_thread);
void evsel__close(struct evsel *evsel);
int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
--
2.31.1

2021-08-21 09:24:38

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 34/37] perf record: apply multithreading in init and fini phases

This patch grows the multithreaded portion of perf-record, marked by the
perf_set_multithreaded and perf_set_singlethreaded functions to the whole
init and fini part of __cmd_record.
By doing so, perf-record can take advantage of the parallelized evlist
operations (open, enable, disable, close).

This patch also needs to handle the case in which evlist and synthesis
multithreading are not enabled at the same time. Therefore, in
record__synthesize multithreading is enabled/disabled and then
disabled/renabled if needed.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/builtin-record.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e2d2445e05d7c07a..db9ec08db672f994 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1411,6 +1411,7 @@ static int record__synthesize(struct record *rec, bool tail)
struct perf_tool *tool = &rec->tool;
int err = 0;
event_op f = process_synthesized_event;
+ bool perf_was_singlethreaded = perf_singlethreaded;

if (rec->opts.tail_synthesize != tail)
return 0;
@@ -1499,12 +1500,16 @@ static int record__synthesize(struct record *rec, bool tail)
if (rec->opts.multithreaded_synthesis) {
perf_set_multithreaded();
f = process_locked_synthesized_event;
+ } else {
+ perf_set_singlethreaded();
}

err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->core.threads,
f, opts->sample_address);

- if (rec->opts.multithreaded_synthesis)
+ if (!perf_was_singlethreaded && perf_singlethreaded)
+ perf_set_multithreaded();
+ if (perf_was_singlethreaded && !perf_singlethreaded)
perf_set_singlethreaded();

out:
@@ -1735,6 +1740,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)

record__uniquify_name(rec);

+ if (rec->opts.multithreaded_evlist)
+ perf_set_multithreaded();
+
if (record__open(rec) != 0) {
err = -1;
goto out_child;
@@ -1877,6 +1885,10 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}
}

+ // disable locks in the main thread since there is no multithreading
+ if (rec->opts.multithreaded_evlist)
+ perf_set_singlethreaded();
+
trigger_ready(&auxtrace_snapshot_trigger);
trigger_ready(&switch_output_trigger);
perf_hooks__invoke_record_start();
@@ -1998,6 +2010,10 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}
}

+ // reenable multithreading for evlist
+ if (rec->opts.multithreaded_evlist)
+ perf_set_multithreaded();
+
trigger_off(&auxtrace_snapshot_trigger);
trigger_off(&switch_output_trigger);

@@ -2099,6 +2115,10 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)

if (!opts->no_bpf_event)
evlist__stop_sb_thread(rec->sb_evlist);
+
+ // disable multithreaded mode on exit
+ if (rec->opts.multithreaded_evlist)
+ perf_set_singlethreaded();
return status;
}

--
2.31.1

2021-08-21 09:24:55

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 35/37] perf test/evlist-open-close: add multithreading

This patch adds the new option -j/--threads to use multiple threads in
the evlist operations in the evlist-open-close benchmark.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/bench/evlist-open-close.c | 46 ++++++++++++++++++++++++----
1 file changed, 40 insertions(+), 6 deletions(-)

diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
index 674cb14cbaa9ef2c..c18aa85725281795 100644
--- a/tools/perf/bench/evlist-open-close.c
+++ b/tools/perf/bench/evlist-open-close.c
@@ -12,6 +12,8 @@
#include "../util/parse-events.h"
#include "internal/threadmap.h"
#include "internal/cpumap.h"
+#include "../util/util.h"
+#include "../util/workqueue/workqueue.h"
#include <linux/perf_event.h>
#include <linux/kernel.h>
#include <linux/time64.h>
@@ -35,7 +37,8 @@ static struct record_opts opts = {
.default_per_cpu = true,
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
- .nr_threads_synthesize = 1,
+ .nr_threads = 1,
+ .multithreaded_evlist = true,
.ctl_fd = -1,
.ctl_fd_ack = -1,
};
@@ -51,6 +54,7 @@ static const struct option options[] = {
OPT_STRING('t', "tid", &opts.target.tid, "tid", "record events on existing thread id"),
OPT_STRING('u', "uid", &opts.target.uid_str, "user", "user to profile"),
OPT_BOOLEAN(0, "per-thread", &opts.target.per_thread, "use per-thread mmaps"),
+ OPT_UINTEGER_OPTARG('j', "threads", &opts.nr_threads, UINT_MAX, "Number of threads to use"),
OPT_END()
};

@@ -106,18 +110,32 @@ static struct evlist *bench__create_evlist(char *evstr)

static int bench__do_evlist_open_close(struct evlist *evlist)
{
- char sbuf[STRERR_BUFSIZE];
- int err = evlist__open(evlist);
+ char sbuf[WORKQUEUE_STRERR_BUFSIZE];
+ int err = -1, ret;
+
+ if (opts.nr_threads > 1) {
+ err = setup_global_workqueue(opts.nr_threads);
+ if (err) {
+ create_workqueue_strerror(global_wq, sbuf, sizeof(sbuf));
+ pr_err("setup_global_workqueue: %s\n", sbuf);
+ return err;
+ }
+ if (evlist->core.all_cpus->nr <= workqueue_nr_threads(global_wq))
+ workqueue_set_affinities_cpu(global_wq, evlist->core.all_cpus);

+ perf_set_multithreaded();
+ }
+
+ err = evlist__open(evlist);
if (err < 0) {
pr_err("evlist__open: %s\n", str_error_r(errno, sbuf, sizeof(sbuf)));
- return err;
+ goto out;
}

err = evlist__mmap(evlist, opts.mmap_pages);
if (err < 0) {
pr_err("evlist__mmap: %s\n", str_error_r(errno, sbuf, sizeof(sbuf)));
- return err;
+ goto out;
}

evlist__enable(evlist);
@@ -125,7 +143,19 @@ static int bench__do_evlist_open_close(struct evlist *evlist)
evlist__munmap(evlist);
evlist__close(evlist);

- return 0;
+out:
+ if (opts.nr_threads > 1) {
+ ret = teardown_global_workqueue();
+ if (ret) {
+ destroy_workqueue_strerror(err, sbuf, sizeof(sbuf));
+ pr_err("teardown_global_workqueue: %s\n", sbuf);
+ err = ret;
+ }
+
+ perf_set_singlethreaded();
+ }
+
+ return err;
}

static int bench_evlist_open_close__run(char *evstr)
@@ -143,6 +173,7 @@ static int bench_evlist_open_close__run(char *evstr)

init_stats(&time_stats);

+ printf(" Number of workers:\t%u\n", opts.nr_threads);
printf(" Number of cpus:\t%d\n", evlist->core.cpus->nr);
printf(" Number of threads:\t%d\n", evlist->core.threads->nr);
printf(" Number of events:\t%d (%d fds)\n",
@@ -226,6 +257,9 @@ int bench_evlist_open_close(int argc, const char **argv)
exit(EXIT_FAILURE);
}

+ if (opts.nr_threads == UINT_MAX)
+ opts.nr_threads = sysconf(_SC_NPROCESSORS_ONLN);
+
err = target__validate(&opts.target);
if (err) {
target__strerror(&opts.target, err, errbuf, sizeof(errbuf));
--
2.31.1

2021-08-21 09:25:37

by Riccardo Mancini

[permalink] [raw]
Subject: [RFC PATCH v1 36/37] perf test/evlist-open-close: use inline func to convert timeval to usec

This patch introduces a new inline function to convert a timeval to
usec.
This function will be used also in the next patch.

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/bench/evlist-open-close.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
index c18aa85725281795..00d0aef564f80d44 100644
--- a/tools/perf/bench/evlist-open-close.c
+++ b/tools/perf/bench/evlist-open-close.c
@@ -26,6 +26,11 @@ static int iterations = 100;
static int nr_events = 1;
static const char *event_string = "dummy";

+static inline u64 timeval2usec(struct timeval *tv)
+{
+ return tv->tv_sec * USEC_PER_SEC + tv->tv_usec;
+}
+
static struct record_opts opts = {
.sample_time = true,
.mmap_pages = UINT_MAX,
@@ -197,7 +202,7 @@ static int bench_evlist_open_close__run(char *evstr)

gettimeofday(&end, NULL);
timersub(&end, &start, &diff);
- runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
+ runtime_us = timeval2usec(&diff);
update_stats(&time_stats, runtime_us);

evlist__delete(evlist);
--
2.31.1

2021-08-31 19:22:25

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 04/37] perf util: add mmap_cpu_mask__duplicate function

Em Sat, Aug 21, 2021 at 11:19:10AM +0200, Riccardo Mancini escreveu:
> This patch adds a new function in util/mmap.c to duplicate a mmap_cpu_mask.
>
> This new function will be used in the following patches.
>
> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/mmap.c | 12 ++++++++++++
> tools/perf/util/mmap.h | 3 +++
> 2 files changed, 15 insertions(+)
>
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index ab7108d22428b527..9e9a447682cc962c 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -350,3 +350,15 @@ int perf_mmap__push(struct mmap *md, void *to,
> out:
> return rc;
> }
> +
> +int mmap_cpu_mask__duplicate(struct mmap_cpu_mask *original,
> + struct mmap_cpu_mask *clone)
> +{
> + clone->nbits = original->nbits;
> + clone->bits = bitmap_alloc(original->nbits);
> + if (!clone->bits)
> + return -ENOMEM;
> +
> + memcpy(clone->bits, original->bits, MMAP_CPU_MASK_BYTES(original));
> + return 0;
> +}
> diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
> index 9d5f589f02ae70e1..b4923e587fd7749c 100644
> --- a/tools/perf/util/mmap.h
> +++ b/tools/perf/util/mmap.h
> @@ -63,4 +63,7 @@ size_t mmap__mmap_len(struct mmap *map);
>
> void mmap_cpu_mask__scnprintf(struct mmap_cpu_mask *mask, const char *tag);
>
> +int mmap_cpu_mask__duplicate(struct mmap_cpu_mask *original,
> + struct mmap_cpu_mask *clone);
> +
> #endif /*__PERF_MMAP_H */

The jargon here is class__clone():

⬢[acme@toolbox perf]$ find tools/perf -name "*.[ch]" | xargs grep __clone
tools/perf/util/bpf-loader.c:bpf_map_op__clone(struct bpf_map_op *op)
tools/perf/util/bpf-loader.c:bpf_map_priv__clone(struct bpf_map_priv *priv)
tools/perf/util/bpf-loader.c: newop = bpf_map_op__clone(pos);
tools/perf/util/bpf-loader.c: priv = bpf_map_priv__clone(tmpl_priv);
tools/perf/util/cgroup.c: evsel = evsel__clone(pos);
tools/perf/util/evsel.c: * evsel__clone - create a new evsel copied from @orig
tools/perf/util/evsel.c:struct evsel *evsel__clone(struct evsel *orig)
tools/perf/util/evsel.h: * Please check evsel__clone() to copy them properly so that
tools/perf/util/evsel.h:struct evsel *evsel__clone(struct evsel *orig);
tools/perf/util/map.h:struct map *map__clone(struct map *map);
tools/perf/util/maps.h:int maps__clone(struct thread *thread, struct maps *parent);
tools/perf/util/parse-events.c:int parse_events_term__clone(struct parse_events_term **new,
tools/perf/util/parse-events.c: ret = parse_events_term__clone(&n, term);
tools/perf/util/parse-events.h:int parse_events_term__clone(struct parse_events_term **new,
tools/perf/util/symbol.c: struct map *m = map__clone(new_map);
tools/perf/util/thread.c:static int thread__clone_maps(struct thread *thread, struct thread *parent, bool do_maps_clone)
tools/perf/util/thread.c: return do_maps_clone ? maps__clone(thread, parent->maps) : 0;
tools/perf/util/thread.c: return thread__clone_maps(thread, parent, do_maps_clone);
tools/perf/util/map.c:struct map *map__clone(struct map *from)
tools/perf/util/map.c: struct map *before = map__clone(pos);
tools/perf/util/map.c: struct map *after = map__clone(pos);
tools/perf/util/map.c:int maps__clone(struct thread *thread, struct maps *parent)
tools/perf/util/map.c: struct map *new = map__clone(map);
tools/perf/util/pmu.c: ret = parse_events_term__clone(&cloned, term);
⬢[acme@toolbox perf]$ find tools/perf -name "*.[ch]" | xargs grep __duplicate
⬢[acme@toolbox perf]$

2021-08-31 19:33:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 16/37] perf evsel: save open flags in evsel

Em Sat, Aug 21, 2021 at 11:19:22AM +0200, Riccardo Mancini escreveu:
> This patch caches the flags used in perf_event_open inside evsel, so
> that they can be set in __evsel__prepare_open (this will be useful
> in following patches, when the fallback mechanisms will be handled
> outside the open itself).
> This also optimizes the code, by not having to recompute them everytime.

Nice, thanks, applied, I'll make available what I have in tmp.perf/core
so that you can take a look before I push it all to Linus, this week.

- Arnaldo

> Since flags are now saved in evsel, the flags argument in
> perf_event_open is removed.
>
> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 24 ++++++++++++------------
> tools/perf/util/evsel.h | 1 +
> 2 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index ddf324e2e17a0951..509a2970a94b3142 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1710,17 +1710,16 @@ static void display_attr(struct perf_event_attr *attr)
> }
>
> static int perf_event_open(struct evsel *evsel,
> - pid_t pid, int cpu, int group_fd,
> - unsigned long flags)
> + pid_t pid, int cpu, int group_fd)
> {
> int precise_ip = evsel->core.attr.precise_ip;
> int fd;
>
> while (1) {
> pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
> - pid, cpu, group_fd, flags);
> + pid, cpu, group_fd, evsel->open_flags);
>
> - fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, group_fd, flags);
> + fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, group_fd, evsel->open_flags);
> if (fd >= 0)
> break;
>
> @@ -1788,6 +1787,10 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> perf_evsel__alloc_fd(&evsel->core, cpus->nr, nthreads) < 0)
> return -ENOMEM;
>
> + evsel->open_flags = PERF_FLAG_FD_CLOEXEC;
> + if (evsel->cgrp)
> + evsel->open_flags |= PERF_FLAG_PID_CGROUP;
> +
> return 0;
> }
>
> @@ -1796,7 +1799,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> int start_cpu, int end_cpu)
> {
> int cpu, thread, nthreads;
> - unsigned long flags = PERF_FLAG_FD_CLOEXEC;
> int pid = -1, err, old_errno;
> enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
>
> @@ -1815,10 +1817,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> else
> nthreads = threads->nr;
>
> - if (evsel->cgrp) {
> - flags |= PERF_FLAG_PID_CGROUP;
> + if (evsel->cgrp)
> pid = evsel->cgrp->fd;
> - }
>
> fallback_missing_features:
> if (perf_missing_features.weight_struct) {
> @@ -1832,7 +1832,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> evsel->core.attr.clockid = 0;
> }
> if (perf_missing_features.cloexec)
> - flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
> + evsel->open_flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
> if (perf_missing_features.mmap2)
> evsel->core.attr.mmap2 = 0;
> if (perf_missing_features.exclude_guest)
> @@ -1866,7 +1866,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> test_attr__ready();
>
> fd = perf_event_open(evsel, pid, cpus->map[cpu],
> - group_fd, flags);
> + group_fd);
>
> FD(evsel, cpu, thread) = fd;
>
> @@ -1874,7 +1874,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>
> if (unlikely(test_attr__enabled)) {
> test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
> - fd, group_fd, flags);
> + fd, group_fd, evsel->open_flags);
> }
>
> if (fd < 0) {
> @@ -2012,7 +2012,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> perf_missing_features.clockid = true;
> pr_debug2_peo("switching off use_clockid\n");
> goto fallback_missing_features;
> - } else if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
> + } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
> perf_missing_features.cloexec = true;
> pr_debug2_peo("switching off cloexec flag\n");
> goto fallback_missing_features;
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index eabccce406886320..1c0057e80d080f2f 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -149,6 +149,7 @@ struct evsel {
> struct bperf_leader_bpf *leader_skel;
> struct bperf_follower_bpf *follower_skel;
> };
> + unsigned long open_flags;
> };
>
> struct perf_missing_features {
> --
> 2.31.1

--

- Arnaldo

2021-08-31 19:38:44

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 18/37] perf evsel: add evsel__prepare_open function

Em Sat, Aug 21, 2021 at 11:19:24AM +0200, Riccardo Mancini escreveu:
> This function will prepare the evsel and disable the missing features.
> It will be used in one of the following patches.

Applied, fixed up this:

int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads)

I.e. the alignment of the second line with parms, please use this form
in your next patches.

- Arnaldo

> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 14 ++++++++++++++
> tools/perf/util/evsel.h | 2 ++
> 2 files changed, 16 insertions(+)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index f0bc89f743915800..4e9a3e62075305f1 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1827,6 +1827,20 @@ static void evsel__disable_missing_features(struct evsel *evsel)
> evsel->core.attr.sample_id_all = 0;
> }
>
> +int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> + struct perf_thread_map *threads)
> +{
> + int err;
> +
> + err = __evsel__prepare_open(evsel, cpus, threads);
> + if (err)
> + return err;
> +
> + evsel__disable_missing_features(evsel);
> +
> + return err;
> +}
> +
> static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads,
> int start_cpu, int end_cpu)
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 1c0057e80d080f2f..58aa998e1814ac9e 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -286,6 +286,8 @@ int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads)
> int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads);
> void evsel__close(struct evsel *evsel);
> +int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> + struct perf_thread_map *threads);
>
> struct perf_sample;
>
> --
> 2.31.1

--

- Arnaldo

2021-08-31 19:43:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 20/37] perf evsel: separate rlimit increase from evsel__open_cpu

Em Sat, Aug 21, 2021 at 11:19:26AM +0200, Riccardo Mancini escreveu:
> This is a preparatory patch for the following patches with the goal to
> separate from evlist__open_cpu the actual opening (which could be
> performed in parallel), from the existing fallback mechanisms, which
> should be handled sequentially.
>
> This patch separates the rlimit increase from evsel__open_cpu.
>
> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 50 ++++++++++++++++++++++++-----------------
> tools/perf/util/evsel.h | 3 +++
> 2 files changed, 33 insertions(+), 20 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index c393bd992322d925..916930ea31450265 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1931,13 +1931,40 @@ bool evsel__detect_missing_features(struct evsel *evsel)
> }
> }
>
> +bool evsel__increase_rlimit(enum rlimit_action *set_rlimit)
> +{
> + int old_errno;
> + struct rlimit l;
> +
> + if (*set_rlimit < INCREASED_MAX) {
> +

The above blank line is not needed
> + old_errno = errno;

And here it should be used, I'm recycling it for you :-)

> + if (getrlimit(RLIMIT_NOFILE, &l) == 0) {
> + if (*set_rlimit == NO_CHANGE)
> + l.rlim_cur = l.rlim_max;
> + else {

Also when the else clause has {}, the if should have it too, I'm fixing
it.

I see that you just moved things around, but then this is a good time to
do these cosmetic changes.

Applying.

- Arnaldo

> + l.rlim_cur = l.rlim_max + 1000;
> + l.rlim_max = l.rlim_cur;
> + }
> + if (setrlimit(RLIMIT_NOFILE, &l) == 0) {
> + (*set_rlimit) += 1;
> + errno = old_errno;
> + return true;
> + }
> + }
> + errno = old_errno;
> + }
> +
> + return false;
> +}
> +
> static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads,
> int start_cpu, int end_cpu)
> {
> int cpu, thread, nthreads;
> int pid = -1, err, old_errno;
> - enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> + enum rlimit_action set_rlimit = NO_CHANGE;
>
> err = __evsel__prepare_open(evsel, cpus, threads);
> if (err)
> @@ -2046,25 +2073,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> * perf stat needs between 5 and 22 fds per CPU. When we run out
> * of them try to increase the limits.
> */
> - if (err == -EMFILE && set_rlimit < INCREASED_MAX) {
> - struct rlimit l;
> -
> - old_errno = errno;
> - if (getrlimit(RLIMIT_NOFILE, &l) == 0) {
> - if (set_rlimit == NO_CHANGE)
> - l.rlim_cur = l.rlim_max;
> - else {
> - l.rlim_cur = l.rlim_max + 1000;
> - l.rlim_max = l.rlim_cur;
> - }
> - if (setrlimit(RLIMIT_NOFILE, &l) == 0) {
> - set_rlimit++;
> - errno = old_errno;
> - goto retry_open;
> - }
> - }
> - errno = old_errno;
> - }
> + if (err == -EMFILE && evsel__increase_rlimit(&set_rlimit))
> + goto retry_open;
>
> if (err != -EINVAL || cpu > 0 || thread > 0)
> goto out_close;
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index a83fb7f69b1ead73..bf9abd9a5cbf9852 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -290,6 +290,9 @@ int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads);
> bool evsel__detect_missing_features(struct evsel *evsel);
>
> +enum rlimit_action { NO_CHANGE, SET_TO_MAX, INCREASED_MAX };
> +bool evsel__increase_rlimit(enum rlimit_action *set_rlimit);
> +
> struct perf_sample;
>
> void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
> --
> 2.31.1

--

- Arnaldo

2021-08-31 19:51:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 22/37] perf evsel: move test_attr__open to success path in evsel__open_cpu

Em Sat, Aug 21, 2021 at 11:19:28AM +0200, Riccardo Mancini escreveu:
> test_attr__open ignores the fd if -1, therefore it is safe to move it to
> the success path (fd >= 0).

Nicely spotted, applied.

- Arnaldo

> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index a1a88607fd59efcb..d2b0391569286080 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2011,11 +2011,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>
> bpf_counter__install_pe(evsel, cpu, fd);
>
> - if (unlikely(test_attr__enabled)) {
> - test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
> - fd, group_fd, evsel->open_flags);
> - }
> -
> if (fd < 0) {
> err = -errno;
>
> @@ -2024,6 +2019,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> goto try_fallback;
> }
>
> + if (unlikely(test_attr__enabled)) {
> + test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
> + fd, group_fd, evsel->open_flags);
> + }
> +
> pr_debug2_peo(" = %d\n", fd);
>
> if (evsel->bpf_fd >= 0) {
> --
> 2.31.1

--

- Arnaldo

2021-08-31 19:56:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 25/37] perf evsel: move event open in evsel__open_cpu to separate function

Em Sat, Aug 21, 2021 at 11:19:31AM +0200, Riccardo Mancini escreveu:
> This is the final patch splitting evsel__open_cpu.
> This patch moves the entire loop code to a separate function, to be
> reused for the multithreaded code.

Are you going to use that 'enum perf_event_open_err' somewhere else?
I.e. is there a need to expose it in evsel.h?

I'm stopping at this patch to give the ones I merged so far some
testing, will now push it to tmp.perf/core.

- Arnaldo

> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 142 ++++++++++++++++++++++++----------------
> tools/perf/util/evsel.h | 12 ++++
> 2 files changed, 99 insertions(+), 55 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 2e95416b8320c6b9..e41f55a7a70ea630 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1945,6 +1945,82 @@ bool evsel__increase_rlimit(enum rlimit_action *set_rlimit)
> return false;
> }
>
> +static struct perf_event_open_result perf_event_open(struct evsel *evsel,
> + pid_t pid, int cpu, int thread,
> + struct perf_cpu_map *cpus,
> + struct perf_thread_map *threads)
> +{
> + int fd, group_fd, rc;
> + struct perf_event_open_result res;
> +
> + if (!evsel->cgrp && !evsel->core.system_wide)
> + pid = perf_thread_map__pid(threads, thread);
> +
> + group_fd = get_group_fd(evsel, cpu, thread);
> +
> + test_attr__ready();
> +
> + pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
> + pid, cpus->map[cpu], group_fd, evsel->open_flags);
> +
> + fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
> + group_fd, evsel->open_flags);
> +
> + FD(evsel, cpu, thread) = fd;
> + res.fd = fd;
> +
> + if (fd < 0) {
> + rc = -errno;
> +
> + pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
> + rc);
> + res.rc = rc;
> + res.err = PEO_FALLBACK;
> + return res;
> + }
> +
> + bpf_counter__install_pe(evsel, cpu, fd);
> +
> + if (unlikely(test_attr__enabled)) {
> + test_attr__open(&evsel->core.attr, pid,
> + cpus->map[cpu], fd,
> + group_fd, evsel->open_flags);
> + }
> +
> + pr_debug2_peo(" = %d\n", fd);
> +
> + if (evsel->bpf_fd >= 0) {
> + int evt_fd = fd;
> + int bpf_fd = evsel->bpf_fd;
> +
> + rc = ioctl(evt_fd,
> + PERF_EVENT_IOC_SET_BPF,
> + bpf_fd);
> + if (rc && errno != EEXIST) {
> + pr_err("failed to attach bpf fd %d: %s\n",
> + bpf_fd, strerror(errno));
> + res.rc = -EINVAL;
> + res.err = PEO_ERROR;
> + return res;
> + }
> + }
> +
> + /*
> + * If we succeeded but had to kill clockid, fail and
> + * have evsel__open_strerror() print us a nice error.
> + */
> + if (perf_missing_features.clockid ||
> + perf_missing_features.clockid_wrong) {
> + res.rc = -EINVAL;
> + res.err = PEO_ERROR;
> + return res;
> + }
> +
> + res.rc = 0;
> + res.err = PEO_SUCCESS;
> + return res;
> +}
> +
> static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads,
> int start_cpu, int end_cpu)
> @@ -1952,6 +2028,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> int cpu, thread, nthreads;
> int pid = -1, err, old_errno;
> enum rlimit_action set_rlimit = NO_CHANGE;
> + struct perf_event_open_result peo_res;
>
> err = __evsel__prepare_open(evsel, cpus, threads);
> if (err)
> @@ -1979,67 +2056,22 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> for (cpu = start_cpu; cpu < end_cpu; cpu++) {
>
> for (thread = 0; thread < nthreads; thread++) {
> - int fd, group_fd;
> retry_open:
> if (thread >= nthreads)
> break;
>
> - if (!evsel->cgrp && !evsel->core.system_wide)
> - pid = perf_thread_map__pid(threads, thread);
> -
> - group_fd = get_group_fd(evsel, cpu, thread);
> -
> - test_attr__ready();
> -
> - pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
> - pid, cpus->map[cpu], group_fd, evsel->open_flags);
> + peo_res = perf_event_open(evsel, pid, cpu, thread, cpus,
> + threads);
>
> - fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
> - group_fd, evsel->open_flags);
> -
> - FD(evsel, cpu, thread) = fd;
> -
> - if (fd < 0) {
> - err = -errno;
> -
> - pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
> - err);
> + err = peo_res.rc;
> + switch (peo_res.err) {
> + case PEO_SUCCESS:
> + set_rlimit = NO_CHANGE;
> + continue;
> + case PEO_FALLBACK:
> goto try_fallback;
> - }
> -
> - bpf_counter__install_pe(evsel, cpu, fd);
> -
> - if (unlikely(test_attr__enabled)) {
> - test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
> - fd, group_fd, evsel->open_flags);
> - }
> -
> - pr_debug2_peo(" = %d\n", fd);
> -
> - if (evsel->bpf_fd >= 0) {
> - int evt_fd = fd;
> - int bpf_fd = evsel->bpf_fd;
> -
> - err = ioctl(evt_fd,
> - PERF_EVENT_IOC_SET_BPF,
> - bpf_fd);
> - if (err && errno != EEXIST) {
> - pr_err("failed to attach bpf fd %d: %s\n",
> - bpf_fd, strerror(errno));
> - err = -EINVAL;
> - goto out_close;
> - }
> - }
> -
> - set_rlimit = NO_CHANGE;
> -
> - /*
> - * If we succeeded but had to kill clockid, fail and
> - * have evsel__open_strerror() print us a nice error.
> - */
> - if (perf_missing_features.clockid ||
> - perf_missing_features.clockid_wrong) {
> - err = -EINVAL;
> + default:
> + case PEO_ERROR:
> goto out_close;
> }
> }
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 0a245afab2d87d74..8c9827a93ac001a7 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -282,6 +282,18 @@ int evsel__enable(struct evsel *evsel);
> int evsel__disable(struct evsel *evsel);
> int evsel__disable_cpu(struct evsel *evsel, int cpu);
>
> +enum perf_event_open_err {
> + PEO_SUCCESS,
> + PEO_FALLBACK,
> + PEO_ERROR
> +};
> +
> +struct perf_event_open_result {
> + enum perf_event_open_err err;
> + int rc;
> + int fd;
> +};
> +
> int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu);
> int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads);
> int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
> --
> 2.31.1

--

- Arnaldo

2021-08-31 20:08:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 27/37] perf evlist: add evlist__for_each_entry_from macro

Em Sat, Aug 21, 2021 at 11:19:33AM +0200, Riccardo Mancini escreveu:
> This patch adds a new iteration macro for evlist that resumes iteration
> from a given evsel in the evlist.
>
> This macro will be used in the next patch

Thanks, applied.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evlist.h | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 5f24a45d4e3cf30a..b0c2da0f9755b2d1 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -275,6 +275,22 @@ void evlist__to_front(struct evlist *evlist, struct evsel *move_evsel);
> #define evlist__for_each_entry_continue(evlist, evsel) \
> __evlist__for_each_entry_continue(&(evlist)->core.entries, evsel)
>
> +/**
> + * __evlist__for_each_entry_from - continue iteration from @evsel (included)
> + * @list: list_head instance to iterate
> + * @evsel: struct evsel iterator
> + */
> +#define __evlist__for_each_entry_from(list, evsel) \
> + list_for_each_entry_from(evsel, list, core.node)
> +
> +/**
> + * evlist__for_each_entry_from - continue iteration from @evsel (included)
> + * @evlist: evlist instance to iterate
> + * @evsel: struct evsel iterator
> + */
> +#define evlist__for_each_entry_from(evlist, evsel) \
> + __evlist__for_each_entry_from(&(evlist)->core.entries, evsel)
> +
> /**
> * __evlist__for_each_entry_reverse - iterate thru all the evsels in reverse order
> * @list: list_head instance to iterate
> --
> 2.31.1

--

- Arnaldo

2021-08-31 21:17:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 31/37] tools lib/subcmd: add OPT_UINTEGER_OPTARG option type

Em Sat, Aug 21, 2021 at 11:19:37AM +0200, Riccardo Mancini escreveu:
> This patch adds OPT_UINTEGER_OPTARG, which is the same as OPT_UINTEGER,
> but also makes it possible to use the option without any value, setting
> the variable to a default value, d.

Thanks, applied, just to erode a bit the patch kit.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/lib/subcmd/parse-options.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/lib/subcmd/parse-options.h b/tools/lib/subcmd/parse-options.h
> index d2414144eb8c9927..41b9b942504d398e 100644
> --- a/tools/lib/subcmd/parse-options.h
> +++ b/tools/lib/subcmd/parse-options.h
> @@ -133,6 +133,7 @@ struct option {
> #define OPT_SET_PTR(s, l, v, h, p) { .type = OPTION_SET_PTR, .short_name = (s), .long_name = (l), .value = (v), .help = (h), .defval = (p) }
> #define OPT_INTEGER(s, l, v, h) { .type = OPTION_INTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, int *), .help = (h) }
> #define OPT_UINTEGER(s, l, v, h) { .type = OPTION_UINTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h) }
> +#define OPT_UINTEGER_OPTARG(s, l, v, d, h) { .type = OPTION_UINTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h), .flags = PARSE_OPT_OPTARG, .defval = (intptr_t)(d) }
> #define OPT_LONG(s, l, v, h) { .type = OPTION_LONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, long *), .help = (h) }
> #define OPT_ULONG(s, l, v, h) { .type = OPTION_ULONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned long *), .help = (h) }
> #define OPT_U64(s, l, v, h) { .type = OPTION_U64, .short_name = (s), .long_name = (l), .value = check_vtype(v, u64 *), .help = (h) }
> --
> 2.31.1

--

2021-08-31 21:40:56

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 14/37] perf evsel: remove retry_sample_id goto label

Em Sat, Aug 21, 2021 at 11:19:20AM +0200, Riccardo Mancini escreveu:
> As far as I can tell, there is no good reason, apart from optimization
> to have the retry_sample_id separate from fallback_missing_features.
> Probably, this label was added to avoid reapplying patches for missing
> features that had already been applied.
> However, missing features that have been added later have not used this
> optimization, always jumping to fallback_missing_features and reapplying
> all missing features.
>
> This patch removes that label, replacing it with
> fallback_missing_features.

Fair enough, applied,

- Arnaldo

> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index f61e5dd53f5d2859..7b4bb3229a16524e 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1825,7 +1825,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> evsel->core.attr.bpf_event = 0;
> if (perf_missing_features.branch_hw_idx)
> evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_HW_INDEX;
> -retry_sample_id:
> if (perf_missing_features.sample_id_all)
> evsel->core.attr.sample_id_all = 0;
>
> @@ -2006,7 +2005,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> } else if (!perf_missing_features.sample_id_all) {
> perf_missing_features.sample_id_all = true;
> pr_debug2_peo("switching off sample_id_all\n");
> - goto retry_sample_id;
> + goto fallback_missing_features;
> } else if (!perf_missing_features.lbr_flags &&
> (evsel->core.attr.branch_sample_type &
> (PERF_SAMPLE_BRANCH_NO_CYCLES |
> --
> 2.31.1

--

- Arnaldo

2021-08-31 21:42:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 15/37] perf evsel: separate open preparation from open itself

Em Sat, Aug 21, 2021 at 11:19:21AM +0200, Riccardo Mancini escreveu:
> This is a preparatory patch for the following patches with the goal to
> separate in evlist__open_cpu the actual perf_event_open, which could be
> performed in parallel, from the existing fallback mechanisms, which
> should be handled sequentially.

Thanks, applied as the end result is equivalent and we erode this
patchkit a bit more.

- Arnaldo

> This patch separates the first lines of evsel__open_cpu into a new
> __evsel__prepare_open function.
>
> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 45 +++++++++++++++++++++++++++++++----------
> 1 file changed, 34 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 7b4bb3229a16524e..ddf324e2e17a0951 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1746,22 +1746,20 @@ static int perf_event_open(struct evsel *evsel,
> return fd;
> }
>
> -static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> - struct perf_thread_map *threads,
> - int start_cpu, int end_cpu)
> +
> +static struct perf_cpu_map *empty_cpu_map;
> +static struct perf_thread_map *empty_thread_map;
> +
> +static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> + struct perf_thread_map *threads)
> {
> - int cpu, thread, nthreads;
> - unsigned long flags = PERF_FLAG_FD_CLOEXEC;
> - int pid = -1, err, old_errno;
> - enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> + int nthreads;
>
> if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
> (perf_missing_features.aux_output && evsel->core.attr.aux_output))
> return -EINVAL;
>
> if (cpus == NULL) {
> - static struct perf_cpu_map *empty_cpu_map;
> -
> if (empty_cpu_map == NULL) {
> empty_cpu_map = perf_cpu_map__dummy_new();
> if (empty_cpu_map == NULL)
> @@ -1772,8 +1770,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> }
>
> if (threads == NULL) {
> - static struct perf_thread_map *empty_thread_map;
> -
> if (empty_thread_map == NULL) {
> empty_thread_map = thread_map__new_by_tid(-1);
> if (empty_thread_map == NULL)
> @@ -1792,6 +1788,33 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> perf_evsel__alloc_fd(&evsel->core, cpus->nr, nthreads) < 0)
> return -ENOMEM;
>
> + return 0;
> +}
> +
> +static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> + struct perf_thread_map *threads,
> + int start_cpu, int end_cpu)
> +{
> + int cpu, thread, nthreads;
> + unsigned long flags = PERF_FLAG_FD_CLOEXEC;
> + int pid = -1, err, old_errno;
> + enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> +
> + err = __evsel__prepare_open(evsel, cpus, threads);
> + if (err)
> + return err;
> +
> + if (cpus == NULL)
> + cpus = empty_cpu_map;
> +
> + if (threads == NULL)
> + threads = empty_thread_map;
> +
> + if (evsel->core.system_wide)
> + nthreads = 1;
> + else
> + nthreads = threads->nr;
> +
> if (evsel->cgrp) {
> flags |= PERF_FLAG_PID_CGROUP;
> pid = evsel->cgrp->fd;
> --
> 2.31.1

--

- Arnaldo

2021-08-31 21:47:35

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 19/37] perf evsel: separate missing feature detection from evsel__open_cpu

Em Sat, Aug 21, 2021 at 11:19:25AM +0200, Riccardo Mancini escreveu:
> This is a preparatory patch for the following patches with the goal to
> separate in evlist__open_cpu the actual opening, which could be
> performed in parallel, from the existing fallback mechanisms, which
> should be handled sequentially.
>
> This patch separates the missing feature detection in evsel__open_cpu
> into a new evsel__detect_missing_features function.

Thanks, applied.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 174 +++++++++++++++++++++-------------------
> tools/perf/util/evsel.h | 1 +
> 2 files changed, 92 insertions(+), 83 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 4e9a3e62075305f1..c393bd992322d925 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1841,6 +1841,96 @@ int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> return err;
> }
>
> +bool evsel__detect_missing_features(struct evsel *evsel)
> +{
> + /*
> + * Must probe features in the order they were added to the
> + * perf_event_attr interface.
> + */
> + if (!perf_missing_features.weight_struct &&
> + (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) {
> + perf_missing_features.weight_struct = true;
> + pr_debug2("switching off weight struct support\n");
> + return true;
> + } else if (!perf_missing_features.code_page_size &&
> + (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
> + perf_missing_features.code_page_size = true;
> + pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
> + return false;
> + } else if (!perf_missing_features.data_page_size &&
> + (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
> + perf_missing_features.data_page_size = true;
> + pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
> + return false;
> + } else if (!perf_missing_features.cgroup && evsel->core.attr.cgroup) {
> + perf_missing_features.cgroup = true;
> + pr_debug2_peo("Kernel has no cgroup sampling support, bailing out\n");
> + return false;
> + } else if (!perf_missing_features.branch_hw_idx &&
> + (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX)) {
> + perf_missing_features.branch_hw_idx = true;
> + pr_debug2("switching off branch HW index support\n");
> + return true;
> + } else if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
> + perf_missing_features.aux_output = true;
> + pr_debug2_peo("Kernel has no attr.aux_output support, bailing out\n");
> + return false;
> + } else if (!perf_missing_features.bpf && evsel->core.attr.bpf_event) {
> + perf_missing_features.bpf = true;
> + pr_debug2_peo("switching off bpf_event\n");
> + return true;
> + } else if (!perf_missing_features.ksymbol && evsel->core.attr.ksymbol) {
> + perf_missing_features.ksymbol = true;
> + pr_debug2_peo("switching off ksymbol\n");
> + return true;
> + } else if (!perf_missing_features.write_backward && evsel->core.attr.write_backward) {
> + perf_missing_features.write_backward = true;
> + pr_debug2_peo("switching off write_backward\n");
> + return false;
> + } else if (!perf_missing_features.clockid_wrong && evsel->core.attr.use_clockid) {
> + perf_missing_features.clockid_wrong = true;
> + pr_debug2_peo("switching off clockid\n");
> + return true;
> + } else if (!perf_missing_features.clockid && evsel->core.attr.use_clockid) {
> + perf_missing_features.clockid = true;
> + pr_debug2_peo("switching off use_clockid\n");
> + return true;
> + } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
> + perf_missing_features.cloexec = true;
> + pr_debug2_peo("switching off cloexec flag\n");
> + return true;
> + } else if (!perf_missing_features.mmap2 && evsel->core.attr.mmap2) {
> + perf_missing_features.mmap2 = true;
> + pr_debug2_peo("switching off mmap2\n");
> + return true;
> + } else if (!perf_missing_features.exclude_guest &&
> + (evsel->core.attr.exclude_guest || evsel->core.attr.exclude_host)) {
> + perf_missing_features.exclude_guest = true;
> + pr_debug2_peo("switching off exclude_guest, exclude_host\n");
> + return true;
> + } else if (!perf_missing_features.sample_id_all) {
> + perf_missing_features.sample_id_all = true;
> + pr_debug2_peo("switching off sample_id_all\n");
> + return true;
> + } else if (!perf_missing_features.lbr_flags &&
> + (evsel->core.attr.branch_sample_type &
> + (PERF_SAMPLE_BRANCH_NO_CYCLES |
> + PERF_SAMPLE_BRANCH_NO_FLAGS))) {
> + perf_missing_features.lbr_flags = true;
> + pr_debug2_peo("switching off branch sample type no (cycles/flags)\n");
> + return true;
> + } else if (!perf_missing_features.group_read &&
> + evsel->core.attr.inherit &&
> + (evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
> + evsel__is_group_leader(evsel)) {
> + perf_missing_features.group_read = true;
> + pr_debug2_peo("switching off group read\n");
> + return true;
> + } else {
> + return false;
> + }
> +}
> +
> static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads,
> int start_cpu, int end_cpu)
> @@ -1979,90 +2069,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> if (err != -EINVAL || cpu > 0 || thread > 0)
> goto out_close;
>
> - /*
> - * Must probe features in the order they were added to the
> - * perf_event_attr interface.
> - */
> - if (!perf_missing_features.weight_struct &&
> - (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) {
> - perf_missing_features.weight_struct = true;
> - pr_debug2("switching off weight struct support\n");
> + if (evsel__detect_missing_features(evsel))
> goto fallback_missing_features;
> - } else if (!perf_missing_features.code_page_size &&
> - (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
> - perf_missing_features.code_page_size = true;
> - pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
> - goto out_close;
> - } else if (!perf_missing_features.data_page_size &&
> - (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
> - perf_missing_features.data_page_size = true;
> - pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
> - goto out_close;
> - } else if (!perf_missing_features.cgroup && evsel->core.attr.cgroup) {
> - perf_missing_features.cgroup = true;
> - pr_debug2_peo("Kernel has no cgroup sampling support, bailing out\n");
> - goto out_close;
> - } else if (!perf_missing_features.branch_hw_idx &&
> - (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX)) {
> - perf_missing_features.branch_hw_idx = true;
> - pr_debug2("switching off branch HW index support\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.aux_output && evsel->core.attr.aux_output) {
> - perf_missing_features.aux_output = true;
> - pr_debug2_peo("Kernel has no attr.aux_output support, bailing out\n");
> - goto out_close;
> - } else if (!perf_missing_features.bpf && evsel->core.attr.bpf_event) {
> - perf_missing_features.bpf = true;
> - pr_debug2_peo("switching off bpf_event\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.ksymbol && evsel->core.attr.ksymbol) {
> - perf_missing_features.ksymbol = true;
> - pr_debug2_peo("switching off ksymbol\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.write_backward && evsel->core.attr.write_backward) {
> - perf_missing_features.write_backward = true;
> - pr_debug2_peo("switching off write_backward\n");
> - goto out_close;
> - } else if (!perf_missing_features.clockid_wrong && evsel->core.attr.use_clockid) {
> - perf_missing_features.clockid_wrong = true;
> - pr_debug2_peo("switching off clockid\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.clockid && evsel->core.attr.use_clockid) {
> - perf_missing_features.clockid = true;
> - pr_debug2_peo("switching off use_clockid\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.cloexec && (evsel->open_flags & PERF_FLAG_FD_CLOEXEC)) {
> - perf_missing_features.cloexec = true;
> - pr_debug2_peo("switching off cloexec flag\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.mmap2 && evsel->core.attr.mmap2) {
> - perf_missing_features.mmap2 = true;
> - pr_debug2_peo("switching off mmap2\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.exclude_guest &&
> - (evsel->core.attr.exclude_guest || evsel->core.attr.exclude_host)) {
> - perf_missing_features.exclude_guest = true;
> - pr_debug2_peo("switching off exclude_guest, exclude_host\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.sample_id_all) {
> - perf_missing_features.sample_id_all = true;
> - pr_debug2_peo("switching off sample_id_all\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.lbr_flags &&
> - (evsel->core.attr.branch_sample_type &
> - (PERF_SAMPLE_BRANCH_NO_CYCLES |
> - PERF_SAMPLE_BRANCH_NO_FLAGS))) {
> - perf_missing_features.lbr_flags = true;
> - pr_debug2_peo("switching off branch sample type no (cycles/flags)\n");
> - goto fallback_missing_features;
> - } else if (!perf_missing_features.group_read &&
> - evsel->core.attr.inherit &&
> - (evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
> - evsel__is_group_leader(evsel)) {
> - perf_missing_features.group_read = true;
> - pr_debug2_peo("switching off group read\n");
> - goto fallback_missing_features;
> - }
> out_close:
> if (err)
> threads->err_thread = thread;
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 58aa998e1814ac9e..a83fb7f69b1ead73 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -288,6 +288,7 @@ int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
> void evsel__close(struct evsel *evsel);
> int evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads);
> +bool evsel__detect_missing_features(struct evsel *evsel);
>
> struct perf_sample;
>
> --
> 2.31.1

--

- Arnaldo

2021-08-31 21:50:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 23/37] perf evsel: move bpf_counter__install_pe to success path in evsel__open_cpu

Em Sat, Aug 21, 2021 at 11:19:29AM +0200, Riccardo Mancini escreveu:
> I don't see why bpf_counter__install_pe should get called even if fd=-1,
> so I'm moving it to the success path.
>
> This will be useful in following patches to separate the actual open and
> the related operations from the fallback mechanisms.

Looks sane, applied.

Next time please use git blame to find the author and add him to the CC
list, like I just did, so that the mistery can be unveiled or a duh!
uttered :-)

- Arnaldo

> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d2b0391569286080..3e556afed8dd396c 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2009,8 +2009,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>
> FD(evsel, cpu, thread) = fd;
>
> - bpf_counter__install_pe(evsel, cpu, fd);
> -
> if (fd < 0) {
> err = -errno;
>
> @@ -2019,6 +2017,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> goto try_fallback;
> }
>
> + bpf_counter__install_pe(evsel, cpu, fd);
> +
> if (unlikely(test_attr__enabled)) {
> test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
> fd, group_fd, evsel->open_flags);
> --
> 2.31.1

--

- Arnaldo

2021-08-31 21:50:17

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 24/37] perf evsel: handle precise_ip fallback in evsel__open_cpu

Em Sat, Aug 21, 2021 at 11:19:30AM +0200, Riccardo Mancini escreveu:
> This is another patch in the effort to separate the fallback mechanisms
> from the open itself.
>
> In case of precise_ip fallback, the original precise_ip will be stored
> in the evsel (it was stored in a local variable) and the open will be
> retried. Since the precise_ip fallback will be the first in the chain of
> fallbacks, there should be no functional change with this patch.

Thanks, looks ok, applied.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 59 ++++++++++++++++++-----------------------
> tools/perf/util/evsel.h | 2 ++
> 2 files changed, 28 insertions(+), 33 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 3e556afed8dd396c..2e95416b8320c6b9 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1709,42 +1709,29 @@ static void display_attr(struct perf_event_attr *attr)
> }
> }
>
> -static int perf_event_open(struct evsel *evsel,
> - pid_t pid, int cpu, int group_fd)
> +bool evsel__precise_ip_fallback(struct evsel *evsel)
> {
> - int precise_ip = evsel->core.attr.precise_ip;
> - int fd;
> -
> - while (1) {
> - pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
> - pid, cpu, group_fd, evsel->open_flags);
> -
> - fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, group_fd, evsel->open_flags);
> - if (fd >= 0)
> - break;
> -
> - /* Do not try less precise if not requested. */
> - if (!evsel->precise_max)
> - break;
> -
> - /*
> - * We tried all the precise_ip values, and it's
> - * still failing, so leave it to standard fallback.
> - */
> - if (!evsel->core.attr.precise_ip) {
> - evsel->core.attr.precise_ip = precise_ip;
> - break;
> - }
> + /* Do not try less precise if not requested. */
> + if (!evsel->precise_max)
> + return false;
>
> - pr_debug2_peo("\nsys_perf_event_open failed, error %d\n", -ENOTSUP);
> - evsel->core.attr.precise_ip--;
> - pr_debug2_peo("decreasing precise_ip by one (%d)\n", evsel->core.attr.precise_ip);
> - display_attr(&evsel->core.attr);
> + /*
> + * We tried all the precise_ip values, and it's
> + * still failing, so leave it to standard fallback.
> + */
> + if (!evsel->core.attr.precise_ip) {
> + evsel->core.attr.precise_ip = evsel->precise_ip_original;
> + return false;
> }
>
> - return fd;
> -}
> + if (!evsel->precise_ip_original)
> + evsel->precise_ip_original = evsel->core.attr.precise_ip;
>
> + evsel->core.attr.precise_ip--;
> + pr_debug2_peo("decreasing precise_ip by one (%d)\n", evsel->core.attr.precise_ip);
> + display_attr(&evsel->core.attr);
> + return true;
> +}
>
> static struct perf_cpu_map *empty_cpu_map;
> static struct perf_thread_map *empty_thread_map;
> @@ -2004,8 +1991,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>
> test_attr__ready();
>
> - fd = perf_event_open(evsel, pid, cpus->map[cpu],
> - group_fd);
> + pr_debug2_peo("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
> + pid, cpus->map[cpu], group_fd, evsel->open_flags);
> +
> + fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
> + group_fd, evsel->open_flags);
>
> FD(evsel, cpu, thread) = fd;
>
> @@ -2058,6 +2048,9 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> return 0;
>
> try_fallback:
> + if (evsel__precise_ip_fallback(evsel))
> + goto retry_open;
> +
> if (evsel__ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
> /* We just removed 1 thread, so lower the upper nthreads limit. */
> nthreads--;
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 661d2677a19b6248..0a245afab2d87d74 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -150,6 +150,7 @@ struct evsel {
> struct bperf_follower_bpf *follower_skel;
> };
> unsigned long open_flags;
> + int precise_ip_original;
> };
>
> struct perf_missing_features {
> @@ -297,6 +298,7 @@ bool evsel__ignore_missing_thread(struct evsel *evsel,
> int nr_cpus, int cpu,
> struct perf_thread_map *threads,
> int thread, int err);
> +bool evsel__precise_ip_fallback(struct evsel *evsel);
>
> struct perf_sample;
>
> --
> 2.31.1

--

- Arnaldo

2021-08-31 22:57:04

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 17/37] perf evsel: separate missing feature disabling from evsel__open_cpu

Em Sat, Aug 21, 2021 at 11:19:23AM +0200, Riccardo Mancini escreveu:
> This is a preparatory patch for the following patches with the goal to
> separate in evlist__open_cpu the actual opening, which could be
> performed in parallel, from the existing fallback mechanisms, which
> should be handled sequentially.
>
> This patch separates the disabling of missing features from
> evlist__open_cpu into a new function evsel__disable_missing_features.


Thanks, applied as the end result is the same.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 57 ++++++++++++++++++++++-------------------
> 1 file changed, 31 insertions(+), 26 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 509a2970a94b3142..f0bc89f743915800 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1794,33 +1794,8 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> return 0;
> }
>
> -static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> - struct perf_thread_map *threads,
> - int start_cpu, int end_cpu)
> +static void evsel__disable_missing_features(struct evsel *evsel)
> {
> - int cpu, thread, nthreads;
> - int pid = -1, err, old_errno;
> - enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> -
> - err = __evsel__prepare_open(evsel, cpus, threads);
> - if (err)
> - return err;
> -
> - if (cpus == NULL)
> - cpus = empty_cpu_map;
> -
> - if (threads == NULL)
> - threads = empty_thread_map;
> -
> - if (evsel->core.system_wide)
> - nthreads = 1;
> - else
> - nthreads = threads->nr;
> -
> - if (evsel->cgrp)
> - pid = evsel->cgrp->fd;
> -
> -fallback_missing_features:
> if (perf_missing_features.weight_struct) {
> evsel__set_sample_bit(evsel, WEIGHT);
> evsel__reset_sample_bit(evsel, WEIGHT_STRUCT);
> @@ -1850,6 +1825,36 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_HW_INDEX;
> if (perf_missing_features.sample_id_all)
> evsel->core.attr.sample_id_all = 0;
> +}
> +
> +static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> + struct perf_thread_map *threads,
> + int start_cpu, int end_cpu)
> +{
> + int cpu, thread, nthreads;
> + int pid = -1, err, old_errno;
> + enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> +
> + err = __evsel__prepare_open(evsel, cpus, threads);
> + if (err)
> + return err;
> +
> + if (cpus == NULL)
> + cpus = empty_cpu_map;
> +
> + if (threads == NULL)
> + threads = empty_thread_map;
> +
> + if (evsel->core.system_wide)
> + nthreads = 1;
> + else
> + nthreads = threads->nr;
> +
> + if (evsel->cgrp)
> + pid = evsel->cgrp->fd;
> +
> +fallback_missing_features:
> + evsel__disable_missing_features(evsel);
>
> display_attr(&evsel->core.attr);
>
> --
> 2.31.1

--

- Arnaldo

2021-08-31 22:59:08

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 21/37] perf evsel: move ignore_missing_thread to fallback code

Em Sat, Aug 21, 2021 at 11:19:27AM +0200, Riccardo Mancini escreveu:
> This patch moves ignore_missing_thread outside the perf_event_open loop.
> Doing so, we need to move the retry_open flag a few places higher, with
> minimal impact. Furthermore, thread need not be decreased since it won't
> get increased by the for loop (since we're jumping back inside), but we
> need to check that the nthreads decrease didn't put thread out of range.
>
> The goal is to have fallbacks handled in one place only, since in the
> future parallel code, these would be handled separately.

Thanks, looks ok, applied.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/evsel.c | 29 +++++++++++++----------------
> tools/perf/util/evsel.h | 5 +++++
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 916930ea31450265..a1a88607fd59efcb 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1656,7 +1656,7 @@ static int update_fds(struct evsel *evsel,
> return 0;
> }
>
> -static bool ignore_missing_thread(struct evsel *evsel,
> +bool evsel__ignore_missing_thread(struct evsel *evsel,
> int nr_cpus, int cpu,
> struct perf_thread_map *threads,
> int thread, int err)
> @@ -1993,12 +1993,15 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>
> for (thread = 0; thread < nthreads; thread++) {
> int fd, group_fd;
> +retry_open:
> + if (thread >= nthreads)
> + break;
>
> if (!evsel->cgrp && !evsel->core.system_wide)
> pid = perf_thread_map__pid(threads, thread);
>
> group_fd = get_group_fd(evsel, cpu, thread);
> -retry_open:
> +
> test_attr__ready();
>
> fd = perf_event_open(evsel, pid, cpus->map[cpu],
> @@ -2016,20 +2019,6 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> if (fd < 0) {
> err = -errno;
>
> - if (ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
> - /*
> - * We just removed 1 thread, so take a step
> - * back on thread index and lower the upper
> - * nthreads limit.
> - */
> - nthreads--;
> - thread--;
> -
> - /* ... and pretend like nothing have happened. */
> - err = 0;
> - continue;
> - }
> -
> pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
> err);
> goto try_fallback;
> @@ -2069,6 +2058,14 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> return 0;
>
> try_fallback:
> + if (evsel__ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
> + /* We just removed 1 thread, so lower the upper nthreads limit. */
> + nthreads--;
> +
> + /* ... and pretend like nothing have happened. */
> + err = 0;
> + goto retry_open;
> + }
> /*
> * perf stat needs between 5 and 22 fds per CPU. When we run out
> * of them try to increase the limits.
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index bf9abd9a5cbf9852..661d2677a19b6248 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -293,6 +293,11 @@ bool evsel__detect_missing_features(struct evsel *evsel);
> enum rlimit_action { NO_CHANGE, SET_TO_MAX, INCREASED_MAX };
> bool evsel__increase_rlimit(enum rlimit_action *set_rlimit);
>
> +bool evsel__ignore_missing_thread(struct evsel *evsel,
> + int nr_cpus, int cpu,
> + struct perf_thread_map *threads,
> + int thread, int err);
> +
> struct perf_sample;
>
> void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
> --
> 2.31.1

--

- Arnaldo

2021-09-03 21:54:36

by Riccardo Mancini

[permalink] [raw]
Subject: Re: [RFC PATCH v1 25/37] perf evsel: move event open in evsel__open_cpu to separate function

Hi Arnaldo,

thanks for your review and your suggestions, and also for the PRIu64 patch.

On Tue, 2021-08-31 at 16:54 -0300, Arnaldo Carvalho de Melo wrote:
> Em Sat, Aug 21, 2021 at 11:19:31AM +0200, Riccardo Mancini escreveu:
> > This is the final patch splitting evsel__open_cpu.
> > This patch moves the entire loop code to a separate function, to be
> > reused for the multithreaded code.
>
> Are you going to use that 'enum perf_event_open_err' somewhere else?
> I.e. is there a need to expose it in evsel.h?

Yes, in the next patch (26/37). It's being used to expose a function that just
does the perf_event_open calls for an evsel. It needs to return such structure
to provide information about the error (which return code, at which thread).

>
> I'm stopping at this patch to give the ones I merged so far some
> testing, will now push it to tmp.perf/core.

I checked tmp.perf/core and it looks good to me.
I also did some additional tests to check that fallback mechanisms where
working:

check missing pid being ignored (rerun until warning is shown)
$ sudo ./perf bench internals evlist-open-close -i10 -u $UID

check that weak group fallback is working
$ sudo ./perf record -e '{cycles,cache-misses,cache-
references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W'

check that precision_ip fallback is working:
edited perf-sys.h to make sys_perf_event_open fail if precision_ip > 2
$ sudo ./perf record -e '{cycles,cs}:P'


I've also run perf-test on my machine and it's passing too.
I'm encounteirng one fail on the "BPF filter" test (42), which is present also
in perf/core, so it should not be related to this patch.

Thanks,
Riccardo

>
> - Arnaldo
>  
> > Signed-off-by: Riccardo Mancini <[email protected]>
> > ---
> >  tools/perf/util/evsel.c | 142 ++++++++++++++++++++++++----------------
> >  tools/perf/util/evsel.h |  12 ++++
> >  2 files changed, 99 insertions(+), 55 deletions(-)
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 2e95416b8320c6b9..e41f55a7a70ea630 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -1945,6 +1945,82 @@ bool evsel__increase_rlimit(enum rlimit_action
> > *set_rlimit)
> >         return false;
> >  }
> >  
> > +static struct perf_event_open_result perf_event_open(struct evsel *evsel,
> > +                                       pid_t pid, int cpu, int thread,
> > +                                       struct perf_cpu_map *cpus,
> > +                                       struct perf_thread_map *threads)
> > +{
> > +       int fd, group_fd, rc;
> > +       struct perf_event_open_result res;
> > +
> > +       if (!evsel->cgrp && !evsel->core.system_wide)
> > +               pid = perf_thread_map__pid(threads, thread);
> > +
> > +       group_fd = get_group_fd(evsel, cpu, thread);
> > +
> > +       test_attr__ready();
> > +
> > +       pr_debug2_peo("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags
> > %#lx",
> > +                       pid, cpus->map[cpu], group_fd, evsel->open_flags);
> > +
> > +       fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
> > +                               group_fd, evsel->open_flags);
> > +
> > +       FD(evsel, cpu, thread) = fd;
> > +       res.fd = fd;
> > +
> > +       if (fd < 0) {
> > +               rc = -errno;
> > +
> > +               pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
> > +                               rc);
> > +               res.rc = rc;
> > +               res.err = PEO_FALLBACK;
> > +               return res;
> > +       }
> > +
> > +       bpf_counter__install_pe(evsel, cpu, fd);
> > +
> > +       if (unlikely(test_attr__enabled)) {
> > +               test_attr__open(&evsel->core.attr, pid,
> > +                       cpus->map[cpu], fd,
> > +                       group_fd, evsel->open_flags);
> > +       }
> > +
> > +       pr_debug2_peo(" = %d\n", fd);
> > +
> > +       if (evsel->bpf_fd >= 0) {
> > +               int evt_fd = fd;
> > +               int bpf_fd = evsel->bpf_fd;
> > +
> > +               rc = ioctl(evt_fd,
> > +                               PERF_EVENT_IOC_SET_BPF,
> > +                               bpf_fd);
> > +               if (rc && errno != EEXIST) {
> > +                       pr_err("failed to attach bpf fd %d: %s\n",
> > +                               bpf_fd, strerror(errno));
> > +                       res.rc = -EINVAL;
> > +                       res.err = PEO_ERROR;
> > +                       return res;
> > +               }
> > +       }
> > +
> > +       /*
> > +        * If we succeeded but had to kill clockid, fail and
> > +        * have evsel__open_strerror() print us a nice error.
> > +        */
> > +       if (perf_missing_features.clockid ||
> > +               perf_missing_features.clockid_wrong) {
> > +               res.rc = -EINVAL;
> > +               res.err = PEO_ERROR;
> > +               return res;
> > +       }
> > +
> > +       res.rc = 0;
> > +       res.err = PEO_SUCCESS;
> > +       return res;
> > +}
> > +
> >  static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> >                 struct perf_thread_map *threads,
> >                 int start_cpu, int end_cpu)
> > @@ -1952,6 +2028,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct
> > perf_cpu_map *cpus,
> >         int cpu, thread, nthreads;
> >         int pid = -1, err, old_errno;
> >         enum rlimit_action set_rlimit = NO_CHANGE;
> > +       struct perf_event_open_result peo_res;
> >  
> >         err = __evsel__prepare_open(evsel, cpus, threads);
> >         if (err)
> > @@ -1979,67 +2056,22 @@ static int evsel__open_cpu(struct evsel *evsel, struct
> > perf_cpu_map *cpus,
> >         for (cpu = start_cpu; cpu < end_cpu; cpu++) {
> >  
> >                 for (thread = 0; thread < nthreads; thread++) {
> > -                       int fd, group_fd;
> >  retry_open:
> >                         if (thread >= nthreads)
> >                                 break;
> >  
> > -                       if (!evsel->cgrp && !evsel->core.system_wide)
> > -                               pid = perf_thread_map__pid(threads, thread);
> > -
> > -                       group_fd = get_group_fd(evsel, cpu, thread);
> > -
> > -                       test_attr__ready();
> > -
> > -                       pr_debug2_peo("sys_perf_event_open: pid %d  cpu %d 
> > group_fd %d  flags %#lx",
> > -                               pid, cpus->map[cpu], group_fd, evsel-
> > >open_flags);
> > +                       peo_res = perf_event_open(evsel, pid, cpu, thread,
> > cpus,
> > +                                               threads);
> >  
> > -                       fd = sys_perf_event_open(&evsel->core.attr, pid, cpus-
> > >map[cpu],
> > -                                               group_fd, evsel->open_flags);
> > -
> > -                       FD(evsel, cpu, thread) = fd;
> > -
> > -                       if (fd < 0) {
> > -                               err = -errno;
> > -
> > -                               pr_debug2_peo("\nsys_perf_event_open failed,
> > error %d\n",
> > -                                         err);
> > +                       err = peo_res.rc;
> > +                       switch (peo_res.err) {
> > +                       case PEO_SUCCESS:
> > +                               set_rlimit = NO_CHANGE;
> > +                               continue;
> > +                       case PEO_FALLBACK:
> >                                 goto try_fallback;
> > -                       }
> > -
> > -                       bpf_counter__install_pe(evsel, cpu, fd);
> > -
> > -                       if (unlikely(test_attr__enabled)) {
> > -                               test_attr__open(&evsel->core.attr, pid, cpus-
> > >map[cpu],
> > -                                               fd, group_fd, evsel-
> > >open_flags);
> > -                       }
> > -
> > -                       pr_debug2_peo(" = %d\n", fd);
> > -
> > -                       if (evsel->bpf_fd >= 0) {
> > -                               int evt_fd = fd;
> > -                               int bpf_fd = evsel->bpf_fd;
> > -
> > -                               err = ioctl(evt_fd,
> > -                                           PERF_EVENT_IOC_SET_BPF,
> > -                                           bpf_fd);
> > -                               if (err && errno != EEXIST) {
> > -                                       pr_err("failed to attach bpf fd %d:
> > %s\n",
> > -                                              bpf_fd, strerror(errno));
> > -                                       err = -EINVAL;
> > -                                       goto out_close;
> > -                               }
> > -                       }
> > -
> > -                       set_rlimit = NO_CHANGE;
> > -
> > -                       /*
> > -                        * If we succeeded but had to kill clockid, fail and
> > -                        * have evsel__open_strerror() print us a nice error.
> > -                        */
> > -                       if (perf_missing_features.clockid ||
> > -                           perf_missing_features.clockid_wrong) {
> > -                               err = -EINVAL;
> > +                       default:
> > +                       case PEO_ERROR:
> >                                 goto out_close;
> >                         }
> >                 }
> > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > index 0a245afab2d87d74..8c9827a93ac001a7 100644
> > --- a/tools/perf/util/evsel.h
> > +++ b/tools/perf/util/evsel.h
> > @@ -282,6 +282,18 @@ int evsel__enable(struct evsel *evsel);
> >  int evsel__disable(struct evsel *evsel);
> >  int evsel__disable_cpu(struct evsel *evsel, int cpu);
> >  
> > +enum perf_event_open_err {
> > +       PEO_SUCCESS,
> > +       PEO_FALLBACK,
> > +       PEO_ERROR
> > +};
> > +
> > +struct perf_event_open_result {
> > +       enum perf_event_open_err err;
> > +       int rc;
> > +       int fd;
> > +};
> > +
> >  int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int
> > cpu);
> >  int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map
> > *threads);
> >  int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
> > --
> > 2.31.1
>


2021-09-11 19:11:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 25/37] perf evsel: move event open in evsel__open_cpu to separate function

Em Fri, Sep 03, 2021 at 11:52:18PM +0200, Riccardo Mancini escreveu:
> Hi Arnaldo,
>
> thanks for your?review and your suggestions, and also for the PRIu64 patch.
>
> On Tue, 2021-08-31 at 16:54 -0300, Arnaldo Carvalho de Melo wrote:
> > Em Sat, Aug 21, 2021 at 11:19:31AM +0200, Riccardo Mancini escreveu:
> > > This is the final patch splitting evsel__open_cpu.
> > > This patch moves the entire loop code to a separate function, to be
> > > reused for the multithreaded code.
> >
> > Are you going to use that 'enum perf_event_open_err' somewhere else?
> > I.e. is there a need to expose it in evsel.h?
>
> Yes, in the next patch (26/37). It's being used to expose a function that just
> does the perf_event_open calls for an evsel. It needs to return such structure
> to provide information about the error (which return code, at which thread).
>
> >
> > I'm stopping at this patch to give the ones I merged so far some
> > testing, will now push it to tmp.perf/core.
>
> I checked tmp.perf/core and it looks good to me.
> I also did some additional tests to check that fallback mechanisms where
> working:
>
> check missing pid being ignored (rerun until warning is shown)
> $ sudo ./perf bench internals evlist-open-close -i10 -u $UID
>
> check that weak group fallback is working
> $ sudo ./perf record -e '{cycles,cache-misses,cache-
> references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W'
>
> check that precision_ip fallback is working:
> edited perf-sys.h to make sys_perf_event_open fail if precision_ip > 2
> $ sudo ./perf record -e '{cycles,cs}:P'
>
>
> I've also run perf-test on my machine and it's passing too.
> I'm encounteirng one?fail on the "BPF filter" test (42), which is present also
> in perf/core, so it should not be related to this patch.

Thanks! I'll try to resume work on it as soon as I have the plumbers
talk ready :-)

- Arnaldo

> Thanks,
> Riccardo
>
> >
> > - Arnaldo
> > ?
> > > Signed-off-by: Riccardo Mancini <[email protected]>
> > > ---
> > > ?tools/perf/util/evsel.c | 142 ++++++++++++++++++++++++----------------
> > > ?tools/perf/util/evsel.h |? 12 ++++
> > > ?2 files changed, 99 insertions(+), 55 deletions(-)
> > >
> > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > > index 2e95416b8320c6b9..e41f55a7a70ea630 100644
> > > --- a/tools/perf/util/evsel.c
> > > +++ b/tools/perf/util/evsel.c
> > > @@ -1945,6 +1945,82 @@ bool evsel__increase_rlimit(enum rlimit_action
> > > *set_rlimit)
> > > ????????return false;
> > > ?}
> > > ?
> > > +static struct perf_event_open_result perf_event_open(struct evsel *evsel,
> > > +???????????????????????????????????????pid_t pid, int cpu, int thread,
> > > +???????????????????????????????????????struct perf_cpu_map *cpus,
> > > +???????????????????????????????????????struct perf_thread_map *threads)
> > > +{
> > > +???????int fd, group_fd, rc;
> > > +???????struct perf_event_open_result res;
> > > +
> > > +???????if (!evsel->cgrp && !evsel->core.system_wide)
> > > +???????????????pid = perf_thread_map__pid(threads, thread);
> > > +
> > > +???????group_fd = get_group_fd(evsel, cpu, thread);
> > > +
> > > +???????test_attr__ready();
> > > +
> > > +???????pr_debug2_peo("sys_perf_event_open: pid %d? cpu %d? group_fd %d? flags
> > > %#lx",
> > > +???????????????????????pid, cpus->map[cpu], group_fd, evsel->open_flags);
> > > +
> > > +???????fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[cpu],
> > > +???????????????????????????????group_fd, evsel->open_flags);
> > > +
> > > +???????FD(evsel, cpu, thread) = fd;
> > > +???????res.fd = fd;
> > > +
> > > +???????if (fd < 0) {
> > > +???????????????rc = -errno;
> > > +
> > > +???????????????pr_debug2_peo("\nsys_perf_event_open failed, error %d\n",
> > > +???????????????????????????????rc);
> > > +???????????????res.rc = rc;
> > > +???????????????res.err = PEO_FALLBACK;
> > > +???????????????return res;
> > > +???????}
> > > +
> > > +???????bpf_counter__install_pe(evsel, cpu, fd);
> > > +
> > > +???????if (unlikely(test_attr__enabled)) {
> > > +???????????????test_attr__open(&evsel->core.attr, pid,
> > > +???????????????????????cpus->map[cpu], fd,
> > > +???????????????????????group_fd, evsel->open_flags);
> > > +???????}
> > > +
> > > +???????pr_debug2_peo(" = %d\n", fd);
> > > +
> > > +???????if (evsel->bpf_fd >= 0) {
> > > +???????????????int evt_fd = fd;
> > > +???????????????int bpf_fd = evsel->bpf_fd;
> > > +
> > > +???????????????rc = ioctl(evt_fd,
> > > +???????????????????????????????PERF_EVENT_IOC_SET_BPF,
> > > +???????????????????????????????bpf_fd);
> > > +???????????????if (rc && errno != EEXIST) {
> > > +???????????????????????pr_err("failed to attach bpf fd %d: %s\n",
> > > +???????????????????????????????bpf_fd, strerror(errno));
> > > +???????????????????????res.rc = -EINVAL;
> > > +???????????????????????res.err = PEO_ERROR;
> > > +???????????????????????return res;
> > > +???????????????}
> > > +???????}
> > > +
> > > +???????/*
> > > +??????? * If we succeeded but had to kill clockid, fail and
> > > +??????? * have evsel__open_strerror() print us a nice error.
> > > +??????? */
> > > +???????if (perf_missing_features.clockid ||
> > > +???????????????perf_missing_features.clockid_wrong) {
> > > +???????????????res.rc = -EINVAL;
> > > +???????????????res.err = PEO_ERROR;
> > > +???????????????return res;
> > > +???????}
> > > +
> > > +???????res.rc = 0;
> > > +???????res.err = PEO_SUCCESS;
> > > +???????return res;
> > > +}
> > > +
> > > ?static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> > > ????????????????struct perf_thread_map *threads,
> > > ????????????????int start_cpu, int end_cpu)
> > > @@ -1952,6 +2028,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct
> > > perf_cpu_map *cpus,
> > > ????????int cpu, thread, nthreads;
> > > ????????int pid = -1, err, old_errno;
> > > ????????enum rlimit_action set_rlimit = NO_CHANGE;
> > > +???????struct perf_event_open_result peo_res;
> > > ?
> > > ????????err = __evsel__prepare_open(evsel, cpus, threads);
> > > ????????if (err)
> > > @@ -1979,67 +2056,22 @@ static int evsel__open_cpu(struct evsel *evsel, struct
> > > perf_cpu_map *cpus,
> > > ????????for (cpu = start_cpu; cpu < end_cpu; cpu++) {
> > > ?
> > > ????????????????for (thread = 0; thread < nthreads; thread++) {
> > > -???????????????????????int fd, group_fd;
> > > ?retry_open:
> > > ????????????????????????if (thread >= nthreads)
> > > ????????????????????????????????break;
> > > ?
> > > -???????????????????????if (!evsel->cgrp && !evsel->core.system_wide)
> > > -???????????????????????????????pid = perf_thread_map__pid(threads, thread);
> > > -
> > > -???????????????????????group_fd = get_group_fd(evsel, cpu, thread);
> > > -
> > > -???????????????????????test_attr__ready();
> > > -
> > > -???????????????????????pr_debug2_peo("sys_perf_event_open: pid %d? cpu %d?
> > > group_fd %d? flags %#lx",
> > > -???????????????????????????????pid, cpus->map[cpu], group_fd, evsel-
> > > >open_flags);
> > > +???????????????????????peo_res = perf_event_open(evsel, pid, cpu, thread,
> > > cpus,
> > > +???????????????????????????????????????????????threads);
> > > ?
> > > -???????????????????????fd = sys_perf_event_open(&evsel->core.attr, pid, cpus-
> > > >map[cpu],
> > > -???????????????????????????????????????????????group_fd, evsel->open_flags);
> > > -
> > > -???????????????????????FD(evsel, cpu, thread) = fd;
> > > -
> > > -???????????????????????if (fd < 0) {
> > > -???????????????????????????????err = -errno;
> > > -
> > > -???????????????????????????????pr_debug2_peo("\nsys_perf_event_open failed,
> > > error %d\n",
> > > -???????????????????????????????????????? err);
> > > +???????????????????????err = peo_res.rc;
> > > +???????????????????????switch (peo_res.err) {
> > > +???????????????????????case PEO_SUCCESS:
> > > +???????????????????????????????set_rlimit = NO_CHANGE;
> > > +???????????????????????????????continue;
> > > +???????????????????????case PEO_FALLBACK:
> > > ????????????????????????????????goto try_fallback;
> > > -???????????????????????}
> > > -
> > > -???????????????????????bpf_counter__install_pe(evsel, cpu, fd);
> > > -
> > > -???????????????????????if (unlikely(test_attr__enabled)) {
> > > -???????????????????????????????test_attr__open(&evsel->core.attr, pid, cpus-
> > > >map[cpu],
> > > -???????????????????????????????????????????????fd, group_fd, evsel-
> > > >open_flags);
> > > -???????????????????????}
> > > -
> > > -???????????????????????pr_debug2_peo(" = %d\n", fd);
> > > -
> > > -???????????????????????if (evsel->bpf_fd >= 0) {
> > > -???????????????????????????????int evt_fd = fd;
> > > -???????????????????????????????int bpf_fd = evsel->bpf_fd;
> > > -
> > > -???????????????????????????????err = ioctl(evt_fd,
> > > -?????????????????????????????????????????? PERF_EVENT_IOC_SET_BPF,
> > > -?????????????????????????????????????????? bpf_fd);
> > > -???????????????????????????????if (err && errno != EEXIST) {
> > > -???????????????????????????????????????pr_err("failed to attach bpf fd %d:
> > > %s\n",
> > > -????????????????????????????????????????????? bpf_fd, strerror(errno));
> > > -???????????????????????????????????????err = -EINVAL;
> > > -???????????????????????????????????????goto out_close;
> > > -???????????????????????????????}
> > > -???????????????????????}
> > > -
> > > -???????????????????????set_rlimit = NO_CHANGE;
> > > -
> > > -???????????????????????/*
> > > -??????????????????????? * If we succeeded but had to kill clockid, fail and
> > > -??????????????????????? * have evsel__open_strerror() print us a nice error.
> > > -??????????????????????? */
> > > -???????????????????????if (perf_missing_features.clockid ||
> > > -?????????????????????????? perf_missing_features.clockid_wrong) {
> > > -???????????????????????????????err = -EINVAL;
> > > +???????????????????????default:
> > > +???????????????????????case PEO_ERROR:
> > > ????????????????????????????????goto out_close;
> > > ????????????????????????}
> > > ????????????????}
> > > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > > index 0a245afab2d87d74..8c9827a93ac001a7 100644
> > > --- a/tools/perf/util/evsel.h
> > > +++ b/tools/perf/util/evsel.h
> > > @@ -282,6 +282,18 @@ int evsel__enable(struct evsel *evsel);
> > > ?int evsel__disable(struct evsel *evsel);
> > > ?int evsel__disable_cpu(struct evsel *evsel, int cpu);
> > > ?
> > > +enum perf_event_open_err {
> > > +???????PEO_SUCCESS,
> > > +???????PEO_FALLBACK,
> > > +???????PEO_ERROR
> > > +};
> > > +
> > > +struct perf_event_open_result {
> > > +???????enum perf_event_open_err err;
> > > +???????int rc;
> > > +???????int fd;
> > > +};
> > > +
> > > ?int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int
> > > cpu);
> > > ?int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map
> > > *threads);
> > > ?int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
> > > --
> > > 2.31.1
> >
>

--

- Arnaldo

2021-10-08 14:48:35

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC PATCH v1 36/37] perf test/evlist-open-close: use inline func to convert timeval to usec

Em Sat, Aug 21, 2021 at 11:19:42AM +0200, Riccardo Mancini escreveu:
> This patch introduces a new inline function to convert a timeval to
> usec.
> This function will be used also in the next patch.

Thanks, applied.

- Arnaldo


> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/bench/evlist-open-close.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
> index c18aa85725281795..00d0aef564f80d44 100644
> --- a/tools/perf/bench/evlist-open-close.c
> +++ b/tools/perf/bench/evlist-open-close.c
> @@ -26,6 +26,11 @@ static int iterations = 100;
> static int nr_events = 1;
> static const char *event_string = "dummy";
>
> +static inline u64 timeval2usec(struct timeval *tv)
> +{
> + return tv->tv_sec * USEC_PER_SEC + tv->tv_usec;
> +}
> +
> static struct record_opts opts = {
> .sample_time = true,
> .mmap_pages = UINT_MAX,
> @@ -197,7 +202,7 @@ static int bench_evlist_open_close__run(char *evstr)
>
> gettimeofday(&end, NULL);
> timersub(&end, &start, &diff);
> - runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
> + runtime_us = timeval2usec(&diff);
> update_stats(&time_stats, runtime_us);
>
> evlist__delete(evlist);
> --
> 2.31.1

--

- Arnaldo