User space tasks can migrate between CPUs, track sideband events for all
CPUs.
The specific scenarios are as follows:
CPU0 CPU1
perf record -C 0 start
taskA starts to be created and executed
-> PERF_RECORD_COMM and PERF_RECORD_MMAP
events only deliver to CPU1
......
|
migrate to CPU0
|
Running on CPU0 <----------/
...
perf record -C 0 stop
Now perf samples the PC of taskA. However, perf does not record the
PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
Therefore, the comm and symbols of taskA cannot be parsed.
The sys_perf_event_open invoked is as follows:
# perf --debug verbose=3 record -e cpu-clock -C 1 true
<SNIP>
Opening: cpu-clock
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0 (PERF_COUNT_SW_CPU_CLOCK)
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
read_format ID|LOST
disabled 1
inherit 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
Opening: dummy:u
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x9 (PERF_COUNT_SW_DUMMY)
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|CPU|IDENTIFIER
read_format ID|LOST
inherit 1
exclude_kernel 1
exclude_hv 1
mmap 1
comm 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
<SNIP>
Changes since_v7:
- The condition for requiring system_wide sideband is changed to
"as long as a non-dummy event exists" (patch4).
- Modify the corresponding test case to record only dummy event (patch6).
- Thanks to tested-by tag from Ravi, but because the solution is modified,
the tested-by tag of Ravi is not added to this version.
Changes since_v6:
- Patch1:
1. No change.
2. Keep Acked-by tag from Adrian.
- Patch2:
1. Update commit message as suggested by Ian.
2. Keep Acked-by tag from Adrian because code is not modified.
- Patch3:
1. Update comment as suggested by Ian.
2. Merge original patch5 ("perf test: Update base-record & system-wide-dummy attr") as suggested by Ian.
3. Only merge commit, keep Acked-by tag from Adrian.
- Patch4:
1. No change. Because Adrian recommends not changing the function name.
2. Keep Acked-by tag from Adrian.
- Patch5:
1. Add cleanup on trap function as suggested by Ian.
2. Remove Tested-by tag from Adrian because the script is modified.
- Patch6:
1. Add Reviewed-by tag from Ian.
Changes since_v5:
- No code changes.
- Detailed commit message of patch3.
- Add Acked-by and Tested-by tags from Adrian Hunter.
Changes since_v4:
- Simplify check code for record__tracking_system_wide().
- Add perf attr test result to commit message for patch 7.
Changes since_v3:
- Check fall_kernel, all_user, and dummy or exclude_user when determining
whether system wide is required.
Changes since_v2:
- Rename record_tracking.sh to record_sideband.sh in tools/perf/tests/shell.
- Remove "perf evlist: Skip dummy event sample_type check for evlist_config" patch.
- Add opts->all_kernel check in record__config_tracking_events().
- Add perf_event_attr test for record selected CPUs exclude_user.
- Update base-record & system-wide-dummy sample_type attr expected values for test-record-C0.
Changes since v1:
- Add perf_evlist__go_system_wide() via internal/evlist.h instead of
exporting perf_evlist__propagate_maps().
- Use evlist__add_aux_dummy() instead of evlist__add_dummy() in
evlist__findnew_tracking_event().
- Add a parameter in evlist__findnew_tracking_event() to deal with
system_wide inside.
- Add sideband for all CPUs when tracing selected CPUs comments on
the perf record man page.
- Use "sideband events" instead of "tracking events".
- Adjust the patches Sequence.
- Add patch5 to skip dummy event sample_type check for evlist_config.
- Add patch6 to update system-wide-dummy attr values for perf test.
Yang Jihong (6):
perf evlist: Add perf_evlist__go_system_wide() helper
perf evlist: Add evlist__findnew_tracking_event() helper
perf record: Move setting tracking events before
record__init_thread_masks()
perf record: Track sideband events for all CPUs when tracing selected
CPUs
perf test: Add test case for record sideband events
perf test: Add perf_event_attr test for record selected CPUs
exclude_user
Yang Jihong (6):
perf evlist: Add perf_evlist__go_system_wide() helper
perf evlist: Add evlist__findnew_tracking_event() helper
perf record: Move setting tracking events before
record__init_thread_masks()
perf record: Track sideband events for all CPUs when tracing selected
CPUs
perf test: Add test case for record sideband events
perf test: Add perf_event_attr test for record dummy event
tools/lib/perf/evlist.c | 9 +++
tools/lib/perf/include/internal/evlist.h | 2 +
tools/perf/Documentation/perf-record.txt | 3 +
tools/perf/builtin-record.c | 92 +++++++++++++++-------
tools/perf/tests/attr/system-wide-dummy | 14 ++--
tools/perf/tests/attr/test-record-C0 | 4 +-
tools/perf/tests/attr/test-record-dummy-C0 | 55 +++++++++++++
tools/perf/tests/shell/record_sideband.sh | 58 ++++++++++++++
tools/perf/util/evlist.c | 18 +++++
tools/perf/util/evlist.h | 1 +
10 files changed, 221 insertions(+), 35 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-dummy-C0
create mode 100755 tools/perf/tests/shell/record_sideband.sh
--
2.30.GIT
For dummy events that keep tracking, we may need to modify its cpu_maps.
For example, change the cpu_maps to record sideband events for all CPUS.
Add perf_evlist__go_system_wide() helper to support this scenario.
Signed-off-by: Yang Jihong <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
---
tools/lib/perf/evlist.c | 9 +++++++++
tools/lib/perf/include/internal/evlist.h | 2 ++
2 files changed, 11 insertions(+)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index b8b066d0dc5e..3acbbccc1901 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -738,3 +738,12 @@ int perf_evlist__nr_groups(struct perf_evlist *evlist)
}
return nr_groups;
}
+
+void perf_evlist__go_system_wide(struct perf_evlist *evlist, struct perf_evsel *evsel)
+{
+ if (!evsel->system_wide) {
+ evsel->system_wide = true;
+ if (evlist->needs_map_propagation)
+ __perf_evlist__propagate_maps(evlist, evsel);
+ }
+}
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 3339bc2f1765..d86ffe8ed483 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -135,4 +135,6 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
void perf_evlist__reset_id_hash(struct perf_evlist *evlist);
void __perf_evlist__set_leader(struct list_head *list, struct perf_evsel *leader);
+
+void perf_evlist__go_system_wide(struct perf_evlist *evlist, struct perf_evsel *evsel);
#endif /* __LIBPERF_INTERNAL_EVLIST_H */
--
2.30.GIT
Currently, intel-bts, intel-pt, and arm-spe may add tracking event to the
evlist. We may need to search for the tracking event for some settings.
Therefore, add evlist__findnew_tracking_event() helper.
If system_wide is true, evlist__findnew_tracking_event() set the cpu map
of the evsel to all online CPUs.
Signed-off-by: Yang Jihong <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-record.c | 11 +++--------
tools/perf/util/evlist.c | 18 ++++++++++++++++++
tools/perf/util/evlist.h | 1 +
3 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 34bb31f08bb5..12edad8392cc 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1293,14 +1293,9 @@ static int record__open(struct record *rec)
*/
if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
perf_pmus__num_core_pmus() > 1) {
- pos = evlist__get_tracking_event(evlist);
- if (!evsel__is_dummy_event(pos)) {
- /* Set up dummy event. */
- if (evlist__add_dummy(evlist))
- return -ENOMEM;
- pos = evlist__last(evlist);
- evlist__set_tracking_event(evlist, pos);
- }
+ pos = evlist__findnew_tracking_event(evlist, false);
+ if (!pos)
+ return -ENOMEM;
/*
* Enable the dummy event when the process is forked for
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7ef43f72098e..25c3ebe2c2f5 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1694,6 +1694,24 @@ void evlist__set_tracking_event(struct evlist *evlist, struct evsel *tracking_ev
tracking_evsel->tracking = true;
}
+struct evsel *evlist__findnew_tracking_event(struct evlist *evlist, bool system_wide)
+{
+ struct evsel *evsel;
+
+ evsel = evlist__get_tracking_event(evlist);
+ if (!evsel__is_dummy_event(evsel)) {
+ evsel = evlist__add_aux_dummy(evlist, system_wide);
+ if (!evsel)
+ return NULL;
+
+ evlist__set_tracking_event(evlist, evsel);
+ } else if (system_wide) {
+ perf_evlist__go_system_wide(&evlist->core, &evsel->core);
+ }
+
+ return evsel;
+}
+
struct evsel *evlist__find_evsel_by_str(struct evlist *evlist, const char *str)
{
struct evsel *evsel;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 664c6bf7b3e0..98e7ddb2bd30 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -387,6 +387,7 @@ bool evlist_cpu_iterator__end(const struct evlist_cpu_iterator *evlist_cpu_itr);
struct evsel *evlist__get_tracking_event(struct evlist *evlist);
void evlist__set_tracking_event(struct evlist *evlist, struct evsel *tracking_evsel);
+struct evsel *evlist__findnew_tracking_event(struct evlist *evlist, bool system_wide);
struct evsel *evlist__find_evsel_by_str(struct evlist *evlist, const char *str);
--
2.30.GIT
User space tasks can migrate between CPUs, so when tracing selected CPUs,
sideband for all CPUs is needed. In this case set the cpu map of the evsel
to all online CPUs. This may modify the original cpu map of the evlist.
Therefore, need to check whether the preceding scenario exists before
record__init_thread_masks().
Dummy tracking has been set in record__open(), move it before
record__init_thread_masks() and add a helper for unified processing.
The sys_perf_event_open invoked is as follows:
# perf --debug verbose=3 record -e cpu-clock -D 100 true
<SNIP>
Opening: cpu-clock
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0 (PERF_COUNT_SW_CPU_CLOCK)
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD|IDENTIFIER
read_format ID|LOST
disabled 1
inherit 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 10318 cpu 0 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 10318 cpu 1 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid 10318 cpu 2 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid 10318 cpu 3 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid 10318 cpu 4 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid 10318 cpu 5 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid 10318 cpu 6 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid 10318 cpu 7 group_fd -1 flags 0x8 = 13
Opening: dummy:u
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x9 (PERF_COUNT_SW_DUMMY)
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|IDENTIFIER
read_format ID|LOST
disabled 1
inherit 1
exclude_kernel 1
exclude_hv 1
mmap 1
comm 1
enable_on_exec 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
sys_perf_event_open: pid 10318 cpu 0 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid 10318 cpu 1 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid 10318 cpu 2 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid 10318 cpu 3 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid 10318 cpu 4 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid 10318 cpu 5 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid 10318 cpu 6 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid 10318 cpu 7 group_fd -1 flags 0x8 = 21
<SNIP>
perf test need to update base-record & system-wide-dummy attr expected values
for test-record-C0:
1. Because a dummy sideband event is added to the sampling of specified
CPUs. When evlist contains evsel of different sample_type,
evlist__config() will change the default PERF_SAMPLE_ID bit to
PERF_SAMPLE_IDENTIFICATION bit.
The attr sample_type expected value of base-record and system-wide-dummy
in test-record-C0 needs to be updated.
2. The perf record uses evlist__add_aux_dummy() instead of
evlist__add_dummy() to add a dummy event.
The expected value of system-wide-dummy attr needs to be updated.
The perf test result is as follows:
# ./perf test list 2>&1 | grep 'Setup struct perf_event_attr'
17: Setup struct perf_event_attr
# ./perf test 17
17: Setup struct perf_event_attr : Ok
Signed-off-by: Yang Jihong <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-record.c | 59 ++++++++++++++++---------
tools/perf/tests/attr/system-wide-dummy | 14 +++---
tools/perf/tests/attr/test-record-C0 | 4 +-
3 files changed, 47 insertions(+), 30 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 12edad8392cc..83bd1f117191 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -906,6 +906,37 @@ static int record__config_off_cpu(struct record *rec)
return off_cpu_prepare(rec->evlist, &rec->opts.target, &rec->opts);
}
+static int record__config_tracking_events(struct record *rec)
+{
+ struct record_opts *opts = &rec->opts;
+ struct evlist *evlist = rec->evlist;
+ struct evsel *evsel;
+
+ /*
+ * For initial_delay, system wide or a hybrid system, we need to add
+ * tracking event so that we can track PERF_RECORD_MMAP to cover the
+ * delay of waiting or event synthesis.
+ */
+ if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
+ perf_pmus__num_core_pmus() > 1) {
+ evsel = evlist__findnew_tracking_event(evlist, false);
+ if (!evsel)
+ return -ENOMEM;
+
+ /*
+ * Enable the tracking event when the process is forked for
+ * initial_delay, immediately for system wide.
+ */
+ if (opts->target.initial_delay && !evsel->immediate &&
+ !target__has_cpu(&opts->target))
+ evsel->core.attr.enable_on_exec = 1;
+ else
+ evsel->immediate = 1;
+ }
+
+ return 0;
+}
+
static bool record__kcore_readable(struct machine *machine)
{
char kcore[PATH_MAX];
@@ -1286,28 +1317,6 @@ static int record__open(struct record *rec)
struct record_opts *opts = &rec->opts;
int rc = 0;
- /*
- * For initial_delay, system wide or a hybrid system, we need to add a
- * dummy event so that we can track PERF_RECORD_MMAP to cover the delay
- * of waiting or event synthesis.
- */
- if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
- perf_pmus__num_core_pmus() > 1) {
- pos = evlist__findnew_tracking_event(evlist, false);
- if (!pos)
- return -ENOMEM;
-
- /*
- * Enable the dummy event when the process is forked for
- * initial_delay, immediately for system wide.
- */
- if (opts->target.initial_delay && !pos->immediate &&
- !target__has_cpu(&opts->target))
- pos->core.attr.enable_on_exec = 1;
- else
- pos->immediate = 1;
- }
-
evlist__config(evlist, opts, &callchain_param);
evlist__for_each_entry(evlist, pos) {
@@ -4190,6 +4199,12 @@ int cmd_record(int argc, const char **argv)
goto out;
}
+ err = record__config_tracking_events(rec);
+ if (err) {
+ pr_err("record__config_tracking_events failed, error %d\n", err);
+ goto out;
+ }
+
err = record__init_thread_masks(rec);
if (err) {
pr_err("Failed to initialize parallel data streaming masks\n");
diff --git a/tools/perf/tests/attr/system-wide-dummy b/tools/perf/tests/attr/system-wide-dummy
index 2f3e3eb728eb..a1e1d6a263bf 100644
--- a/tools/perf/tests/attr/system-wide-dummy
+++ b/tools/perf/tests/attr/system-wide-dummy
@@ -9,8 +9,10 @@ flags=8
type=1
size=136
config=9
-sample_period=4000
-sample_type=455
+sample_period=1
+# PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_TIME |
+# PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER
+sample_type=65671
read_format=4|20
# Event will be enabled right away.
disabled=0
@@ -18,12 +20,12 @@ inherit=1
pinned=0
exclusive=0
exclude_user=0
-exclude_kernel=0
-exclude_hv=0
+exclude_kernel=1
+exclude_hv=1
exclude_idle=0
mmap=1
comm=1
-freq=1
+freq=0
inherit_stat=0
enable_on_exec=0
task=1
@@ -32,7 +34,7 @@ precise_ip=0
mmap_data=0
sample_id_all=1
exclude_host=0
-exclude_guest=0
+exclude_guest=1
exclude_callchain_kernel=0
exclude_callchain_user=0
mmap2=1
diff --git a/tools/perf/tests/attr/test-record-C0 b/tools/perf/tests/attr/test-record-C0
index 317730b906dd..198e8429a1bf 100644
--- a/tools/perf/tests/attr/test-record-C0
+++ b/tools/perf/tests/attr/test-record-C0
@@ -10,9 +10,9 @@ cpu=0
enable_on_exec=0
# PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_TIME |
-# PERF_SAMPLE_ID | PERF_SAMPLE_PERIOD
+# PERF_SAMPLE_PERIOD | PERF_SAMPLE_IDENTIFIER
# + PERF_SAMPLE_CPU added by -C 0
-sample_type=455
+sample_type=65927
# Dummy event handles mmaps, comm and task.
mmap=0
--
2.30.GIT
On 04-Sep-23 8:03 AM, Yang Jihong wrote:
> User space tasks can migrate between CPUs, track sideband events for all
> CPUs.
>
> The specific scenarios are as follows:
>
> CPU0 CPU1
> perf record -C 0 start
> taskA starts to be created and executed
> -> PERF_RECORD_COMM and PERF_RECORD_MMAP
> events only deliver to CPU1
> ......
> |
> migrate to CPU0
> |
> Running on CPU0 <----------/
> ...
>
> perf record -C 0 stop
>
> Now perf samples the PC of taskA. However, perf does not record the
> PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
> Therefore, the comm and symbols of taskA cannot be parsed.
>
> The sys_perf_event_open invoked is as follows:
>
> # perf --debug verbose=3 record -e cpu-clock -C 1 true
> <SNIP>
> Opening: cpu-clock
> ------------------------------------------------------------
> perf_event_attr:
> type 1 (PERF_TYPE_SOFTWARE)
> size 136
> config 0 (PERF_COUNT_SW_CPU_CLOCK)
> { sample_period, sample_freq } 4000
> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
> read_format ID|LOST
> disabled 1
> inherit 1
> freq 1
> sample_id_all 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
> Opening: dummy:u
> ------------------------------------------------------------
> perf_event_attr:
> type 1 (PERF_TYPE_SOFTWARE)
> size 136
> config 0x9 (PERF_COUNT_SW_DUMMY)
> { sample_period, sample_freq } 1
> sample_type IP|TID|TIME|CPU|IDENTIFIER
> read_format ID|LOST
> inherit 1
> exclude_kernel 1
> exclude_hv 1
> mmap 1
> comm 1
> task 1
> sample_id_all 1
> exclude_guest 1
> mmap2 1
> comm_exec 1
> ksymbol 1
> bpf_event 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
> <SNIP>
>
> Changes since_v7:
> - The condition for requiring system_wide sideband is changed to
> "as long as a non-dummy event exists" (patch4).
> - Modify the corresponding test case to record only dummy event (patch6).
> - Thanks to tested-by tag from Ravi, but because the solution is modified,
> the tested-by tag of Ravi is not added to this version.
I've re-tested v8 with my simple test.
Tested-by: Ravi Bangoria <[email protected]>
Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu:
> On 04-Sep-23 8:03 AM, Yang Jihong wrote:
> > User space tasks can migrate between CPUs, track sideband events for all
> > CPUs.
> >
> > The specific scenarios are as follows:
> >
> > CPU0 CPU1
> > perf record -C 0 start
> > taskA starts to be created and executed
> > -> PERF_RECORD_COMM and PERF_RECORD_MMAP
> > events only deliver to CPU1
> > ......
> > |
> > migrate to CPU0
> > |
> > Running on CPU0 <----------/
> > ...
> >
> > perf record -C 0 stop
> >
> > Now perf samples the PC of taskA. However, perf does not record the
> > PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
> > Therefore, the comm and symbols of taskA cannot be parsed.
> >
> > The sys_perf_event_open invoked is as follows:
> >
> > # perf --debug verbose=3 record -e cpu-clock -C 1 true
> > <SNIP>
> > Opening: cpu-clock
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 1 (PERF_TYPE_SOFTWARE)
> > size 136
> > config 0 (PERF_COUNT_SW_CPU_CLOCK)
> > { sample_period, sample_freq } 4000
> > sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
> > read_format ID|LOST
> > disabled 1
> > inherit 1
> > freq 1
> > sample_id_all 1
> > exclude_guest 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
> > Opening: dummy:u
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 1 (PERF_TYPE_SOFTWARE)
> > size 136
> > config 0x9 (PERF_COUNT_SW_DUMMY)
> > { sample_period, sample_freq } 1
> > sample_type IP|TID|TIME|CPU|IDENTIFIER
> > read_format ID|LOST
> > inherit 1
> > exclude_kernel 1
> > exclude_hv 1
> > mmap 1
> > comm 1
> > task 1
> > sample_id_all 1
> > exclude_guest 1
> > mmap2 1
> > comm_exec 1
> > ksymbol 1
> > bpf_event 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
> > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
> > sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
> > sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
> > sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
> > sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
> > sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
> > sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
> > <SNIP>
> >
> > Changes since_v7:
> > - The condition for requiring system_wide sideband is changed to
> > "as long as a non-dummy event exists" (patch4).
> > - Modify the corresponding test case to record only dummy event (patch6).
> > - Thanks to tested-by tag from Ravi, but because the solution is modified,
> > the tested-by tag of Ravi is not added to this version.
>
> I've re-tested v8 with my simple test.
>
> Tested-by: Ravi Bangoria <[email protected]>
Thanks, applied to the csets that were still sitting in an umpublished
perf-tools-next local branch, soon public.
- Arnaldo
Hello,
On Tue, Sep 12, 2023 at 1:32 PM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu:
> > On 04-Sep-23 8:03 AM, Yang Jihong wrote:
> > > User space tasks can migrate between CPUs, track sideband events for all
> > > CPUs.
> > >
> > > The specific scenarios are as follows:
> > >
> > > CPU0 CPU1
> > > perf record -C 0 start
> > > taskA starts to be created and executed
> > > -> PERF_RECORD_COMM and PERF_RECORD_MMAP
> > > events only deliver to CPU1
> > > ......
> > > |
> > > migrate to CPU0
> > > |
> > > Running on CPU0 <----------/
> > > ...
> > >
> > > perf record -C 0 stop
> > >
> > > Now perf samples the PC of taskA. However, perf does not record the
> > > PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
> > > Therefore, the comm and symbols of taskA cannot be parsed.
> > >
> > > The sys_perf_event_open invoked is as follows:
> > >
> > > # perf --debug verbose=3 record -e cpu-clock -C 1 true
> > > <SNIP>
> > > Opening: cpu-clock
> > > ------------------------------------------------------------
> > > perf_event_attr:
> > > type 1 (PERF_TYPE_SOFTWARE)
> > > size 136
> > > config 0 (PERF_COUNT_SW_CPU_CLOCK)
> > > { sample_period, sample_freq } 4000
> > > sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
> > > read_format ID|LOST
> > > disabled 1
> > > inherit 1
> > > freq 1
> > > sample_id_all 1
> > > exclude_guest 1
> > > ------------------------------------------------------------
> > > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
> > > Opening: dummy:u
> > > ------------------------------------------------------------
> > > perf_event_attr:
> > > type 1 (PERF_TYPE_SOFTWARE)
> > > size 136
> > > config 0x9 (PERF_COUNT_SW_DUMMY)
> > > { sample_period, sample_freq } 1
> > > sample_type IP|TID|TIME|CPU|IDENTIFIER
> > > read_format ID|LOST
> > > inherit 1
> > > exclude_kernel 1
> > > exclude_hv 1
> > > mmap 1
> > > comm 1
> > > task 1
> > > sample_id_all 1
> > > exclude_guest 1
> > > mmap2 1
> > > comm_exec 1
> > > ksymbol 1
> > > bpf_event 1
> > > ------------------------------------------------------------
> > > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
> > > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
> > > sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
> > > sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
> > > sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
> > > sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
> > > sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
> > > sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
> > > <SNIP>
> > >
> > > Changes since_v7:
> > > - The condition for requiring system_wide sideband is changed to
> > > "as long as a non-dummy event exists" (patch4).
> > > - Modify the corresponding test case to record only dummy event (patch6).
> > > - Thanks to tested-by tag from Ravi, but because the solution is modified,
> > > the tested-by tag of Ravi is not added to this version.
> >
> > I've re-tested v8 with my simple test.
> >
> > Tested-by: Ravi Bangoria <[email protected]>
>
>
> Thanks, applied to the csets that were still sitting in an umpublished
> perf-tools-next local branch, soon public.
Now I'm seeing a perf test failure on perf-tools-next.
$ sudo ./perf test -v 17
17: Setup struct perf_event_attr :
--- start ---
test child forked, pid 1616372
Using CPUID GenuineIntel-6-8C-1
running './tests/attr/test-record-branch-filter-k'
running './tests/attr/test-record-period'
running './tests/attr/test-record-graph-default'
test limitation '!aarch64'
excluded architecture list ['aarch64']
running './tests/attr/test-record-branch-filter-any'
running './tests/attr/test-record-data'
running './tests/attr/test-stat-detailed-1'
running './tests/attr/test-record-branch-filter-hv'
running './tests/attr/test-record-graph-fp'
test limitation '!aarch64'
excluded architecture list ['aarch64']
running './tests/attr/test-record-basic'
running './tests/attr/test-record-group2'
running './tests/attr/test-stat-detailed-3'
running './tests/attr/test-record-branch-any'
running './tests/attr/test-record-branch-filter-ind_call'
running './tests/attr/test-stat-detailed-2'
running './tests/attr/test-record-group1'
running './tests/attr/test-record-count'
running './tests/attr/test-record-no-samples'
running './tests/attr/test-record-graph-dwarf'
running './tests/attr/test-record-spe-period'
test limitation 'aarch64'
skipped [x86_64] './tests/attr/test-record-spe-period'
running './tests/attr/test-record-graph-fp-aarch64'
test limitation 'aarch64'
skipped [x86_64] './tests/attr/test-record-graph-fp-aarch64'
running './tests/attr/test-record-freq'
running './tests/attr/test-record-pfm-period'
running './tests/attr/test-record-no-buffering'
running './tests/attr/test-record-no-inherit'
running './tests/attr/test-record-branch-filter-any_ret'
running './tests/attr/test-record-raw'
running './tests/attr/test-record-dummy-C0'
expected read_format=4, got 20
FAILED './tests/attr/test-record-dummy-C0' - match failure
test child finished with -1
---- end ----
Setup struct perf_event_attr: FAILED!
Hello,
On 2023/9/16 8:14, Namhyung Kim wrote:
> Hello,
>
> On Tue, Sep 12, 2023 at 1:32 PM Arnaldo Carvalho de Melo
> <[email protected]> wrote:
>>
>> Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu:
>>> On 04-Sep-23 8:03 AM, Yang Jihong wrote:
>>>> User space tasks can migrate between CPUs, track sideband events for all
>>>> CPUs.
>>>>
>>>> The specific scenarios are as follows:
>>>>
>>>> CPU0 CPU1
>>>> perf record -C 0 start
>>>> taskA starts to be created and executed
>>>> -> PERF_RECORD_COMM and PERF_RECORD_MMAP
>>>> events only deliver to CPU1
>>>> ......
>>>> |
>>>> migrate to CPU0
>>>> |
>>>> Running on CPU0 <----------/
>>>> ...
>>>>
>>>> perf record -C 0 stop
>>>>
>>>> Now perf samples the PC of taskA. However, perf does not record the
>>>> PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
>>>> Therefore, the comm and symbols of taskA cannot be parsed.
>>>>
>>>> The sys_perf_event_open invoked is as follows:
>>>>
>>>> # perf --debug verbose=3 record -e cpu-clock -C 1 true
>>>> <SNIP>
>>>> Opening: cpu-clock
>>>> ------------------------------------------------------------
>>>> perf_event_attr:
>>>> type 1 (PERF_TYPE_SOFTWARE)
>>>> size 136
>>>> config 0 (PERF_COUNT_SW_CPU_CLOCK)
>>>> { sample_period, sample_freq } 4000
>>>> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
>>>> read_format ID|LOST
>>>> disabled 1
>>>> inherit 1
>>>> freq 1
>>>> sample_id_all 1
>>>> exclude_guest 1
>>>> ------------------------------------------------------------
>>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
>>>> Opening: dummy:u
>>>> ------------------------------------------------------------
>>>> perf_event_attr:
>>>> type 1 (PERF_TYPE_SOFTWARE)
>>>> size 136
>>>> config 0x9 (PERF_COUNT_SW_DUMMY)
>>>> { sample_period, sample_freq } 1
>>>> sample_type IP|TID|TIME|CPU|IDENTIFIER
>>>> read_format ID|LOST
>>>> inherit 1
>>>> exclude_kernel 1
>>>> exclude_hv 1
>>>> mmap 1
>>>> comm 1
>>>> task 1
>>>> sample_id_all 1
>>>> exclude_guest 1
>>>> mmap2 1
>>>> comm_exec 1
>>>> ksymbol 1
>>>> bpf_event 1
>>>> ------------------------------------------------------------
>>>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
>>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
>>>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
>>>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
>>>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
>>>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
>>>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
>>>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
>>>> <SNIP>
>>>>
>>>> Changes since_v7:
>>>> - The condition for requiring system_wide sideband is changed to
>>>> "as long as a non-dummy event exists" (patch4).
>>>> - Modify the corresponding test case to record only dummy event (patch6).
>>>> - Thanks to tested-by tag from Ravi, but because the solution is modified,
>>>> the tested-by tag of Ravi is not added to this version.
>>>
>>> I've re-tested v8 with my simple test.
>>>
>>> Tested-by: Ravi Bangoria <[email protected]>
>>
>>
>> Thanks, applied to the csets that were still sitting in an umpublished
>> perf-tools-next local branch, soon public.
>
> Now I'm seeing a perf test failure on perf-tools-next.
Uh.. the kernel I was using before didn't support PERF_FORMAT_LOST, so
forget about supporting PERF_FORMAT_LOST. I've updated the kernel and
retested it.
The link to the fixed patch is as follows:
https://lore.kernel.org/all/[email protected]/
Thanks,
Yang
On Sat, Sep 16, 2023 at 2:24 AM Yang Jihong <[email protected]> wrote:
>
> Hello,
>
> On 2023/9/16 8:14, Namhyung Kim wrote:
> > Hello,
> >
> > On Tue, Sep 12, 2023 at 1:32 PM Arnaldo Carvalho de Melo
> > <[email protected]> wrote:
> >>
> >> Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu:
> >>> On 04-Sep-23 8:03 AM, Yang Jihong wrote:
> >>>> User space tasks can migrate between CPUs, track sideband events for all
> >>>> CPUs.
> >>>>
> >>>> The specific scenarios are as follows:
> >>>>
> >>>> CPU0 CPU1
> >>>> perf record -C 0 start
> >>>> taskA starts to be created and executed
> >>>> -> PERF_RECORD_COMM and PERF_RECORD_MMAP
> >>>> events only deliver to CPU1
> >>>> ......
> >>>> |
> >>>> migrate to CPU0
> >>>> |
> >>>> Running on CPU0 <----------/
> >>>> ...
> >>>>
> >>>> perf record -C 0 stop
> >>>>
> >>>> Now perf samples the PC of taskA. However, perf does not record the
> >>>> PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
> >>>> Therefore, the comm and symbols of taskA cannot be parsed.
> >>>>
> >>>> The sys_perf_event_open invoked is as follows:
> >>>>
> >>>> # perf --debug verbose=3 record -e cpu-clock -C 1 true
> >>>> <SNIP>
> >>>> Opening: cpu-clock
> >>>> ------------------------------------------------------------
> >>>> perf_event_attr:
> >>>> type 1 (PERF_TYPE_SOFTWARE)
> >>>> size 136
> >>>> config 0 (PERF_COUNT_SW_CPU_CLOCK)
> >>>> { sample_period, sample_freq } 4000
> >>>> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
> >>>> read_format ID|LOST
> >>>> disabled 1
> >>>> inherit 1
> >>>> freq 1
> >>>> sample_id_all 1
> >>>> exclude_guest 1
> >>>> ------------------------------------------------------------
> >>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
> >>>> Opening: dummy:u
> >>>> ------------------------------------------------------------
> >>>> perf_event_attr:
> >>>> type 1 (PERF_TYPE_SOFTWARE)
> >>>> size 136
> >>>> config 0x9 (PERF_COUNT_SW_DUMMY)
> >>>> { sample_period, sample_freq } 1
> >>>> sample_type IP|TID|TIME|CPU|IDENTIFIER
> >>>> read_format ID|LOST
> >>>> inherit 1
> >>>> exclude_kernel 1
> >>>> exclude_hv 1
> >>>> mmap 1
> >>>> comm 1
> >>>> task 1
> >>>> sample_id_all 1
> >>>> exclude_guest 1
> >>>> mmap2 1
> >>>> comm_exec 1
> >>>> ksymbol 1
> >>>> bpf_event 1
> >>>> ------------------------------------------------------------
> >>>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
> >>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
> >>>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
> >>>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
> >>>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
> >>>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
> >>>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
> >>>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
> >>>> <SNIP>
> >>>>
> >>>> Changes since_v7:
> >>>> - The condition for requiring system_wide sideband is changed to
> >>>> "as long as a non-dummy event exists" (patch4).
> >>>> - Modify the corresponding test case to record only dummy event (patch6).
> >>>> - Thanks to tested-by tag from Ravi, but because the solution is modified,
> >>>> the tested-by tag of Ravi is not added to this version.
> >>>
> >>> I've re-tested v8 with my simple test.
> >>>
> >>> Tested-by: Ravi Bangoria <[email protected]>
> >>
> >>
> >> Thanks, applied to the csets that were still sitting in an umpublished
> >> perf-tools-next local branch, soon public.
> >
> > Now I'm seeing a perf test failure on perf-tools-next.
>
> Uh.. the kernel I was using before didn't support PERF_FORMAT_LOST, so
> forget about supporting PERF_FORMAT_LOST. I've updated the kernel and
> retested it.
>
> The link to the fixed patch is as follows:
> https://lore.kernel.org/all/[email protected]/
Thank you for the quick fix!
Namhyung