2021-07-08 16:06:29

by Liang, Kan

[permalink] [raw]
Subject: [PATCH] perf record: Add a dummy event for a hybrid system

From: Kan Liang <[email protected]>

Some symbols may not be resolved if a user only monitor one type of PMU.

$ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
$ sudo perf report –stdio
# Overhead Command Shared Object Symbol
# ........ ......... .................
# ......................................
#
28.02% perf-exec [unknown] [.] 0x0000000000401cf6
11.32% perf-exec [unknown] [.] 0x0000000000401d04
10.90% perf-exec [unknown] [.] 0x0000000000401d11
10.61% perf-exec [unknown] [.] 0x0000000000401cfc

To parse symbols, the side-band events, e.g., COMM, which are generated
by the kernel are required. To decide whether to generate the side-band
event, the kernel relies on the event_filter_match() to filter the
unrelated events. On a hybrid system, the event_filter_match() further
checks the CPU mask of the current enabled PMU. If an event is collected
on the CPU which doesn't have an enabled PMU, it's treated as an
unrelated event.

The "big_small_workload" is created in a big core, but runs on a small
core. The side-band events are filtered, because the user only monitors
the PMU of the small core. The big core PMU is not enabled.

For a hybrid system, a dummy event is required to generate the complete
side-band events.

Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-record.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3337b5f..99607b9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -891,11 +891,12 @@ static int record__open(struct record *rec)
int rc = 0;

/*
- * For initial_delay or system wide, we need to add a dummy event so
- * that we can track PERF_RECORD_MMAP to cover the delay of waiting or
- * event synthesis.
+ * For initial_delay or system wide or a hybrid system, we need to
+ * add a dummy event so that we can track PERF_RECORD_MMAP to cover
+ * the delay of waiting or event synthesis.
*/
- if (opts->initial_delay || target__has_cpu(&opts->target)) {
+ if (opts->initial_delay || target__has_cpu(&opts->target) ||
+ perf_pmu__has_hybrid()) {
pos = evlist__get_tracking_event(evlist);
if (!evsel__is_dummy_event(pos)) {
/* Set up dummy event. */
--
2.7.4


2021-07-09 05:34:44

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf record: Add a dummy event for a hybrid system

On Thu, Jul 8, 2021 at 9:05 AM <[email protected]> wrote:
>
> From: Kan Liang <[email protected]>
>
> Some symbols may not be resolved if a user only monitor one type of PMU.
>
> $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
> $ sudo perf report –stdio
> # Overhead Command Shared Object Symbol
> # ........ ......... .................
> # ......................................
> #
> 28.02% perf-exec [unknown] [.] 0x0000000000401cf6
> 11.32% perf-exec [unknown] [.] 0x0000000000401d04
> 10.90% perf-exec [unknown] [.] 0x0000000000401d11
> 10.61% perf-exec [unknown] [.] 0x0000000000401cfc
>
> To parse symbols, the side-band events, e.g., COMM, which are generated
> by the kernel are required. To decide whether to generate the side-band
> event, the kernel relies on the event_filter_match() to filter the
> unrelated events. On a hybrid system, the event_filter_match() further
> checks the CPU mask of the current enabled PMU. If an event is collected
> on the CPU which doesn't have an enabled PMU, it's treated as an
> unrelated event.
>
> The "big_small_workload" is created in a big core, but runs on a small
> core. The side-band events are filtered, because the user only monitors
> the PMU of the small core. The big core PMU is not enabled.
>
> For a hybrid system, a dummy event is required to generate the complete
> side-band events.
>
> Signed-off-by: Kan Liang <[email protected]>

Acked-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung


> ---
> tools/perf/builtin-record.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 3337b5f..99607b9 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -891,11 +891,12 @@ static int record__open(struct record *rec)
> int rc = 0;
>
> /*
> - * For initial_delay or system wide, we need to add a dummy event so
> - * that we can track PERF_RECORD_MMAP to cover the delay of waiting or
> - * event synthesis.
> + * For initial_delay or system wide or a hybrid system, we need to
> + * add a dummy event so that we can track PERF_RECORD_MMAP to cover
> + * the delay of waiting or event synthesis.
> */
> - if (opts->initial_delay || target__has_cpu(&opts->target)) {
> + if (opts->initial_delay || target__has_cpu(&opts->target) ||
> + perf_pmu__has_hybrid()) {
> pos = evlist__get_tracking_event(evlist);
> if (!evsel__is_dummy_event(pos)) {
> /* Set up dummy event. */
> --
> 2.7.4
>

2021-07-09 12:53:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf record: Add a dummy event for a hybrid system

Em Thu, Jul 08, 2021 at 10:32:13PM -0700, Namhyung Kim escreveu:
> On Thu, Jul 8, 2021 at 9:05 AM <[email protected]> wrote:
> >
> > From: Kan Liang <[email protected]>
> >
> > Some symbols may not be resolved if a user only monitor one type of PMU.
> >
> > $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
> > $ sudo perf report –stdio
> > # Overhead Command Shared Object Symbol
> > # ........ ......... .................
> > # ......................................
> > #
> > 28.02% perf-exec [unknown] [.] 0x0000000000401cf6
> > 11.32% perf-exec [unknown] [.] 0x0000000000401d04
> > 10.90% perf-exec [unknown] [.] 0x0000000000401d11
> > 10.61% perf-exec [unknown] [.] 0x0000000000401cfc
> >
> > To parse symbols, the side-band events, e.g., COMM, which are generated
> > by the kernel are required. To decide whether to generate the side-band
> > event, the kernel relies on the event_filter_match() to filter the
> > unrelated events. On a hybrid system, the event_filter_match() further
> > checks the CPU mask of the current enabled PMU. If an event is collected
> > on the CPU which doesn't have an enabled PMU, it's treated as an
> > unrelated event.
> >
> > The "big_small_workload" is created in a big core, but runs on a small
> > core. The side-band events are filtered, because the user only monitors
> > the PMU of the small core. The big core PMU is not enabled.
> >
> > For a hybrid system, a dummy event is required to generate the complete
> > side-band events.
> >
> > Signed-off-by: Kan Liang <[email protected]>
>
> Acked-by: Namhyung Kim <[email protected]>

Thanks, applied.

- Arnaldo