2021-07-07 05:58:51

by Jin Yao

[permalink] [raw]
Subject: [PATCH v2] perf stat: Merge uncore events by default for hybrid platform

On hybrid platform, by default stat aggregates and reports the event counts
per pmu. For example,

# perf stat -e cycles -a true

Performance counter stats for 'system wide':

1,400,445 cpu_core/cycles/
680,881 cpu_atom/cycles/

0.001770773 seconds time elapsed

While for uncore events, that's not a suitable method. Uncore has nothing
to do with hybrid. So for uncore events, we aggregate event counts from all
PMUs and report the counts without PMUs.

Before:

# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

Performance counter stats for 'system wide':

2,058 uncore_arb_0/event=0x81,umask=0x1/
2,028 uncore_arb_1/event=0x81,umask=0x1/
0 uncore_arb_0/event=0x84,umask=0x1/
0 uncore_arb_1/event=0x84,umask=0x1/

0.000614498 seconds time elapsed

After:

# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

Performance counter stats for 'system wide':

3,996 arb/event=0x81,umask=0x1/
0 arb/event=0x84,umask=0x1/

0.000630046 seconds time elapsed

Of course, we also keep the '--no-merge' working for uncore events.

# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true

Performance counter stats for 'system wide':

1,952 uncore_arb_0/event=0x81,umask=0x1/
1,921 uncore_arb_1/event=0x81,umask=0x1/
0 uncore_arb_0/event=0x84,umask=0x1/
0 uncore_arb_1/event=0x84,umask=0x1/

0.000575536 seconds time elapsed

Signed-off-by: Jin Yao <[email protected]>
---
v2:
- Use evsel__find_pmu() to find uncore pmu.
- Create hybrid_uniquify() to check if uniquify the event name for hybrid.

tools/perf/builtin-stat.c | 3 ---
tools/perf/util/stat-display.c | 14 +++++++++++++-
2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f9f74a514315..b67a44982b61 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2442,9 +2442,6 @@ int cmd_stat(int argc, const char **argv)

evlist__check_cpu_maps(evsel_list);

- if (perf_pmu__has_hybrid())
- stat_config.no_merge = true;
-
/*
* Initialize thread_map with comm names,
* so we could print it out on output.
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index c588a6b7a8db..87f77016b9cc 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -593,6 +593,18 @@ static void collect_all_aliases(struct perf_stat_config *config, struct evsel *c
}
}

+static bool is_uncore(struct evsel *evsel)
+{
+ struct perf_pmu *pmu = evsel__find_pmu(evsel);
+
+ return pmu && pmu->is_uncore;
+}
+
+static bool hybrid_uniquify(struct evsel *evsel)
+{
+ return perf_pmu__has_hybrid() && !is_uncore(evsel);
+}
+
static bool collect_data(struct perf_stat_config *config, struct evsel *counter,
void (*cb)(struct perf_stat_config *config, struct evsel *counter, void *data,
bool first),
@@ -601,7 +613,7 @@ static bool collect_data(struct perf_stat_config *config, struct evsel *counter,
if (counter->merged_stat)
return false;
cb(config, counter, data, true);
- if (config->no_merge)
+ if (config->no_merge || hybrid_uniquify(counter))
uniquify_event_name(counter);
else if (counter->auto_merge_stats)
collect_all_aliases(config, counter, cb, data);
--
2.27.0


2021-07-11 16:07:15

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2] perf stat: Merge uncore events by default for hybrid platform

On Wed, Jul 07, 2021 at 01:56:52PM +0800, Jin Yao wrote:
> On hybrid platform, by default stat aggregates and reports the event counts
> per pmu. For example,
>
> # perf stat -e cycles -a true
>
> Performance counter stats for 'system wide':
>
> 1,400,445 cpu_core/cycles/
> 680,881 cpu_atom/cycles/
>
> 0.001770773 seconds time elapsed
>
> While for uncore events, that's not a suitable method. Uncore has nothing
> to do with hybrid. So for uncore events, we aggregate event counts from all
> PMUs and report the counts without PMUs.
>
> Before:
>
> # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
>
> Performance counter stats for 'system wide':
>
> 2,058 uncore_arb_0/event=0x81,umask=0x1/
> 2,028 uncore_arb_1/event=0x81,umask=0x1/
> 0 uncore_arb_0/event=0x84,umask=0x1/
> 0 uncore_arb_1/event=0x84,umask=0x1/
>
> 0.000614498 seconds time elapsed
>
> After:
>
> # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
>
> Performance counter stats for 'system wide':
>
> 3,996 arb/event=0x81,umask=0x1/
> 0 arb/event=0x84,umask=0x1/
>
> 0.000630046 seconds time elapsed
>
> Of course, we also keep the '--no-merge' working for uncore events.
>
> # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true
>
> Performance counter stats for 'system wide':
>
> 1,952 uncore_arb_0/event=0x81,umask=0x1/
> 1,921 uncore_arb_1/event=0x81,umask=0x1/
> 0 uncore_arb_0/event=0x84,umask=0x1/
> 0 uncore_arb_1/event=0x84,umask=0x1/
>
> 0.000575536 seconds time elapsed
>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> v2:
> - Use evsel__find_pmu() to find uncore pmu.
> - Create hybrid_uniquify() to check if uniquify the event name for hybrid.

Acked-by: Jiri Olsa <[email protected]>

thanks,
jirka


>
> tools/perf/builtin-stat.c | 3 ---
> tools/perf/util/stat-display.c | 14 +++++++++++++-
> 2 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index f9f74a514315..b67a44982b61 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2442,9 +2442,6 @@ int cmd_stat(int argc, const char **argv)
>
> evlist__check_cpu_maps(evsel_list);
>
> - if (perf_pmu__has_hybrid())
> - stat_config.no_merge = true;
> -
> /*
> * Initialize thread_map with comm names,
> * so we could print it out on output.
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index c588a6b7a8db..87f77016b9cc 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -593,6 +593,18 @@ static void collect_all_aliases(struct perf_stat_config *config, struct evsel *c
> }
> }
>
> +static bool is_uncore(struct evsel *evsel)
> +{
> + struct perf_pmu *pmu = evsel__find_pmu(evsel);
> +
> + return pmu && pmu->is_uncore;
> +}
> +
> +static bool hybrid_uniquify(struct evsel *evsel)
> +{
> + return perf_pmu__has_hybrid() && !is_uncore(evsel);
> +}
> +
> static bool collect_data(struct perf_stat_config *config, struct evsel *counter,
> void (*cb)(struct perf_stat_config *config, struct evsel *counter, void *data,
> bool first),
> @@ -601,7 +613,7 @@ static bool collect_data(struct perf_stat_config *config, struct evsel *counter,
> if (counter->merged_stat)
> return false;
> cb(config, counter, data, true);
> - if (config->no_merge)
> + if (config->no_merge || hybrid_uniquify(counter))
> uniquify_event_name(counter);
> else if (counter->auto_merge_stats)
> collect_all_aliases(config, counter, cb, data);
> --
> 2.27.0
>

2021-07-12 18:13:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2] perf stat: Merge uncore events by default for hybrid platform

Em Sun, Jul 11, 2021 at 06:02:13PM +0200, Jiri Olsa escreveu:
> On Wed, Jul 07, 2021 at 01:56:52PM +0800, Jin Yao wrote:
<SNIP>
> > While for uncore events, that's not a suitable method. Uncore has nothing
> > to do with hybrid. So for uncore events, we aggregate event counts from all
> > PMUs and report the counts without PMUs.
<SNIP>

> Acked-by: Jiri Olsa <[email protected]>

Thanks, applied to perf/urgent.

- Arnaldo