The metric mixed with uncore event is failed with workload.
root@lkp-csl-2sp5 ~# perf stat -M DRAM_BW_Use -- sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_imc/cas_count_read/).
/bin/dmesg | grep -i perf may provide additional information.
But it works without workload.
root@lkp-csl-2sp5 ~# perf stat -M DRAM_BW_Use
^C
Performance counter stats for 'system wide':
5.38 MiB uncore_imc/cas_count_read/
3.98 MiB uncore_imc/cas_count_write/
1550285007 ns duration_time
1.550285007 seconds time elapsed
The uncore events (e.g. uncore_imc/cas_count_read/ and
uncore_imc/cas_count_write/) have been forced to system wide but duration_time
is not.
Currently, the target system wide is set according to:
1. there's no workload specified
2. there is workload specified but all requested events are system wide events.
For DRAM_BW_Use, the target is not set to system wide since 2 is not
met. Next, the counter creation on CPU will run into some issues.
It should make sense to set the target system wide if there is workload
specified and at least one required event is system wide event.
With this patch,
root@lkp-csl-2sp5 ~# perf stat -M DRAM_BW_Use -- sleep 1
Performance counter stats for 'system wide':
4.81 MiB uncore_imc/cas_count_read/
2.89 MiB uncore_imc/cas_count_write/
1002019036 ns duration_time
1.002019036 seconds time elapsed
Fixes: e3ba76deef23 ("perf tools: Force uncore events to system wide monitoring")
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/builtin-stat.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9207b6c45475..b01ee06b1965 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1841,8 +1841,8 @@ static void setup_system_wide(int forks)
* conditions is met:
*
* - there's no workload specified
- * - there is workload specified but all requested
- * events are system wide events
+ * - there is workload specified but at least one requested
+ * event is system wide event
*/
if (!target__none(&target))
return;
@@ -1851,13 +1851,16 @@ static void setup_system_wide(int forks)
target.system_wide = true;
else {
struct evsel *counter;
+ bool system_wide = false;
evlist__for_each_entry(evsel_list, counter) {
- if (!counter->core.system_wide)
- return;
+ if (counter->core.system_wide) {
+ system_wide = true;
+ break;
+ }
}
- if (evsel_list->core.nr_entries)
+ if (evsel_list->core.nr_entries && system_wide)
target.system_wide = true;
}
}
--
2.17.1
On Mon, Apr 27, 2020 at 10:41:16PM +0800, Jin Yao wrote:
SNIP
> index 9207b6c45475..b01ee06b1965 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1841,8 +1841,8 @@ static void setup_system_wide(int forks)
> * conditions is met:
> *
> * - there's no workload specified
> - * - there is workload specified but all requested
> - * events are system wide events
> + * - there is workload specified but at least one requested
> + * event is system wide event
> */
> if (!target__none(&target))
> return;
> @@ -1851,13 +1851,16 @@ static void setup_system_wide(int forks)
> target.system_wide = true;
> else {
> struct evsel *counter;
> + bool system_wide = false;
>
> evlist__for_each_entry(evsel_list, counter) {
> - if (!counter->core.system_wide)
> - return;
> + if (counter->core.system_wide) {
> + system_wide = true;
> + break;
> + }
I wonder this would break some expectations.. would it be
more safe to detect duration event and bypass it from the
decission? but maybe the case I'm worried about is not a
problem at all.. Andi?
jirka
> }
>
> - if (evsel_list->core.nr_entries)
> + if (evsel_list->core.nr_entries && system_wide)
> target.system_wide = true;
> }
> }
> --
> 2.17.1
>
> I wonder this would break some expectations.. would it be
> more safe to detect duration event and bypass it from the
> decission? but maybe the case I'm worried about is not a
> problem at all.. Andi?
Don't see what it would break.
Yes maybe we need to special case duration_time more, but that would
be a much bigger patch.
-Andi
On Tue, Apr 28, 2020 at 02:19:22PM -0700, Andi Kleen wrote:
> > I wonder this would break some expectations.. would it be
> > more safe to detect duration event and bypass it from the
> > decission? but maybe the case I'm worried about is not a
> > problem at all.. Andi?
>
> Don't see what it would break.
>
> Yes maybe we need to special case duration_time more, but that would
> be a much bigger patch.
would below change work for you? if duration_time is the only
case, I'd rather go with the special case for it
jirka
---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9207b6c45475..2518204cffd1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1853,6 +1853,8 @@ static void setup_system_wide(int forks)
struct evsel *counter;
evlist__for_each_entry(evsel_list, counter) {
+ if (counter->tool_event == PERF_TOOL_DURATION_TIME)
+ continue;
if (!counter->core.system_wide)
return;
}
Hi Jiri,
On 4/29/2020 4:16 PM, Jiri Olsa wrote:
> On Tue, Apr 28, 2020 at 02:19:22PM -0700, Andi Kleen wrote:
>>> I wonder this would break some expectations.. would it be
>>> more safe to detect duration event and bypass it from the
>>> decission? but maybe the case I'm worried about is not a
>>> problem at all.. Andi?
>>
>> Don't see what it would break.
>>
>> Yes maybe we need to special case duration_time more, but that would
>> be a much bigger patch.
>
> would below change work for you? if duration_time is the only
> case, I'd rather go with the special case for it
>
> jirka
>
>
Just tested, for the case of DRAM_BW_Use, it can work.
root@lkp-csl-2sp5 ~# perf stat -M DRAM_BW_Use -- sleep 1
Performance counter stats for 'system wide':
9.54 MiB uncore_imc/cas_count_read/
3.26 MiB uncore_imc/cas_count_write/
1002109793 ns duration_time
1.002109793 seconds time elapsed
Thanks
Jin Yao
> ---
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 9207b6c45475..2518204cffd1 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1853,6 +1853,8 @@ static void setup_system_wide(int forks)
> struct evsel *counter;
>
> evlist__for_each_entry(evsel_list, counter) {
> + if (counter->tool_event == PERF_TOOL_DURATION_TIME)
> + continue;
> if (!counter->core.system_wide)
> return;
> }
>