2020-09-22 03:08:38

by Jin Yao

[permalink] [raw]
Subject: [PATCH] perf stat: Skip duration_time in setup_system_wide

Some metrics (such as DRAM_BW_Use) consists of uncore events and
duration_time. For uncore events, counter->core.system_wide is
true. But for duration_time, counter->core.system_wide is false
so target.system_wide is set to false.

Then 'enable_on_exec' is set in perf_event_attr of uncore event.
Kernel will return error when trying to open the uncore event.

This patch skips the duration_time in setup_system_wide then
target.system_wide will be set to true for the evlist of uncore
events + duration_time.

Before (tested on skylake desktop):

# perf stat -M DRAM_BW_Use -- sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
/bin/dmesg | grep -i perf may provide additional information.

After:

# perf stat -M DRAM_BW_Use -- sleep 1

Performance counter stats for 'system wide':

169 arb/event=0x84,umask=0x1/ # 0.00 DRAM_BW_Use
40,427 arb/event=0x81,umask=0x1/
1,000,902,197 ns duration_time

1.000902197 seconds time elapsed

Fixes: 648b5af3f3ae ("libperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'")
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/builtin-stat.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7f8d756d9408..9bcc93bc0973 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2047,8 +2047,10 @@ static void setup_system_wide(int forks)
struct evsel *counter;

evlist__for_each_entry(evsel_list, counter) {
- if (!counter->core.system_wide)
+ if (!counter->core.system_wide &&
+ strcmp(counter->name, "duration_time")) {
return;
+ }
}

if (evsel_list->core.nr_entries)
--
2.17.1


2020-09-22 18:00:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf stat: Skip duration_time in setup_system_wide

Em Tue, Sep 22, 2020 at 09:50:04AM +0800, Jin Yao escreveu:
> Some metrics (such as DRAM_BW_Use) consists of uncore events and
> duration_time. For uncore events, counter->core.system_wide is
> true. But for duration_time, counter->core.system_wide is false
> so target.system_wide is set to false.
>
> Then 'enable_on_exec' is set in perf_event_attr of uncore event.
> Kernel will return error when trying to open the uncore event.
>
> This patch skips the duration_time in setup_system_wide then
> target.system_wide will be set to true for the evlist of uncore
> events + duration_time.
>
> Before (tested on skylake desktop):
>
> # perf stat -M DRAM_BW_Use -- sleep 1
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
> /bin/dmesg | grep -i perf may provide additional information.
>
> After:
>
> # perf stat -M DRAM_BW_Use -- sleep 1
>
> Performance counter stats for 'system wide':
>
> 169 arb/event=0x84,umask=0x1/ # 0.00 DRAM_BW_Use
> 40,427 arb/event=0x81,umask=0x1/
> 1,000,902,197 ns duration_time
>
> 1.000902197 seconds time elapsed
>
> Fixes: 648b5af3f3ae ("libperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'")

Humm, what makes you think that this cset was the one introducing this
problem? It just moves evsel->system_wide to evsel->core.system_wide.

- Arnaldo


> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/builtin-stat.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 7f8d756d9408..9bcc93bc0973 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2047,8 +2047,10 @@ static void setup_system_wide(int forks)
> struct evsel *counter;
>
> evlist__for_each_entry(evsel_list, counter) {
> - if (!counter->core.system_wide)
> + if (!counter->core.system_wide &&
> + strcmp(counter->name, "duration_time")) {
> return;
> + }
> }
>
> if (evsel_list->core.nr_entries)
> --
> 2.17.1
>

--

- Arnaldo

2020-09-22 18:07:38

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf stat: Skip duration_time in setup_system_wide

Em Tue, Sep 22, 2020 at 02:56:30PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Sep 22, 2020 at 09:50:04AM +0800, Jin Yao escreveu:
> > Some metrics (such as DRAM_BW_Use) consists of uncore events and
> > duration_time. For uncore events, counter->core.system_wide is
> > true. But for duration_time, counter->core.system_wide is false
> > so target.system_wide is set to false.
> >
> > Then 'enable_on_exec' is set in perf_event_attr of uncore event.
> > Kernel will return error when trying to open the uncore event.
> >
> > This patch skips the duration_time in setup_system_wide then
> > target.system_wide will be set to true for the evlist of uncore
> > events + duration_time.
> >
> > Before (tested on skylake desktop):
> >
> > # perf stat -M DRAM_BW_Use -- sleep 1
> > Error:
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
> > /bin/dmesg | grep -i perf may provide additional information.
> >
> > After:
> >
> > # perf stat -M DRAM_BW_Use -- sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > 169 arb/event=0x84,umask=0x1/ # 0.00 DRAM_BW_Use
> > 40,427 arb/event=0x81,umask=0x1/
> > 1,000,902,197 ns duration_time
> >
> > 1.000902197 seconds time elapsed
> >
> > Fixes: 648b5af3f3ae ("libperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'")
>
> Humm, what makes you think that this cset was the one introducing this
> problem? It just moves evsel->system_wide to evsel->core.system_wide.

Apart from that I reproduced the problem and after applying your patch
it seems cured:

[acme@quaco perf]$ grep 'model name' -m1 /proc/cpuinfo
model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz

Before (with -v to see details):

[root@quaco ~]# perf stat -v -M DRAM_BW_Use -- sleep 1
Using CPUID GenuineIntel-6-8E-A
metric expr 64 * ( arb@event\=0x81\,umask\=0x1@ + arb@event\=0x84\,umask\=0x1@ ) / 1000000 / duration_time / 1000 for DRAM_BW_Use
found event duration_time
found event arb/event=0x84,umask=0x1/
found event arb/event=0x81,umask=0x1/
adding {arb/event=0x84,umask=0x1/,arb/event=0x81,umask=0x1/}:W,duration_time
Control descriptor is not initialized
Warning:
arb/event=0x84,umask=0x1/ event is not supported by the kernel.
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
/bin/dmesg | grep -i perf may provide additional information.

[root@quaco ~]#

After:

[root@quaco ~]# perf stat -M DRAM_BW_Use -- sleep 1

Performance counter stats for 'system wide':

2,806 arb/event=0x84,umask=0x1/ # 0.63 DRAM_BW_Use
10,001,820 arb/event=0x81,umask=0x1/
1,016,875,686 ns duration_time

1.016875686 seconds time elapsed

[root@quaco ~]#

So I'm removing that fixes and adding this one, that I think is where
"duration_time" was being considered...

Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring")

Also, wouldn't it be better to have the duration_time event with its
evsel->core.system_wide set to true?

- Arnaldo

2020-09-23 02:57:09

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH] perf stat: Skip duration_time in setup_system_wide

Hi Arnaldo,

On 9/23/2020 2:02 AM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 22, 2020 at 02:56:30PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Tue, Sep 22, 2020 at 09:50:04AM +0800, Jin Yao escreveu:
>>> Some metrics (such as DRAM_BW_Use) consists of uncore events and
>>> duration_time. For uncore events, counter->core.system_wide is
>>> true. But for duration_time, counter->core.system_wide is false
>>> so target.system_wide is set to false.
>>>
>>> Then 'enable_on_exec' is set in perf_event_attr of uncore event.
>>> Kernel will return error when trying to open the uncore event.
>>>
>>> This patch skips the duration_time in setup_system_wide then
>>> target.system_wide will be set to true for the evlist of uncore
>>> events + duration_time.
>>>
>>> Before (tested on skylake desktop):
>>>
>>> # perf stat -M DRAM_BW_Use -- sleep 1
>>> Error:
>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
>>> /bin/dmesg | grep -i perf may provide additional information.
>>>
>>> After:
>>>
>>> # perf stat -M DRAM_BW_Use -- sleep 1
>>>
>>> Performance counter stats for 'system wide':
>>>
>>> 169 arb/event=0x84,umask=0x1/ # 0.00 DRAM_BW_Use
>>> 40,427 arb/event=0x81,umask=0x1/
>>> 1,000,902,197 ns duration_time
>>>
>>> 1.000902197 seconds time elapsed
>>>
>>> Fixes: 648b5af3f3ae ("libperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'")
>>
>> Humm, what makes you think that this cset was the one introducing this
>> problem? It just moves evsel->system_wide to evsel->core.system_wide.
>
> Apart from that I reproduced the problem and after applying your patch
> it seems cured:
>
> [acme@quaco perf]$ grep 'model name' -m1 /proc/cpuinfo
> model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
>
> Before (with -v to see details):
>
> [root@quaco ~]# perf stat -v -M DRAM_BW_Use -- sleep 1
> Using CPUID GenuineIntel-6-8E-A
> metric expr 64 * ( arb@event\=0x81\,umask\=0x1@ + arb@event\=0x84\,umask\=0x1@ ) / 1000000 / duration_time / 1000 for DRAM_BW_Use
> found event duration_time
> found event arb/event=0x84,umask=0x1/
> found event arb/event=0x81,umask=0x1/
> adding {arb/event=0x84,umask=0x1/,arb/event=0x81,umask=0x1/}:W,duration_time
> Control descriptor is not initialized
> Warning:
> arb/event=0x84,umask=0x1/ event is not supported by the kernel.
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
> /bin/dmesg | grep -i perf may provide additional information.
>
> [root@quaco ~]#
>
> After:
>
> [root@quaco ~]# perf stat -M DRAM_BW_Use -- sleep 1
>
> Performance counter stats for 'system wide':
>
> 2,806 arb/event=0x84,umask=0x1/ # 0.63 DRAM_BW_Use
> 10,001,820 arb/event=0x81,umask=0x1/
> 1,016,875,686 ns duration_time
>
> 1.016875686 seconds time elapsed
>
> [root@quaco ~]#
>
> So I'm removing that fixes and adding this one, that I think is where
> "duration_time" was being considered...
>
> Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring")
>

Yes, this fixes is much better, thanks.

> Also, wouldn't it be better to have the duration_time event with its
> evsel->core.system_wide set to true?
>

That looks to be another solution, should be OK too I think. :)

But anyway we need a test.

Thanks
Jin Yao

> - Arnaldo
>