Commit f01642e4912b ("perf metricgroup: Support multiple
events for metricgroup") introduced support for multiple events
in a metric group. But with the current upstream, metric events
names are not printed properly incase we try to run multiple
metric groups with overlapping event.
With current upstream version, incase of overlapping metric events
issue is, we always start our comparision logic from start.
So, the events which already matched with some metric group also
take part in comparision logic. Because of that when we have overlapping
events, we end up matching current metric group event with already matched
one.
For example, in skylake machine we have metric event CoreIPC and
Instructions. Both of them need 'inst_retired.any' event value.
As events in Instructions is subset of events in CoreIPC, they
endup in pointing to same 'inst_retired.any' value.
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
1,254,992,790 inst_retired.any # 1254992790.0
Instructions
# 1.3 CoreIPC
977,172,805 cycles
1,254,992,756 inst_retired.any
1.000802596 seconds time elapsed
command:# sudo ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
948,650 uops_retired.retire_slots
866,182 inst_retired.any # 0.7 IPC
866,182 inst_retired.any
1,175,671 cpu_clk_unhalted.thread
Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
track of events which already matched with some group by setting it true.
So, we skip all used events in list when we start comparision logic.
Patch also make some changes in comparision logic, incase we get a match
miss, we discard the whole match and start again with first event id in
metric event.
With this patch:
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
3,348,415 inst_retired.any # 0.3 CoreIPC
11,779,026 cycles
3,348,381 inst_retired.any # 3348381.0
Instructions
1.001649056 seconds time elapsed
command:# ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
1,023,148 uops_retired.retire_slots # 1.1 UPI
924,976 inst_retired.any
924,976 inst_retired.any # 0.6 IPC
1,489,414 cpu_clk_unhalted.thread
1.003064672 seconds time elapsed
Signed-off-by: Kajol Jain <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Anju T Sudhakar <[email protected]>
Cc: Ravi Bangoria <[email protected]>
---
tools/perf/util/metricgroup.c | 50 ++++++++++++++++++++++-------------
1 file changed, 31 insertions(+), 19 deletions(-)
Changelog:
v3 -> v4
- Make 'evlist_used' a bool pointer.
v2 -> v3
- Add array in place of variable to keep track of matched events.
Because incase we miss match in previous approach, all events will
be rolled over in next condition. So, rather we add array and set
it incase that variable already match with some group.
- Suggested by Jiri Olsa
v1 -> v2
- Rather then adding static variable in metricgroup.c,
add a new variable in evlist itself with name 'evlist_iter'
---
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 02aee946b6c1..6fcd839128cd 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -93,13 +93,16 @@ struct egroup {
static struct evsel *find_evsel_group(struct evlist *perf_evlist,
const char **ids,
int idnum,
- struct evsel **metric_events)
+ struct evsel **metric_events,
+ bool *evlist_used)
{
struct evsel *ev;
- int i = 0;
+ int i = 0, j = 0;
bool leader_found;
evlist__for_each_entry (perf_evlist, ev) {
+ if (evlist_used[j++])
+ continue;
if (!strcmp(ev->name, ids[i])) {
if (!metric_events[i])
metric_events[i] = ev;
@@ -107,22 +110,17 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
if (i == idnum)
break;
} else {
- if (i + 1 == idnum) {
- /* Discard the whole match and start again */
- i = 0;
- memset(metric_events, 0,
- sizeof(struct evsel *) * idnum);
- continue;
- }
-
- if (!strcmp(ev->name, ids[i]))
- metric_events[i] = ev;
- else {
- /* Discard the whole match and start again */
- i = 0;
- memset(metric_events, 0,
- sizeof(struct evsel *) * idnum);
- continue;
+ /* Discard the whole match and start again */
+ i = 0;
+ memset(metric_events, 0,
+ sizeof(struct evsel *) * idnum);
+
+ if (!strcmp(ev->name, ids[i])) {
+ if (!metric_events[i])
+ metric_events[i] = ev;
+ i++;
+ if (i == idnum)
+ break;
}
}
}
@@ -144,7 +142,10 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
!strcmp(ev->name, metric_events[i]->name)) {
ev->metric_leader = metric_events[i];
}
+ j++;
}
+ ev = metric_events[i];
+ evlist_used[ev->idx] = true;
}
return metric_events[0];
@@ -160,6 +161,14 @@ static int metricgroup__setup_events(struct list_head *groups,
int ret = 0;
struct egroup *eg;
struct evsel *evsel;
+ bool *evlist_used;
+
+ evlist_used = (bool *)calloc(perf_evlist->core.nr_entries,
+ sizeof(bool));
+ if (!evlist_used) {
+ ret = -ENOMEM;
+ break;
+ }
list_for_each_entry (eg, groups, nd) {
struct evsel **metric_events;
@@ -170,7 +179,7 @@ static int metricgroup__setup_events(struct list_head *groups,
break;
}
evsel = find_evsel_group(perf_evlist, eg->ids, eg->idnum,
- metric_events);
+ metric_events, evlist_used);
if (!evsel) {
pr_debug("Cannot resolve %s: %s\n",
eg->metric_name, eg->metric_expr);
@@ -194,6 +203,9 @@ static int metricgroup__setup_events(struct list_head *groups,
expr->metric_events = metric_events;
list_add(&expr->nd, &me->head);
}
+
+ free(evlist_used);
+
return ret;
}
--
2.21.0
On Wed, Feb 12, 2020 at 11:11:02AM +0530, Kajol Jain wrote:
SNIP
>
> return metric_events[0];
> @@ -160,6 +161,14 @@ static int metricgroup__setup_events(struct list_head *groups,
> int ret = 0;
> struct egroup *eg;
> struct evsel *evsel;
> + bool *evlist_used;
> +
> + evlist_used = (bool *)calloc(perf_evlist->core.nr_entries,
> + sizeof(bool));
> + if (!evlist_used) {
> + ret = -ENOMEM;
> + break;
hum, how did this compile for you? ;-)
util/metricgroup.c: In function ‘metricgroup__setup_events’:
util/metricgroup.c:170:3: error: break statement not within loop or switch
170 | break;
jirka
> -----Original Message-----
> From: [email protected]
> <[email protected]> On Behalf Of Kajol Jain
> Sent: 2020??2??12?? 13:41
> To: [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; Jiri Olsa <[email protected]>; Alexander Shishkin
> <[email protected]>; Andi Kleen <[email protected]>; Kan
> Liang <[email protected]>; Peter Zijlstra <[email protected]>; Jin
> Yao <[email protected]>; Madhavan Srinivasan
> <[email protected]>; Anju T Sudhakar <[email protected]>;
> Ravi Bangoria <[email protected]>
> Subject: [PATCH v4] tools/perf/metricgroup: Fix printing event names of metric
> group with multiple events incase of overlapping events
>
> Commit f01642e4912b ("perf metricgroup: Support multiple events for
> metricgroup") introduced support for multiple events in a metric group. But
> with the current upstream, metric events names are not printed properly
> incase we try to run multiple metric groups with overlapping event.
>
> With current upstream version, incase of overlapping metric events issue is, we
> always start our comparision logic from start.
> So, the events which already matched with some metric group also take part in
> comparision logic. Because of that when we have overlapping events, we end
> up matching current metric group event with already matched one.
>
> For example, in skylake machine we have metric event CoreIPC and
> Instructions. Both of them need 'inst_retired.any' event value.
> As events in Instructions is subset of events in CoreIPC, they endup in pointing
> to same 'inst_retired.any' value.
>
> In skylake platform:
>
> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
>
> Performance counter stats for 'CPU(s) 0':
>
> 1,254,992,790 inst_retired.any # 1254992790.0
> Instructions
> # 1.3
> CoreIPC
> 977,172,805 cycles
> 1,254,992,756 inst_retired.any
>
> 1.000802596 seconds time elapsed
>
> command:# sudo ./perf stat -M UPI,IPC sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 948,650 uops_retired.retire_slots
> 866,182 inst_retired.any # 0.7 IPC
> 866,182 inst_retired.any
> 1,175,671 cpu_clk_unhalted.thread
>
> Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep track of
> events which already matched with some group by setting it true.
> So, we skip all used events in list when we start comparision logic.
> Patch also make some changes in comparision logic, incase we get a match
> miss, we discard the whole match and start again with first event id in metric
> event.
>
> With this patch:
> In skylake platform:
>
> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
>
> Performance counter stats for 'CPU(s) 0':
>
> 3,348,415 inst_retired.any # 0.3 CoreIPC
> 11,779,026 cycles
> 3,348,381 inst_retired.any # 3348381.0
> Instructions
>
> 1.001649056 seconds time elapsed
>
> command:# ./perf stat -M UPI,IPC sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 1,023,148 uops_retired.retire_slots # 1.1 UPI
> 924,976 inst_retired.any
> 924,976 inst_retired.any # 0.6 IPC
> 1,489,414 cpu_clk_unhalted.thread
>
> 1.003064672 seconds time elapsed
>
> Signed-off-by: Kajol Jain <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: Madhavan Srinivasan <[email protected]>
> Cc: Anju T Sudhakar <[email protected]>
> Cc: Ravi Bangoria <[email protected]>
> ---
> tools/perf/util/metricgroup.c | 50 ++++++++++++++++++++++-------------
> 1 file changed, 31 insertions(+), 19 deletions(-)
Hi Kajol,
I am not sure if it is good to ask a question here :-)
I encountered a perf metricgroup issue, the result is incorrect when the metric includes more than 2 events.
git log --oneline tools/perf/util/metricgroup.c
3635b27cc058 perf metricgroup: Fix printing event names of metric group with multiple events
f01642e4912b perf metricgroup: Support multiple events for metricgroup
287f2649f791 perf metricgroup: Scale the metric result
I did a simple test, below is the JSON file and result.
[
{
"PublicDescription": "Calculate DDR0 bus actual utilization which vary from DDR0 controller clock frequency",
"BriefDescription": "imx8qm: ddr0 bus actual utilization",
"MetricName": "imx8qm-ddr0-bus-util",
"MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )",
"MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
}
]
./perf stat -I 1000 -M imx8qm-ddr0-bus-util
# time counts unit events
1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
1.000104250 6201 imx8_ddr0/write-cycles/
2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
2.000525625 2738 imx8_ddr0/write-cycles/
3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
3.000819125 303 imx8_ddr0/write-cycles/
4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
4.001103750 2317 imx8_ddr0/write-cycles/
5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
5.001392750 601 imx8_ddr0/write-cycles/
You can see that only the first result is correct, could this be reproduced at you side?
Thanks a lot!
Best Regards,
Joakim Zhang
On 2/17/20 1:51 AM, Jiri Olsa wrote:
> On Wed, Feb 12, 2020 at 11:11:02AM +0530, Kajol Jain wrote:
>
> SNIP
>
>>
>> return metric_events[0];
>> @@ -160,6 +161,14 @@ static int metricgroup__setup_events(struct list_head *groups,
>> int ret = 0;
>> struct egroup *eg;
>> struct evsel *evsel;
>> + bool *evlist_used;
>> +
>> + evlist_used = (bool *)calloc(perf_evlist->core.nr_entries,
>> + sizeof(bool));
>> + if (!evlist_used) {
>> + ret = -ENOMEM;
>> + break;
>
> hum, how did this compile for you? ;-)
>
> util/metricgroup.c: In function ‘metricgroup__setup_events’:
> util/metricgroup.c:170:3: error: break statement not within loop or switch
> 170 | break;
>
>
Hi jiri,
Yes you are right. My bad I send patch from other branch. Will send correct one.
Thanks,
Kajol
> jirka
>
On 2/17/20 8:41 AM, Joakim Zhang wrote:
>
>> -----Original Message-----
>> From: [email protected]
>> <[email protected]> On Behalf Of Kajol Jain
>> Sent: 2020??2??12?? 13:41
>> To: [email protected]
>> Cc: [email protected]; [email protected];
>> [email protected]; Jiri Olsa <[email protected]>; Alexander Shishkin
>> <[email protected]>; Andi Kleen <[email protected]>; Kan
>> Liang <[email protected]>; Peter Zijlstra <[email protected]>; Jin
>> Yao <[email protected]>; Madhavan Srinivasan
>> <[email protected]>; Anju T Sudhakar <[email protected]>;
>> Ravi Bangoria <[email protected]>
>> Subject: [PATCH v4] tools/perf/metricgroup: Fix printing event names of metric
>> group with multiple events incase of overlapping events
>>
>> Commit f01642e4912b ("perf metricgroup: Support multiple events for
>> metricgroup") introduced support for multiple events in a metric group. But
>> with the current upstream, metric events names are not printed properly
>> incase we try to run multiple metric groups with overlapping event.
>>
>> With current upstream version, incase of overlapping metric events issue is, we
>> always start our comparision logic from start.
>> So, the events which already matched with some metric group also take part in
>> comparision logic. Because of that when we have overlapping events, we end
>> up matching current metric group event with already matched one.
>>
>> For example, in skylake machine we have metric event CoreIPC and
>> Instructions. Both of them need 'inst_retired.any' event value.
>> As events in Instructions is subset of events in CoreIPC, they endup in pointing
>> to same 'inst_retired.any' value.
>>
>> In skylake platform:
>>
>> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
>>
>> Performance counter stats for 'CPU(s) 0':
>>
>> 1,254,992,790 inst_retired.any # 1254992790.0
>> Instructions
>> # 1.3
>> CoreIPC
>> 977,172,805 cycles
>> 1,254,992,756 inst_retired.any
>>
>> 1.000802596 seconds time elapsed
>>
>> command:# sudo ./perf stat -M UPI,IPC sleep 1
>>
>> Performance counter stats for 'sleep 1':
>>
>> 948,650 uops_retired.retire_slots
>> 866,182 inst_retired.any # 0.7 IPC
>> 866,182 inst_retired.any
>> 1,175,671 cpu_clk_unhalted.thread
>>
>> Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep track of
>> events which already matched with some group by setting it true.
>> So, we skip all used events in list when we start comparision logic.
>> Patch also make some changes in comparision logic, incase we get a match
>> miss, we discard the whole match and start again with first event id in metric
>> event.
>>
>> With this patch:
>> In skylake platform:
>>
>> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
>>
>> Performance counter stats for 'CPU(s) 0':
>>
>> 3,348,415 inst_retired.any # 0.3 CoreIPC
>> 11,779,026 cycles
>> 3,348,381 inst_retired.any # 3348381.0
>> Instructions
>>
>> 1.001649056 seconds time elapsed
>>
>> command:# ./perf stat -M UPI,IPC sleep 1
>>
>> Performance counter stats for 'sleep 1':
>>
>> 1,023,148 uops_retired.retire_slots # 1.1 UPI
>> 924,976 inst_retired.any
>> 924,976 inst_retired.any # 0.6 IPC
>> 1,489,414 cpu_clk_unhalted.thread
>>
>> 1.003064672 seconds time elapsed
>>
>> Signed-off-by: Kajol Jain <[email protected]>
>> Cc: Jiri Olsa <[email protected]>
>> Cc: Alexander Shishkin <[email protected]>
>> Cc: Andi Kleen <[email protected]>
>> Cc: Kan Liang <[email protected]>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Jin Yao <[email protected]>
>> Cc: Madhavan Srinivasan <[email protected]>
>> Cc: Anju T Sudhakar <[email protected]>
>> Cc: Ravi Bangoria <[email protected]>
>> ---
>> tools/perf/util/metricgroup.c | 50 ++++++++++++++++++++++-------------
>> 1 file changed, 31 insertions(+), 19 deletions(-)
>
> Hi Kajol,
>
> I am not sure if it is good to ask a question here :-)
>
> I encountered a perf metricgroup issue, the result is incorrect when the metric includes more than 2 events.
>
> git log --oneline tools/perf/util/metricgroup.c
> 3635b27cc058 perf metricgroup: Fix printing event names of metric group with multiple events
> f01642e4912b perf metricgroup: Support multiple events for metricgroup
> 287f2649f791 perf metricgroup: Scale the metric result
>
> I did a simple test, below is the JSON file and result.
> [
> {
> "PublicDescription": "Calculate DDR0 bus actual utilization which vary from DDR0 controller clock frequency",
> "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> "MetricName": "imx8qm-ddr0-bus-util",
> "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )",
> "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> }
> ]
> ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> # time counts unit events
> 1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
> 1.000104250 6201 imx8_ddr0/write-cycles/
> 2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
> 2.000525625 2738 imx8_ddr0/write-cycles/
> 3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
> 3.000819125 303 imx8_ddr0/write-cycles/
> 4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
> 4.001103750 2317 imx8_ddr0/write-cycles/
> 5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
> 5.001392750 601 imx8_ddr0/write-cycles/
>
> You can see that only the first result is correct, could this be reproduced at you side?
Hi Joakim,
Will try to look into it from my side.
Thanks,
Kajol
>
> Thanks a lot!
>
> Best Regards,
> Joakim Zhang
>
> -----Original Message-----
> From: kajoljain <[email protected]>
> Sent: 2020??2??20?? 17:54
> To: Joakim Zhang <[email protected]>; [email protected]
> Cc: [email protected]; [email protected]; Jiri Olsa
> <[email protected]>; Alexander Shishkin <[email protected]>;
> Andi Kleen <[email protected]>; Kan Liang <[email protected]>; Peter
> Zijlstra <[email protected]>; Jin Yao <[email protected]>; Madhavan
> Srinivasan <[email protected]>; Anju T Sudhakar
> <[email protected]>; Ravi Bangoria <[email protected]>
> Subject: Re: [PATCH v4] tools/perf/metricgroup: Fix printing event names of
> metric group with multiple events incase of overlapping events
>
>
>
> On 2/17/20 8:41 AM, Joakim Zhang wrote:
> >
> >> -----Original Message-----
> >> From: [email protected]
> >> <[email protected]> On Behalf Of Kajol Jain
> >> Sent: 2020??2??12?? 13:41
> >> To: [email protected]
> >> Cc: [email protected]; [email protected];
> >> [email protected]; Jiri Olsa <[email protected]>; Alexander Shishkin
> >> <[email protected]>; Andi Kleen
> >> <[email protected]>; Kan Liang <[email protected]>; Peter
> >> Zijlstra <[email protected]>; Jin Yao <[email protected]>;
> >> Madhavan Srinivasan <[email protected]>; Anju T Sudhakar
> >> <[email protected]>; Ravi Bangoria
> >> <[email protected]>
> >> Subject: [PATCH v4] tools/perf/metricgroup: Fix printing event names
> >> of metric group with multiple events incase of overlapping events
> >>
> >> Commit f01642e4912b ("perf metricgroup: Support multiple events for
> >> metricgroup") introduced support for multiple events in a metric
> >> group. But with the current upstream, metric events names are not
> >> printed properly incase we try to run multiple metric groups with
> overlapping event.
> >>
> >> With current upstream version, incase of overlapping metric events
> >> issue is, we always start our comparision logic from start.
> >> So, the events which already matched with some metric group also take
> >> part in comparision logic. Because of that when we have overlapping
> >> events, we end up matching current metric group event with already
> matched one.
> >>
> >> For example, in skylake machine we have metric event CoreIPC and
> >> Instructions. Both of them need 'inst_retired.any' event value.
> >> As events in Instructions is subset of events in CoreIPC, they endup
> >> in pointing to same 'inst_retired.any' value.
> >>
> >> In skylake platform:
> >>
> >> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
> >>
> >> Performance counter stats for 'CPU(s) 0':
> >>
> >> 1,254,992,790 inst_retired.any # 1254992790.0
> >>
> Instructions
> >> # 1.3
> >> CoreIPC
> >> 977,172,805 cycles
> >> 1,254,992,756 inst_retired.any
> >>
> >> 1.000802596 seconds time elapsed
> >>
> >> command:# sudo ./perf stat -M UPI,IPC sleep 1
> >>
> >> Performance counter stats for 'sleep 1':
> >>
> >> 948,650 uops_retired.retire_slots
> >> 866,182 inst_retired.any # 0.7 IPC
> >> 866,182 inst_retired.any
> >> 1,175,671 cpu_clk_unhalted.thread
> >>
> >> Patch fixes the issue by adding a new bool pointer 'evlist_used' to
> >> keep track of events which already matched with some group by setting it
> true.
> >> So, we skip all used events in list when we start comparision logic.
> >> Patch also make some changes in comparision logic, incase we get a
> >> match miss, we discard the whole match and start again with first
> >> event id in metric event.
> >>
> >> With this patch:
> >> In skylake platform:
> >>
> >> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
> >>
> >> Performance counter stats for 'CPU(s) 0':
> >>
> >> 3,348,415 inst_retired.any # 0.3
> CoreIPC
> >> 11,779,026 cycles
> >> 3,348,381 inst_retired.any # 3348381.0
> >>
> Instructions
> >>
> >> 1.001649056 seconds time elapsed
> >>
> >> command:# ./perf stat -M UPI,IPC sleep 1
> >>
> >> Performance counter stats for 'sleep 1':
> >>
> >> 1,023,148 uops_retired.retire_slots # 1.1 UPI
> >> 924,976 inst_retired.any
> >> 924,976 inst_retired.any # 0.6 IPC
> >> 1,489,414 cpu_clk_unhalted.thread
> >>
> >> 1.003064672 seconds time elapsed
> >>
> >> Signed-off-by: Kajol Jain <[email protected]>
> >> Cc: Jiri Olsa <[email protected]>
> >> Cc: Alexander Shishkin <[email protected]>
> >> Cc: Andi Kleen <[email protected]>
> >> Cc: Kan Liang <[email protected]>
> >> Cc: Peter Zijlstra <[email protected]>
> >> Cc: Jin Yao <[email protected]>
> >> Cc: Madhavan Srinivasan <[email protected]>
> >> Cc: Anju T Sudhakar <[email protected]>
> >> Cc: Ravi Bangoria <[email protected]>
> >> ---
> >> tools/perf/util/metricgroup.c | 50
> >> ++++++++++++++++++++++-------------
> >> 1 file changed, 31 insertions(+), 19 deletions(-)
> >
> > Hi Kajol,
> >
> > I am not sure if it is good to ask a question here :-)
> >
> > I encountered a perf metricgroup issue, the result is incorrect when the
> metric includes more than 2 events.
> >
> > git log --oneline tools/perf/util/metricgroup.c
> > 3635b27cc058 perf metricgroup: Fix printing event names of metric
> > group with multiple events f01642e4912b perf metricgroup: Support
> > multiple events for metricgroup
> > 287f2649f791 perf metricgroup: Scale the metric result
> >
> > I did a simple test, below is the JSON file and result.
> > [
> > {
> > "PublicDescription": "Calculate DDR0 bus actual utilization
> which vary from DDR0 controller clock frequency",
> > "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> > "MetricName": "imx8qm-ddr0-bus-util",
> > "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )",
> > "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> > }
> > ]
> > ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > You can see that only the first result is correct, could this be reproduced at
> you side?
>
> Hi Joakim,
> Will try to look into it from my side.
Thanks Kajol for your help, I look into this issue, but don't know how to fix it.
The results are always correct if signal event used in "MetricExpr" with "-I" parameters, but the results are incorrect when more than one events used in "MetricExpr".
Hope you can find the root cause :-)
Best Regards,
Joakim Zhang
> Thanks,
> Kajol
> >
> > Thanks a lot!
> >
> > Best Regards,
> > Joakim Zhang
> >
FYI, we noticed the following commit (built with gcc-7):
commit: 5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d ("[PATCH v4] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events")
url: https://github.com/0day-ci/linux/commits/Kajol-Jain/tools-perf-metricgroup-Fix-printing-event-names-of-metric-group-with-multiple-events-incase-of-overlapping-events/20200215-093220
in testcase: perf-sanity-tests
with following parameters:
perf_compiler: gcc
ucode: 0x27
on test machine: 8 threads Intel(R) Core(TM) i7-4790 v3 @ 3.60GHz with 6G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
user :notice: [ 58.688905] LD trace/beauty/perf-in.o
user :err : [ 58.694839] util/metricgroup.c: In function 'metricgroup__setup_events':
user :notice: [ 58.696757] CC util/evlist.o
user :err : [ 58.705739] util/metricgroup.c:170:3: error: break statement not within loop or switch
user :notice: [ 58.710154] LD arch/x86/tests/perf-in.o
user :err : [ 58.720491] break;
user :notice: [ 58.722367] LD arch/x86/perf-in.o
user :err : [ 58.727486] ^~~~~
user :notice: [ 58.729854] LD arch/perf-in.o
user :err : [ 58.734440] mv: cannot stat 'util/.metricgroup.o.tmp': No such file or directory
user :notice: [ 58.738730] CC ui/tui/util.o
user :err : [ 58.741991] make[4]: *** [util/metricgroup.o] Error 1
user :notice: [ 58.743544] CC tests/hists_link.o
user :err : [ 58.746796] make[4]: *** Waiting for unfinished jobs....
user :notice: [ 58.749865] CC ui/browsers/map.o
user :notice: [ 58.796537] CC util/evsel.o
user :notice: [ 58.803941] CC ui/tui/helpline.o
user :notice: [ 58.811723] CC ui/tui/progress.o
user :notice: [ 58.819597] CC ui/browsers/scripts.o
user :notice: [ 58.827810] CC tests/hists_filter.o
user :notice: [ 58.835934] CC ui/browsers/header.o
user :notice: [ 58.843797] LD ui/tui/perf-in.o
user :notice: [ 58.851633] CC tests/hists_output.o
user :notice: [ 58.859662] CC ui/browsers/res_sample.o
user :notice: [ 58.868059] CC tests/hists_cumulate.o
user :notice: [ 58.876298] CC util/evsel_fprintf.o
user :notice: [ 58.884794] CC util/perf_event_attr_fprintf.o
user :notice: [ 58.893616] CC util/evswitch.o
user :notice: [ 58.901051] CC util/find_bit.o
user :notice: [ 58.908790] CC util/get_current_dir_name.o
user :notice: [ 58.917030] CC util/kallsyms.o
user :notice: [ 58.924272] CC tests/python-use.o
user :notice: [ 58.931786] CC util/levenshtein.o
user :notice: [ 58.939306] CC util/llvm-utils.o
user :notice: [ 58.946793] CC util/mmap.o
user :notice: [ 58.953880] CC util/memswap.o
user :notice: [ 58.961527] BISON util/parse-events-bison.c
user :notice: [ 58.969835] CC util/perf_regs.o
user :notice: [ 58.977405] CC tests/bp_signal.o
user :notice: [ 58.984747] CC util/path.o
user :notice: [ 58.991911] CC tests/bp_signal_overflow.o
user :notice: [ 59.000134] CC util/print_binary.o
user :notice: [ 59.007613] CC util/rlimit.o
user :notice: [ 59.014591] CC util/argv_split.o
user :notice: [ 59.021979] CC util/rbtree.o
user :notice: [ 59.028985] CC tests/bp_account.o
user :notice: [ 59.036173] CC tests/wp.o
user :notice: [ 59.042871] CC util/libstring.o
user :notice: [ 59.050011] CC util/bitmap.o
user :notice: [ 59.056858] CC util/hweight.o
user :notice: [ 59.063966] CC tests/task-exit.o
user :notice: [ 59.071298] CC tests/sw-clock.o
user :notice: [ 59.078342] CC util/smt.o
user :notice: [ 59.084981] CC util/strbuf.o
user :notice: [ 59.092311] CC tests/mmap-thread-lookup.o
user :notice: [ 59.100343] CC util/string.o
user :notice: [ 59.107456] CC tests/thread-maps-share.o
user :notice: [ 59.115581] CC tests/switch-tracking.o
user :notice: [ 59.123304] CC util/strlist.o
user :notice: [ 59.130429] CC tests/keep-tracking.o
user :notice: [ 59.138249] CC tests/code-reading.o
user :notice: [ 59.145689] CC util/strfilter.o
user :notice: [ 59.152629] CC util/top.o
user :notice: [ 59.159112] CC util/usage.o
user :notice: [ 59.165621] CC tests/sample-parsing.o
user :notice: [ 59.172446] CC util/dso.o
user :notice: [ 59.178526] CC tests/parse-no-sample-id-all.o
user :notice: [ 59.186126] CC tests/kmod-path.o
user :notice: [ 59.192493] CC util/dsos.o
user :notice: [ 59.198456] CC tests/thread-map.o
user :notice: [ 59.204913] CC util/symbol.o
user :notice: [ 59.210930] CC tests/llvm.o
user :notice: [ 59.217000] CC util/symbol_fprintf.o
user :notice: [ 59.223692] CC tests/bpf.o
user :notice: [ 59.229547] CC util/color.o
user :notice: [ 59.235563] CC tests/topology.o
user :notice: [ 59.241827] CC tests/mem.o
user :notice: [ 59.247704] CC tests/cpumap.o
user :notice: [ 59.253922] CC util/color_config.o
user :notice: [ 59.260604] LD ui/browsers/perf-in.o
user :notice: [ 59.267346] CC tests/stat.o
user :notice: [ 59.273288] LD ui/perf-in.o
user :notice: [ 59.279415] CC tests/event_update.o
user :notice: [ 59.286157] CC tests/event-times.o
user :notice: [ 59.292673] CC tests/expr.o
user :notice: [ 59.298749] CC util/metricgroup.o
user :notice: [ 59.305223] CC util/header.o
user :notice: [ 59.311512] CC tests/backward-ring-buffer.o
user :notice: [ 59.318814] CC tests/sdt.o
user :notice: [ 59.327108] /usr/src/perf_selftests-x86_64-rhel-7.6-5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d/tools/build/Makefile.build:96: recipe for target 'util/metricgroup.o' failed
user :notice: [ 59.344991] CC tests/is_printable_array.o
user :notice: [ 59.352277] CC tests/bitmap.o
user :notice: [ 59.358573] CC tests/perf-hooks.o
user :notice: [ 59.365109] CC tests/clang.o
user :notice: [ 59.371510] CC tests/unit_number__scnprintf.o
user :notice: [ 59.379158] CC tests/mem2node.o
user :notice: [ 59.385517] CC tests/maps.o
user :notice: [ 59.391769] CC tests/time-utils-test.o
user :notice: [ 59.398779] CC tests/genelf.o
user :notice: [ 59.405116] CC tests/dwarf-unwind.o
user :notice: [ 59.411975] CC tests/llvm-src-base.o
user :notice: [ 59.418975] CC tests/llvm-src-kbuild.o
user :notice: [ 59.426189] CC tests/llvm-src-prologue.o
user :notice: [ 59.433635] CC tests/llvm-src-relocation.o
user :notice: [ 59.441006] LD tests/perf-in.o
user :err : [ 60.191397] make[3]: *** [util] Error 2
user :notice: [ 60.193667] /usr/src/perf_selftests-x86_64-rhel-7.6-5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d/tools/build/Makefile.build:139: recipe for target 'util' failed
user :err : [ 60.196176] make[2]: *** [perf-in.o] Error 2
user :notice: [ 60.198386] Makefile.perf:616: recipe for target 'perf-in.o' failed
user :err : [ 60.212393] make[1]: *** [sub-make] Error 2
user :notice: [ 60.214587] Makefile.perf:224: recipe for target 'sub-make' failed
user :err : [ 60.218764] make: *** [all] Error 2
user :notice: [ 60.221026] Makefile:69: recipe for target 'all' failed
user :err : [ 60.227253] make perf failed
user :notice: [ 60.230837] make: Leaving directory '/usr/src/perf_selftests-x86_64-rhel-7.6-5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d/tools/perf'
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
Thanks,
Rong Chen
On 3/2/20 6:47 AM, kernel test robot wrote:
> FYI, we noticed the following commit (built with gcc-7):
>
> commit: 5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d ("[PATCH v4] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events")
> url: https://github.com/0day-ci/linux/commits/Kajol-Jain/tools-perf-metricgroup-Fix-printing-event-names-of-metric-group-with-multiple-events-incase-of-overlapping-events/20200215-093220
>
>
> in testcase: perf-sanity-tests
> with following parameters:
>
> perf_compiler: gcc
> ucode: 0x27
>
>
>
> on test machine: 8 threads Intel(R) Core(TM) i7-4790 v3 @ 3.60GHz with 6G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> user :notice: [ 58.688905] LD trace/beauty/perf-in.o
>
> user :err : [ 58.694839] util/metricgroup.c: In function 'metricgroup__setup_events':
>
> user :notice: [ 58.696757] CC util/evlist.o
>
> user :err : [ 58.705739] util/metricgroup.c:170:3: error: break statement not within loop or switch
>
> user :notice: [ 58.710154] LD arch/x86/tests/perf-in.o
>
> user :err : [ 58.720491] break;
>
> user :notice: [ 58.722367] LD arch/x86/perf-in.o
>
> user :err : [ 58.727486] ^~~~~
>
> user :notice: [ 58.729854] LD arch/perf-in.o
>
> user :err : [ 58.734440] mv: cannot stat 'util/.metricgroup.o.tmp': No such file or directory
>
> user :notice: [ 58.738730] CC ui/tui/util.o
>
> user :err : [ 58.741991] make[4]: *** [util/metricgroup.o] Error 1
>
> user :notice: [ 58.743544] CC tests/hists_link.o
>
> user :err : [ 58.746796] make[4]: *** Waiting for unfinished jobs....
>
> user :notice: [ 58.749865] CC ui/browsers/map.o
>
> user :notice: [ 58.796537] CC util/evsel.o
>
> user :notice: [ 58.803941] CC ui/tui/helpline.o
>
> user :notice: [ 58.811723] CC ui/tui/progress.o
>
> user :notice: [ 58.819597] CC ui/browsers/scripts.o
>
> user :notice: [ 58.827810] CC tests/hists_filter.o
>
> user :notice: [ 58.835934] CC ui/browsers/header.o
>
> user :notice: [ 58.843797] LD ui/tui/perf-in.o
>
> user :notice: [ 58.851633] CC tests/hists_output.o
>
> user :notice: [ 58.859662] CC ui/browsers/res_sample.o
>
> user :notice: [ 58.868059] CC tests/hists_cumulate.o
>
> user :notice: [ 58.876298] CC util/evsel_fprintf.o
>
> user :notice: [ 58.884794] CC util/perf_event_attr_fprintf.o
>
> user :notice: [ 58.893616] CC util/evswitch.o
>
> user :notice: [ 58.901051] CC util/find_bit.o
>
> user :notice: [ 58.908790] CC util/get_current_dir_name.o
>
> user :notice: [ 58.917030] CC util/kallsyms.o
>
> user :notice: [ 58.924272] CC tests/python-use.o
>
> user :notice: [ 58.931786] CC util/levenshtein.o
>
> user :notice: [ 58.939306] CC util/llvm-utils.o
>
> user :notice: [ 58.946793] CC util/mmap.o
>
> user :notice: [ 58.953880] CC util/memswap.o
>
> user :notice: [ 58.961527] BISON util/parse-events-bison.c
>
> user :notice: [ 58.969835] CC util/perf_regs.o
>
> user :notice: [ 58.977405] CC tests/bp_signal.o
>
> user :notice: [ 58.984747] CC util/path.o
>
> user :notice: [ 58.991911] CC tests/bp_signal_overflow.o
>
> user :notice: [ 59.000134] CC util/print_binary.o
>
> user :notice: [ 59.007613] CC util/rlimit.o
>
> user :notice: [ 59.014591] CC util/argv_split.o
>
> user :notice: [ 59.021979] CC util/rbtree.o
>
> user :notice: [ 59.028985] CC tests/bp_account.o
>
> user :notice: [ 59.036173] CC tests/wp.o
>
> user :notice: [ 59.042871] CC util/libstring.o
>
> user :notice: [ 59.050011] CC util/bitmap.o
>
> user :notice: [ 59.056858] CC util/hweight.o
>
> user :notice: [ 59.063966] CC tests/task-exit.o
>
> user :notice: [ 59.071298] CC tests/sw-clock.o
>
> user :notice: [ 59.078342] CC util/smt.o
>
> user :notice: [ 59.084981] CC util/strbuf.o
>
> user :notice: [ 59.092311] CC tests/mmap-thread-lookup.o
>
> user :notice: [ 59.100343] CC util/string.o
>
> user :notice: [ 59.107456] CC tests/thread-maps-share.o
>
> user :notice: [ 59.115581] CC tests/switch-tracking.o
>
> user :notice: [ 59.123304] CC util/strlist.o
>
> user :notice: [ 59.130429] CC tests/keep-tracking.o
>
> user :notice: [ 59.138249] CC tests/code-reading.o
>
> user :notice: [ 59.145689] CC util/strfilter.o
>
> user :notice: [ 59.152629] CC util/top.o
>
> user :notice: [ 59.159112] CC util/usage.o
>
> user :notice: [ 59.165621] CC tests/sample-parsing.o
>
> user :notice: [ 59.172446] CC util/dso.o
>
> user :notice: [ 59.178526] CC tests/parse-no-sample-id-all.o
>
> user :notice: [ 59.186126] CC tests/kmod-path.o
>
> user :notice: [ 59.192493] CC util/dsos.o
>
> user :notice: [ 59.198456] CC tests/thread-map.o
>
> user :notice: [ 59.204913] CC util/symbol.o
>
> user :notice: [ 59.210930] CC tests/llvm.o
>
> user :notice: [ 59.217000] CC util/symbol_fprintf.o
>
> user :notice: [ 59.223692] CC tests/bpf.o
>
> user :notice: [ 59.229547] CC util/color.o
>
> user :notice: [ 59.235563] CC tests/topology.o
>
> user :notice: [ 59.241827] CC tests/mem.o
>
> user :notice: [ 59.247704] CC tests/cpumap.o
>
> user :notice: [ 59.253922] CC util/color_config.o
>
> user :notice: [ 59.260604] LD ui/browsers/perf-in.o
>
> user :notice: [ 59.267346] CC tests/stat.o
>
> user :notice: [ 59.273288] LD ui/perf-in.o
>
> user :notice: [ 59.279415] CC tests/event_update.o
>
> user :notice: [ 59.286157] CC tests/event-times.o
>
> user :notice: [ 59.292673] CC tests/expr.o
>
> user :notice: [ 59.298749] CC util/metricgroup.o
>
> user :notice: [ 59.305223] CC util/header.o
>
> user :notice: [ 59.311512] CC tests/backward-ring-buffer.o
>
> user :notice: [ 59.318814] CC tests/sdt.o
>
> user :notice: [ 59.327108] /usr/src/perf_selftests-x86_64-rhel-7.6-5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d/tools/build/Makefile.build:96: recipe for target 'util/metricgroup.o' failed
>
> user :notice: [ 59.344991] CC tests/is_printable_array.o
>
> user :notice: [ 59.352277] CC tests/bitmap.o
>
> user :notice: [ 59.358573] CC tests/perf-hooks.o
>
> user :notice: [ 59.365109] CC tests/clang.o
>
> user :notice: [ 59.371510] CC tests/unit_number__scnprintf.o
>
> user :notice: [ 59.379158] CC tests/mem2node.o
>
> user :notice: [ 59.385517] CC tests/maps.o
>
> user :notice: [ 59.391769] CC tests/time-utils-test.o
>
> user :notice: [ 59.398779] CC tests/genelf.o
>
> user :notice: [ 59.405116] CC tests/dwarf-unwind.o
>
> user :notice: [ 59.411975] CC tests/llvm-src-base.o
>
> user :notice: [ 59.418975] CC tests/llvm-src-kbuild.o
>
> user :notice: [ 59.426189] CC tests/llvm-src-prologue.o
>
> user :notice: [ 59.433635] CC tests/llvm-src-relocation.o
>
> user :notice: [ 59.441006] LD tests/perf-in.o
>
> user :err : [ 60.191397] make[3]: *** [util] Error 2
>
> user :notice: [ 60.193667] /usr/src/perf_selftests-x86_64-rhel-7.6-5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d/tools/build/Makefile.build:139: recipe for target 'util' failed
>
> user :err : [ 60.196176] make[2]: *** [perf-in.o] Error 2
>
> user :notice: [ 60.198386] Makefile.perf:616: recipe for target 'perf-in.o' failed
>
> user :err : [ 60.212393] make[1]: *** [sub-make] Error 2
>
> user :notice: [ 60.214587] Makefile.perf:224: recipe for target 'sub-make' failed
>
> user :err : [ 60.218764] make: *** [all] Error 2
>
> user :notice: [ 60.221026] Makefile:69: recipe for target 'all' failed
>
> user :err : [ 60.227253] make perf failed
>
> user :notice: [ 60.230837] make: Leaving directory '/usr/src/perf_selftests-x86_64-rhel-7.6-5dd4f4ab07f8257f1d113cc2b9ee95f43d576b9d/tools/perf'
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp run job.yaml
>
>
Hi Rong Chen,
You are right, I got that issue in this patch. I already send fix for that. You can check v6 of this patch.
Please let me know if still you found some issue.
Link to the patch: https://lkml.org/lkml/2020/2/21/1166
Thanks,
Kajol
>
> Thanks,
> Rong Chen
>
On 2/20/20 4:06 PM, Joakim Zhang wrote:
>
>> -----Original Message-----
>> From: kajoljain <[email protected]>
>> Sent: 2020??2??20?? 17:54
>> To: Joakim Zhang <[email protected]>; [email protected]
>> Cc: [email protected]; [email protected]; Jiri Olsa
>> <[email protected]>; Alexander Shishkin <[email protected]>;
>> Andi Kleen <[email protected]>; Kan Liang <[email protected]>; Peter
>> Zijlstra <[email protected]>; Jin Yao <[email protected]>; Madhavan
>> Srinivasan <[email protected]>; Anju T Sudhakar
>> <[email protected]>; Ravi Bangoria <[email protected]>
>> Subject: Re: [PATCH v4] tools/perf/metricgroup: Fix printing event names of
>> metric group with multiple events incase of overlapping events
>>
>>
>>
>> On 2/17/20 8:41 AM, Joakim Zhang wrote:
>>>
>>>> -----Original Message-----
>>>> From: [email protected]
>>>> <[email protected]> On Behalf Of Kajol Jain
>>>> Sent: 2020??2??12?? 13:41
>>>> To: [email protected]
>>>> Cc: [email protected]; [email protected];
>>>> [email protected]; Jiri Olsa <[email protected]>; Alexander Shishkin
>>>> <[email protected]>; Andi Kleen
>>>> <[email protected]>; Kan Liang <[email protected]>; Peter
>>>> Zijlstra <[email protected]>; Jin Yao <[email protected]>;
>>>> Madhavan Srinivasan <[email protected]>; Anju T Sudhakar
>>>> <[email protected]>; Ravi Bangoria
>>>> <[email protected]>
>>>> Subject: [PATCH v4] tools/perf/metricgroup: Fix printing event names
>>>> of metric group with multiple events incase of overlapping events
>>>>
>>>> Commit f01642e4912b ("perf metricgroup: Support multiple events for
>>>> metricgroup") introduced support for multiple events in a metric
>>>> group. But with the current upstream, metric events names are not
>>>> printed properly incase we try to run multiple metric groups with
>> overlapping event.
>>>>
>>>> With current upstream version, incase of overlapping metric events
>>>> issue is, we always start our comparision logic from start.
>>>> So, the events which already matched with some metric group also take
>>>> part in comparision logic. Because of that when we have overlapping
>>>> events, we end up matching current metric group event with already
>> matched one.
>>>>
>>>> For example, in skylake machine we have metric event CoreIPC and
>>>> Instructions. Both of them need 'inst_retired.any' event value.
>>>> As events in Instructions is subset of events in CoreIPC, they endup
>>>> in pointing to same 'inst_retired.any' value.
>>>>
>>>> In skylake platform:
>>>>
>>>> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
>>>>
>>>> Performance counter stats for 'CPU(s) 0':
>>>>
>>>> 1,254,992,790 inst_retired.any # 1254992790.0
>>>>
>> Instructions
>>>> # 1.3
>>>> CoreIPC
>>>> 977,172,805 cycles
>>>> 1,254,992,756 inst_retired.any
>>>>
>>>> 1.000802596 seconds time elapsed
>>>>
>>>> command:# sudo ./perf stat -M UPI,IPC sleep 1
>>>>
>>>> Performance counter stats for 'sleep 1':
>>>>
>>>> 948,650 uops_retired.retire_slots
>>>> 866,182 inst_retired.any # 0.7 IPC
>>>> 866,182 inst_retired.any
>>>> 1,175,671 cpu_clk_unhalted.thread
>>>>
>>>> Patch fixes the issue by adding a new bool pointer 'evlist_used' to
>>>> keep track of events which already matched with some group by setting it
>> true.
>>>> So, we skip all used events in list when we start comparision logic.
>>>> Patch also make some changes in comparision logic, incase we get a
>>>> match miss, we discard the whole match and start again with first
>>>> event id in metric event.
>>>>
>>>> With this patch:
>>>> In skylake platform:
>>>>
>>>> command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
>>>>
>>>> Performance counter stats for 'CPU(s) 0':
>>>>
>>>> 3,348,415 inst_retired.any # 0.3
>> CoreIPC
>>>> 11,779,026 cycles
>>>> 3,348,381 inst_retired.any # 3348381.0
>>>>
>> Instructions
>>>>
>>>> 1.001649056 seconds time elapsed
>>>>
>>>> command:# ./perf stat -M UPI,IPC sleep 1
>>>>
>>>> Performance counter stats for 'sleep 1':
>>>>
>>>> 1,023,148 uops_retired.retire_slots # 1.1 UPI
>>>> 924,976 inst_retired.any
>>>> 924,976 inst_retired.any # 0.6 IPC
>>>> 1,489,414 cpu_clk_unhalted.thread
>>>>
>>>> 1.003064672 seconds time elapsed
>>>>
>>>> Signed-off-by: Kajol Jain <[email protected]>
>>>> Cc: Jiri Olsa <[email protected]>
>>>> Cc: Alexander Shishkin <[email protected]>
>>>> Cc: Andi Kleen <[email protected]>
>>>> Cc: Kan Liang <[email protected]>
>>>> Cc: Peter Zijlstra <[email protected]>
>>>> Cc: Jin Yao <[email protected]>
>>>> Cc: Madhavan Srinivasan <[email protected]>
>>>> Cc: Anju T Sudhakar <[email protected]>
>>>> Cc: Ravi Bangoria <[email protected]>
>>>> ---
>>>> tools/perf/util/metricgroup.c | 50
>>>> ++++++++++++++++++++++-------------
>>>> 1 file changed, 31 insertions(+), 19 deletions(-)
>>>
>>> Hi Kajol,
>>>
>>> I am not sure if it is good to ask a question here :-)
>>>
>>> I encountered a perf metricgroup issue, the result is incorrect when the
>> metric includes more than 2 events.
>>>
>>> git log --oneline tools/perf/util/metricgroup.c
>>> 3635b27cc058 perf metricgroup: Fix printing event names of metric
>>> group with multiple events f01642e4912b perf metricgroup: Support
>>> multiple events for metricgroup
>>> 287f2649f791 perf metricgroup: Scale the metric result
>>>
>>> I did a simple test, below is the JSON file and result.
>>> [
>>> {
>>> "PublicDescription": "Calculate DDR0 bus actual utilization
>> which vary from DDR0 controller clock frequency",
>>> "BriefDescription": "imx8qm: ddr0 bus actual utilization",
>>> "MetricName": "imx8qm-ddr0-bus-util",
>>> "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
>> imx8_ddr0\\/write\\-cycles\\/ )",
>>> "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
>>> }
>>> ]
>>> ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
>>> # time counts unit events
>>> 1.000104250 16720 imx8_ddr0/read-cycles/
>> # 22921.0 imx8qm-ddr0-bus-util
>>> 1.000104250 6201 imx8_ddr0/write-cycles/
>>> 2.000525625 8316 imx8_ddr0/read-cycles/
>> # 12785.5 imx8qm-ddr0-bus-util
>>> 2.000525625 2738 imx8_ddr0/write-cycles/
>>> 3.000819125 1056 imx8_ddr0/read-cycles/
>> # 4136.7 imx8qm-ddr0-bus-util
>>> 3.000819125 303 imx8_ddr0/write-cycles/
>>> 4.001103750 6260 imx8_ddr0/read-cycles/
>> # 9149.8 imx8qm-ddr0-bus-util
>>> 4.001103750 2317 imx8_ddr0/write-cycles/
>>> 5.001392750 2084 imx8_ddr0/read-cycles/
>> # 4516.0 imx8qm-ddr0-bus-util
>>> 5.001392750 601 imx8_ddr0/write-cycles/
>>>
>>> You can see that only the first result is correct, could this be reproduced at
>> you side?
>>
>> Hi Joakim,
>> Will try to look into it from my side.
>
> Thanks Kajol for your help, I look into this issue, but don't know how to fix it.
>
> The results are always correct if signal event used in "MetricExpr" with "-I" parameters, but the results are incorrect when more than one events used in "MetricExpr".
>
Hi Joakim,
So, I try to look into this issue and understand the flow. From my understanding, whenever we do
calculation of metric expression we don't use exact count we are getting.
Basically we use mean value of each event in the calculation of metric expression.
So, I am taking same example you refer.
Metric Event: imx8qm-ddr0-bus-util
MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )"
command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
# time counts unit events
1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
1.000104250 6201 imx8_ddr0/write-cycles/
2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
2.000525625 2738 imx8_ddr0/write-cycles/
3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
3.000819125 303 imx8_ddr0/write-cycles/
4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
4.001103750 2317 imx8_ddr0/write-cycles/
5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
5.001392750 601 imx8_ddr0/write-cycles/
If you see we have a function called 'update_stats' in file util/stat.c where we do this calculation
and updating stats->mean value. And this mean value is what we are using actually in our
metric expression calculation.
We call this function in each iteration where we update stats->mean and stats->n for each event.
But one weird issue is, for very first event, stat->n is always 1 that is why we are getting
mean same as count.
So this is the reason for single event you get exact aggregate of metric expression.
So doesn't matter how many events you have in your metric expression, every time
you take exact count for first one and normalized value for rest which is weird.
According to update_stats function: We are updating mean as:
stats->mean += delta / stats->n where, delta = val - stats->mean.
If we take write-cycles here. Initially mean = 0 and n = 1.
1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg value: 16720 + 6201 = 22921)
2nd iteration: n=2, write cycles: 6201 + (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 = 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
Andi and Jiri, I am not sure if its expected behavior. I mean shouldn't we either take mean value of each event
or take n as 1 for each event. And one more question, Should we add an option to say whether user want exact aggregate or
this normalize aggregate to remove the confusion? I try to find it out if we already have one but didn't get.
Please let me know if my understanding is fine.
Thanks,
Kajol
> Hope you can find the root cause :-)
>
> Best Regards,
> Joakim Zhang
>> Thanks,
>> Kajol
>>>
>>> Thanks a lot!
>>>
>>> Best Regards,
>>> Joakim Zhang
>>>
[...]
> >>> Hi Kajol,
> >>>
> >>> I am not sure if it is good to ask a question here :-)
> >>>
> >>> I encountered a perf metricgroup issue, the result is incorrect when
> >>> the
> >> metric includes more than 2 events.
> >>>
> >>> git log --oneline tools/perf/util/metricgroup.c
> >>> 3635b27cc058 perf metricgroup: Fix printing event names of metric
> >>> group with multiple events f01642e4912b perf metricgroup: Support
> >>> multiple events for metricgroup
> >>> 287f2649f791 perf metricgroup: Scale the metric result
> >>>
> >>> I did a simple test, below is the JSON file and result.
> >>> [
> >>> {
> >>> "PublicDescription": "Calculate DDR0 bus actual
> >>> utilization
> >> which vary from DDR0 controller clock frequency",
> >>> "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> >>> "MetricName": "imx8qm-ddr0-bus-util",
> >>> "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> >> imx8_ddr0\\/write\\-cycles\\/ )",
> >>> "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> >>> }
> >>> ]
> >>> ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> >>> # time counts unit events
> >>> 1.000104250 16720 imx8_ddr0/read-cycles/
> >> # 22921.0 imx8qm-ddr0-bus-util
> >>> 1.000104250 6201
> imx8_ddr0/write-cycles/
> >>> 2.000525625 8316 imx8_ddr0/read-cycles/
> >> # 12785.5 imx8qm-ddr0-bus-util
> >>> 2.000525625 2738
> imx8_ddr0/write-cycles/
> >>> 3.000819125 1056 imx8_ddr0/read-cycles/
> >> # 4136.7 imx8qm-ddr0-bus-util
> >>> 3.000819125 303
> imx8_ddr0/write-cycles/
> >>> 4.001103750 6260 imx8_ddr0/read-cycles/
> >> # 9149.8 imx8qm-ddr0-bus-util
> >>> 4.001103750 2317
> imx8_ddr0/write-cycles/
> >>> 5.001392750 2084 imx8_ddr0/read-cycles/
> >> # 4516.0 imx8qm-ddr0-bus-util
> >>> 5.001392750 601
> imx8_ddr0/write-cycles/
> >>>
> >>> You can see that only the first result is correct, could this be
> >>> reproduced at
> >> you side?
> >>
> >> Hi Joakim,
> >> Will try to look into it from my side.
> >
>
> > Thanks Kajol for your help, I look into this issue, but don't know how to fix it.
> >
> > The results are always correct if signal event used in "MetricExpr" with "-I"
> parameters, but the results are incorrect when more than one events used in
> "MetricExpr".
> >
>
> Hi Joakim,
> So, I try to look into this issue and understand the flow. From my
> understanding, whenever we do
> calculation of metric expression we don't use exact count we are getting.
> Basically we use mean value of each event in the calculation of metric
> expression.
>
> So, I am taking same example you refer.
>
> Metric Event: imx8qm-ddr0-bus-util
> MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )"
>
> command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
>
> # time counts unit events
> 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> 1.000104250 6201 imx8_ddr0/write-cycles/
> 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> 2.000525625 2738 imx8_ddr0/write-cycles/
> 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> 3.000819125 303 imx8_ddr0/write-cycles/
> 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> 4.001103750 2317 imx8_ddr0/write-cycles/
> 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> 5.001392750 601 imx8_ddr0/write-cycles/
>
> If you see we have a function called 'update_stats' in file util/stat.c where we
> do this calculation and updating stats->mean value. And this mean value is
> what we are using actually in our metric expression calculation.
>
> We call this function in each iteration where we update stats->mean and
> stats->n for each event.
> But one weird issue is, for very first event, stat->n is always 1 that is why we
> are getting mean same as count.
> So this is the reason for single event you get exact aggregate of metric
> expression.
> So doesn't matter how many events you have in your metric expression, every
> time you take exact count for first one and normalized value for rest which is
> weird.
>
> According to update_stats function: We are updating mean as:
>
> stats->mean += delta / stats->n where, delta = val - stats->mean.
>
> If we take write-cycles here. Initially mean = 0 and n = 1.
>
> 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg value: 16720
> + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 + (2738 - 6201)/2 =
> 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5) 3rd iteration: n=3, write
> cycles: 4469.5 + (303 - 4469.5)/3 = 3080.6667 (Final aggr value: 1056 +
> 3080.6667 = 4136.7)
>
> Andi and Jiri, I am not sure if its expected behavior. I mean shouldn't we either
> take mean value of each event or take n as 1 for each event. And one more
> question, Should we add an option to say whether user want exact aggregate
> or this normalize aggregate to remove the confusion? I try to find it out if we
> already have one but didn't get.
> Please let me know if my understanding is fine.
Hi Kajol,
Sorry, your reply was buried in a sea of emails, it comes into my eyes when I searched any feedback from you. Much thanks for your great details!!!!!
I can quite understand what you explained. As a user, I think we always want to get the exact result according to the metric expression.
Can you take this case as an example then send out a formal email into mailing list to reflect this weird issue, more people can participate and discuss about it. Or you need me clear up and sent out the email?
This could attract maintainers' attention.
Best Regards,
Joakim Zhang
On 3/20/20 12:17 PM, Joakim Zhang wrote:
>
> [...]
>>>>> Hi Kajol,
>>>>>
>>>>> I am not sure if it is good to ask a question here :-)
>>>>>
>>>>> I encountered a perf metricgroup issue, the result is incorrect when
>>>>> the
>>>> metric includes more than 2 events.
>>>>>
>>>>> git log --oneline tools/perf/util/metricgroup.c
>>>>> 3635b27cc058 perf metricgroup: Fix printing event names of metric
>>>>> group with multiple events f01642e4912b perf metricgroup: Support
>>>>> multiple events for metricgroup
>>>>> 287f2649f791 perf metricgroup: Scale the metric result
>>>>>
>>>>> I did a simple test, below is the JSON file and result.
>>>>> [
>>>>> {
>>>>> "PublicDescription": "Calculate DDR0 bus actual
>>>>> utilization
>>>> which vary from DDR0 controller clock frequency",
>>>>> "BriefDescription": "imx8qm: ddr0 bus actual utilization",
>>>>> "MetricName": "imx8qm-ddr0-bus-util",
>>>>> "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
>>>> imx8_ddr0\\/write\\-cycles\\/ )",
>>>>> "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
>>>>> }
>>>>> ]
>>>>> ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
>>>>> # time counts unit events
>>>>> 1.000104250 16720 imx8_ddr0/read-cycles/
>>>> # 22921.0 imx8qm-ddr0-bus-util
>>>>> 1.000104250 6201
>> imx8_ddr0/write-cycles/
>>>>> 2.000525625 8316 imx8_ddr0/read-cycles/
>>>> # 12785.5 imx8qm-ddr0-bus-util
>>>>> 2.000525625 2738
>> imx8_ddr0/write-cycles/
>>>>> 3.000819125 1056 imx8_ddr0/read-cycles/
>>>> # 4136.7 imx8qm-ddr0-bus-util
>>>>> 3.000819125 303
>> imx8_ddr0/write-cycles/
>>>>> 4.001103750 6260 imx8_ddr0/read-cycles/
>>>> # 9149.8 imx8qm-ddr0-bus-util
>>>>> 4.001103750 2317
>> imx8_ddr0/write-cycles/
>>>>> 5.001392750 2084 imx8_ddr0/read-cycles/
>>>> # 4516.0 imx8qm-ddr0-bus-util
>>>>> 5.001392750 601
>> imx8_ddr0/write-cycles/
>>>>>
>>>>> You can see that only the first result is correct, could this be
>>>>> reproduced at
>>>> you side?
>>>>
>>>> Hi Joakim,
>>>> Will try to look into it from my side.
>>>
>>
>>> Thanks Kajol for your help, I look into this issue, but don't know how to fix it.
>>>
>>> The results are always correct if signal event used in "MetricExpr" with "-I"
>> parameters, but the results are incorrect when more than one events used in
>> "MetricExpr".
>>>
>>
>> Hi Joakim,
>> So, I try to look into this issue and understand the flow. From my
>> understanding, whenever we do
>> calculation of metric expression we don't use exact count we are getting.
>> Basically we use mean value of each event in the calculation of metric
>> expression.
>>
>> So, I am taking same example you refer.
>>
>> Metric Event: imx8qm-ddr0-bus-util
>> MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )"
>>
>> command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
>>
>> # time counts unit events
>> 1.000104250 16720 imx8_ddr0/read-cycles/
>> # 22921.0 imx8qm-ddr0-bus-util
>> 1.000104250 6201 imx8_ddr0/write-cycles/
>> 2.000525625 8316 imx8_ddr0/read-cycles/
>> # 12785.5 imx8qm-ddr0-bus-util
>> 2.000525625 2738 imx8_ddr0/write-cycles/
>> 3.000819125 1056 imx8_ddr0/read-cycles/
>> # 4136.7 imx8qm-ddr0-bus-util
>> 3.000819125 303 imx8_ddr0/write-cycles/
>> 4.001103750 6260 imx8_ddr0/read-cycles/
>> # 9149.8 imx8qm-ddr0-bus-util
>> 4.001103750 2317 imx8_ddr0/write-cycles/
>> 5.001392750 2084 imx8_ddr0/read-cycles/
>> # 4516.0 imx8qm-ddr0-bus-util
>> 5.001392750 601 imx8_ddr0/write-cycles/
>>
>> If you see we have a function called 'update_stats' in file util/stat.c where we
>> do this calculation and updating stats->mean value. And this mean value is
>> what we are using actually in our metric expression calculation.
>>
>> We call this function in each iteration where we update stats->mean and
>> stats->n for each event.
>> But one weird issue is, for very first event, stat->n is always 1 that is why we
>> are getting mean same as count.
>> So this is the reason for single event you get exact aggregate of metric
>> expression.
>> So doesn't matter how many events you have in your metric expression, every
>> time you take exact count for first one and normalized value for rest which is
>> weird.
>>
>> According to update_stats function: We are updating mean as:
>>
>> stats->mean += delta / stats->n where, delta = val - stats->mean.
>>
>> If we take write-cycles here. Initially mean = 0 and n = 1.
>>
>> 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg value: 16720
>> + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 + (2738 - 6201)/2 =
>> 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5) 3rd iteration: n=3, write
>> cycles: 4469.5 + (303 - 4469.5)/3 = 3080.6667 (Final aggr value: 1056 +
>> 3080.6667 = 4136.7)
>>
>> Andi and Jiri, I am not sure if its expected behavior. I mean shouldn't we either
>> take mean value of each event or take n as 1 for each event. And one more
>> question, Should we add an option to say whether user want exact aggregate
>> or this normalize aggregate to remove the confusion? I try to find it out if we
>> already have one but didn't get.
>> Please let me know if my understanding is fine.
>
> Hi Kajol,
>
> Sorry, your reply was buried in a sea of emails, it comes into my eyes when I searched any feedback from you. Much thanks for your great details!!!!!
>
> I can quite understand what you explained. As a user, I think we always want to get the exact result according to the metric expression.
>
> Can you take this case as an example then send out a formal email into mailing list to reflect this weird issue, more people can participate and discuss about it. Or you need me clear up and sent out the email?
> This could attract maintainers' attention.
>
Hi Joakim,
Yes you are right, I will send separate mail on lkml. That way it will be better.
Thanks,
Kajol
> Best Regards,
> Joakim Zhang
>
Hello All,
I want to discuss one issue raised by Joakim Zhang where he mentioned
that, we are not getting correct result in-case of multiple events present in metric
expression.
This is one example pointed by him :
below is the JSON file and result.
[
{
"PublicDescription": "Calculate DDR0 bus actual utilization which vary from DDR0 controller clock frequency",
"BriefDescription": "imx8qm: ddr0 bus actual utilization",
"MetricName": "imx8qm-ddr0-bus-util",
"MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )",
"MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
}
]
./perf stat -I 1000 -M imx8qm-ddr0-bus-util
# time counts unit events
1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
1.000104250 6201 imx8_ddr0/write-cycles/
2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
2.000525625 2738 imx8_ddr0/write-cycles/
3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
3.000819125 303 imx8_ddr0/write-cycles/
4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
4.001103750 2317 imx8_ddr0/write-cycles/
5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
5.001392750 601 imx8_ddr0/write-cycles/
Based on given metric expression, the sum coming correct for first iteration while for
rest, we won't see same addition result. But in-case we have single event in metric
expression, we are getting correct result as expected.
So, I try to look into this issue and understand the flow. From my understanding, whenever we do
calculation of metric expression we don't use exact count we are getting.
Basically we use mean value of each metric event in the calculation of metric expression.
So, I take same example:
Metric Event: imx8qm-ddr0-bus-util
MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )"
command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
# time counts unit events
1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
1.000104250 6201 imx8_ddr0/write-cycles/
2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
2.000525625 2738 imx8_ddr0/write-cycles/
3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
3.000819125 303 imx8_ddr0/write-cycles/
4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
4.001103750 2317 imx8_ddr0/write-cycles/
5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
5.001392750 601 imx8_ddr0/write-cycles/
So, there is one function called 'update_stats' in file util/stat.c where we do this calculation
and updating stats->mean value. And this mean value is what we are actually using in our
metric expression calculation.
We call this function in each iteration where we update stats->mean and stats->n for each event.
But one weird issue is, for very first event, stat->n is always 1 that is why we are getting
mean same as count.
So this the reason why for single event we get exact aggregate of metric expression.
So doesn't matter how many events you have in your metric expression, every time
you take exact count for first one and normalized value for rest which is weird.
According to update_stats function: We are updating mean as:
stats->mean += delta / stats->n where, delta = val - stats->mean.
If we take write-cycles here. Initially mean = 0 and n = 1.
1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg value: 16720 + 6201 = 22921)
2nd iteration: n=2, write cycles: 6201 + (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 = 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
I am not sure if its expected behavior. I mean shouldn't we either take mean value of each event
or take n as 1 for each event.
I am thinking, Should we add an option to say whether user want exact aggregate or
this normalize aggregate to remove the confusion? I try to find it out if we already have one but didn't get.
Please let me know if my understanding is fine. Or something I can add to resolve this issue.
Thanks,
Kajol
Hi Arnaldo,
Kajol reflects this issue for almost two months, got no feedbacks, do you have any comments? That could be appreciated if you can look into it. Thanks a lot!
Please refer to below link:
https://www.spinics.net/lists/linux-perf-users/msg11011.html
Best Regards,
Joakim Zhang
> -----Original Message-----
> From: kajoljain <[email protected]>
> Sent: 2020年3月24日 16:01
> To: Joakim Zhang <[email protected]>; [email protected]; Jiri Olsa
> <[email protected]>; Andi Kleen <[email protected]>
> Cc: [email protected]; [email protected]; Kan Liang
> <[email protected]>; Madhavan Srinivasan
> <[email protected]>; Anju T Sudhakar <[email protected]>;
> Ravi Bangoria <[email protected]>
> Subject: [RFC] Issue in final aggregate value, in case of multiple events present
> in metric expression
>
> Hello All,
> I want to discuss one issue raised by Joakim Zhang where he mentioned
> that, we are not getting correct result in-case of multiple events present in
> metric expression.
>
> This is one example pointed by him :
>
> below is the JSON file and result.
> [
> {
> "PublicDescription": "Calculate DDR0 bus actual utilization
> which vary from DDR0 controller clock frequency",
> "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> "MetricName": "imx8qm-ddr0-bus-util",
> "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )",
> "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> }
> ]
> ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> # time counts unit events
> 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> 1.000104250 6201 imx8_ddr0/write-cycles/
> 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> 2.000525625 2738 imx8_ddr0/write-cycles/
> 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> 3.000819125 303 imx8_ddr0/write-cycles/
> 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> 4.001103750 2317 imx8_ddr0/write-cycles/
> 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> 5.001392750 601 imx8_ddr0/write-cycles/
>
> Based on given metric expression, the sum coming correct for first iteration
> while for rest, we won't see same addition result. But in-case we have single
> event in metric expression, we are getting correct result as expected.
>
>
> So, I try to look into this issue and understand the flow. From my understanding,
> whenever we do calculation of metric expression we don't use exact count we
> are getting.
> Basically we use mean value of each metric event in the calculation of metric
> expression.
>
> So, I take same example:
>
> Metric Event: imx8qm-ddr0-bus-util
> MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )"
>
> command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
>
> # time counts unit events
> 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> 1.000104250 6201 imx8_ddr0/write-cycles/
> 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> 2.000525625 2738 imx8_ddr0/write-cycles/
> 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> 3.000819125 303 imx8_ddr0/write-cycles/
> 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> 4.001103750 2317 imx8_ddr0/write-cycles/
> 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> 5.001392750 601 imx8_ddr0/write-cycles/
>
> So, there is one function called 'update_stats' in file util/stat.c where we do this
> calculation and updating stats->mean value. And this mean value is what we
> are actually using in our metric expression calculation.
>
> We call this function in each iteration where we update stats->mean and
> stats->n for each event.
> But one weird issue is, for very first event, stat->n is always 1 that is why we
> are getting mean same as count.
>
> So this the reason why for single event we get exact aggregate of metric
> expression.
> So doesn't matter how many events you have in your metric expression, every
> time you take exact count for first one and normalized value for rest which is
> weird.
>
> According to update_stats function: We are updating mean as:
>
> stats->mean += delta / stats->n where, delta = val - stats->mean.
>
> If we take write-cycles here. Initially mean = 0 and n = 1.
>
> 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg value: 16720
> + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 + (2738 - 6201)/2 =
> 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5) 3rd iteration: n=3, write
> cycles: 4469.5 + (303 - 4469.5)/3 = 3080.6667 (Final aggr value: 1056 +
> 3080.6667 = 4136.7)
>
> I am not sure if its expected behavior. I mean shouldn't we either take mean
> value of each event or take n as 1 for each event.
>
>
> I am thinking, Should we add an option to say whether user want exact
> aggregate or
> this normalize aggregate to remove the confusion? I try to find it out if we
> already have one but didn't get.
> Please let me know if my understanding is fine. Or something I can add to
> resolve this issue.
>
> Thanks,
> Kajol
On Sat, May 09, 2020 at 05:52:53AM +0000, Joakim Zhang wrote:
>
> Hi Arnaldo,
>
> Kajol reflects this issue for almost two months, got no feedbacks, do you have any comments? That could be appreciated if you can look into it. Thanks a lot!
Jin Yao did a lot of fixes to metrics recently. Does it still happen with
the latest tip tree?
-Andi
Hi Kajol,
On 3/24/2020 4:00 PM, kajoljain wrote:
> Hello All,
> I want to discuss one issue raised by Joakim Zhang where he mentioned
> that, we are not getting correct result in-case of multiple events present in metric
> expression.
>
> This is one example pointed by him :
>
> below is the JSON file and result.
> [
> {
> "PublicDescription": "Calculate DDR0 bus actual utilization which vary from DDR0 controller clock frequency",
> "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> "MetricName": "imx8qm-ddr0-bus-util",
> "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )",
> "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> }
> ]
> ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> # time counts unit events
> 1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
> 1.000104250 6201 imx8_ddr0/write-cycles/
> 2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
> 2.000525625 2738 imx8_ddr0/write-cycles/
> 3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
> 3.000819125 303 imx8_ddr0/write-cycles/
> 4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
> 4.001103750 2317 imx8_ddr0/write-cycles/
> 5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
> 5.001392750 601 imx8_ddr0/write-cycles/
>
> Based on given metric expression, the sum coming correct for first iteration while for
> rest, we won't see same addition result. But in-case we have single event in metric
> expression, we are getting correct result as expected.
>
>
> So, I try to look into this issue and understand the flow. From my understanding, whenever we do
> calculation of metric expression we don't use exact count we are getting.
> Basically we use mean value of each metric event in the calculation of metric expression.
>
> So, I take same example:
>
> Metric Event: imx8qm-ddr0-bus-util
> MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )"
>
> command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
>
> # time counts unit events
> 1.000104250 16720 imx8_ddr0/read-cycles/ # 22921.0 imx8qm-ddr0-bus-util
> 1.000104250 6201 imx8_ddr0/write-cycles/
> 2.000525625 8316 imx8_ddr0/read-cycles/ # 12785.5 imx8qm-ddr0-bus-util
> 2.000525625 2738 imx8_ddr0/write-cycles/
> 3.000819125 1056 imx8_ddr0/read-cycles/ # 4136.7 imx8qm-ddr0-bus-util
> 3.000819125 303 imx8_ddr0/write-cycles/
> 4.001103750 6260 imx8_ddr0/read-cycles/ # 9149.8 imx8qm-ddr0-bus-util
> 4.001103750 2317 imx8_ddr0/write-cycles/
> 5.001392750 2084 imx8_ddr0/read-cycles/ # 4516.0 imx8qm-ddr0-bus-util
> 5.001392750 601 imx8_ddr0/write-cycles/
>
> So, there is one function called 'update_stats' in file util/stat.c where we do this calculation
> and updating stats->mean value. And this mean value is what we are actually using in our
> metric expression calculation.
>
> We call this function in each iteration where we update stats->mean and stats->n for each event.
> But one weird issue is, for very first event, stat->n is always 1 that is why we are getting
> mean same as count.
>
> So this the reason why for single event we get exact aggregate of metric expression.
> So doesn't matter how many events you have in your metric expression, every time
> you take exact count for first one and normalized value for rest which is weird.
>
> According to update_stats function: We are updating mean as:
>
> stats->mean += delta / stats->n where, delta = val - stats->mean.
>
> If we take write-cycles here. Initially mean = 0 and n = 1.
>
> 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg value: 16720 + 6201 = 22921)
> 2nd iteration: n=2, write cycles: 6201 + (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
> 3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 = 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
>
> I am not sure if its expected behavior. I mean shouldn't we either take mean value of each event
> or take n as 1 for each event.
>
>
> I am thinking, Should we add an option to say whether user want exact aggregate or
> this normalize aggregate to remove the confusion? I try to find it out if we already have one but didn't get.
> Please let me know if my understanding is fine. Or something I can add to resolve this issue.
>
> Thanks,
> Kajol
>
Since you use the interval mode, can this commit fix the issue?
http://lore.kernel.org/lkml/[email protected]
Thanks
Jin Yao
> -----Original Message-----
> From: Jin, Yao <[email protected]>
> Sent: 2020年5月11日 9:12
> To: kajoljain <[email protected]>; Joakim Zhang
> <[email protected]>; [email protected]; Jiri Olsa <[email protected]>;
> Andi Kleen <[email protected]>
> Cc: [email protected]; [email protected]; Kan Liang
> <[email protected]>; Madhavan Srinivasan
> <[email protected]>; Anju T Sudhakar <[email protected]>;
> Ravi Bangoria <[email protected]>
> Subject: Re: [RFC] Issue in final aggregate value, in case of multiple events
> present in metric expression
>
> Hi Kajol,
>
> On 3/24/2020 4:00 PM, kajoljain wrote:
> > Hello All,
> > I want to discuss one issue raised by Joakim Zhang where he mentioned
> > that, we are not getting correct result in-case of multiple events
> > present in metric expression.
> >
> > This is one example pointed by him :
> >
> > below is the JSON file and result.
> > [
> > {
> > "PublicDescription": "Calculate DDR0 bus actual utilization
> which vary from DDR0 controller clock frequency",
> > "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> > "MetricName": "imx8qm-ddr0-bus-util",
> > "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )",
> > "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> > }
> > ]
> > ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > Based on given metric expression, the sum coming correct for first
> > iteration while for rest, we won't see same addition result. But
> > in-case we have single event in metric expression, we are getting correct
> result as expected.
> >
> >
> > So, I try to look into this issue and understand the flow. From my
> > understanding, whenever we do calculation of metric expression we don't use
> exact count we are getting.
> > Basically we use mean value of each metric event in the calculation of metric
> expression.
> >
> > So, I take same example:
> >
> > Metric Event: imx8qm-ddr0-bus-util
> > MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )"
> >
> > command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> >
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > So, there is one function called 'update_stats' in file util/stat.c
> > where we do this calculation and updating stats->mean value. And this
> > mean value is what we are actually using in our metric expression calculation.
> >
> > We call this function in each iteration where we update stats->mean and
> stats->n for each event.
> > But one weird issue is, for very first event, stat->n is always 1 that
> > is why we are getting mean same as count.
> >
> > So this the reason why for single event we get exact aggregate of metric
> expression.
> > So doesn't matter how many events you have in your metric expression,
> > every time you take exact count for first one and normalized value for rest
> which is weird.
> >
> > According to update_stats function: We are updating mean as:
> >
> > stats->mean += delta / stats->n where, delta = val - stats->mean.
> >
> > If we take write-cycles here. Initially mean = 0 and n = 1.
> >
> > 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg
> > value: 16720 + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 +
> > (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
> > 3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 =
> > 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
> >
> > I am not sure if its expected behavior. I mean shouldn't we either
> > take mean value of each event or take n as 1 for each event.
> >
> >
> > I am thinking, Should we add an option to say whether user want exact
> > aggregate or this normalize aggregate to remove the confusion? I try to find
> it out if we already have one but didn't get.
> > Please let me know if my understanding is fine. Or something I can add to
> resolve this issue.
> >
> > Thanks,
> > Kajol
> >
>
> Since you use the interval mode, can this commit fix the issue?
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flore.kern
> el.org%2Flkml%2F20200420145417.6864-1-yao.jin%40linux.intel.com&dat
> a=02%7C01%7Cqiangqing.zhang%40nxp.com%7Cb249ba9cac594c57208c08d7
> f5485feb%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637247563
> 504900211&sdata=6eyWI915sfoZsOg9XsdaaBHaEV1wdwI26ikKxHFT4a8
> %3D&reserved=0
Hi Andi, Jin Yao
Patch can fix this issue, thanks.
Best Regards,
Joakim Zhang
> Thanks
> Jin Yao
> -----Original Message-----
> From: Jin, Yao <[email protected]>
> Sent: 2020年5月11日 9:12
> To: kajoljain <[email protected]>; Joakim Zhang
> <[email protected]>; [email protected]; Jiri Olsa <[email protected]>;
> Andi Kleen <[email protected]>
> Cc: [email protected]; [email protected]; Kan Liang
> <[email protected]>; Madhavan Srinivasan
> <[email protected]>; Anju T Sudhakar <[email protected]>;
> Ravi Bangoria <[email protected]>
> Subject: Re: [RFC] Issue in final aggregate value, in case of multiple events
> present in metric expression
>
> Hi Kajol,
>
> On 3/24/2020 4:00 PM, kajoljain wrote:
> > Hello All,
> > I want to discuss one issue raised by Joakim Zhang where he mentioned
> > that, we are not getting correct result in-case of multiple events
> > present in metric expression.
> >
> > This is one example pointed by him :
> >
> > below is the JSON file and result.
> > [
> > {
> > "PublicDescription": "Calculate DDR0 bus actual utilization
> which vary from DDR0 controller clock frequency",
> > "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> > "MetricName": "imx8qm-ddr0-bus-util",
> > "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )",
> > "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> > }
> > ]
> > ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > Based on given metric expression, the sum coming correct for first
> > iteration while for rest, we won't see same addition result. But
> > in-case we have single event in metric expression, we are getting correct
> result as expected.
> >
> >
> > So, I try to look into this issue and understand the flow. From my
> > understanding, whenever we do calculation of metric expression we don't use
> exact count we are getting.
> > Basically we use mean value of each metric event in the calculation of metric
> expression.
> >
> > So, I take same example:
> >
> > Metric Event: imx8qm-ddr0-bus-util
> > MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )"
> >
> > command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> >
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > So, there is one function called 'update_stats' in file util/stat.c
> > where we do this calculation and updating stats->mean value. And this
> > mean value is what we are actually using in our metric expression calculation.
> >
> > We call this function in each iteration where we update stats->mean and
> stats->n for each event.
> > But one weird issue is, for very first event, stat->n is always 1 that
> > is why we are getting mean same as count.
> >
> > So this the reason why for single event we get exact aggregate of metric
> expression.
> > So doesn't matter how many events you have in your metric expression,
> > every time you take exact count for first one and normalized value for rest
> which is weird.
> >
> > According to update_stats function: We are updating mean as:
> >
> > stats->mean += delta / stats->n where, delta = val - stats->mean.
> >
> > If we take write-cycles here. Initially mean = 0 and n = 1.
> >
> > 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg
> > value: 16720 + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 +
> > (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
> > 3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 =
> > 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
> >
> > I am not sure if its expected behavior. I mean shouldn't we either
> > take mean value of each event or take n as 1 for each event.
> >
> >
> > I am thinking, Should we add an option to say whether user want exact
> > aggregate or this normalize aggregate to remove the confusion? I try to find
> it out if we already have one but didn't get.
> > Please let me know if my understanding is fine. Or something I can add to
> resolve this issue.
> >
> > Thanks,
> > Kajol
> >
>
> Since you use the interval mode, can this commit fix the issue?
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flore.kern
> el.org%2Flkml%2F20200420145417.6864-1-yao.jin%40linux.intel.com&dat
> a=02%7C01%7Cqiangqing.zhang%40nxp.com%7Cb249ba9cac594c57208c08d7
> f5485feb%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637247563
> 504900211&sdata=6eyWI915sfoZsOg9XsdaaBHaEV1wdwI26ikKxHFT4a8
> %3D&reserved=0
Hi Jin Yao, Andi,
I applied this patch on L5.4 kernel, it really can fix such issue, we can get the correct result with interval mode.
1) "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )",
We can always get the correct result for expression 1), no matter the interval value.
2) "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ ) / duration_time",
However, when I use the duration_time event in expression 2), we cannot always get the correct result.
When interval <= 4ms, it can calculate correctly. The result is (107016 + 15910) / 4 = 30731.5, correct!
root@imx8mpevk:~# ./perf stat -a -I 4000 -M imx8mp-lpddr4-bandwidth-usage
# time counts unit events
4.000704750 107016 imx8_ddr0/read-cycles/ # 30731.5 imx8mp-lpddr4-bandwidth-usage
4.000704750 15910 imx8_ddr0/write-cycles/
4.000704750 4000704750 ns duration_time
When interval >= 5ms, the result is in correct. The result should be (235948 + 42878) / 5 = 55765.2, incorrect!
root@imx8mpevk:~# ./perf stat -a -I 5000 -M imx8mp-lpddr4-bandwidth-usage
# time counts unit events
5.000733250 235948 imx8_ddr0/read-cycles/ # 395479.5 imx8mp-lpddr4-bandwidth-usage
5.000733250 42878 imx8_ddr0/write-cycles/
5.000733250 5000733250 ns duration_time
Do you know there are some limitation of duration_time event? Thanks a lot!
Best Regards,
Joakim Zhang
> Thanks
> Jin Yao