> -----Original Message-----
> From: Jin, Yao <[email protected]>
> Sent: 2020年5月11日 9:12
> To: kajoljain <[email protected]>; Joakim Zhang
> <[email protected]>; [email protected]; Jiri Olsa <[email protected]>;
> Andi Kleen <[email protected]>
> Cc: [email protected]; [email protected]; Kan Liang
> <[email protected]>; Madhavan Srinivasan
> <[email protected]>; Anju T Sudhakar <[email protected]>;
> Ravi Bangoria <[email protected]>
> Subject: Re: [RFC] Issue in final aggregate value, in case of multiple events
> present in metric expression
>
> Hi Kajol,
>
> On 3/24/2020 4:00 PM, kajoljain wrote:
> > Hello All,
> > I want to discuss one issue raised by Joakim Zhang where he mentioned
> > that, we are not getting correct result in-case of multiple events
> > present in metric expression.
> >
> > This is one example pointed by him :
> >
> > below is the JSON file and result.
> > [
> > {
> > "PublicDescription": "Calculate DDR0 bus actual utilization
> which vary from DDR0 controller clock frequency",
> > "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> > "MetricName": "imx8qm-ddr0-bus-util",
> > "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )",
> > "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> > }
> > ]
> > ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > Based on given metric expression, the sum coming correct for first
> > iteration while for rest, we won't see same addition result. But
> > in-case we have single event in metric expression, we are getting correct
> result as expected.
> >
> >
> > So, I try to look into this issue and understand the flow. From my
> > understanding, whenever we do calculation of metric expression we don't use
> exact count we are getting.
> > Basically we use mean value of each metric event in the calculation of metric
> expression.
> >
> > So, I take same example:
> >
> > Metric Event: imx8qm-ddr0-bus-util
> > MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )"
> >
> > command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> >
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > So, there is one function called 'update_stats' in file util/stat.c
> > where we do this calculation and updating stats->mean value. And this
> > mean value is what we are actually using in our metric expression calculation.
> >
> > We call this function in each iteration where we update stats->mean and
> stats->n for each event.
> > But one weird issue is, for very first event, stat->n is always 1 that
> > is why we are getting mean same as count.
> >
> > So this the reason why for single event we get exact aggregate of metric
> expression.
> > So doesn't matter how many events you have in your metric expression,
> > every time you take exact count for first one and normalized value for rest
> which is weird.
> >
> > According to update_stats function: We are updating mean as:
> >
> > stats->mean += delta / stats->n where, delta = val - stats->mean.
> >
> > If we take write-cycles here. Initially mean = 0 and n = 1.
> >
> > 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg
> > value: 16720 + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 +
> > (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
> > 3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 =
> > 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
> >
> > I am not sure if its expected behavior. I mean shouldn't we either
> > take mean value of each event or take n as 1 for each event.
> >
> >
> > I am thinking, Should we add an option to say whether user want exact
> > aggregate or this normalize aggregate to remove the confusion? I try to find
> it out if we already have one but didn't get.
> > Please let me know if my understanding is fine. Or something I can add to
> resolve this issue.
> >
> > Thanks,
> > Kajol
> >
>
> Since you use the interval mode, can this commit fix the issue?
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flore.kern
> el.org%2Flkml%2F20200420145417.6864-1-yao.jin%40linux.intel.com&dat
> a=02%7C01%7Cqiangqing.zhang%40nxp.com%7Cb249ba9cac594c57208c08d7
> f5485feb%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637247563
> 504900211&sdata=6eyWI915sfoZsOg9XsdaaBHaEV1wdwI26ikKxHFT4a8
> %3D&reserved=0

Hi Andi, Jin Yao

Patch can fix this issue, thanks.

Best Regards,
Joakim Zhang
> Thanks
> Jin Yao

2020-05-13 08:11:03

by Joakim Zhang

[permalink] [raw]

Subject: RE: [RFC] Issue in final aggregate value, in case of multiple events present in metric expression

> -----Original Message-----
> From: Jin, Yao <[email protected]>
> Sent: 2020年5月11日 9:12
> To: kajoljain <[email protected]>; Joakim Zhang
> <[email protected]>; [email protected]; Jiri Olsa <[email protected]>;
> Andi Kleen <[email protected]>
> Cc: [email protected]; [email protected]; Kan Liang
> <[email protected]>; Madhavan Srinivasan
> <[email protected]>; Anju T Sudhakar <[email protected]>;
> Ravi Bangoria <[email protected]>
> Subject: Re: [RFC] Issue in final aggregate value, in case of multiple events
> present in metric expression
>
> Hi Kajol,
>
> On 3/24/2020 4:00 PM, kajoljain wrote:
> > Hello All,
> > I want to discuss one issue raised by Joakim Zhang where he mentioned
> > that, we are not getting correct result in-case of multiple events
> > present in metric expression.
> >
> > This is one example pointed by him :
> >
> > below is the JSON file and result.
> > [
> > {
> > "PublicDescription": "Calculate DDR0 bus actual utilization
> which vary from DDR0 controller clock frequency",
> > "BriefDescription": "imx8qm: ddr0 bus actual utilization",
> > "MetricName": "imx8qm-ddr0-bus-util",
> > "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )",
> > "MetricGroup": "i.MX8QM_DDR0_BUS_UTIL"
> > }
> > ]
> > ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > Based on given metric expression, the sum coming correct for first
> > iteration while for rest, we won't see same addition result. But
> > in-case we have single event in metric expression, we are getting correct
> result as expected.
> >
> >
> > So, I try to look into this issue and understand the flow. From my
> > understanding, whenever we do calculation of metric expression we don't use
> exact count we are getting.
> > Basically we use mean value of each metric event in the calculation of metric
> expression.
> >
> > So, I take same example:
> >
> > Metric Event: imx8qm-ddr0-bus-util
> > MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ +
> imx8_ddr0\\/write\\-cycles\\/ )"
> >
> > command#: ./perf stat -I 1000 -M imx8qm-ddr0-bus-util
> >
> > # time counts unit events
> > 1.000104250 16720 imx8_ddr0/read-cycles/
> # 22921.0 imx8qm-ddr0-bus-util
> > 1.000104250 6201 imx8_ddr0/write-cycles/
> > 2.000525625 8316 imx8_ddr0/read-cycles/
> # 12785.5 imx8qm-ddr0-bus-util
> > 2.000525625 2738 imx8_ddr0/write-cycles/
> > 3.000819125 1056 imx8_ddr0/read-cycles/
> # 4136.7 imx8qm-ddr0-bus-util
> > 3.000819125 303 imx8_ddr0/write-cycles/
> > 4.001103750 6260 imx8_ddr0/read-cycles/
> # 9149.8 imx8qm-ddr0-bus-util
> > 4.001103750 2317 imx8_ddr0/write-cycles/
> > 5.001392750 2084 imx8_ddr0/read-cycles/
> # 4516.0 imx8qm-ddr0-bus-util
> > 5.001392750 601 imx8_ddr0/write-cycles/
> >
> > So, there is one function called 'update_stats' in file util/stat.c
> > where we do this calculation and updating stats->mean value. And this
> > mean value is what we are actually using in our metric expression calculation.
> >
> > We call this function in each iteration where we update stats->mean and
> stats->n for each event.
> > But one weird issue is, for very first event, stat->n is always 1 that
> > is why we are getting mean same as count.
> >
> > So this the reason why for single event we get exact aggregate of metric
> expression.
> > So doesn't matter how many events you have in your metric expression,
> > every time you take exact count for first one and normalized value for rest
> which is weird.
> >
> > According to update_stats function: We are updating mean as:
> >
> > stats->mean += delta / stats->n where, delta = val - stats->mean.
> >
> > If we take write-cycles here. Initially mean = 0 and n = 1.
> >
> > 1st iteration: n=1, write cycle : 6201 and mean = 6201 (Final agg
> > value: 16720 + 6201 = 22921) 2nd iteration: n=2, write cycles: 6201 +
> > (2738 - 6201)/2 = 4469.5 (Final aggr value: 8316 + 4469.5 = 12785.5)
> > 3rd iteration: n=3, write cycles: 4469.5 + (303 - 4469.5)/3 =
> > 3080.6667 (Final aggr value: 1056 + 3080.6667 = 4136.7)
> >
> > I am not sure if its expected behavior. I mean shouldn't we either
> > take mean value of each event or take n as 1 for each event.
> >
> >
> > I am thinking, Should we add an option to say whether user want exact
> > aggregate or this normalize aggregate to remove the confusion? I try to find
> it out if we already have one but didn't get.
> > Please let me know if my understanding is fine. Or something I can add to
> resolve this issue.
> >
> > Thanks,
> > Kajol
> >
>
> Since you use the interval mode, can this commit fix the issue?
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flore.kern
> el.org%2Flkml%2F20200420145417.6864-1-yao.jin%40linux.intel.com&dat
> a=02%7C01%7Cqiangqing.zhang%40nxp.com%7Cb249ba9cac594c57208c08d7
> f5485feb%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637247563
> 504900211&sdata=6eyWI915sfoZsOg9XsdaaBHaEV1wdwI26ikKxHFT4a8
> %3D&reserved=0

Hi Jin Yao, Andi,

I applied this patch on L5.4 kernel, it really can fix such issue, we can get the correct result with interval mode.

1) "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ )",
We can always get the correct result for expression 1), no matter the interval value.

2) "MetricExpr": "( imx8_ddr0\\/read\\-cycles\\/ + imx8_ddr0\\/write\\-cycles\\/ ) / duration_time",
However, when I use the duration_time event in expression 2), we cannot always get the correct result.

When interval <= 4ms, it can calculate correctly. The result is (107016 + 15910) / 4 = 30731.5, correct!
root@imx8mpevk:~# ./perf stat -a -I 4000 -M imx8mp-lpddr4-bandwidth-usage
# time counts unit events
4.000704750 107016 imx8_ddr0/read-cycles/ # 30731.5 imx8mp-lpddr4-bandwidth-usage
4.000704750 15910 imx8_ddr0/write-cycles/
4.000704750 4000704750 ns duration_time

When interval >= 5ms, the result is in correct. The result should be (235948 + 42878) / 5 = 55765.2, incorrect!
root@imx8mpevk:~# ./perf stat -a -I 5000 -M imx8mp-lpddr4-bandwidth-usage
# time counts unit events
5.000733250 235948 imx8_ddr0/read-cycles/ # 395479.5 imx8mp-lpddr4-bandwidth-usage
5.000733250 42878 imx8_ddr0/write-cycles/
5.000733250 5000733250 ns duration_time

Do you know there are some limitation of duration_time event? Thanks a lot!

Best Regards,
Joakim Zhang
> Thanks
> Jin Yao