2021-09-22 10:19:09

by [email protected]

[permalink] [raw]
Subject: [PATCH v2 0/2] libperf: Add support for scaling counters obtained from the read() system call during multiplexing

This patch series supports counter scaling when perf_evsel__read() obtains a counter
using the read() system call during multiplexing.

The first patch adds scaling of counters obtained from the read() system call
during multiplexing.

The second patch adds a test for the first patch.
This patch is based on Vince's rdpmc_multiplexing.c [1]

---
Changes in v2:
- Fix not to divide by zero when counter scaling
- Add test to verify that no division by zero occurs


[1] https://github.com/deater/perf_event_tests/blob/master/tests/rdpmc/rdpmc_multiplexing.c


nakamura shunsuke (2):
libperf: Add processing to scale the counters obtained during the
read() system call when multiplexing
libperf tests: Add test_stat_multiplexing test

tools/lib/perf/evsel.c | 6 +
tools/lib/perf/tests/test-evlist.c | 183 +++++++++++++++++++++++++++++
2 files changed, 189 insertions(+)

--
2.27.0


2021-09-22 10:19:34

by [email protected]

[permalink] [raw]
Subject: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

From: nakamura shunsuke <[email protected]>

perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
does not scale counters obtained by read() system call.

Add processing to perf_evsel__read() to scale the counters obtained during the
read() system call when multiplexing.


Signed-off-by: Shunsuke Nakamura <[email protected]>
---
tools/lib/perf/evsel.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8441e3e1aaac..0ebd1d34436f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -18,6 +18,7 @@
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <asm/bug.h>
+#include <linux/math64.h>

void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
int idx)
@@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
if (readn(*fd, count->values, size) <= 0)
return -errno;

+ if (count->ena != count->run) {
+ if (count->run != 0)
+ count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
+ }
+
return 0;
}

--
2.27.0

2021-09-22 21:37:08

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> From: nakamura shunsuke <[email protected]>
>
> perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> does not scale counters obtained by read() system call.
>
> Add processing to perf_evsel__read() to scale the counters obtained during the
> read() system call when multiplexing.
>
>
> Signed-off-by: Shunsuke Nakamura <[email protected]>
> ---
> tools/lib/perf/evsel.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 8441e3e1aaac..0ebd1d34436f 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -18,6 +18,7 @@
> #include <sys/ioctl.h>
> #include <sys/mman.h>
> #include <asm/bug.h>
> +#include <linux/math64.h>
>
> void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> int idx)
> @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> if (readn(*fd, count->values, size) <= 0)
> return -errno;
>
> + if (count->ena != count->run) {
> + if (count->run != 0)
> + count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> + }

so I think perf stat expect raw values in there and does the
scaling by itself, please check following code:

read_counters
read_affinity_counters
read_counter_cpu
read_single_counter
evsel__read_counter

perf_stat_process_counter
process_counter_maps
process_counter_values
perf_counts_values__scale


perhaps we could export perf_counts_values__scale if it'd be any help

jirka


> +
> return 0;
> }
>
> --
> 2.27.0
>

2021-09-28 12:49:44

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi Jirka

> On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > From: nakamura shunsuke <[email protected]>
> >
> > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > does not scale counters obtained by read() system call.
> >
> > Add processing to perf_evsel__read() to scale the counters obtained during the
> > read() system call when multiplexing.
> >
> >
> > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > ---
> >  tools/lib/perf/evsel.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > index 8441e3e1aaac..0ebd1d34436f 100644
> > --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> > @@ -18,6 +18,7 @@
> >  #include <sys/ioctl.h>
> >  #include <sys/mman.h>
> >  #include <asm/bug.h>
> > +#include <linux/math64.h>
> > 
> >  void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> >                      int idx)
> > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> >        if (readn(*fd, count->values, size) <= 0)
> >                return -errno;
> > 
> > +     if (count->ena != count->run) {
> > +             if (count->run != 0)
> > +                     count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > +     }
>
> so I think perf stat expect raw values in there and does the
> scaling by itself, please check following code:
>
> read_counters
>   read_affinity_counters
>     read_counter_cpu
>       read_single_counter
>         evsel__read_counter
>
>   perf_stat_process_counter
>     process_counter_maps
>       process_counter_values
>         perf_counts_values__scale
>
>
> perhaps we could export perf_counts_values__scale if it'd be any help

Thank you for your comment.

The purpose of this patch is to unify the counters obtained with
perf_evsel__read() to scaled or unscaled values.

perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
available, else gets by readn(). In current implementation, caller
gets scaled counter if goes through RDPMC path, otherwise gets unscaled
counter via readn() path.

However caller cannnot know which path were taken.

If caller expects a raw value, I think the RDPMC path should also
return an unscaled counter.

diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
index c89dfa5..aaa4579 100644
--- a/tools/lib/perf/mmap.c
+++ b/tools/lib/perf/mmap.c
@@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
count->ena += delta;
if (idx)
count->run += delta;
-
- cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
}

count->val = cnt;

Rob, do you have any comments?

Best Regards
Shunsuke

2021-10-05 16:39:09

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Tue, Sep 28, 2021 at 7:41 AM [email protected]
<[email protected]> wrote:
>
> Hi Jirka
>
> > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > > From: nakamura shunsuke <[email protected]>
> > >
> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > does not scale counters obtained by read() system call.
> > >
> > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > read() system call when multiplexing.
> > >
> > >
> > > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > > ---
> > > tools/lib/perf/evsel.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > > index 8441e3e1aaac..0ebd1d34436f 100644
> > > --- a/tools/lib/perf/evsel.c
> > +++ b/tools/lib/perf/evsel.c
> > > @@ -18,6 +18,7 @@
> > > #include <sys/ioctl.h>
> > > #include <sys/mman.h>
> > > #include <asm/bug.h>
> > > +#include <linux/math64.h>
> > >
> > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> > > int idx)
> > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > > if (readn(*fd, count->values, size) <= 0)
> > > return -errno;
> > >
> > > + if (count->ena != count->run) {
> > > + if (count->run != 0)
> > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > > + }
> >
> > so I think perf stat expect raw values in there and does the
> > scaling by itself, please check following code:
> >
> > read_counters
> > read_affinity_counters
> > read_counter_cpu
> > read_single_counter
> > evsel__read_counter
> >
> > perf_stat_process_counter
> > process_counter_maps
> > process_counter_values
> > perf_counts_values__scale
> >
> >
> > perhaps we could export perf_counts_values__scale if it'd be any help
>
> Thank you for your comment.
>
> The purpose of this patch is to unify the counters obtained with
> perf_evsel__read() to scaled or unscaled values.
>
> perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
> available, else gets by readn(). In current implementation, caller
> gets scaled counter if goes through RDPMC path, otherwise gets unscaled
> counter via readn() path.
>
> However caller cannnot know which path were taken.
>
> If caller expects a raw value, I think the RDPMC path should also
> return an unscaled counter.
>
> diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> index c89dfa5..aaa4579 100644
> --- a/tools/lib/perf/mmap.c
> +++ b/tools/lib/perf/mmap.c
> @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
> count->ena += delta;
> if (idx)
> count->run += delta;
> -
> - cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
> }
>
> count->val = cnt;
>
> Rob, do you have any comments?

Submit a proper patch with the above.

Rob

2021-10-07 17:19:21

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Tue, Sep 28, 2021 at 09:53:24AM +0000, [email protected] wrote:
> Hi Jirka
>
> > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > > From: nakamura shunsuke <[email protected]>
> > >
> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > does not scale counters obtained by read() system call.
> > >
> > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > read() system call when multiplexing.
> > >
> > >
> > > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > > ---
> > >? tools/lib/perf/evsel.c | 6 ++++++
> > >? 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > > index 8441e3e1aaac..0ebd1d34436f 100644
> > > --- a/tools/lib/perf/evsel.c
> > +++ b/tools/lib/perf/evsel.c
> > > @@ -18,6 +18,7 @@
> > >? #include <sys/ioctl.h>
> > >? #include <sys/mman.h>
> > >? #include <asm/bug.h>
> > > +#include <linux/math64.h>
> > >?
> > >? void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> > >????????????????????? int idx)
> > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > >??????? if (readn(*fd, count->values, size) <= 0)
> > >??????????????? return -errno;
> > >?
> > > +???? if (count->ena != count->run) {
> > > +???????????? if (count->run != 0)
> > > +???????????????????? count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > > +???? }
> >
> > so I think perf stat expect raw values in there and does the
> > scaling by itself, please check following code:
> >
> > read_counters
> > ? read_affinity_counters
> > ??? read_counter_cpu
> > ????? read_single_counter
> > ??????? evsel__read_counter
> >
> > ? perf_stat_process_counter
> > ??? process_counter_maps
> > ????? process_counter_values
> > ??????? perf_counts_values__scale
> >
> >
> > perhaps we could export perf_counts_values__scale if it'd be any help
>
> Thank you for your comment.
>
> The purpose of this patch is to unify the counters obtained with
> perf_evsel__read() to scaled or unscaled values.
>
> perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
> available, else gets by readn(). In current implementation, caller
> gets scaled counter if goes through RDPMC path, otherwise gets unscaled
> counter via readn() path.
>
> However caller cannnot know which path were taken.
>
> If caller expects a raw value, I think the RDPMC path should also
> return an unscaled counter.
>
> diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> index c89dfa5..aaa4579 100644
> --- a/tools/lib/perf/mmap.c
> +++ b/tools/lib/perf/mmap.c
> @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
> count->ena += delta;
> if (idx)
> count->run += delta;
> -
> - cnt = mul_u64_u64_div64(cnt, count->ena, count->run);

perf stat does not mmap counters so this should not be invoked
within perf stat.. but we should be consistent and scale after
calling perf_evsel__read.. and give user the possibility to get
un-scaled counts

that perhaps brings new feature.. mmap perf stat counters to invoke
the fast reading path for counters.. IIRC it should be matter just
to mmap the first 'user' page

thanks,
jirka

> }
>
> count->val = cnt;
>
> Rob, do you have any comments?
>
> Best Regards
> Shunsuke

2021-10-19 05:09:46

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi Jirka
Sorry for the late reply.

> > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > > > From: nakamura shunsuke <[email protected]>
> > > >
> > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > does not scale counters obtained by read() system call.
> > > >
> > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > read() system call when multiplexing.
> > > >
> > > >
> > > > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > > > ---
> > > >  tools/lib/perf/evsel.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > > > index 8441e3e1aaac..0ebd1d34436f 100644
> > > > --- a/tools/lib/perf/evsel.c
> > > +++ b/tools/lib/perf/evsel.c
> > > > @@ -18,6 +18,7 @@
> > > >  #include <sys/ioctl.h>
> > > >  #include <sys/mman.h>
> > > >  #include <asm/bug.h>
> > > > +#include <linux/math64.h>
> > > > 
> > > >  void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> > > >                      int idx)
> > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > > >        if (readn(*fd, count->values, size) <= 0)
> > > >                return -errno;
> > > > 
> > > > +     if (count->ena != count->run) {
> > > > +             if (count->run != 0)
> > > > +                     count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > > > +     }
> > >
> > > so I think perf stat expect raw values in there and does the
> > > scaling by itself, please check following code:
> > >
> > > read_counters
> > >   read_affinity_counters
> > >     read_counter_cpu
> > >       read_single_counter
> > >         evsel__read_counter
> > >
> > >   perf_stat_process_counter
> > >     process_counter_maps
> > >       process_counter_values
> > >         perf_counts_values__scale
> > >
> > >
> > > perhaps we could export perf_counts_values__scale if it'd be any help
> >
> > Thank you for your comment.
> >
> > The purpose of this patch is to unify the counters obtained with
> > perf_evsel__read() to scaled or unscaled values.
> >
> > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
> > available, else gets by readn(). In current implementation, caller
> > gets scaled counter if goes through RDPMC path, otherwise gets unscaled
> > counter via readn() path.
> >
> > However caller cannnot know which path were taken.
> >
> > If caller expects a raw value, I think the RDPMC path should also
> > return an unscaled counter.
> >
> > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> > index c89dfa5..aaa4579 100644
> > --- a/tools/lib/perf/mmap.c
> > +++ b/tools/lib/perf/mmap.c
> > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
> >                 count->ena += delta;
> >                 if (idx)
> >                         count->run += delta;
> > -
> > -               cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
>
> perf stat does not mmap counters so this should not be invoked
> within perf stat.. but we should be consistent and scale after
> calling perf_evsel__read.. and give user the possibility to get
> un-scaled counts
>
> that perhaps brings new feature.. mmap perf stat counters to invoke
> the fast reading path for counters.. IIRC it should be matter just
> to mmap the first 'user' page

Thank you for your comment.
I think it will be good that perf stat supports rdpmc.

I will modify the patch.

Best Regards
Shunsuke

2021-10-19 05:09:47

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi Rob
Sorry for the late reply.

> > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > > > From: nakamura shunsuke <[email protected]>
> > > >
> > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > does not scale counters obtained by read() system call.
> > > >
> > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > read() system call when multiplexing.
> > > >
> > > >
> > > > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > > > ---
> > > >  tools/lib/perf/evsel.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > > > index 8441e3e1aaac..0ebd1d34436f 100644
> > > > --- a/tools/lib/perf/evsel.c
> > > +++ b/tools/lib/perf/evsel.c
> > > > @@ -18,6 +18,7 @@
> > > >  #include <sys/ioctl.h>
> > > >  #include <sys/mman.h>
> > > >  #include <asm/bug.h>
> > > > +#include <linux/math64.h>
> > > >
> > > >  void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> > > >                      int idx)
> > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > > >        if (readn(*fd, count->values, size) <= 0)
> > > >                return -errno;
> > > >
> > > > +     if (count->ena != count->run) {
> > > > +             if (count->run != 0)
> > > > +                     count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > > > +     }
> > >
> > > so I think perf stat expect raw values in there and does the
> > > scaling by itself, please check following code:
> > >
> > > read_counters
> > >   read_affinity_counters
> > >     read_counter_cpu
> > >       read_single_counter
> > >         evsel__read_counter
> > >
> > >   perf_stat_process_counter
> > >     process_counter_maps
> > >       process_counter_values
> > >         perf_counts_values__scale
> > >
> > >
> > > perhaps we could export perf_counts_values__scale if it'd be any help
> >
> > Thank you for your comment.
> >
> > The purpose of this patch is to unify the counters obtained with
> > perf_evsel__read() to scaled or unscaled values.
> >
> > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
> > available, else gets by readn(). In current implementation, caller
> > gets scaled counter if goes through RDPMC path, otherwise gets unscaled
> > counter via readn() path.
> >
> > However caller cannnot know which path were taken.
> >
> > If caller expects a raw value, I think the RDPMC path should also
> > return an unscaled counter.
> >
> > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> > index c89dfa5..aaa4579 100644
> > --- a/tools/lib/perf/mmap.c
> > +++ b/tools/lib/perf/mmap.c
> > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
> >                 count->ena += delta;
> >                 if (idx)
> >                         count->run += delta;
> > -
> > -               cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
> >         }
> >
> >         count->val = cnt;
> >
> > Rob, do you have any comments?
>
> Submit a proper patch with the above.

Thank you for your comment.
I will send the v3 patch.

Best Regards
Shunsuke

2021-11-08 06:43:33

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi Jirka

> > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > > > > From: nakamura shunsuke <[email protected]>
> > > > >
> > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > > does not scale counters obtained by read() system call.
> > > > >
> > > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > > read() system call when multiplexing.
> > > > >
> > > > >
> > > > > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > > > > ---
> > > > >  tools/lib/perf/evsel.c | 6 ++++++
> > > > >  1 file changed, 6 insertions(+)
> > > > >
> > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > > > > index 8441e3e1aaac..0ebd1d34436f 100644
> > > > > --- a/tools/lib/perf/evsel.c
> > > > +++ b/tools/lib/perf/evsel.c
> > > > > @@ -18,6 +18,7 @@
> > > > >  #include <sys/ioctl.h>
> > > > >  #include <sys/mman.h>
> > > > >  #include <asm/bug.h>
> > > > > +#include <linux/math64.h>
> > > > > 
> > > > >  void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> > > > >                      int idx)
> > > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > > > >        if (readn(*fd, count->values, size) <= 0)
> > > > >                return -errno;
> > > > > 
> > > > > +     if (count->ena != count->run) {
> > > > > +             if (count->run != 0)
> > > > > +                     count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > > > > +     }
> > > >
> > > > so I think perf stat expect raw values in there and does the
> > > > scaling by itself, please check following code:
> > > >
> > > > read_counters
> > > >   read_affinity_counters
> > > >     read_counter_cpu
> > > >       read_single_counter
> > > >         evsel__read_counter
> > > >
> > > >   perf_stat_process_counter
> > > >     process_counter_maps
> > > >       process_counter_values
> > > >         perf_counts_values__scale
> > > >
> > > >
> > > > perhaps we could export perf_counts_values__scale if it'd be any help
> > >
> > > Thank you for your comment.
> > >
> > > The purpose of this patch is to unify the counters obtained with
> > > perf_evsel__read() to scaled or unscaled values.
> > >
> > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
> > > available, else gets by readn(). In current implementation, caller
> > > gets scaled counter if goes through RDPMC path, otherwise gets unscaled
> > > counter via readn() path.
> > >
> > > However caller cannnot know which path were taken.
> > >
> > > If caller expects a raw value, I think the RDPMC path should also
> > > return an unscaled counter.
> > >
> > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> > > index c89dfa5..aaa4579 100644
> > > --- a/tools/lib/perf/mmap.c
> > > +++ b/tools/lib/perf/mmap.c
> > > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
> > >                 count->ena += delta;
> > >                 if (idx)
> > >                         count->run += delta;
> > > -
> > > -               cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
> >
> > perf stat does not mmap counters so this should not be invoked
> > within perf stat.. but we should be consistent and scale after
> > calling perf_evsel__read.. and give user the possibility to get
> > un-scaled counts
> >
> > that perhaps brings new feature.. mmap perf stat counters to invoke
> > the fast reading path for counters.. IIRC it should be matter just
> > to mmap the first 'user' page
>
> Thank you for your comment.
> I think it will be good that perf stat supports rdpmc.
>
> I will modify the patch.

I think rdpmc cannot measure the command/program specified in perf stat
because it measures the calling thread of perf_event_open.
If my understanding is wrong, please point it out to me.

Best Regards
Shunsuke

2021-11-14 16:16:57

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Mon, Nov 08, 2021 at 12:49:24AM +0000, [email protected] wrote:
> Hi Jirka
>
> > > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote:
> > > > > > From: nakamura shunsuke <[email protected]>
> > > > > >
> > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > > > does not scale counters obtained by read() system call.
> > > > > >
> > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > > > read() system call when multiplexing.
> > > > > >
> > > > > >
> > > > > > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > > > > > ---
> > > > > >? tools/lib/perf/evsel.c | 6 ++++++
> > > > > >? 1 file changed, 6 insertions(+)
> > > > > >
> > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> > > > > > index 8441e3e1aaac..0ebd1d34436f 100644
> > > > > > --- a/tools/lib/perf/evsel.c
> > > > > +++ b/tools/lib/perf/evsel.c
> > > > > > @@ -18,6 +18,7 @@
> > > > > >? #include <sys/ioctl.h>
> > > > > >? #include <sys/mman.h>
> > > > > >? #include <asm/bug.h>
> > > > > > +#include <linux/math64.h>
> > > > > >?
> > > > > >? void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> > > > > >????????????????????? int idx)
> > > > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > > > > >??????? if (readn(*fd, count->values, size) <= 0)
> > > > > >??????????????? return -errno;
> > > > > >?
> > > > > > +???? if (count->ena != count->run) {
> > > > > > +???????????? if (count->run != 0)
> > > > > > +???????????????????? count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> > > > > > +???? }
> > > > >
> > > > > so I think perf stat expect raw values in there and does the
> > > > > scaling by itself, please check following code:
> > > > >
> > > > > read_counters
> > > > > ? read_affinity_counters
> > > > > ??? read_counter_cpu
> > > > > ????? read_single_counter
> > > > > ??????? evsel__read_counter
> > > > >
> > > > > ? perf_stat_process_counter
> > > > > ??? process_counter_maps
> > > > > ????? process_counter_values
> > > > > ??????? perf_counts_values__scale
> > > > >
> > > > >
> > > > > perhaps we could export perf_counts_values__scale if it'd be any help
> > > >
> > > > Thank you for your comment.
> > > >
> > > > The purpose of this patch is to unify the counters obtained with
> > > > perf_evsel__read() to scaled or unscaled values.
> > > >
> > > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is
> > > > available, else gets by readn(). In current implementation, caller
> > > > gets scaled counter if goes through RDPMC path, otherwise gets unscaled
> > > > counter via readn() path.
> > > >
> > > > However caller cannnot know which path were taken.
> > > >
> > > > If caller expects a raw value, I think the RDPMC path should also
> > > > return an unscaled counter.
> > > >
> > > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> > > > index c89dfa5..aaa4579 100644
> > > > --- a/tools/lib/perf/mmap.c
> > > > +++ b/tools/lib/perf/mmap.c
> > > > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
> > > >???????????????? count->ena += delta;
> > > >???????????????? if (idx)
> > > >???????????????????????? count->run += delta;
> > > > -
> > > > -?????????????? cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
> > >
> > > perf stat does not mmap counters so this should not be invoked
> > > within perf stat.. but we should be consistent and scale after
> > > calling perf_evsel__read.. and give user the possibility to get
> > > un-scaled counts
> > >
> > > that perhaps brings new feature.. mmap perf stat counters to invoke
> > > the fast reading path for counters.. IIRC it should be matter just
> > > to mmap the first 'user' page
> >
> > Thank you for your comment.
> > I think it will be good that perf stat supports rdpmc.
> >
> > I will modify the patch.
>
> I think rdpmc cannot measure the command/program specified in perf stat
> because it measures the calling thread of perf_event_open.
> If my understanding is wrong, please point it out to me.

right, I guess we could use that just for system wide monitoring,
where we open counter for each cpu

jirka