2021-08-20 09:47:31

by [email protected]

[permalink] [raw]
Subject: [PATCH 0/3] libperf: Add support for scaling counters obtained from the read() system call during multiplexing

This patch series supports counter scaling when perf_evsel__read() obtains a counter
using the read() system call during multiplexing.

The first patch adds scaling of counters obtained from the read() system call
during multiplexing.

The second patch fixes verbose printing.

The third patch adds a test for the first patch.
This patch is based on Vince's rdpmc_multiplexing.c [1]


[1] https://github.com/deater/perf_event_tests/blob/master/tests/rdpmc/rdpmc_multiplexing.c


Shunsuke Nakamura (3):
libperf: Add processing to scale the counters obtained during the read
() system call when multiplexing
libperf tests: Fix verbose printing
libperf tests: Add test_stat_multiplexing test

tools/lib/perf/evsel.c | 4 +
tools/lib/perf/include/internal/tests.h | 2 +
tools/lib/perf/tests/test-evlist.c | 138 ++++++++++++++++++++++++
3 files changed, 144 insertions(+)

--
2.25.1


2021-08-20 09:47:37

by [email protected]

[permalink] [raw]
Subject: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
does not scale counters obtained by read() system call.

Add processing to perf_evsel__read() to scale the counters obtained during the
read() system call when multiplexing.

Signed-off-by: Shunsuke Nakamura <[email protected]>
---
tools/lib/perf/evsel.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index d8886720e83d..005cf64a1ad7 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -18,6 +18,7 @@
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <asm/bug.h>
+#include <linux/math64.h>

void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
int idx)
@@ -308,6 +309,9 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
return -errno;

+ if (count->ena != count->run)
+ count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
+
return 0;
}

--
2.25.1

2021-08-20 09:48:33

by [email protected]

[permalink] [raw]
Subject: [PATCH 2/3] libperf tests: Fix verbose printing

libperf's verbose printing checks the -v option every time the macro _T_ START
is called.

Since there are currently four libperf tests registered, the macro _T_ START is
called four times, but verbose printing after the second time is not output.

Resets the index of the element processed by getopt() and fix verbose printing
so that it prints in all tests.

Signed-off-by: Shunsuke Nakamura <[email protected]>
---
tools/lib/perf/include/internal/tests.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/lib/perf/include/internal/tests.h b/tools/lib/perf/include/internal/tests.h
index 61052099225b..b130a6663ff8 100644
--- a/tools/lib/perf/include/internal/tests.h
+++ b/tools/lib/perf/include/internal/tests.h
@@ -23,6 +23,8 @@ static inline int get_verbose(char **argv, int argc)
break;
}
}
+ optind = 1;
+
return verbose;
}

--
2.25.1

2021-08-20 09:49:50

by [email protected]

[permalink] [raw]
Subject: [PATCH 3/3] libperf tests: Add test_stat_multiplexing test

Adds a test for a counter obtained using read() system call during multiplexing

Committer testing:

$ sudo make tests V=1 -C tools/lib/perf/
make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf
make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests
running static:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...
count = 503074578, run = 261416566, enable = 435415718
count = 503295562, run = 261411675, enable = 435412316
count = 501603369, run = 261421312, enable = 435408854
count = 501245546, run = 261419457, enable = 435405029
count = 501849603, run = 261415041, enable = 435400600
count = 500801298, run = 261410125, enable = 435395597
count = 502116997, run = 261401960, enable = 435389539
count = 501797294, run = 261397791, enable = 435382777
count = 501249740, run = 261377402, enable = 435374884
count = 501754502, run = 260985031, enable = 435366109
count = 501659788, run = 260985466, enable = 435354970
count = 503670502, run = 260985172, enable = 435345076
count = 503209138, run = 260987272, enable = 435335863
count = 502772845, run = 260985720, enable = 435314163
count = 504045922, run = 260985324, enable = 435303814
Expected: 502005425
High: 504045922 Low: 500801298 Average: 502276445
Average Error = 0.05%
OK
- running tests/test-evsel.c...
loop = 65536, count = 333260
loop = 131072, count = 655861
loop = 262144, count = 1315513
loop = 524288, count = 2633730
loop = 1048576, count = 5270513
loop = 65536, count = 407118
loop = 131072, count = 786947
loop = 262144, count = 1859532
loop = 524288, count = 3849286
loop = 1048576, count = 7310477
OK
running dynamic:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...
count = 502071859, run = 275961923, enable = 461897932
count = 501339557, run = 275959274, enable = 461895615
count = 501706307, run = 275912910, enable = 461892367
count = 501877502, run = 275906600, enable = 461888804
count = 501773043, run = 275905934, enable = 461884843
count = 502724983, run = 275884848, enable = 461880027
count = 502436857, run = 276864452, enable = 461874272
count = 502598147, run = 277873765, enable = 461867901
count = 502601952, run = 278872100, enable = 461860271
count = 502356640, run = 278933007, enable = 461851421
count = 503474674, run = 278930678, enable = 461840342
count = 503365623, run = 278976959, enable = 461830886
count = 503307062, run = 278004451, enable = 461821700
count = 502845553, run = 276988055, enable = 461812741
count = 501627390, run = 275995291, enable = 461803691
Expected: 501940978
High: 503474674 Low: 501339557 Average: 502407143
Average Error = 0.09%
OK
- running tests/test-evsel.c...
loop = 65536, count = 328182
loop = 131072, count = 661219
loop = 262144, count = 1316712
loop = 524288, count = 2641030
loop = 1048576, count = 5267395
loop = 65536, count = 393675
loop = 131072, count = 842152
loop = 262144, count = 1664160
loop = 524288, count = 3421570
loop = 1048576, count = 6856783
OK
make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'

Signed-off-by: Shunsuke Nakamura <[email protected]>
---
tools/lib/perf/tests/test-evlist.c | 138 +++++++++++++++++++++++++++++
1 file changed, 138 insertions(+)

diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c
index c67c83399170..c7184d8b6ce9 100644
--- a/tools/lib/perf/tests/test-evlist.c
+++ b/tools/lib/perf/tests/test-evlist.c
@@ -21,6 +21,9 @@
#include "tests.h"
#include <internal/evsel.h>

+#define EVENT_NUM 15
+#define WAIT_COUNT 100000000UL
+
static int libperf_print(enum libperf_print_level level,
const char *fmt, va_list ap)
{
@@ -413,6 +416,140 @@ static int test_mmap_cpus(void)
return 0;
}

+static double display_error(long long average,
+ long long high,
+ long long low,
+ long long expected) {
+
+ double error;
+
+ error = (((double)average - expected) / expected) * 100.0;
+
+ __T_VERBOSE(" Expected: %lld\n", expected);
+ __T_VERBOSE(" High: %lld Low: %lld Average: %lld\n",
+ high, low, average);
+
+ __T_VERBOSE(" Average Error = %.2f%%\n",error);
+
+ return error;
+
+}
+
+static int test_stat_multiplexing(void)
+{
+ struct perf_counts_values expected_counts = { .val = 0 };
+ struct perf_counts_values multi_counts[EVENT_NUM] = {{ .val = 0 },};
+ struct perf_thread_map *threads;
+ struct perf_evlist *evlist;
+ struct perf_evsel *evsel;
+ struct perf_event_attr attr = {
+ .type = PERF_TYPE_HARDWARE,
+ .config = PERF_COUNT_HW_INSTRUCTIONS,
+ .read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
+ PERF_FORMAT_TOTAL_TIME_RUNNING,
+ .disabled = 1,
+ };
+ int err, i, nonzero=0;
+ unsigned long count;
+ long long max = 0, min = 0, avg = 0;
+ double error = 0.0;
+
+ /* read for non-multiplexing event count */
+ threads = perf_thread_map__new_dummy();
+ __T("failed to create threads", threads);
+
+ perf_thread_map__set_pid(threads, 0, 0);
+
+ evsel = perf_evsel__new(&attr);
+ __T("failed to create evsel", evsel);
+
+ err = perf_evsel__open(evsel, NULL, threads);
+ __T("failed to open evsel", err == 0);
+
+ err = perf_evsel__enable(evsel);
+ __T("failed to enable evsel", err == 0);
+
+ count = WAIT_COUNT;
+ while(count--);
+
+ perf_evsel__read(evsel, 0, 0, &expected_counts);
+ __T("failed to read value for evsel", expected_counts.val != 0);
+ __T("failed to read non-multiplexing event count",
+ expected_counts.ena == expected_counts.run);
+
+ err = perf_evsel__disable(evsel);
+ __T("failed to enable evsel", err == 0);
+
+ perf_evsel__close(evsel);
+ perf_evsel__delete(evsel);
+
+ perf_thread_map__put(threads);
+
+
+ /* read for multiplexing event count */
+ threads = perf_thread_map__new_dummy();
+ __T("failed to create threads", threads);
+
+ perf_thread_map__set_pid(threads, 0, 0);
+
+ evlist = perf_evlist__new();
+ __T("failed to create evlist", evlist);
+
+ for (i = 0; i < EVENT_NUM; i++) {
+ evsel = perf_evsel__new(&attr);
+ __T("failed to create evsel1", evsel);
+
+ perf_evlist__add(evlist, evsel);
+ }
+ perf_evlist__set_maps(evlist, NULL, threads);
+
+ err = perf_evlist__open(evlist);
+ __T("failed to open evsel", err == 0);
+
+ perf_evlist__enable(evlist);
+
+ count = WAIT_COUNT;
+ while (count--);
+
+ i = 0;
+ perf_evlist__for_each_evsel(evlist, evsel) {
+ perf_evsel__read(evsel, 0, 0, &multi_counts[i]);
+ __T("failed to read value for evsel", multi_counts[i].val != 0);
+ i++;
+ }
+
+ perf_evlist__disable(evlist);
+
+
+ min = multi_counts[0].val;
+ for (i = 0; i < EVENT_NUM; i++) {
+ __T_VERBOSE("\tcount = %lu, run = %lu, enable = %lu\n",
+ multi_counts[i].val, multi_counts[i].run, multi_counts[i].ena);
+
+ if (multi_counts[i].val > max)
+ max = multi_counts[i].val;
+
+ if (multi_counts[i].val < min)
+ min = multi_counts[i].val;
+
+ avg += multi_counts[i].val;
+
+ if (multi_counts[i].val != 0)
+ nonzero++;
+ }
+ avg = avg / nonzero;
+
+ error = display_error(avg, max, min, expected_counts.val);
+
+ __T("Error out of range!", ((error <= 1.0) && (error >= -1.0)));
+
+ perf_evlist__close(evlist);
+ perf_evlist__delete(evlist);
+
+ perf_thread_map__put(threads);
+ return 0;
+}
+
int test_evlist(int argc, char **argv)
{
__T_START;
@@ -424,6 +561,7 @@ int test_evlist(int argc, char **argv)
test_stat_thread_enable();
test_mmap_thread();
test_mmap_cpus();
+ test_stat_multiplexing();

__T_END;
return tests_failed == 0 ? 0 : -1;
--
2.25.1

2021-08-23 20:14:28

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> does not scale counters obtained by read() system call.
>
> Add processing to perf_evsel__read() to scale the counters obtained during the
> read() system call when multiplexing.

Which one is right though? Changing what read() returns could break
users, right? Or are you implying that the RDPMC path is correct and
read() was not. More likely the former case since I wrote the latter.

>
> Signed-off-by: Shunsuke Nakamura <[email protected]>
> ---
> tools/lib/perf/evsel.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index d8886720e83d..005cf64a1ad7 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -18,6 +18,7 @@
> #include <sys/ioctl.h>
> #include <sys/mman.h>
> #include <asm/bug.h>
> +#include <linux/math64.h>
>
> void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
> int idx)
> @@ -308,6 +309,9 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
> return -errno;
>
> + if (count->ena != count->run)
> + count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> +
> return 0;
> }
>
> --
> 2.25.1
>
>

2021-08-23 20:28:21

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 2/3] libperf tests: Fix verbose printing

On Fri, Aug 20, 2021 at 06:39:07PM +0900, Shunsuke Nakamura wrote:
> libperf's verbose printing checks the -v option every time the macro _T_ START

__T_START

> is called.
>
> Since there are currently four libperf tests registered, the macro _T_ START is
> called four times, but verbose printing after the second time is not output.
>
> Resets the index of the element processed by getopt() and fix verbose printing
> so that it prints in all tests.
>
> Signed-off-by: Shunsuke Nakamura <[email protected]>
> ---
> tools/lib/perf/include/internal/tests.h | 2 ++
> 1 file changed, 2 insertions(+)

Acked-by: Rob Herring <[email protected]>

>
> diff --git a/tools/lib/perf/include/internal/tests.h b/tools/lib/perf/include/internal/tests.h
> index 61052099225b..b130a6663ff8 100644
> --- a/tools/lib/perf/include/internal/tests.h
> +++ b/tools/lib/perf/include/internal/tests.h
> @@ -23,6 +23,8 @@ static inline int get_verbose(char **argv, int argc)
> break;
> }
> }
> + optind = 1;
> +
> return verbose;
> }
>
> --
> 2.25.1
>
>

2021-08-24 10:20:42

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi, Rob

> On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > does not scale counters obtained by read() system call.
> >
> > Add processing to perf_evsel__read() to scale the counters obtained during the
> > read() system call when multiplexing.
>
> Which one is right though? Changing what read() returns could break
> users, right? Or are you implying that the RDPMC path is correct and
> read() was not. More likely the former case since I wrote the latter.

perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
by the read() system call when multiplexed with RDPMC enabled.

That is, there is a mix of scaled and unscaled values.

As Rob says, when this patch is applied, rescaling the count obtained from
perf_evsel__read() during multiplexing will break the count.

I think the easiest solution is to change the value you get from RDPMC to not scale
and let the user scale it, but I thought it would be a little inconvenient.

Best Regards
Shunsuke

2021-08-24 18:07:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/3] libperf tests: Fix verbose printing

Em Mon, Aug 23, 2021 at 03:26:42PM -0500, Rob Herring escreveu:
> On Fri, Aug 20, 2021 at 06:39:07PM +0900, Shunsuke Nakamura wrote:
> > libperf's verbose printing checks the -v option every time the macro _T_ START
>
> __T_START
>
> > is called.
> >
> > Since there are currently four libperf tests registered, the macro _T_ START is
> > called four times, but verbose printing after the second time is not output.
> >
> > Resets the index of the element processed by getopt() and fix verbose printing
> > so that it prints in all tests.
> >
> > Signed-off-by: Shunsuke Nakamura <[email protected]>
> > ---
> > tools/lib/perf/include/internal/tests.h | 2 ++
> > 1 file changed, 2 insertions(+)
>
> Acked-by: Rob Herring <[email protected]>

Thanks, applied.

Waiting for the conclusion on the discussion for the other two patches.

- Arnaldo

2021-08-31 08:59:45

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi, Rob

> > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > does not scale counters obtained by read() system call.
> > >
> > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > read() system call when multiplexing.
> >
> > Which one is right though? Changing what read() returns could break
> > users, right? Or are you implying that the RDPMC path is correct and
> > read() was not. More likely the former case since I wrote the latter.
>
> perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> by the read() system call when multiplexed with RDPMC enabled.
>
> That is, there is a mix of scaled and unscaled values.
>
> As Rob says, when this patch is applied, rescaling the count obtained from
> perf_evsel__read() during multiplexing will break the count.
>
> I think the easiest solution is to change the value you get from RDPMC to not scale
> and let the user scale it, but I thought it would be a little inconvenient.

Any comments?

Best Regards
Shunsuke

2021-08-31 12:27:28

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Tue, Aug 24, 2021 at 5:12 AM [email protected]
<[email protected]> wrote:
>
> Hi, Rob
>
> > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > does not scale counters obtained by read() system call.
> > >
> > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > read() system call when multiplexing.
> >
> > Which one is right though? Changing what read() returns could break
> > users, right? Or are you implying that the RDPMC path is correct and
> > read() was not. More likely the former case since I wrote the latter.
>
> perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> by the read() system call when multiplexed with RDPMC enabled.
>
> That is, there is a mix of scaled and unscaled values.
>
> As Rob says, when this patch is applied, rescaling the count obtained from
> perf_evsel__read() during multiplexing will break the count.
>
> I think the easiest solution is to change the value you get from RDPMC to not scale
> and let the user scale it, but I thought it would be a little inconvenient.

Agreed, unless someone else has an opinion. It would be good to do the
scaling in libperf with the optimized math op, but I assume there's
some reason the user may need unscaled values?

Rob

2021-09-08 00:01:15

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

On Tue, Aug 31, 2021 at 5:26 AM Rob Herring <[email protected]> wrote:
>
> On Tue, Aug 24, 2021 at 5:12 AM [email protected]
> <[email protected]> wrote:
> >
> > Hi, Rob
> >
> > > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > does not scale counters obtained by read() system call.
> > > >
> > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > read() system call when multiplexing.
> > >
> > > Which one is right though? Changing what read() returns could break
> > > users, right? Or are you implying that the RDPMC path is correct and
> > > read() was not. More likely the former case since I wrote the latter.
> >
> > perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> > by the read() system call when multiplexed with RDPMC enabled.
> >
> > That is, there is a mix of scaled and unscaled values.
> >
> > As Rob says, when this patch is applied, rescaling the count obtained from
> > perf_evsel__read() during multiplexing will break the count.
> >
> > I think the easiest solution is to change the value you get from RDPMC to not scale
> > and let the user scale it, but I thought it would be a little inconvenient.
>
> Agreed, unless someone else has an opinion. It would be good to do the
> scaling in libperf with the optimized math op, but I assume there's
> some reason the user may need unscaled values?

Hi, something I've mentioned on other threads [1] is that running may
be zero due to multiplexing but enabled be greater. This can lead to a
divide by zero when scaling. Giving the ratio to the caller gives more
information - I may be misunderstanding this thread, apologies if so.

Thanks,
Ian

[1] https://lore.kernel.org/lkml/CAL_JsqKc_qFA59L9e-xXOhE4yBTdVg-Ea9EddimWwqj3XXxhGw@mail.gmail.com/

> Rob

2021-09-17 13:12:59

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing

Hi, Ian

> > On Tue, Aug 24, 2021 at 5:12 AM [email protected]
> > <[email protected]> wrote:
> > >
> > > Hi, Rob
> > >
> > > > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > > does not scale counters obtained by read() system call.
> > > > >
> > > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > > read() system call when multiplexing.
> > > >
> > > > Which one is right though? Changing what read() returns could break
> > > > users, right? Or are you implying that the RDPMC path is correct and
> > > > read() was not. More likely the former case since I wrote the latter.
> > >
> > > perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> > > by the read() system call when multiplexed with RDPMC enabled.
> > >
> > > That is, there is a mix of scaled and unscaled values.
> > >
> > > As Rob says, when this patch is applied, rescaling the count obtained from
> > > perf_evsel__read() during multiplexing will break the count.
> > >
> > > I think the easiest solution is to change the value you get from RDPMC to not scale
> > > and let the user scale it, but I thought it would be a little inconvenient.
> >
> > Agreed, unless someone else has an opinion. It would be good to do the
> > scaling in libperf with the optimized math op, but I assume there's
> > some reason the user may need unscaled values?
>
> Hi, something I've mentioned on other threads [1] is that running may
> be zero due to multiplexing but enabled be greater.

Thanks for your comment.
I'll fix it.

> This can lead to a divide by zero when scaling. Giving the ratio to the caller
> gives more information - I may be misunderstanding this thread, apologies if so.

The perf_counts_values contains enabled and running.
So, caller can calculate the ratio.

Best Regards
Shunsuke