2023-01-27 12:27:52

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf test: Fix JSON format linter test checks

Em Fri, Jan 20, 2023 at 02:40:39PM +0100, Michael Petlan escreveu:
> The test fails on CPUs with topdown metrics, where it is common to print
> two metrics per line. Since these are included in default event set for
> `perf stat -- something`, it causes the "no args" subtest to fail due to
> unexpected member count. We need to accept 7 or 9 members in that case.
>
> Coming to that, counting commas in the JSON line and consider it a number
> of elements is incorrect and misleading. There should be +1 element than
> the count of commas, while also, commas can potentially appear in the
> event names. Count " : " substrings rather, since these better fit to
> what should be actually counted.

Before I apply this first patch I can run, as root, the 'perf test' JSON
lint 100 times without problems:

[root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
1 98: perf stat JSON output linter : Ok
2 98: perf stat JSON output linter : Ok
3 98: perf stat JSON output linter : Ok
4 98: perf stat JSON output linter : Ok
<SNIP>
96 98: perf stat JSON output linter : Ok
97 98: perf stat JSON output linter : Ok
98 98: perf stat JSON output linter : Ok
99 98: perf stat JSON output linter : Ok
100 98: perf stat JSON output linter : Ok
[root@quaco ~]#

After applying it it fails seemingly randomly, I'll remove both patches
from my tmp.perf/core branch and investigate.

[acme@quaco perf]$ git log --oneline -1
fdafd42aac3629db (HEAD) perf stat: Fix JSON metric printout for multiple metrics per line
[acme@quaco perf]$ perf -v
perf version 6.2.rc5.gfdafd42aac36
[acme@quaco perf]$
[root@quaco ~]# perf -v
perf version 6.2.rc5.gfdafd42aac36
[root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
[root@quaco ~]# perf -v
perf version 6.2.rc5.gfdafd42aac36
[root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
1 98: perf stat JSON output linter : Ok
2 98: perf stat JSON output linter : Ok
3 98: perf stat JSON output linter : FAILED!
4 98: perf stat JSON output linter : Ok
5 98: perf stat JSON output linter : Ok
6 98: perf stat JSON output linter : Ok
7 98: perf stat JSON output linter : FAILED!
8 98: perf stat JSON output linter : Ok
9 98: perf stat JSON output linter : Ok
10 98: perf stat JSON output linter : Ok
11 98: perf stat JSON output linter : Ok
12 98: perf stat JSON output linter : Ok
13 98: perf stat JSON output linter : Ok
14 98: perf stat JSON output linter : Ok
15 98: perf stat JSON output linter : Ok
16 98: perf stat JSON output linter : FAILED!
17 98: perf stat JSON output linter : Ok
18 98: perf stat JSON output linter : Ok
19 98: perf stat JSON output linter : Ok
20 98: perf stat JSON output linter : FAILED!
21 98: perf stat JSON output linter : Ok
22 98: perf stat JSON output linter : Ok
23 98: perf stat JSON output linter : FAILED!
24 98: perf stat JSON output linter : Ok
25 98: perf stat JSON output linter : Ok
26 98: perf stat JSON output linter : Ok
27 98: perf stat JSON output linter : Ok
28 98: perf stat JSON output linter : FAILED!
29 98: perf stat JSON output linter : Ok
30 98: perf stat JSON output linter : Ok
31 98: perf stat JSON output linter : FAILED!
32 98: perf stat JSON output linter : FAILED!
33 98: perf stat JSON output linter : FAILED!
34 98: perf stat JSON output linter : Ok
35 98: perf stat JSON output linter : Ok
36 98: perf stat JSON output linter : Ok
37 98: perf stat JSON output linter : Ok
38 98: perf stat JSON output linter : FAILED!
39 98: perf stat JSON output linter : FAILED!
40 98: perf stat JSON output linter : FAILED!
41 98: perf stat JSON output linter : Ok
42 98: perf stat JSON output linter : Ok
43 98: perf stat JSON output linter : Ok
44 98: perf stat JSON output linter : Ok
45 98: perf stat JSON output linter : Ok
46 98: perf stat JSON output linter : FAILED!
47 98: perf stat JSON output linter : Ok
48 98: perf stat JSON output linter : FAILED!
49 98: perf stat JSON output linter : FAILED!
50 98: perf stat JSON output linter : FAILED!
51 98: perf stat JSON output linter : FAILED!
52 98: perf stat JSON output linter : Ok
53 98: perf stat JSON output linter : FAILED!
54 98: perf stat JSON output linter : Ok
55 98: perf stat JSON output linter : Ok
56 98: perf stat JSON output linter : FAILED!
57 98: perf stat JSON output linter : Ok
58 98: perf stat JSON output linter : Ok
59 98: perf stat JSON output linter : FAILED!
60 98: perf stat JSON output linter : Ok
61 98: perf stat JSON output linter : FAILED!
62 98: perf stat JSON output linter : Ok
63 98: perf stat JSON output linter : Ok
64 98: perf stat JSON output linter : Ok
65 98: perf stat JSON output linter : Ok
66 98: perf stat JSON output linter : Ok
67 98: perf stat JSON output linter : Ok
68 98: perf stat JSON output linter : Ok
69 98: perf stat JSON output linter : Ok
70 98: perf stat JSON output linter : Ok
71 98: perf stat JSON output linter : FAILED!
72 98: perf stat JSON output linter : FAILED!
73 98: perf stat JSON output linter : Ok
74 98: perf stat JSON output linter : FAILED!
75 98: perf stat JSON output linter : Ok
76 98: perf stat JSON output linter : FAILED!
77 98: perf stat JSON output linter : Ok
78 98: perf stat JSON output linter : FAILED!
79 98: perf stat JSON output linter : FAILED!
80 98: perf stat JSON output linter : FAILED!
81 98: perf stat JSON output linter : Ok
82 98: perf stat JSON output linter : FAILED!
83 98: perf stat JSON output linter : FAILED!
84 98: perf stat JSON output linter : Ok
85 98: perf stat JSON output linter : Ok
86 98: perf stat JSON output linter : Ok
87 98: perf stat JSON output linter : FAILED!
88 98: perf stat JSON output linter : FAILED!
89 98: perf stat JSON output linter : Ok
90 98: perf stat JSON output linter : FAILED!
91 98: perf stat JSON output linter : FAILED!
92 98: perf stat JSON output linter : Ok
93 98: perf stat JSON output linter : FAILED!
94 98: perf stat JSON output linter : FAILED!
95 98: perf stat JSON output linter : FAILED!
96 98: perf stat JSON output linter : FAILED!
97 98: perf stat JSON output linter : FAILED!
98 98: perf stat JSON output linter : Ok
99 98: perf stat JSON output linter : Ok
100 98: perf stat JSON output linter : FAILED!
[root@quaco ~]#


> Fixes: 0c343af2a2f8 ("perf test: JSON format checking")
>
> Signed-off-by: Michael Petlan <[email protected]>
> ---
> .../tests/shell/lib/perf_json_output_lint.py | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/tests/shell/lib/perf_json_output_lint.py b/tools/perf/tests/shell/lib/perf_json_output_lint.py
> index d90f8d102eb9..4f1bbb3f07ec 100644
> --- a/tools/perf/tests/shell/lib/perf_json_output_lint.py
> +++ b/tools/perf/tests/shell/lib/perf_json_output_lint.py
> @@ -40,17 +40,17 @@ def is_counter_value(num):
> return isfloat(num) or num == '<not counted>' or num == '<not supported>'
>
> def check_json_output(expected_items):
> - if expected_items != -1:
> + if expected_items:
> for line in Lines:
> if 'failed' not in line:
> count = 0
> - count = line.count(',')
> - if count != expected_items and count >= 1 and count <= 3 and 'metric-value' in line:
> + count = line.count(' : ')
> + if count not in expected_items and count >= 2 and count <= 4 and 'metric-value' in line:
> # Events that generate >1 metric may have isolated metric
> # values and possibly other prefixes like interval, core and
> # aggregate-number.
> continue
> - if count != expected_items:
> + if count not in expected_items:
> raise RuntimeError(f'wrong number of fields. counted {count} expected {expected_items}'
> f' in \'{line}\'')
> checks = {
> @@ -82,14 +82,14 @@ def check_json_output(expected_items):
>
> try:
> if args.no_args or args.system_wide or args.event:
> - expected_items = 6
> + expected_items = [7, 9]
> elif args.interval or args.per_thread or args.system_wide_no_aggr:
> - expected_items = 7
> + expected_items = [8, 10]
> elif args.per_core or args.per_socket or args.per_node or args.per_die:
> - expected_items = 8
> + expected_items = [9, 11]
> else:
> # If no option is specified, don't check the number of items.
> - expected_items = -1
> + expected_items = []
> check_json_output(expected_items)
> except:
> print('Test failed for input:\n' + '\n'.join(Lines))
> --
> 2.18.4
>

--

- Arnaldo


2023-01-27 12:30:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf test: Fix JSON format linter test checks

Em Fri, Jan 27, 2023 at 09:26:03AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Jan 20, 2023 at 02:40:39PM +0100, Michael Petlan escreveu:
> > The test fails on CPUs with topdown metrics, where it is common to print
> > two metrics per line. Since these are included in default event set for
> > `perf stat -- something`, it causes the "no args" subtest to fail due to
> > unexpected member count. We need to accept 7 or 9 members in that case.
> >
> > Coming to that, counting commas in the JSON line and consider it a number
> > of elements is incorrect and misleading. There should be +1 element than
> > the count of commas, while also, commas can potentially appear in the
> > event names. Count " : " substrings rather, since these better fit to
> > what should be actually counted.
>
> Before I apply this first patch I can run, as root, the 'perf test' JSON
> lint 100 times without problems:
>
> [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> 1 98: perf stat JSON output linter : Ok
> 2 98: perf stat JSON output linter : Ok
> 3 98: perf stat JSON output linter : Ok
> 4 98: perf stat JSON output linter : Ok
> <SNIP>
> 96 98: perf stat JSON output linter : Ok
> 97 98: perf stat JSON output linter : Ok
> 98 98: perf stat JSON output linter : Ok
> 99 98: perf stat JSON output linter : Ok
> 100 98: perf stat JSON output linter : Ok
> [root@quaco ~]#
>
> After applying it it fails seemingly randomly, I'll remove both patches
> from my tmp.perf/core branch and investigate.
>
> [acme@quaco perf]$ git log --oneline -1
> fdafd42aac3629db (HEAD) perf stat: Fix JSON metric printout for multiple metrics per line
> [acme@quaco perf]$ perf -v
> perf version 6.2.rc5.gfdafd42aac36
> [acme@quaco perf]$
> [root@quaco ~]# perf -v
> perf version 6.2.rc5.gfdafd42aac36
> [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> [root@quaco ~]# perf -v
> perf version 6.2.rc5.gfdafd42aac36
> [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> 1 98: perf stat JSON output linter : Ok
> 2 98: perf stat JSON output linter : Ok
> 3 98: perf stat JSON output linter : FAILED!
> 4 98: perf stat JSON output linter : Ok

BTW, after applying the second patch, the one for 'perf test', the
problem persists:

[acme@quaco perf]$ git log --oneline -1
320cd37176508ec2 (HEAD, acme/tmp.perf/core, acme.korg/tmp.perf/core) perf test: Fix JSON format linter test checks
[acme@quaco perf]$ perf -v
perf version 6.2.rc5.g320cd3717650
[acme@quaco perf]$
[root@quaco ~]# perf -v
perf version 6.2.rc5.g320cd3717650
[root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
1 98: perf stat JSON output linter : Ok
2 98: perf stat JSON output linter : FAILED!
3 98: perf stat JSON output linter : Ok
4 98: perf stat JSON output linter : Ok
5 98: perf stat JSON output linter : FAILED!
6 98: perf stat JSON output linter : Ok
7 98: perf stat JSON output linter : FAILED!
8 98: perf stat JSON output linter : Ok
9 98: perf stat JSON output linter : Ok
10 98: perf stat JSON output linter : FAILED!
11 98: perf stat JSON output linter : FAILED!
12 98: perf stat JSON output linter : FAILED!
13 98: perf stat JSON output linter : Ok
14 98: perf stat JSON output linter : Ok
15 98: perf stat JSON output linter : FAILED!
16 98: perf stat JSON output linter : FAILED!
17 98: perf stat JSON output linter : FAILED!
18 98: perf stat JSON output linter :^C
[root@quaco ~]#

When it works:

[root@quaco ~]# perf test -v 98
98: perf stat JSON output linter :
--- start ---
test child forked, pid 62202
Checking json output: no args [Success]
Checking json output: system wide [Success]
Checking json output: interval [Success]
Checking json output: event [Success]
Checking json output: per thread [Success]
Checking json output: per node [Success]
Checking json output: system wide Checking json output: system wide no aggregation [Success]
Checking json output: per core [Success]
Checking json output: per die [Success]
Checking json output: per socket [Success]
test child finished with 0
---- end ----
perf stat JSON output linter: Ok
[root@quaco ~]#

When it fails:

[root@quaco ~]# perf test -v 98
98: perf stat JSON output linter :
--- start ---
test child forked, pid 62270
Checking json output: no args [Success]
Checking json output: system wide [Success]
Checking json output: interval [Success]
Checking json output: event [Success]
Checking json output: per thread Test failed for input:
{"thread" : "rcu_preempt-16", "counter-value" : "0.018340", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 19071, "pcnt-running" : 100.00, "metric-value" : 0.001758, "metric-unit" : "CPUs utilized"}

{"thread" : "gnome-terminal--2977", "counter-value" : "0.061868", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 62075, "pcnt-running" : 100.00, "metric-value" : 0.005930, "metric-unit" : "CPUs utilized"}

{"thread" : "perf-62294", "counter-value" : "9.398635", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 9398904, "pcnt-running" : 100.00, "metric-value" : 0.900916, "metric-unit" : "CPUs utilized"}

{"thread" : "rcu_preempt-16", "counter-value" : "4.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 15203, "pcnt-running" : 100.00, "metric-value" : 218.102508, "metric-unit" : "K/sec"}

{"thread" : "gnome-terminal--2977", "counter-value" : "1.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 62075, "pcnt-running" : 100.00, "metric-value" : 16.163445, "metric-unit" : "K/sec"}

{"thread" : "perf-62294", "counter-value" : "1.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 9388658, "pcnt-running" : 100.00, "metric-value" : 106.398429, "metric-unit" : "/sec"}

{"thread" : "rcu_preempt-16", "counter-value" : "1.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 12511, "pcnt-running" : 100.00, "metric-value" : 54.525627, "metric-unit" : "K/sec"}

{"thread" : "perf-62294", "counter-value" : "2.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 9427495, "pcnt-running" : 100.00, "metric-value" : 212.796858, "metric-unit" : "/sec"}

{"thread" : "rcu_preempt-16", "counter-value" : "41498.000000", "unit" : "", "event" : "cycles", "event-runtime" : 20944, "pcnt-running" : 100.00, "metric-value" : 2.262704, "metric-unit" : "GHz"}

{"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}

{"thread" : "perf-62294", "counter-value" : "36686750.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9439269, "pcnt-running" : 100.00, "metric-value" : 3.903413, "metric-unit" : "GHz"}

{"thread" : "rcu_preempt-16", "counter-value" : "25086.000000", "unit" : "", "event" : "instructions", "event-runtime" : 20944, "pcnt-running" : 100.00, "metric-value" : 0.604511, "metric-unit" : "insn per cycle"}

{"thread" : "ksoftirqd/1-22", "counter-value" : "13360.000000", "unit" : "", "event" : "instructions", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 0.559394, "metric-unit" : "insn per cycle"}

{"thread" : "perf-62294", "counter-value" : "7905940.000000", "unit" : "", "event" : "instructions", "event-runtime" : 9438686, "pcnt-running" : 100.00, "metric-value" : 0.215499, "metric-unit" : "insn per cycle"}

{"thread" : "rcu_preempt-16", "counter-value" : "3951.000000", "unit" : "", "event" : "branches", "event-runtime" : 16533, "pcnt-running" : 100.00, "metric-value" : 215.430752, "metric-unit" : "M/sec"}

{"thread" : "ksoftirqd/1-22", "counter-value" : "2822.000000", "unit" : "", "event" : "branches", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}

{"thread" : "perf-62294", "counter-value" : "1691804.000000", "unit" : "", "event" : "branches", "event-runtime" : 9474118, "pcnt-running" : 100.00, "metric-value" : 180.005288, "metric-unit" : "M/sec"}

{"thread" : "rcu_preempt-16", "counter-value" : "279.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 16533, "pcnt-running" : 100.00, "metric-value" : 7.061503, "metric-unit" : "of all branches"}

{"thread" : "ksoftirqd/1-22", "counter-value" : "153.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 5.421687, "metric-unit" : "of all branches"}

{"thread" : "kworker/1:2-events-752", "counter-value" : "121.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9850, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "of all branches"}

{"thread" : "perf-62294", "counter-value" : "52693.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9451948, "pcnt-running" : 100.00, "metric-value" : 3.114604, "metric-unit" : "of all branches"}

Traceback (most recent call last):
File "/home/acme/libexec/perf-core/tests/shell/lib/perf_json_output_lint.py", line 93, in <module>
check_json_output(expected_items)
File "/home/acme/libexec/perf-core/tests/shell/lib/perf_json_output_lint.py", line 54, in check_json_output
raise RuntimeError(f'wrong number of fields. counted {count} expected {expected_items}'
RuntimeError: wrong number of fields. counted 6 expected [8, 10] in '{"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}
'
test child finished with -1
---- end ----
perf stat JSON output linter: FAILED!
[root@quaco ~]#

2023-01-31 17:20:15

by Michael Petlan

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf test: Fix JSON format linter test checks

On Fri, 27 Jan 2023, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jan 27, 2023 at 09:26:03AM -0300, Arnaldo Carvalho de Melo escreveu:
[...]

> > Before I apply this first patch I can run, as root, the 'perf test' JSON
> > lint 100 times without problems:
> >
> > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > 1 98: perf stat JSON output linter : Ok
> > 2 98: perf stat JSON output linter : Ok
> > 3 98: perf stat JSON output linter : Ok
> > 4 98: perf stat JSON output linter : Ok
> > <SNIP>
> > 96 98: perf stat JSON output linter : Ok
> > 97 98: perf stat JSON output linter : Ok
> > 98 98: perf stat JSON output linter : Ok
> > 99 98: perf stat JSON output linter : Ok
> > 100 98: perf stat JSON output linter : Ok
> > [root@quaco ~]#
> >
> > After applying it it fails seemingly randomly, I'll remove both patches
> > from my tmp.perf/core branch and investigate.

Hello Arnaldo. Have you found anything?

Which patch of the two causes the failures? The one that changes the JSON
printing or the second that changes the test?

I hope that the JSON printing is now correct, it seems to be, at least on my
machine + the SPR box I was debugging this on.


> >
> > [acme@quaco perf]$ git log --oneline -1
> > fdafd42aac3629db (HEAD) perf stat: Fix JSON metric printout for multiple metrics per line
> > [acme@quaco perf]$ perf -v
> > perf version 6.2.rc5.gfdafd42aac36
> > [acme@quaco perf]$
> > [root@quaco ~]# perf -v
> > perf version 6.2.rc5.gfdafd42aac36
> > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > [root@quaco ~]# perf -v
> > perf version 6.2.rc5.gfdafd42aac36
> > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > 1 98: perf stat JSON output linter : Ok
> > 2 98: perf stat JSON output linter : Ok
> > 3 98: perf stat JSON output linter : FAILED!
> > 4 98: perf stat JSON output linter : Ok
>
> BTW, after applying the second patch, the one for 'perf test', the
> problem persists:
>
> [acme@quaco perf]$ git log --oneline -1
> 320cd37176508ec2 (HEAD, acme/tmp.perf/core, acme.korg/tmp.perf/core) perf test: Fix JSON format linter test checks
> [acme@quaco perf]$ perf -v
> perf version 6.2.rc5.g320cd3717650
> [acme@quaco perf]$
> [root@quaco ~]# perf -v
> perf version 6.2.rc5.g320cd3717650
> [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> 1 98: perf stat JSON output linter : Ok
> 2 98: perf stat JSON output linter : FAILED!
> 3 98: perf stat JSON output linter : Ok
> 4 98: perf stat JSON output linter : Ok
> 5 98: perf stat JSON output linter : FAILED!
> 6 98: perf stat JSON output linter : Ok
> 7 98: perf stat JSON output linter : FAILED!
> 8 98: perf stat JSON output linter : Ok
> 9 98: perf stat JSON output linter : Ok
> 10 98: perf stat JSON output linter : FAILED!
> 11 98: perf stat JSON output linter : FAILED!
> 12 98: perf stat JSON output linter : FAILED!
> 13 98: perf stat JSON output linter : Ok
> 14 98: perf stat JSON output linter : Ok
> 15 98: perf stat JSON output linter : FAILED!
> 16 98: perf stat JSON output linter : FAILED!
> 17 98: perf stat JSON output linter : FAILED!
> 18 98: perf stat JSON output linter :^C
> [root@quaco ~]#
>
> When it works:
>
> [root@quaco ~]# perf test -v 98
> 98: perf stat JSON output linter :
> --- start ---
> test child forked, pid 62202
> Checking json output: no args [Success]
> Checking json output: system wide [Success]
> Checking json output: interval [Success]
> Checking json output: event [Success]
> Checking json output: per thread [Success]
> Checking json output: per node [Success]
> Checking json output: system wide Checking json output: system wide no aggregation [Success]
> Checking json output: per core [Success]
> Checking json output: per die [Success]
> Checking json output: per socket [Success]
> test child finished with 0
> ---- end ----
> perf stat JSON output linter: Ok
> [root@quaco ~]#
>
> When it fails:
>
> [root@quaco ~]# perf test -v 98
> 98: perf stat JSON output linter :
> --- start ---
> test child forked, pid 62270
> Checking json output: no args [Success]
> Checking json output: system wide [Success]
> Checking json output: interval [Success]
> Checking json output: event [Success]
> Checking json output: per thread Test failed for input:
> {"thread" : "rcu_preempt-16", "counter-value" : "0.018340", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 19071, "pcnt-running" : 100.00, "metric-value" : 0.001758, "metric-unit" : "CPUs utilized"}
>
> {"thread" : "gnome-terminal--2977", "counter-value" : "0.061868", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 62075, "pcnt-running" : 100.00, "metric-value" : 0.005930, "metric-unit" : "CPUs utilized"}
>
> {"thread" : "perf-62294", "counter-value" : "9.398635", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 9398904, "pcnt-running" : 100.00, "metric-value" : 0.900916, "metric-unit" : "CPUs utilized"}
>
> {"thread" : "rcu_preempt-16", "counter-value" : "4.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 15203, "pcnt-running" : 100.00, "metric-value" : 218.102508, "metric-unit" : "K/sec"}
>
> {"thread" : "gnome-terminal--2977", "counter-value" : "1.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 62075, "pcnt-running" : 100.00, "metric-value" : 16.163445, "metric-unit" : "K/sec"}
>
> {"thread" : "perf-62294", "counter-value" : "1.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 9388658, "pcnt-running" : 100.00, "metric-value" : 106.398429, "metric-unit" : "/sec"}
>
> {"thread" : "rcu_preempt-16", "counter-value" : "1.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 12511, "pcnt-running" : 100.00, "metric-value" : 54.525627, "metric-unit" : "K/sec"}
>
> {"thread" : "perf-62294", "counter-value" : "2.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 9427495, "pcnt-running" : 100.00, "metric-value" : 212.796858, "metric-unit" : "/sec"}
>
> {"thread" : "rcu_preempt-16", "counter-value" : "41498.000000", "unit" : "", "event" : "cycles", "event-runtime" : 20944, "pcnt-running" : 100.00, "metric-value" : 2.262704, "metric-unit" : "GHz"}
>
> {"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}
>
> {"thread" : "perf-62294", "counter-value" : "36686750.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9439269, "pcnt-running" : 100.00, "metric-value" : 3.903413, "metric-unit" : "GHz"}
>
> {"thread" : "rcu_preempt-16", "counter-value" : "25086.000000", "unit" : "", "event" : "instructions", "event-runtime" : 20944, "pcnt-running" : 100.00, "metric-value" : 0.604511, "metric-unit" : "insn per cycle"}
>
> {"thread" : "ksoftirqd/1-22", "counter-value" : "13360.000000", "unit" : "", "event" : "instructions", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 0.559394, "metric-unit" : "insn per cycle"}
>
> {"thread" : "perf-62294", "counter-value" : "7905940.000000", "unit" : "", "event" : "instructions", "event-runtime" : 9438686, "pcnt-running" : 100.00, "metric-value" : 0.215499, "metric-unit" : "insn per cycle"}
>
> {"thread" : "rcu_preempt-16", "counter-value" : "3951.000000", "unit" : "", "event" : "branches", "event-runtime" : 16533, "pcnt-running" : 100.00, "metric-value" : 215.430752, "metric-unit" : "M/sec"}
>
> {"thread" : "ksoftirqd/1-22", "counter-value" : "2822.000000", "unit" : "", "event" : "branches", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}
>
> {"thread" : "perf-62294", "counter-value" : "1691804.000000", "unit" : "", "event" : "branches", "event-runtime" : 9474118, "pcnt-running" : 100.00, "metric-value" : 180.005288, "metric-unit" : "M/sec"}
>
> {"thread" : "rcu_preempt-16", "counter-value" : "279.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 16533, "pcnt-running" : 100.00, "metric-value" : 7.061503, "metric-unit" : "of all branches"}
>
> {"thread" : "ksoftirqd/1-22", "counter-value" : "153.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 5.421687, "metric-unit" : "of all branches"}
>
> {"thread" : "kworker/1:2-events-752", "counter-value" : "121.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9850, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "of all branches"}
>
> {"thread" : "perf-62294", "counter-value" : "52693.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9451948, "pcnt-running" : 100.00, "metric-value" : 3.114604, "metric-unit" : "of all branches"}
>
> Traceback (most recent call last):
> File "/home/acme/libexec/perf-core/tests/shell/lib/perf_json_output_lint.py", line 93, in <module>
> check_json_output(expected_items)
> File "/home/acme/libexec/perf-core/tests/shell/lib/perf_json_output_lint.py", line 54, in check_json_output
> raise RuntimeError(f'wrong number of fields. counted {count} expected {expected_items}'
> RuntimeError: wrong number of fields. counted 6 expected [8, 10] in '{"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}

Here it seems that the test finds unexpected number of entities in the
line. It can be, 6 is not in [8, 10]. However, since the line looks OK,
we should expand the set of accepted results to [6, 8, 10] then.

If this fails even without the test patch, so only with my JSON printing
patch, AND it does NOT fail without it, it seems very weird. The patch
fixes the JSON format breakage that happens when two metrics are per line.

The JSON output above looks sane, especially looking at the line that
failed:

{"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}

It rather seems that depending on the values counted, something gets
computed or not, and that changes the output, so sometimes a line with
unexpected count of elements appears there. Couldn't anything else
cause the change of the output?

Michael
> '
> test child finished with -1
> ---- end ----
> perf stat JSON output linter: FAILED!
> [root@quaco ~]#
>
>


2023-02-02 01:19:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf test: Fix JSON format linter test checks

Em Tue, Jan 31, 2023 at 06:14:17PM +0100, Michael Petlan escreveu:
> On Fri, 27 Jan 2023, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Jan 27, 2023 at 09:26:03AM -0300, Arnaldo Carvalho de Melo escreveu:
> [...]
>
> > > Before I apply this first patch I can run, as root, the 'perf test' JSON
> > > lint 100 times without problems:
> > >
> > > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > > 1 98: perf stat JSON output linter : Ok
> > > 2 98: perf stat JSON output linter : Ok
> > > 3 98: perf stat JSON output linter : Ok
> > > 4 98: perf stat JSON output linter : Ok
> > > <SNIP>
> > > 96 98: perf stat JSON output linter : Ok
> > > 97 98: perf stat JSON output linter : Ok
> > > 98 98: perf stat JSON output linter : Ok
> > > 99 98: perf stat JSON output linter : Ok
> > > 100 98: perf stat JSON output linter : Ok
> > > [root@quaco ~]#
> > >
> > > After applying it it fails seemingly randomly, I'll remove both patches
> > > from my tmp.perf/core branch and investigate.
>
> Hello Arnaldo. Have you found anything?
>
> Which patch of the two causes the failures? The one that changes the JSON

Its described below, the first patch introduces the problem.

IT may well be due to the Intel model of my test machine, from what I
could read on the other threads about the "all metrics test", but I
haven't had time to fully investigate.

[acme@quaco pahole]$ grep -m1 ^model\ name /proc/cpuinfo
model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
[acme@quaco pahole]$

- Arnaldo

> printing or the second that changes the test?
>
> I hope that the JSON printing is now correct, it seems to be, at least on my
> machine + the SPR box I was debugging this on.
>
>
> > >
> > > [acme@quaco perf]$ git log --oneline -1
> > > fdafd42aac3629db (HEAD) perf stat: Fix JSON metric printout for multiple metrics per line
> > > [acme@quaco perf]$ perf -v
> > > perf version 6.2.rc5.gfdafd42aac36
> > > [acme@quaco perf]$
> > > [root@quaco ~]# perf -v
> > > perf version 6.2.rc5.gfdafd42aac36
> > > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > > [root@quaco ~]# perf -v
> > > perf version 6.2.rc5.gfdafd42aac36
> > > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > > 1 98: perf stat JSON output linter : Ok
> > > 2 98: perf stat JSON output linter : Ok
> > > 3 98: perf stat JSON output linter : FAILED!
> > > 4 98: perf stat JSON output linter : Ok
> >
> > BTW, after applying the second patch, the one for 'perf test', the
> > problem persists:
> >
> > [acme@quaco perf]$ git log --oneline -1
> > 320cd37176508ec2 (HEAD, acme/tmp.perf/core, acme.korg/tmp.perf/core) perf test: Fix JSON format linter test checks
> > [acme@quaco perf]$ perf -v
> > perf version 6.2.rc5.g320cd3717650
> > [acme@quaco perf]$
> > [root@quaco ~]# perf -v
> > perf version 6.2.rc5.g320cd3717650
> > [root@quaco ~]# for a in $(seq 100) ; do echo -n $a ; perf test 98 ; done
> > 1 98: perf stat JSON output linter : Ok
> > 2 98: perf stat JSON output linter : FAILED!
> > 3 98: perf stat JSON output linter : Ok
> > 4 98: perf stat JSON output linter : Ok
> > 5 98: perf stat JSON output linter : FAILED!
> > 6 98: perf stat JSON output linter : Ok
> > 7 98: perf stat JSON output linter : FAILED!
> > 8 98: perf stat JSON output linter : Ok
> > 9 98: perf stat JSON output linter : Ok
> > 10 98: perf stat JSON output linter : FAILED!
> > 11 98: perf stat JSON output linter : FAILED!
> > 12 98: perf stat JSON output linter : FAILED!
> > 13 98: perf stat JSON output linter : Ok
> > 14 98: perf stat JSON output linter : Ok
> > 15 98: perf stat JSON output linter : FAILED!
> > 16 98: perf stat JSON output linter : FAILED!
> > 17 98: perf stat JSON output linter : FAILED!
> > 18 98: perf stat JSON output linter :^C
> > [root@quaco ~]#
> >
> > When it works:
> >
> > [root@quaco ~]# perf test -v 98
> > 98: perf stat JSON output linter :
> > --- start ---
> > test child forked, pid 62202
> > Checking json output: no args [Success]
> > Checking json output: system wide [Success]
> > Checking json output: interval [Success]
> > Checking json output: event [Success]
> > Checking json output: per thread [Success]
> > Checking json output: per node [Success]
> > Checking json output: system wide Checking json output: system wide no aggregation [Success]
> > Checking json output: per core [Success]
> > Checking json output: per die [Success]
> > Checking json output: per socket [Success]
> > test child finished with 0
> > ---- end ----
> > perf stat JSON output linter: Ok
> > [root@quaco ~]#
> >
> > When it fails:
> >
> > [root@quaco ~]# perf test -v 98
> > 98: perf stat JSON output linter :
> > --- start ---
> > test child forked, pid 62270
> > Checking json output: no args [Success]
> > Checking json output: system wide [Success]
> > Checking json output: interval [Success]
> > Checking json output: event [Success]
> > Checking json output: per thread Test failed for input:
> > {"thread" : "rcu_preempt-16", "counter-value" : "0.018340", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 19071, "pcnt-running" : 100.00, "metric-value" : 0.001758, "metric-unit" : "CPUs utilized"}
> >
> > {"thread" : "gnome-terminal--2977", "counter-value" : "0.061868", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 62075, "pcnt-running" : 100.00, "metric-value" : 0.005930, "metric-unit" : "CPUs utilized"}
> >
> > {"thread" : "perf-62294", "counter-value" : "9.398635", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 9398904, "pcnt-running" : 100.00, "metric-value" : 0.900916, "metric-unit" : "CPUs utilized"}
> >
> > {"thread" : "rcu_preempt-16", "counter-value" : "4.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 15203, "pcnt-running" : 100.00, "metric-value" : 218.102508, "metric-unit" : "K/sec"}
> >
> > {"thread" : "gnome-terminal--2977", "counter-value" : "1.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 62075, "pcnt-running" : 100.00, "metric-value" : 16.163445, "metric-unit" : "K/sec"}
> >
> > {"thread" : "perf-62294", "counter-value" : "1.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 9388658, "pcnt-running" : 100.00, "metric-value" : 106.398429, "metric-unit" : "/sec"}
> >
> > {"thread" : "rcu_preempt-16", "counter-value" : "1.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 12511, "pcnt-running" : 100.00, "metric-value" : 54.525627, "metric-unit" : "K/sec"}
> >
> > {"thread" : "perf-62294", "counter-value" : "2.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 9427495, "pcnt-running" : 100.00, "metric-value" : 212.796858, "metric-unit" : "/sec"}
> >
> > {"thread" : "rcu_preempt-16", "counter-value" : "41498.000000", "unit" : "", "event" : "cycles", "event-runtime" : 20944, "pcnt-running" : 100.00, "metric-value" : 2.262704, "metric-unit" : "GHz"}
> >
> > {"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}
> >
> > {"thread" : "perf-62294", "counter-value" : "36686750.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9439269, "pcnt-running" : 100.00, "metric-value" : 3.903413, "metric-unit" : "GHz"}
> >
> > {"thread" : "rcu_preempt-16", "counter-value" : "25086.000000", "unit" : "", "event" : "instructions", "event-runtime" : 20944, "pcnt-running" : 100.00, "metric-value" : 0.604511, "metric-unit" : "insn per cycle"}
> >
> > {"thread" : "ksoftirqd/1-22", "counter-value" : "13360.000000", "unit" : "", "event" : "instructions", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 0.559394, "metric-unit" : "insn per cycle"}
> >
> > {"thread" : "perf-62294", "counter-value" : "7905940.000000", "unit" : "", "event" : "instructions", "event-runtime" : 9438686, "pcnt-running" : 100.00, "metric-value" : 0.215499, "metric-unit" : "insn per cycle"}
> >
> > {"thread" : "rcu_preempt-16", "counter-value" : "3951.000000", "unit" : "", "event" : "branches", "event-runtime" : 16533, "pcnt-running" : 100.00, "metric-value" : 215.430752, "metric-unit" : "M/sec"}
> >
> > {"thread" : "ksoftirqd/1-22", "counter-value" : "2822.000000", "unit" : "", "event" : "branches", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}
> >
> > {"thread" : "perf-62294", "counter-value" : "1691804.000000", "unit" : "", "event" : "branches", "event-runtime" : 9474118, "pcnt-running" : 100.00, "metric-value" : 180.005288, "metric-unit" : "M/sec"}
> >
> > {"thread" : "rcu_preempt-16", "counter-value" : "279.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 16533, "pcnt-running" : 100.00, "metric-value" : 7.061503, "metric-unit" : "of all branches"}
> >
> > {"thread" : "ksoftirqd/1-22", "counter-value" : "153.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9949, "pcnt-running" : 100.00, "metric-value" : 5.421687, "metric-unit" : "of all branches"}
> >
> > {"thread" : "kworker/1:2-events-752", "counter-value" : "121.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9850, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "of all branches"}
> >
> > {"thread" : "perf-62294", "counter-value" : "52693.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 9451948, "pcnt-running" : 100.00, "metric-value" : 3.114604, "metric-unit" : "of all branches"}
> >
> > Traceback (most recent call last):
> > File "/home/acme/libexec/perf-core/tests/shell/lib/perf_json_output_lint.py", line 93, in <module>
> > check_json_output(expected_items)
> > File "/home/acme/libexec/perf-core/tests/shell/lib/perf_json_output_lint.py", line 54, in check_json_output
> > raise RuntimeError(f'wrong number of fields. counted {count} expected {expected_items}'
> > RuntimeError: wrong number of fields. counted 6 expected [8, 10] in '{"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}
>
> Here it seems that the test finds unexpected number of entities in the
> line. It can be, 6 is not in [8, 10]. However, since the line looks OK,
> we should expand the set of accepted results to [6, 8, 10] then.
>
> If this fails even without the test patch, so only with my JSON printing
> patch, AND it does NOT fail without it, it seems very weird. The patch
> fixes the JSON format breakage that happens when two metrics are per line.
>
> The JSON output above looks sane, especially looking at the line that
> failed:
>
> {"thread" : "ksoftirqd/1-22", "counter-value" : "23883.000000", "unit" : "", "event" : "cycles", "event-runtime" : 9949, "pcnt-running" : 100.00}
>
> It rather seems that depending on the values counted, something gets
> computed or not, and that changes the output, so sometimes a line with
> unexpected count of elements appears there. Couldn't anything else
> cause the change of the output?
>
> Michael
> > '
> > test child finished with -1
> > ---- end ----
> > perf stat JSON output linter: FAILED!
> > [root@quaco ~]#
> >
> >
>

--

- Arnaldo