2023-12-07 12:57:52

by Thomas Richter

[permalink] [raw]
Subject: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390

On s390 this test fails very often, as can be observed in the output
below. This is caused by the second test function
check_cpu_list_counted(). The perf stat is triggered for 2 CPUs
0 and 1. On s390, which usually has a lot more CPUs, most often
this ends up in no counter increments on these 2 CPUs 0 and 1.

Fix this and trigger explicit workload on CPU 0 and 1 for
systemd. This is a better approach than calculating a long
list of CPUs (which is basicly the same as option -a), or
wait a longer period of time.

Output before:
# for i in $(seq 10)
> do ./perf test 100
> done
100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
100: perf stat --bpf-counters --for-each-cgroup test : Ok
#

Output after:
# for i in $(seq 10);
do ./perf test 100;
done
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
100: perf stat --bpf-counters --for-each-cgroup test : Ok
#

Signed-off-by: Thomas Richter <[email protected]>
---
tools/perf/tests/shell/stat_bpf_counters_cgrp.sh | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
index e75d0780dc78..f67602321403 100755
--- a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
@@ -60,6 +60,7 @@ check_system_wide_counted()

check_cpu_list_counted()
{
+ taskset -c 0,1 systemctl daemon-reexec &
check_cpu_list_counted_output=$(perf stat -C 0,1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1 2>&1)
if echo ${check_cpu_list_counted_output} | grep -q -F "<not "; then
echo "Some CPU events are not counted"
--
2.43.0


2023-12-07 23:28:20

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390

Hello,

On Thu, Dec 7, 2023 at 4:57 AM Thomas Richter <[email protected]> wrote:
>
> On s390 this test fails very often, as can be observed in the output
> below. This is caused by the second test function
> check_cpu_list_counted(). The perf stat is triggered for 2 CPUs
> 0 and 1. On s390, which usually has a lot more CPUs, most often
> this ends up in no counter increments on these 2 CPUs 0 and 1.
>
> Fix this and trigger explicit workload on CPU 0 and 1 for
> systemd. This is a better approach than calculating a long
> list of CPUs (which is basicly the same as option -a), or
> wait a longer period of time.
>
> Output before:
> # for i in $(seq 10)
> > do ./perf test 100
> > done
> 100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : FAILED!
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> #
>
> Output after:
> # for i in $(seq 10);
> do ./perf test 100;
> done
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> 100: perf stat --bpf-counters --for-each-cgroup test : Ok
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> ---
> tools/perf/tests/shell/stat_bpf_counters_cgrp.sh | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
> index e75d0780dc78..f67602321403 100755
> --- a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
> +++ b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
> @@ -60,6 +60,7 @@ check_system_wide_counted()
>
> check_cpu_list_counted()
> {
> + taskset -c 0,1 systemctl daemon-reexec &

Thanks for the patch. But I think it should support
machines without systemd (or maybe with old versions).

Also probably you want to reset the behavior after
the test. I think we can just run some built-in test
workload like `perf test -w thloop`.

Thanks,
Namhyung


> check_cpu_list_counted_output=$(perf stat -C 0,1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1 2>&1)
> if echo ${check_cpu_list_counted_output} | grep -q -F "<not "; then
> echo "Some CPU events are not counted"
> --
> 2.43.0
>

2023-12-08 11:08:31

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390

On 12/8/23 00:26, Namhyung Kim wrote:

> Thanks for the patch. But I think it should support
> machines without systemd (or maybe with old versions).
>
> Also probably you want to reset the behavior after
> the test. I think we can just run some built-in test
> workload like `perf test -w thloop`.
>
> Thanks,
> Namhyung

Thanks for our feedback.
Well regarding the use of systemd daemon-reexec the manual says
this command restarts the systemd triggered processes.
There is nothing to reset. All ports stay active while the command
is processed.

I tried your 'perf test -w thloop`, but that did not trigger
anything on system.slice.

I do not understand enough about cgroups and system.slice, but I am
under the impression, that the system.slice just increment counters
when executed by processes under systemd control. Maybe I am wrong.

The only other workload which always incremented system.slice counters
was 'ssh localhost ls -l', which involves local login and a running sshd.

Thanks for your advice on how to continue on this.


--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

2023-12-08 11:31:37

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390

On 12/8/23 12:07, Thomas Richter wrote:
> On 12/8/23 00:26, Namhyung Kim wrote:
>
>> Thanks for the patch. But I think it should support
>> machines without systemd (or maybe with old versions).
>>
>> Also probably you want to reset the behavior after
>> the test. I think we can just run some built-in test
>> workload like `perf test -w thloop`.
>>
>> Thanks,
>> Namhyung
>
> Thanks for our feedback.
> Well regarding the use of systemd daemon-reexec the manual says
> this command restarts the systemd triggered processes.
> There is nothing to reset. All ports stay active while the command
> is processed.
>
> I tried your 'perf test -w thloop`, but that did not trigger
> anything on system.slice.
>
> I do not understand enough about cgroups and system.slice, but I am
> under the impression, that the system.slice just increment counters
> when executed by processes under systemd control. Maybe I am wrong.
>
> The only other workload which always incremented system.slice counters
> was 'ssh localhost ls -l', which involves local login and a running sshd.
>
> Thanks for your advice on how to continue on this.
>
>

I have done some reading and found this:
Special Slice Units
There are four ".slice" units which form the basis of the hierarchy for assignment of
resources for services, users, and virtual machines or containers.
See systemd.slice(7) for details about slice units.

-.slice
The root slice is the root of the slice hierarchy. It usually does not contain units directly,
but may be used to set defaults for the whole tree.
Added in version 206.

system.slice
By default, all system services started by systemd are found in this slice.
Added in version 206.

So it looks like system.slice attached counters get only incremented
when systemd controlled processes do some work,

--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

2023-12-11 23:13:36

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390

On Fri, Dec 8, 2023 at 3:30 AM Thomas Richter <[email protected]> wrote:
>
> On 12/8/23 12:07, Thomas Richter wrote:
> > On 12/8/23 00:26, Namhyung Kim wrote:
> >
> >> Thanks for the patch. But I think it should support
> >> machines without systemd (or maybe with old versions).
> >>
> >> Also probably you want to reset the behavior after
> >> the test. I think we can just run some built-in test
> >> workload like `perf test -w thloop`.
> >>
> >> Thanks,
> >> Namhyung
> >
> > Thanks for our feedback.
> > Well regarding the use of systemd daemon-reexec the manual says
> > this command restarts the systemd triggered processes.
> > There is nothing to reset. All ports stay active while the command
> > is processed.
> >
> > I tried your 'perf test -w thloop`, but that did not trigger
> > anything on system.slice.
> >
> > I do not understand enough about cgroups and system.slice, but I am
> > under the impression, that the system.slice just increment counters
> > when executed by processes under systemd control. Maybe I am wrong.

Ah, you're right. It needs to run the task somewhere in the system.slice.
Then it'd be hard to get a proper cgroup name generally. Hmm..

My concern was it'd bind system daemons on the CPU 0 and 1 after the
test. Probably you could run it at the end of the test again without taskset.

> >
> > The only other workload which always incremented system.slice counters
> > was 'ssh localhost ls -l', which involves local login and a running sshd.

But it won't work if the system doesn't have sshd.

Thanks,
Namhyung