2022-10-27 20:34:04

by Andrei Vagin

[permalink] [raw]
Subject: [PATCH] sched: consider WF_SYNC to find idle siblings

From: Andrei Vagin <[email protected]>

WF_SYNC means that the waker goes to sleep after wakeup, so the current
cpu can be considered idle if the waker is the only process that is
running on it.

The perf pipe benchmark shows that this change reduces the average time
per operation from 8.8 usecs/op to 3.7 usecs/op.

Before:
$ ./tools/perf/perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

Total time: 8.813 [sec]

8.813985 usecs/op
113456 ops/sec

After:
$ ./tools/perf/perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

Total time: 3.743 [sec]

3.743971 usecs/op
267096 ops/sec

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Valentin Schneider <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e4a0b8bd941c..40ac3cc68f5b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7245,7 +7245,8 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
new_cpu = find_idlest_cpu(sd, p, cpu, prev_cpu, sd_flag);
} else if (wake_flags & WF_TTWU) { /* XXX always ? */
/* Fast path */
- new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
+ if (!sync || cpu != new_cpu || this_rq()->nr_running != 1)
+ new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
}
rcu_read_unlock();

--
2.34.1



2022-10-31 13:19:09

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

On Thu, Oct 27, 2022 at 01:26:03PM -0700, Andrei Vagin wrote:
> From: Andrei Vagin <[email protected]>
>
> WF_SYNC means that the waker goes to sleep after wakeup, so the current
> cpu can be considered idle if the waker is the only process that is
> running on it.
>
> The perf pipe benchmark shows that this change reduces the average time
> per operation from 8.8 usecs/op to 3.7 usecs/op.
>
> Before:
> $ ./tools/perf/perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
>
> Total time: 8.813 [sec]
>
> 8.813985 usecs/op
> 113456 ops/sec
>
> After:
> $ ./tools/perf/perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
>
> Total time: 3.743 [sec]
>
> 3.743971 usecs/op
> 267096 ops/sec

But what; if anything, does it do for the myrad of other benchmarks we
run?

2022-10-31 22:54:41

by Andrei Vagin

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

On Mon, Oct 31, 2022 at 5:57 AM Peter Zijlstra <[email protected]> wrote:
>
> On Thu, Oct 27, 2022 at 01:26:03PM -0700, Andrei Vagin wrote:
> > From: Andrei Vagin <[email protected]>
> >
> > WF_SYNC means that the waker goes to sleep after wakeup, so the current
> > cpu can be considered idle if the waker is the only process that is
> > running on it.
> >
> > The perf pipe benchmark shows that this change reduces the average time
> > per operation from 8.8 usecs/op to 3.7 usecs/op.
> >
> > Before:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 8.813 [sec]
> >
> > 8.813985 usecs/op
> > 113456 ops/sec
> >
> > After:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 3.743 [sec]
> >
> > 3.743971 usecs/op
> > 267096 ops/sec
>
> But what; if anything, does it do for the myrad of other benchmarks we
> run?

I've run these set of benchmarks:
* perf bench sched messaging
* perf bench epoll all
* perf bench futex all
* schbench
* tbench
* kernel compilation

Results look the same with and without this change for all benchmarks except
tbench. tbench shows improvements when a number of processes is less than a
number of cpu-s.

Here are results from my test host with 8 cpu-s.

$ tbench_srv & "tbench" "-t" "15" "1" "127.0.0.1"
Before: Throughput 260.498 MB/sec 1 clients 1 procs max_latency=1.301 ms
After: Throughput 462.047 MB/sec 1 clients 1 procs max_latency=1.066 ms

$ tbench_srv & "tbench" "-t" "15" "4" "127.0.0.1"
Before: Throughput 733.44 MB/sec 4 clients 4 procs max_latency=0.935 ms
After: Throughput 1778.94 MB/sec 4 clients 4 procs max_latency=0.882 ms

$ tbench_srv & "tbench" "-t" "15" "8" "127.0.0.1"
Before: Throughput 1965.41 MB/sec 8 clients 8 procs max_latency=2.145 ms
After: Throughput 2002.96 MB/sec 8 clients 8 procs max_latency=1.881 ms

$ tbench_srv & "tbench" "-t" "15" "32" "127.0.0.1"
Before: Throughput 1881.79 MB/sec 32 clients 32 procs max_latency=16.365 ms
After: Throughput 1891.87 MB/sec 32 clients 32 procs max_latency=4.050 ms

Let me know if you want to see results for any other specific benchmark.

Thanks,
Andrei

2022-11-01 10:24:32

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

On Thu, Oct 27, 2022 at 01:26:03PM -0700, Andrei Vagin wrote:
> From: Andrei Vagin <[email protected]>
>
> WF_SYNC means that the waker goes to sleep after wakeup, so the current
> cpu can be considered idle if the waker is the only process that is
> running on it.
>
> The perf pipe benchmark shows that this change reduces the average time
> per operation from 8.8 usecs/op to 3.7 usecs/op.
>
> Before:
> $ ./tools/perf/perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
>
> Total time: 8.813 [sec]
>
> 8.813985 usecs/op
> 113456 ops/sec
>
> After:
> $ ./tools/perf/perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
>
> Total time: 3.743 [sec]
>
> 3.743971 usecs/op
> 267096 ops/sec
>

The WF_SYNC hint in unreliable as the waking process does not always
go to sleep immediately. While it's great for a benchmark like a pipe
benchmark as the relationship is strictly synchronous, it does not work
out as well for networking which can use WF_SYNC for wakeups but either
multiple tasks are being woken up or the waker does not go to sleep as
there is sufficient inbound traffic to keep it awake. There used to be
an attempt to track how accurate WF_SYNC was, using avg_overlap I think,
but it was ultimately removed.

--
Mel Gorman
SUSE Labs

2022-11-01 13:22:27

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

Hi Mel,
On 2022-11-01 at 09:41:57 +0000, Mel Gorman wrote:
> On Thu, Oct 27, 2022 at 01:26:03PM -0700, Andrei Vagin wrote:
> > From: Andrei Vagin <[email protected]>
> >
> > WF_SYNC means that the waker goes to sleep after wakeup, so the current
> > cpu can be considered idle if the waker is the only process that is
> > running on it.
> >
> > The perf pipe benchmark shows that this change reduces the average time
> > per operation from 8.8 usecs/op to 3.7 usecs/op.
> >
> > Before:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 8.813 [sec]
> >
> > 8.813985 usecs/op
> > 113456 ops/sec
> >
> > After:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 3.743 [sec]
> >
> > 3.743971 usecs/op
> > 267096 ops/sec
> >
>
> The WF_SYNC hint in unreliable as the waking process does not always
> go to sleep immediately. While it's great for a benchmark like a pipe
> benchmark as the relationship is strictly synchronous, it does not work
> out as well for networking which can use WF_SYNC for wakeups but either
> multiple tasks are being woken up or the waker does not go to sleep as
> there is sufficient inbound traffic to keep it awake. There used to be
> an attempt to track how accurate WF_SYNC was, using avg_overlap I think,
> but it was ultimately removed.
avg_overlap was removed 10 years ago because of accuracy problem that
"we are missing the necessary call to update_curr()" according to
commit e12f31d3e5d3 ("sched: Remove avg_overlap"). But in current code
I think this issue described in above commit does not exist anymore because
in current code the put_prev_task() would invoke update_curr() for each
entity, then calculating the avg_overlap is always using the update-to-date
runtime? If it is true, is it applicable to bring avg_overlap back?

Some benchmarks suffer from cross-CPU wakeup which introduces rq lock
contention. Similar to this patch, I tracked the average duration of the
task and place the wakee to a CPU where only 1 short-running task is running,
which is another direction to mitigate cross-CPU wakeup[1]. Not sure if we
could deal with more accurately?

[1] https://lore.kernel.org/lkml/6b81eea9a8cafb7634f36586f1744b8d4ac49da5.1666531576.git.yu.c.chen@intel.com/

thanks,
Chenyu

2022-11-02 00:33:20

by Andrei Vagin

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

On Tue, Nov 1, 2022 at 2:42 AM Mel Gorman <[email protected]> wrote:
>
> On Thu, Oct 27, 2022 at 01:26:03PM -0700, Andrei Vagin wrote:
> > From: Andrei Vagin <[email protected]>
> >
> > WF_SYNC means that the waker goes to sleep after wakeup, so the current
> > cpu can be considered idle if the waker is the only process that is
> > running on it.
> >
> > The perf pipe benchmark shows that this change reduces the average time
> > per operation from 8.8 usecs/op to 3.7 usecs/op.
> >
> > Before:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 8.813 [sec]
> >
> > 8.813985 usecs/op
> > 113456 ops/sec
> >
> > After:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 3.743 [sec]
> >
> > 3.743971 usecs/op
> > 267096 ops/sec
> >
>
> The WF_SYNC hint in unreliable as the waking process does not always
> go to sleep immediately. While it's great for a benchmark like a pipe
> benchmark as the relationship is strictly synchronous, it does not work
> out as well for networking which can use WF_SYNC for wakeups but either
> multiple tasks are being woken up or the waker does not go to sleep as
> there is sufficient inbound traffic to keep it awake.

This change should work fine when we wake up multiple tasks. If the waker
doesn't go to sleep, it sounds like a misuse of WF_SYNC. For example,
wake_affine_idle contains the same check like introduced in this
patch. At the first
glance, wake_affine_weight handles WF_SYNC incorrectly in this case too.

As for benchmarks, tbench shows much better numbers with this change:

$ tbench_srv & "tbench" "-t" "15" "4" "127.0.0.1"
Before: Throughput 733.44 MB/sec 4 clients 4 procs max_latency=0.935 ms
After: Throughput 1778.94 MB/sec 4 clients 4 procs max_latency=0.882 ms

I know it is just another synchronous benchmark...

I am working on the synchronous mode of seccom user notifies[1]. In the
first two versions, I used the WF_CURRENT_CPU [2] flag that has been borrowed
from the umcg patchset [3]. But when I was preparing the third version of the
patchset, I wondered why WF_SYNC didn't work in this case and ended up with this
patch. For the seccom patchset, fast synchronous context switches are the most
critical part, so any advice on how to do that properly are welcome.

[1] https://lore.kernel.org/lkml/[email protected]/T/
[2] https://lore.kernel.org/lkml/[email protected]/T/#m8a597d43764aa8ded2788ea7ce4276f9045668d1
[3] https://lkml.iu.edu/hypermail/linux/kernel/2111.0/04473.html

Thanks,
Andrei
> There used to be
> an attempt to track how accurate WF_SYNC was, using avg_overlap I think,
> but it was ultimately removed.
>
> --
> Mel Gorman
> SUSE Labs

2022-11-14 10:14:57

by K Prateek Nayak

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

Hello Andrei,

I've tested this patch on the a dual socket Zen3 system
(2 x 64C/128T)

tl;dr

o I observe consistent regression for hackbench running
smaller number of groups.
o tbench shows improvements for smaller number of clients
but regresses for larger client counts.

I'll leave the detailed results below:

On 10/28/2022 1:56 AM, Andrei Vagin wrote:
> From: Andrei Vagin <[email protected]>
>
> WF_SYNC means that the waker goes to sleep after wakeup, so the current
> cpu can be considered idle if the waker is the only process that is
> running on it.
>
> The perf pipe benchmark shows that this change reduces the average time
> per operation from 8.8 usecs/op to 3.7 usecs/op.
>
> Before:
> $ ./tools/perf/perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
>
> Total time: 8.813 [sec]
>
> 8.813985 usecs/op
> 113456 ops/sec
>
> After:
> $ ./tools/perf/perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
>
> Total time: 3.743 [sec]
>
> 3.743971 usecs/op
> 267096 ops/sec
>

Following are the results from running standard benchmarks on a
dual socket Zen3 (2 x 64C/128T) machine configured in different
NPS modes.

NPS Modes are used to logically divide single socket into
multiple NUMA region.
Following is the NUMA configuration for each NPS mode on the system:

NPS1: Each socket is a NUMA node.
Total 2 NUMA nodes in the dual socket machine.

Node 0: 0-63, 128-191
Node 1: 64-127, 192-255

NPS2: Each socket is further logically divided into 2 NUMA regions.
Total 4 NUMA nodes exist over 2 socket.

Node 0: 0-31, 128-159
Node 1: 32-63, 160-191
Node 2: 64-95, 192-223
Node 3: 96-127, 223-255

NPS4: Each socket is logically divided into 4 NUMA regions.
Total 8 NUMA nodes exist over 2 socket.

Node 0: 0-15, 128-143
Node 1: 16-31, 144-159
Node 2: 32-47, 160-175
Node 3: 48-63, 176-191
Node 4: 64-79, 192-207
Node 5: 80-95, 208-223
Node 6: 96-111, 223-231
Node 7: 112-127, 232-255

Benchmark Results:

Kernel versions:
- tip: 5.19.0 tip sched/core
- sync: 5.19.0 tip sched/core + this patch

When we started testing, the tip was at:
commit fdf756f71271 ("sched: Fix more TASK_state comparisons")

~~~~~~~~~~~~~
~ hackbench ~
~~~~~~~~~~~~~

o NPS1

Test: tip sync
1-groups: 4.06 (0.00 pct) 4.38 (-7.88 pct) *
2-groups: 4.76 (0.00 pct) 4.91 (-3.15 pct)
4-groups: 5.22 (0.00 pct) 5.03 (3.63 pct)
8-groups: 5.35 (0.00 pct) 5.23 (2.24 pct)
16-groups: 7.21 (0.00 pct) 6.86 (4.85 pct)

o NPS2

Test: tip sync
1-groups: 4.09 (0.00 pct) 4.39 (-7.33 pct) *
2-groups: 4.70 (0.00 pct) 4.82 (-2.55 pct)
4-groups: 5.05 (0.00 pct) 4.94 (2.17 pct)
8-groups: 5.35 (0.00 pct) 5.15 (3.73 pct)
16-groups: 6.37 (0.00 pct) 6.55 (-2.82 pct)

o NPS4

Test: tip sync
1-groups: 4.07 (0.00 pct) 4.31 (-5.89 pct) *
2-groups: 4.65 (0.00 pct) 4.79 (-3.01 pct)
4-groups: 5.13 (0.00 pct) 4.99 (2.72 pct)
8-groups: 5.47 (0.00 pct) 5.51 (-0.73 pct)
16-groups: 6.82 (0.00 pct) 7.07 (-3.66 pct)

~~~~~~~~~~~~
~ schbench ~
~~~~~~~~~~~~

o NPS1

#workers: tip sync
1: 33.00 (0.00 pct) 32.00 (3.03 pct)
2: 35.00 (0.00 pct) 36.00 (-2.85 pct)
4: 39.00 (0.00 pct) 36.00 (7.69 pct)
8: 49.00 (0.00 pct) 48.00 (2.04 pct)
16: 63.00 (0.00 pct) 67.00 (-6.34 pct)
32: 109.00 (0.00 pct) 107.00 (1.83 pct)
64: 208.00 (0.00 pct) 220.00 (-5.76 pct)
128: 559.00 (0.00 pct) 551.00 (1.43 pct)
256: 45888.00 (0.00 pct) 40512.00 (11.71 pct)
512: 80000.00 (0.00 pct) 79744.00 (0.32 pct)

o NPS2

#workers: tip sync
1: 30.00 (0.00 pct) 31.00 (-3.33 pct)
2: 37.00 (0.00 pct) 36.00 (2.70 pct)
4: 39.00 (0.00 pct) 42.00 (-7.69 pct)
8: 51.00 (0.00 pct) 47.00 (7.84 pct)
16: 67.00 (0.00 pct) 67.00 (0.00 pct)
32: 117.00 (0.00 pct) 113.00 (3.41 pct)
64: 216.00 (0.00 pct) 228.00 (-5.55 pct)
128: 529.00 (0.00 pct) 531.00 (-0.37 pct)
256: 47040.00 (0.00 pct) 42688.00 (9.25 pct)
512: 84864.00 (0.00 pct) 81280.00 (4.22 pct)

o NPS4

#workers: tip sync
1: 23.00 (0.00 pct) 34.00 (-47.82 pct)
2: 28.00 (0.00 pct) 35.00 (-25.00 pct)
4: 41.00 (0.00 pct) 42.00 (-2.43 pct)
8: 60.00 (0.00 pct) 55.00 (8.33 pct)
16: 71.00 (0.00 pct) 67.00 (5.63 pct)
32: 117.00 (0.00 pct) 116.00 (0.85 pct)
64: 227.00 (0.00 pct) 221.00 (2.64 pct)
128: 545.00 (0.00 pct) 599.00 (-9.90 pct)
256: 45632.00 (0.00 pct) 45760.00 (-0.28 pct)
512: 81024.00 (0.00 pct) 79744.00 (1.57 pct)

Note: schebench at low worker count can show large run
to run variation. Unless the regressions are unusually
large, these data points can be ignored.

~~~~~~~~~~
~ tbench ~
~~~~~~~~~~

o NPS1

Clients: tip sync
1 578.37 (0.00 pct) 652.14 (12.75 pct)
2 1062.09 (0.00 pct) 1179.10 (11.01 pct)
4 1800.62 (0.00 pct) 2160.13 (19.96 pct)
8 3211.02 (0.00 pct) 3705.97 (15.41 pct)
16 4848.92 (0.00 pct) 5906.04 (21.80 pct)
32 9091.36 (0.00 pct) 10622.56 (16.84 pct)
64 15454.01 (0.00 pct) 20319.16 (31.48 pct)
128 3511.33 (0.00 pct) 31631.81 (800.84 pct) *
128 19910.99 (0.00pct) 31631.81 (58.86 pct) [Verification Run]
256 50019.32 (0.00 pct) 39234.55 (-21.56 pct) *
512 44317.68 (0.00 pct) 38788.24 (-12.47 pct) *
1024 41200.85 (0.00 pct) 37231.35 (-9.63 pct) *

o NPS2

Clients: tip sync
1 576.05 (0.00 pct) 648.53 (12.58 pct)
2 1037.68 (0.00 pct) 1231.59 (18.68 pct)
4 1818.13 (0.00 pct) 2173.43 (19.54 pct)
8 3004.16 (0.00 pct) 3636.79 (21.05 pct)
16 4520.11 (0.00 pct) 5786.93 (28.02 pct)
32 8624.23 (0.00 pct) 10927.48 (26.70 pct)
64 14886.75 (0.00 pct) 18573.28 (24.76 pct)
128 20602.00 (0.00 pct) 28635.03 (38.99 pct)
256 45566.83 (0.00 pct) 36262.90 (-20.41 pct) *
512 42717.49 (0.00 pct) 35884.09 (-15.99 pct) *
1024 40936.61 (0.00 pct) 37045.24 (-9.50 pct) *

o NPS4

Clients: tip sync
1 576.36 (0.00 pct) 658.78 (14.30 pct)
2 1044.26 (0.00 pct) 1220.65 (16.89 pct)
4 1839.77 (0.00 pct) 2190.02 (19.03 pct)
8 3043.53 (0.00 pct) 3582.88 (17.72 pct)
16 5207.54 (0.00 pct) 5349.74 (2.73 pct)
32 9263.86 (0.00 pct) 10608.17 (14.51 pct)
64 14959.66 (0.00 pct) 18186.46 (21.57 pct)
128 20698.65 (0.00 pct) 31209.19 (50.77 pct)
256 46666.21 (0.00 pct) 38551.07 (-17.38 pct) *
512 41532.80 (0.00 pct) 37525.65 (-9.64 pct) *
1024 39459.49 (0.00 pct) 36075.96 (-8.57 pct) *

Note: On the tested kernel, with 128 clients, tbench can
run into a bottleneck during C-state exit. More details
can be found at
https://lore.kernel.org/lkml/[email protected]/
This issue has been fixed in v6.0 but was not a part of
the tip kernel when we began testing. This data point has
been rerun with C2 disabled to get representative results.

~~~~~~~~~~
~ stream ~
~~~~~~~~~~

o NPS1

-> 10 Runs:

Test: tip sync
Copy: 328419.14 (0.00 pct) 331174.37 (0.83 pct)
Scale: 206071.21 (0.00 pct) 211655.02 (2.70 pct)
Add: 235271.48 (0.00 pct) 240925.76 (2.40 pct)
Triad: 253175.80 (0.00 pct) 250029.15 (-1.24 pct)

-> 100 Runs:

Test: tip sync
Copy: 328209.61 (0.00 pct) 316634.10 (-3.52 pct)
Scale: 216310.13 (0.00 pct) 211496.10 (-2.22 pct)
Add: 244417.83 (0.00 pct) 237258.24 (-2.92 pct)
Triad: 237508.83 (0.00 pct) 247541.91 (4.22 pct)

o NPS2

-> 10 Runs:

Test: tip sync
Copy: 336503.88 (0.00 pct) 333502.90 (-0.89 pct)
Scale: 218035.23 (0.00 pct) 217009.06 (-0.47 pct)
Add: 257677.42 (0.00 pct) 253882.69 (-1.47 pct)
Triad: 268872.37 (0.00 pct) 263099.47 (-2.14 pct)

-> 100 Runs:

Test: tip sync
Copy: 332304.34 (0.00 pct) 336798.10 (1.35 pct)
Scale: 223421.60 (0.00 pct) 217501.94 (-2.64 pct)
Add: 252363.56 (0.00 pct) 255571.69 (1.27 pct)
Triad: 266687.56 (0.00 pct) 262833.28 (-1.44 pct)

o NPS4

-> 10 Runs:

Test: tip sync
Copy: 353515.62 (0.00 pct) 335743.68 (-5.02 pct)
Scale: 228854.37 (0.00 pct) 237557.44 (3.80 pct)
Add: 254942.12 (0.00 pct) 259415.35 (1.75 pct)
Triad: 270521.87 (0.00 pct) 273002.56 (0.91 pct)

-> 100 Runs:

Test: tip sync
Copy: 374520.81 (0.00 pct) 374736.48 (0.05 pct)
Scale: 246280.23 (0.00 pct) 237696.80 (-3.48 pct)
Add: 262772.72 (0.00 pct) 259964.95 (-1.06 pct)
Triad: 283740.92 (0.00 pct) 279790.28 (-1.39 pct)

~~~~~~~~~~~~~~~~~~
~ Schedstat Data ~
~~~~~~~~~~~~~~~~~~

-> Following are the schedstat data from hackbench 1-group
and tbench 64-clients and tbench 256-clients

-> Legend for per CPU stats:

rq->yld_count: sched_yield count
rq->sched_count: schedule called
rq->sched_goidle: schedule left the processor idle
rq->ttwu_count: try_to_wake_up was called
rq->ttwu_local: try_to_wake_up was called to wake up the local cpu
rq->rq_cpu_time: total runtime by tasks on this processor (in jiffies)
rq->rq_sched_info.run_delay: total waittime by tasks on this processor (in jiffies)
rq->rq_sched_info.pcount: total timeslices run on this cpu

o Hackbench - NPS1

tip: 4.069s
sync: 4.525s

------------------------------------------------------------------------------------------------------------------------------------------------------
cpu: all_cpus (avg) vs cpu: all_cpus (avg)
------------------------------------------------------------------------------------------------------------------------------------------------------
kernel : tip, sync
sched_yield count : 0, 0
Legacy counter can be ignored : 0, 0
schedule called : 27633, 25474 | -7.81|
schedule left the processor idle : 11609, 10587 | -8.80| ( 42.01, 41.56 )
try_to_wake_up was called : 15991, 14807 | -7.40|
try_to_wake_up was called to wake up the local cpu : 473, 1630 | 244.61| ( 2.96% of total, 11.01% of total ) <--- More wakeup on local CPU
total runtime by tasks on this processor (in jiffies) : 252079468, 316798504 | 25.67|
total waittime by tasks on this processor (in jiffies) : 204693750, 207418931 ( 81.20, 65.47 )
total timeslices run on this cpu : 16020, 14884 | -7.09| <------------------------ The increase in runtime has a
strong correlation with
rq->rq_sched_info.pcount
------------------------------------------------------------------------------------------------------------------------------------------------------

< ----------------------------------------------------------------- Wakeup info: ----------------------------------------------------------------- >
kernel : tip, sync
Wakeups on same SMT cpus = all_cpus (avg) : 854, 556 | -34.89|
Wakeups on same MC cpus = all_cpus (avg) : 12855, 8624 | -32.91|
Wakeups on same DIE cpus = all_cpus (avg) : 1270, 2496 | 96.54|
Wakeups on same NUMA cpus = all_cpus (avg) : 538, 1500 | 178.81|
Affine wakeups on same SMT cpus = all_cpus (avg) : 590, 512 | -13.22|
Affine wakeups on same MC cpus = all_cpus (avg) : 8048, 6244 | -22.42|
Affine wakeups on same DIE cpus = all_cpus (avg) : 641, 1712 | 167.08|
Affine wakeups on same NUMA cpus = all_cpus (avg) : 256, 800 | 212.50|
------------------------------------------------------------------------------------------------------------------------------------------------------

o tbench - NPS1 (64 Clients)

tip: 15674.9 MB/sec
sync: 19510.4 MB/sec (+24.46%)

------------------------------------------------------------------------------------------------------------------------------------------------------
cpu: all_cpus (avg) vs cpu: all_cpus (avg)
------------------------------------------------------------------------------------------------------------------------------------------------------
kernel : tip, sync
sched_yield count : 0, 0
Legacy counter can be ignored : 0, 0
schedule called : 3245409, 2088248 | -35.66|
schedule left the processor idle : 1621656, 5675 | -99.65| ( 49.97% of total, 0.27% of total)
try_to_wake_up was called : 1622705, 1373295 | -15.37|
try_to_wake_up was called to wake up the local cpu : 1075, 1369101 |127258.23| ( 0.07% of total, 99.69% of total ) <---- In case of modified kernel
total runtime by tasks on this processor (in jiffies) : 18612280720, 17991066483 all wakeup are on the same CPU
total waittime by tasks on this processor (in jiffies) : 7698505, 7046293108 |91428.07| ( 0.04% of total, 39.17% of total )
total timeslices run on this cpu : 1623752, 2082438 | 28.25| <----------------------------------------------- Total rq->rq_sched_info.pcount is
larger on the modified kernel. Strong
correlation with improvements in BW
------------------------------------------------------------------------------------------------------------------------------------------------------

< ----------------------------------------------------------------- Wakeup info: ----------------------------------------------------------------- >
kernel : tip, sync
Wakeups on same SMT cpus = all_cpus (avg) : 64021, 3757 | -94.13|
Wakeups on same MC cpus = all_cpus (avg) : 1557597, 392 | -99.97| <-- In most case, the affine wakeup
Wakeups on same DIE cpus = all_cpus (avg) : 4, 18 | 350.00| is on another CPU is same MC doamin
Wakeups on same NUMA cpus = all_cpus (avg) : 5, 25 | 400.00| in case of tip kernel
Affine wakeups on same SMT cpus = all_cpus (avg) : 64018, 1374 | -97.85| |
Affine wakeups on same MC cpus = all_cpus (avg) : 1557431, 129 | -99.99| <-------
Affine wakeups on same DIE cpus = all_cpus (avg) : 3, 10 | 233.33|
Affine wakeups on same NUMA cpus = all_cpus (avg) : 2, 14 | 600.00|
------------------------------------------------------------------------------------------------------------------------------------------------------

o tbench - NPS1 (256 Clients)

tip: 44792.6 MB/sec
sync: 36050.4 MB/sec (-19.51%)

------------------------------------------------------------------------------------------------------------------------------------------------------
cpu: all_cpus (avg) vs cpu: all_cpus (avg)
------------------------------------------------------------------------------------------------------------------------------------------------------
kernel : tip, sync
sched_yield count : 3, 0 |-100.00|
Legacy counter can be ignored : 0, 0
schedule called : 4795945, 3839616 | -19.94|
schedule left the processor idle : 21549, 63 | -99.71| ( 0.45, 0 )
try_to_wake_up was called : 3077285, 2526474 | -17.90|
try_to_wake_up was called to wake up the local cpu : 3055451, 2526380 | -17.32| ( 99.29, 100 ) <------ More wakeup count for tip almost all of which
total runtime by tasks on this processor (in jiffies) : 71776758037, 71864382378 is on the local CPU
total waittime by tasks on this processor (in jiffies) : 29064423457, 27994939439 ( 40.49, 38.96 )
total timeslices run on this cpu : 4774388, 3839547 | -19.58| <---------------------------- rq->rq_sched_info.pcount is lower on newer
kernel which has strong correlation with the B/W drop
------------------------------------------------------------------------------------------------------------------------------------------------------

< ----------------------------------------------------------------- Wakeup info: ----------------------------------------------------------------- >
kernel : tip, sync
Wakeups on same SMT cpus = all_cpus (avg) : 19979, 78 | -99.61|
Wakeups on same MC cpus = all_cpus (avg) : 1848, 9 | -99.51|
Wakeups on same DIE cpus = all_cpus (avg) : 3, 2 | -33.33|
Wakeups on same NUMA cpus = all_cpus (avg) : 3, 3
Affine wakeups on same SMT cpus = all_cpus (avg) : 19860, 36 | -99.82|
Affine wakeups on same MC cpus = all_cpus (avg) : 1758, 4 | -99.77|
Affine wakeups on same DIE cpus = all_cpus (avg) : 2, 1 | -50.00|
Affine wakeups on same NUMA cpus = all_cpus (avg) : 2, 2
------------------------------------------------------------------------------------------------------------------------------------------------------


> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Vincent Guittot <[email protected]>
> Cc: Dietmar Eggemann <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ben Segall <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Cc: Valentin Schneider <[email protected]>
> Signed-off-by: Andrei Vagin <[email protected]>
> ---
> kernel/sched/fair.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e4a0b8bd941c..40ac3cc68f5b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7245,7 +7245,8 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
> new_cpu = find_idlest_cpu(sd, p, cpu, prev_cpu, sd_flag);
> } else if (wake_flags & WF_TTWU) { /* XXX always ? */
> /* Fast path */
> - new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
> + if (!sync || cpu != new_cpu || this_rq()->nr_running != 1)
> + new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);

Adding perf stat data below which shows a larger dip in IPC for the patched
kernel as the system gets busy.

~~~~~~~~~~~~~
~ perf stat ~
~~~~~~~~~~~~~

Command: perf stat -a -e cycles -e instructions -- ./tbench_runner.sh

- tbench (NPS1)

-> 64 clients

o tip (15182 MB/sec):

18,054,464,226,798 cycles
14,634,257,521,310 instructions # 0.81 insn per cycle

o sync (19597.7 MB/sec [+29.08%]):

14,355,896,738,265 cycles
13,331,402,605,112 instructions # 0.93 (+14.81%) insn per cycle <-- Patched kernel has higher IPC
probably due to lesser stalls
with data warm at L1 and L2 cache.

-> 256 clients

o tip (51581 MB/sec):

51,719,263,738,848 cycles
34,387,747,050,053 instructions # 0.66 insn per cycle

o sync (42409 MB/sec [-17.78%]):

55,236,537,108,392 cycles
28,406,928,952,272 instructions # 0.51 (-22.72%) insn per cycle <-- Patched kernel has lower IPC when
system is busy.

> }
> rcu_read_unlock();
>
If you would like me to run any specific workload on the
test system or gather any specific data, please let me
know.
--
Thanks and Regards,
Prateek

2022-11-16 11:28:36

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

On Mon, Oct 31, 2022 at 01:57:31PM +0100, Peter Zijlstra wrote:
> On Thu, Oct 27, 2022 at 01:26:03PM -0700, Andrei Vagin wrote:
> > From: Andrei Vagin <[email protected]>
> >
> > WF_SYNC means that the waker goes to sleep after wakeup, so the current
> > cpu can be considered idle if the waker is the only process that is
> > running on it.
> >
> > The perf pipe benchmark shows that this change reduces the average time
> > per operation from 8.8 usecs/op to 3.7 usecs/op.
> >
> > Before:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 8.813 [sec]
> >
> > 8.813985 usecs/op
> > 113456 ops/sec
> >
> > After:
> > $ ./tools/perf/perf bench sched pipe
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> > Total time: 3.743 [sec]
> >
> > 3.743971 usecs/op
> > 267096 ops/sec
>
> But what; if anything, does it do for the myrad of other benchmarks we
> run?

Varies as expected.

For a basic IO benchmark like dbench, the headline difference is small
although on occasion bad for high thread counts. Variability tends to be
higher.

Tbench4 tended to look great for lower thread counts as it's quite
synchronous but regresses for larger thread counts.

perf pipe tends to look great as it's strictly synchronous. On one machine
(1 socket Skylake), it showed a regression of 27% but sometimes it was
the opposite with 70-80% gains depending on the machine.

Then something like netperf gets punished severely across all machines.
TCP_STREAM is variable but UDP_STREAM gets punished severely for all
machines. TCP_STREAM sometimes shows gains and losses but mostly losses
of around 5% except for 1 machine. UDP_STREAM consistently shows losses
in the 40-60% mark even for the simple case of running on a UMA machine

netperf-udp
6.1.0-rc3 6.1.0-rc3
vanilla sched-wfsync-v1r1
Hmean send-64 235.21 ( 0.00%) 112.75 * -52.06%*
Hmean send-128 475.87 ( 0.00%) 227.27 * -52.24%*
Hmean send-256 968.82 ( 0.00%) 452.43 * -53.30%*
Hmean send-1024 3859.30 ( 0.00%) 1792.63 * -53.55%*
Hmean send-2048 7720.07 ( 0.00%) 3525.27 * -54.34%*
Hmean send-3312 12095.78 ( 0.00%) 5587.11 * -53.81%*
Hmean send-4096 14498.47 ( 0.00%) 6824.96 * -52.93%*
Hmean send-8192 25713.27 ( 0.00%) 12474.87 * -51.48%*
Hmean send-16384 43537.08 ( 0.00%) 23080.04 * -46.99%*

This is not too surprising as UDP_STREAM is blasting packets so there are
wakeups but the waker is not going to sleep immediately. So yeah, there are
cases where the patch helps but when it hurts, it can hurt a lot. The patch
certainly demonstrates that there is room for improvement on how WF_SYNC is
treated but as it stands, it would start a game of apply/revert ping-pong
as different bisections showed the patch caused one set of problems and
the revert caused another.

--
Mel Gorman
SUSE Labs

2022-11-16 19:14:11

by Andrei Vagin

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings

Hello Mel and Prateek,

Thank you both for running tests and publishing results here.

On Wed, Nov 16, 2022 at 2:57 AM Mel Gorman <[email protected]> wrote:
<snip>
>
> This is not too surprising as UDP_STREAM is blasting packets so there are
> wakeups but the waker is not going to sleep immediately. So yeah, there are
> cases where the patch helps but when it hurts, it can hurt a lot. The patch
> certainly demonstrates that there is room for improvement on how WF_SYNC is
> treated but as it stands, it would start a game of apply/revert ping-pong
> as different bisections showed the patch caused one set of problems and
> the revert caused another.

I agree with these conclusions. The situation is more complex than how
I saw it initially.

Thanks,
Andrei

2022-11-21 15:47:27

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] sched: consider WF_SYNC to find idle siblings


Greeting,

FYI, we noticed a 53.3% improvement of netperf.Throughput_tps due to commit:


commit: b6aabb01e3004b0285d953761e1592974b4b50fa ("[PATCH] sched: consider WF_SYNC to find idle siblings")
url: https://github.com/intel-lab-lkp/linux/commits/Andrei-Vagin/sched-consider-WF_SYNC-to-find-idle-siblings/20221028-042823
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git c36eae5a86d15a694968be35d7ff056854997a72
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH] sched: consider WF_SYNC to find idle siblings

in testcase: netperf
on test machine: 144 threads 4 sockets Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz (Cooper Lake) with 128G memory
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 25%
cluster: cs-localhost
test: TCP_RR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/

In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | unixbench: unixbench.score 36.2% improvement |
| test machine | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | nr_task=30% |
| | runtime=300s |
| | test=context1 |
+------------------+-------------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-11/performance/ipv4/x86_64-rhel-8.3/25%/debian-11.1-x86_64-20220510.cgz/300s/lkp-cpl-4sp1/TCP_RR/netperf

commit:
c36eae5a86 ("sched/psi: Fix possible missing or delayed pending event")
b6aabb01e3 ("sched: consider WF_SYNC to find idle siblings")

c36eae5a86d15a69 b6aabb01e3004b0285d953761e1
---------------- ---------------------------
%stddev %change %stddev
\ | \
3611595 +53.3% 5537882 netperf.Throughput_total_tps
100322 +53.3% 153830 netperf.Throughput_tps
130329 +8.7e+05% 1.128e+09 ? 19% netperf.time.involuntary_context_switches
16743 ? 2% -22.6% 12962 netperf.time.minor_page_faults
1202 +16.4% 1399 ? 2% netperf.time.percent_of_cpu_this_job_got
3432 +12.6% 3865 ? 2% netperf.time.system_time
188.48 +85.8% 350.16 ? 4% netperf.time.user_time
1.083e+09 -50.8% 5.334e+08 ? 39% netperf.time.voluntary_context_switches
1.083e+09 +53.3% 1.661e+09 netperf.workload
2.212e+09 -97.0% 65981036 cpuidle..usage
84551 ? 5% -18.5% 68943 ? 3% meminfo.Active
84433 ? 5% -18.5% 68833 ? 3% meminfo.Active(anon)
79312 ? 6% -21.0% 62677 ? 5% numa-meminfo.node3.Active
79280 ? 6% -21.0% 62645 ? 5% numa-meminfo.node3.Active(anon)
19806 ? 6% -20.8% 15687 ? 5% numa-vmstat.node3.nr_active_anon
19806 ? 6% -20.8% 15687 ? 5% numa-vmstat.node3.nr_zone_active_anon
8.60 -2.4 6.16 mpstat.cpu.all.soft%
18.97 -1.9 17.04 mpstat.cpu.all.sys%
1.30 +0.5 1.85 ? 3% mpstat.cpu.all.usr%
37.00 +47.7% 54.67 vmstat.procs.r
14204693 -23.0% 10930950 vmstat.system.cs
295062 +6.5% 314162 vmstat.system.in
21070 ? 5% -18.5% 17172 ? 3% proc-vmstat.nr_active_anon
106997 -2.5% 104321 proc-vmstat.nr_anon_pages
409349 ? 4% +10.0% 450452 ? 4% proc-vmstat.nr_mapped
83608 -1.5% 82352 proc-vmstat.nr_slab_unreclaimable
21070 ? 5% -18.5% 17172 ? 3% proc-vmstat.nr_zone_active_anon
531984 ? 9% -15.9% 447458 ? 6% proc-vmstat.numa_hint_faults_local
24368 ? 7% -29.6% 17161 ? 4% proc-vmstat.pgactivate
7844373 -3.4% 7576615 proc-vmstat.pgalloc_normal
1450970 ? 2% -18.0% 1189301 proc-vmstat.pgfree
1358 -36.3% 864.67 turbostat.Avg_MHz
41.28 -15.0 26.27 turbostat.Busy%
1.281e+09 -100.0% 15898 ? 95% turbostat.C1
12.60 -12.6 0.00 turbostat.C1%
58.66 +25.5% 73.65 turbostat.CPU%c1
0.21 +77.0% 0.37 turbostat.IPC
14.47 ? 21% -14.5 0.00 turbostat.PKG_%
8.861e+08 -100.0% 4493 ? 14% turbostat.POLL
6.47 -6.5 0.00 turbostat.POLL%
571.65 -8.5% 522.90 turbostat.PkgWatt
8.27 -4.0% 7.94 turbostat.RAMWatt
27233 ? 32% -95.4% 1255 ?136% sched_debug.cfs_rq:/.MIN_vruntime.avg
750764 ? 9% -76.0% 180343 ?136% sched_debug.cfs_rq:/.MIN_vruntime.max
133981 ? 19% -88.8% 14993 ?136% sched_debug.cfs_rq:/.MIN_vruntime.stddev
0.26 +23.0% 0.32 ? 6% sched_debug.cfs_rq:/.h_nr_running.avg
1.03 ? 6% +88.1% 1.93 ? 4% sched_debug.cfs_rq:/.h_nr_running.max
0.42 +46.9% 0.62 ? 5% sched_debug.cfs_rq:/.h_nr_running.stddev
9751 ? 10% -32.8% 6552 ? 13% sched_debug.cfs_rq:/.load.avg
27233 ? 32% -95.4% 1255 ?136% sched_debug.cfs_rq:/.max_vruntime.avg
750764 ? 9% -76.0% 180343 ?136% sched_debug.cfs_rq:/.max_vruntime.max
133981 ? 19% -88.8% 14993 ?136% sched_debug.cfs_rq:/.max_vruntime.stddev
724533 ? 8% +35.8% 983881 ? 8% sched_debug.cfs_rq:/.min_vruntime.avg
845404 ? 10% +54.1% 1302855 ? 8% sched_debug.cfs_rq:/.min_vruntime.max
27934 ? 13% +297.8% 111130 ? 11% sched_debug.cfs_rq:/.min_vruntime.stddev
0.26 -18.8% 0.21 ? 4% sched_debug.cfs_rq:/.nr_running.avg
197.05 +90.2% 374.73 ? 4% sched_debug.cfs_rq:/.runnable_avg.avg
849.80 ? 4% +77.3% 1506 ? 2% sched_debug.cfs_rq:/.runnable_avg.max
209.48 +161.4% 547.51 sched_debug.cfs_rq:/.runnable_avg.stddev
114133 ? 29% +172.2% 310722 ? 39% sched_debug.cfs_rq:/.spread0.max
-65385 +469.6% -372412 sched_debug.cfs_rq:/.spread0.min
27949 ? 13% +297.7% 111154 ? 11% sched_debug.cfs_rq:/.spread0.stddev
196.98 +35.1% 266.15 ? 3% sched_debug.cfs_rq:/.util_avg.avg
849.66 ? 4% +35.1% 1147 ? 3% sched_debug.cfs_rq:/.util_avg.max
209.42 +86.8% 391.19 sched_debug.cfs_rq:/.util_avg.stddev
93.07 ? 4% +76.1% 163.93 ? 4% sched_debug.cfs_rq:/.util_est_enqueued.avg
664.98 ? 8% +48.2% 985.29 ? 3% sched_debug.cfs_rq:/.util_est_enqueued.max
162.79 +91.6% 311.99 sched_debug.cfs_rq:/.util_est_enqueued.stddev
250834 ? 9% +84.0% 461429 ? 5% sched_debug.cpu.avg_idle.avg
3157 ? 6% +939.5% 32825 ? 45% sched_debug.cpu.avg_idle.min
282325 ? 5% +16.5% 328926 sched_debug.cpu.avg_idle.stddev
7.58 ? 3% +34.3% 10.18 ? 2% sched_debug.cpu.clock.stddev
10236 ? 4% -13.5% 8854 ? 5% sched_debug.cpu.curr->pid.max
0.00 ? 13% +55.3% 0.00 ? 4% sched_debug.cpu.next_balance.stddev
0.23 ? 3% +41.6% 0.32 ? 5% sched_debug.cpu.nr_running.avg
1.03 ? 6% +81.6% 1.87 ? 5% sched_debug.cpu.nr_running.max
0.40 ? 2% +53.6% 0.62 ? 4% sched_debug.cpu.nr_running.stddev
13964454 ? 10% -31.5% 9562330 ? 9% sched_debug.cpu.nr_switches.avg
15141338 ? 10% -19.0% 12257576 ? 7% sched_debug.cpu.nr_switches.max
12639092 ? 9% -54.5% 5746871 ? 11% sched_debug.cpu.nr_switches.min
444327 ? 13% +150.9% 1114942 ? 10% sched_debug.cpu.nr_switches.stddev
950.00 -100.0% 0.00 sched_debug.rt_rq:/.rt_runtime.avg
950.00 -100.0% 0.00 sched_debug.rt_rq:/.rt_runtime.max
950.00 -100.0% 0.00 sched_debug.rt_rq:/.rt_runtime.min
12.33 -96.1% 0.48 ? 7% perf-stat.i.MPKI
2.876e+10 +11.9% 3.217e+10 perf-stat.i.branch-instructions
1.21 +0.1 1.26 perf-stat.i.branch-miss-rate%
3.406e+08 +17.8% 4.012e+08 perf-stat.i.branch-misses
1.90 ? 15% +34.8 36.70 ? 16% perf-stat.i.cache-miss-rate%
1.739e+09 -95.6% 76617839 ? 7% perf-stat.i.cache-references
14360622 -23.1% 11042127 perf-stat.i.context-switches
1.43 -46.7% 0.76 perf-stat.i.cpi
2.023e+11 -38.7% 1.239e+11 perf-stat.i.cpu-cycles
1322 +1109.9% 15995 ? 2% perf-stat.i.cpu-migrations
7928 ? 19% -35.8% 5089 ? 19% perf-stat.i.cycles-between-cache-misses
0.02 ? 6% -0.0 0.01 ? 4% perf-stat.i.dTLB-load-miss-rate%
6843508 ? 6% -32.1% 4644002 ? 3% perf-stat.i.dTLB-load-misses
4.213e+10 +15.6% 4.87e+10 perf-stat.i.dTLB-loads
0.00 ? 5% -0.0 0.00 ? 4% perf-stat.i.dTLB-store-miss-rate%
2.439e+10 +14.5% 2.793e+10 perf-stat.i.dTLB-stores
62.06 +2.8 64.90 perf-stat.i.iTLB-load-miss-rate%
1.596e+08 ? 2% +44.2% 2.302e+08 perf-stat.i.iTLB-load-misses
97319700 +27.8% 1.244e+08 perf-stat.i.iTLB-loads
1.429e+11 +14.0% 1.629e+11 perf-stat.i.instructions
918.99 ? 2% -19.1% 743.46 perf-stat.i.instructions-per-iTLB-miss
0.71 +85.9% 1.31 perf-stat.i.ipc
1.40 -38.7% 0.86 perf-stat.i.metric.GHz
873.16 +76.2% 1538 ? 3% perf-stat.i.metric.K/sec
673.72 +12.2% 755.60 perf-stat.i.metric.M/sec
82.06 ? 3% +9.3 91.32 perf-stat.i.node-store-miss-rate%
1049204 ? 16% +53.8% 1613542 ? 18% perf-stat.i.node-store-misses
443904 ? 6% -30.0% 310869 ? 6% perf-stat.i.node-stores
12.17 -96.1% 0.47 ? 7% perf-stat.overall.MPKI
1.18 +0.1 1.25 perf-stat.overall.branch-miss-rate%
1.75 ? 17% +36.3 38.07 ? 15% perf-stat.overall.cache-miss-rate%
1.42 -46.3% 0.76 perf-stat.overall.cpi
6820 ? 15% -35.7% 4383 ? 17% perf-stat.overall.cycles-between-cache-misses
0.02 ? 6% -0.0 0.01 ? 4% perf-stat.overall.dTLB-load-miss-rate%
0.00 ? 5% -0.0 0.00 ? 3% perf-stat.overall.dTLB-store-miss-rate%
62.11 +2.8 64.92 perf-stat.overall.iTLB-load-miss-rate%
895.64 ? 2% -21.0% 707.76 perf-stat.overall.instructions-per-iTLB-miss
0.71 +86.1% 1.31 perf-stat.overall.ipc
69.85 ? 6% +13.7 83.54 ? 2% perf-stat.overall.node-store-miss-rate%
39789 -25.6% 29606 perf-stat.overall.path-length
2.866e+10 +11.9% 3.206e+10 perf-stat.ps.branch-instructions
3.395e+08 +17.8% 3.999e+08 perf-stat.ps.branch-misses
1.733e+09 -95.6% 76361868 ? 7% perf-stat.ps.cache-references
14312085 -23.1% 11004648 perf-stat.ps.context-switches
2.016e+11 -38.7% 1.235e+11 perf-stat.ps.cpu-cycles
1317 +1109.8% 15942 ? 2% perf-stat.ps.cpu-migrations
6808921 ? 6% -32.1% 4621256 ? 3% perf-stat.ps.dTLB-load-misses
4.199e+10 +15.6% 4.854e+10 perf-stat.ps.dTLB-loads
2.431e+10 +14.5% 2.784e+10 perf-stat.ps.dTLB-stores
1.591e+08 ? 2% +44.2% 2.294e+08 perf-stat.ps.iTLB-load-misses
96991959 +27.8% 1.239e+08 perf-stat.ps.iTLB-loads
1.424e+11 +14.0% 1.624e+11 perf-stat.ps.instructions
0.01 ? 50% -72.0% 0.00 ?115% perf-stat.ps.major-faults
1046642 ? 16% +53.7% 1608395 ? 18% perf-stat.ps.node-store-misses
442450 ? 6% -29.9% 309961 ? 7% perf-stat.ps.node-stores
4.311e+13 +14.1% 4.919e+13 perf-stat.total.instructions
55.82 ? 5% -21.0 34.84 ? 14% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
55.50 ? 5% -20.9 34.61 ? 14% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
55.49 ? 5% -20.9 34.61 ? 14% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
55.40 ? 5% -20.8 34.61 ? 14% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
48.23 ? 7% -13.7 34.56 ? 14% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
46.09 ? 7% -11.7 34.40 ? 14% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
46.05 ? 7% -11.7 34.36 ? 14% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
10.29 ? 7% -10.3 0.00 perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
10.00 ? 6% -10.0 0.00 perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
9.36 ? 6% -9.4 0.00 perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
1.58 ? 7% -1.2 0.40 ? 72% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
6.53 ? 6% -1.1 5.42 ? 13% perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv.send_omni_inner
0.00 +0.8 0.82 ? 8% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
0.00 +0.9 0.88 ? 42% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send.recv_omni
0.00 +0.9 0.93 ? 9% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.schedule_timeout.wait_woken
0.00 +1.0 0.98 ? 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
0.00 +1.0 0.98 ? 7% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
0.00 +1.0 0.98 ? 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__send
0.00 +1.0 1.03 ? 22% perf-profile.calltrace.cycles-pp.skb_release_data.__kfree_skb.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established
0.00 +1.1 1.13 ? 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recv
0.00 +1.1 1.13 ? 9% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
0.00 +1.1 1.13 ? 9% perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
0.00 +1.2 1.15 ? 8% perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv
1.54 ? 6% +1.2 2.76 ? 8% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
0.00 +1.3 1.32 ? 7% perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
5.02 ? 7% +1.3 6.35 ? 8% perf-profile.calltrace.cycles-pp.wait_woken.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
1.33 ? 7% +1.4 2.74 ? 7% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
10.51 ? 5% +1.4 11.96 ? 8% perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
8.66 ? 7% +1.5 10.14 ? 8% perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
14.38 ? 6% +1.5 15.87 ? 7% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit
8.75 ? 7% +1.5 10.27 ? 8% perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core
1.06 ? 22% +1.5 2.60 ? 8% perf-profile.calltrace.cycles-pp.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
14.55 ? 6% +1.5 16.09 ? 7% perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
14.67 ? 6% +1.6 16.26 ? 7% perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
13.28 ? 6% +1.7 14.93 ? 8% perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
0.00 +1.7 1.67 ? 8% perf-profile.calltrace.cycles-pp.loopback_xmit.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit
10.67 ? 5% +1.7 12.38 ? 7% perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
0.00 +1.7 1.75 ? 14% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.__wake_up_common.__wake_up_common_lock
2.39 ? 6% +1.8 4.16 ? 8% perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
10.70 ? 5% +1.8 12.47 ? 7% perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
0.00 +1.8 1.78 ? 8% perf-profile.calltrace.cycles-pp.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
0.00 +1.8 1.79 ? 13% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.__wake_up_common.__wake_up_common_lock.sock_def_readable
0.00 +1.8 1.84 ? 8% perf-profile.calltrace.cycles-pp.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
3.91 ? 7% +1.9 5.78 ? 8% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
3.68 ? 7% +1.9 5.58 ? 8% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.wait_woken.sk_wait_data
4.03 ? 7% +1.9 5.97 ? 8% perf-profile.calltrace.cycles-pp.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
0.00 +2.0 1.99 ? 8% perf-profile.calltrace.cycles-pp.__send
11.19 ? 6% +2.2 13.36 ? 7% perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start
0.00 +2.3 2.26 ? 21% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send.send_omni_inner
0.00 +2.3 2.29 ? 9% perf-profile.calltrace.cycles-pp.recv
11.56 ? 6% +2.5 14.09 ? 7% perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq
0.00 +2.6 2.59 ? 8% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
11.58 ? 6% +2.6 14.18 ? 7% perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip
0.00 +2.7 2.72 ? 8% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
0.00 +2.9 2.90 ? 8% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
16.38 ? 6% +3.4 19.76 ? 7% perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
0.00 +3.9 3.86 ? 3% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
17.18 ? 6% +4.3 21.44 ? 6% perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked
12.91 ? 6% +4.6 17.50 ? 7% perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send.recv_omni
18.67 ? 6% +5.3 24.00 ? 6% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
13.02 ? 6% +5.5 18.56 ? 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send.recv_omni.process_requests
13.10 ? 6% +5.7 18.77 ? 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__send.recv_omni.process_requests.spawn_child
13.24 ? 6% +5.8 19.06 ? 7% perf-profile.calltrace.cycles-pp.__send.recv_omni.process_requests.spawn_child.accept_connection
20.18 ? 6% +6.3 26.50 ? 6% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
20.27 ? 6% +6.5 26.78 ? 6% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
12.70 ? 6% +6.5 19.22 ? 9% perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send.send_omni_inner
21.45 ? 6% +8.3 29.75 ? 7% perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections
12.80 ? 6% +8.8 21.60 ? 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send.send_omni_inner.send_tcp_rr
19.60 ? 7% +8.9 28.49 ? 7% perf-profile.calltrace.cycles-pp.send_omni_inner.send_tcp_rr.main.__libc_start_main
24.83 ? 19% +8.9 33.77 ? 15% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
24.83 ? 19% +8.9 33.77 ? 15% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
19.76 ? 7% +9.0 28.80 ? 7% perf-profile.calltrace.cycles-pp.send_tcp_rr.main.__libc_start_main
11.96 ? 7% +9.4 21.31 ? 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__send.send_omni_inner.send_tcp_rr.main
24.07 ? 6% +9.4 33.44 ? 7% perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
12.09 ? 7% +9.4 21.47 ? 9% perf-profile.calltrace.cycles-pp.__send.send_omni_inner.send_tcp_rr.main.__libc_start_main
24.63 ? 6% +9.8 34.41 ? 7% perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
25.07 ? 6% +10.3 35.41 ? 7% perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
25.49 ? 6% +11.0 36.46 ? 7% perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send
55.82 ? 5% -21.0 34.84 ? 14% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
55.82 ? 5% -21.0 34.84 ? 14% perf-profile.children.cycles-pp.cpu_startup_entry
55.50 ? 5% -20.9 34.61 ? 14% perf-profile.children.cycles-pp.start_secondary
55.74 ? 5% -20.9 34.84 ? 14% perf-profile.children.cycles-pp.do_idle
48.52 ? 7% -13.7 34.79 ? 14% perf-profile.children.cycles-pp.cpuidle_idle_call
46.36 ? 7% -11.7 34.63 ? 14% perf-profile.children.cycles-pp.cpuidle_enter
46.34 ? 7% -11.7 34.63 ? 14% perf-profile.children.cycles-pp.cpuidle_enter_state
10.35 ? 7% -10.4 0.00 perf-profile.children.cycles-pp.poll_idle
10.06 ? 6% -10.1 0.00 perf-profile.children.cycles-pp.intel_idle_irq
3.04 ? 7% -2.3 0.72 ? 10% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.70 ? 5% -1.6 0.15 ? 16% perf-profile.children.cycles-pp.menu_select
1.33 ? 6% -1.3 0.03 ? 70% perf-profile.children.cycles-pp.ttwu_queue_wakelist
1.68 ? 7% -0.9 0.75 ? 8% perf-profile.children.cycles-pp.dequeue_entity
0.92 ? 7% -0.6 0.29 ? 18% perf-profile.children.cycles-pp.select_task_rq
0.73 ? 4% -0.6 0.10 ? 18% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.90 ? 8% -0.6 0.28 ? 7% perf-profile.children.cycles-pp.skb_attempt_defer_free
0.79 ? 7% -0.6 0.24 ? 19% perf-profile.children.cycles-pp.select_task_rq_fair
0.66 ? 7% -0.5 0.19 ? 10% perf-profile.children.cycles-pp.sock_rfree
0.48 ? 3% -0.4 0.09 ? 20% perf-profile.children.cycles-pp.tick_nohz_next_event
1.11 ? 10% -0.4 0.74 ? 18% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.64 ? 5% -0.4 0.28 ? 11% perf-profile.children.cycles-pp.remove_wait_queue
0.96 ? 7% -0.3 0.62 ? 11% perf-profile.children.cycles-pp.__inet_lookup_established
0.81 ? 5% -0.3 0.47 ? 13% perf-profile.children.cycles-pp.prepare_task_switch
0.82 ? 7% -0.3 0.52 ? 7% perf-profile.children.cycles-pp._raw_spin_lock
0.57 ? 9% -0.2 0.33 ? 11% perf-profile.children.cycles-pp.tcp_check_space
1.10 ? 7% -0.2 0.88 ? 8% perf-profile.children.cycles-pp.enqueue_entity
0.56 ? 8% -0.2 0.35 ? 9% perf-profile.children.cycles-pp.copyout
0.55 ? 9% -0.2 0.34 ? 7% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.35 ? 8% -0.2 0.16 ? 8% perf-profile.children.cycles-pp.kfree_skbmem
0.58 ? 7% -0.2 0.41 ? 7% perf-profile.children.cycles-pp.copy_user_short_string
0.31 ? 8% -0.2 0.15 ? 11% perf-profile.children.cycles-pp.perf_tp_event
0.20 ? 8% -0.1 0.05 ? 7% perf-profile.children.cycles-pp.skb_release_head_state
0.58 ? 7% -0.1 0.47 ? 8% perf-profile.children.cycles-pp.tcp_rcv_space_adjust
0.09 ? 15% -0.0 0.04 ? 45% perf-profile.children.cycles-pp.rebalance_domains
0.07 ? 11% -0.0 0.03 ? 70% perf-profile.children.cycles-pp.tracing_gen_ctx_irq_test
0.07 ? 7% +0.0 0.09 ? 7% perf-profile.children.cycles-pp.tcp_rate_skb_delivered
0.15 ? 7% +0.0 0.18 ? 9% perf-profile.children.cycles-pp.add_wait_queue
0.07 ? 7% +0.0 0.10 ? 10% perf-profile.children.cycles-pp.tcp_event_data_recv
0.04 ? 44% +0.0 0.08 ? 10% perf-profile.children.cycles-pp.place_entity
0.11 ? 11% +0.0 0.15 ? 15% perf-profile.children.cycles-pp.tcp_rearm_rto
0.06 ? 14% +0.0 0.11 ? 12% perf-profile.children.cycles-pp.ip_skb_dst_mtu
0.02 ?141% +0.0 0.06 ? 21% perf-profile.children.cycles-pp.timekeeping_max_deferment
0.05 ? 47% +0.0 0.09 ? 7% perf-profile.children.cycles-pp.nf_hook_slow
0.02 ?141% +0.0 0.06 ? 11% perf-profile.children.cycles-pp.__cgroup_account_cputime
0.22 ? 10% +0.1 0.27 ? 9% perf-profile.children.cycles-pp.__mod_timer
0.09 ? 14% +0.1 0.14 ? 19% perf-profile.children.cycles-pp.scheduler_tick
0.00 +0.1 0.05 ? 7% perf-profile.children.cycles-pp.minmax_running_min
0.00 +0.1 0.05 ? 8% perf-profile.children.cycles-pp.ip_local_deliver
0.00 +0.1 0.06 ? 13% perf-profile.children.cycles-pp.tcp_options_write
0.00 +0.1 0.06 ? 13% perf-profile.children.cycles-pp.inet_sendmsg
0.01 ?223% +0.1 0.06 ? 11% perf-profile.children.cycles-pp.skb_clone
0.00 +0.1 0.06 ? 13% perf-profile.children.cycles-pp.tcp_small_queue_check
0.00 +0.1 0.06 ? 13% perf-profile.children.cycles-pp.cubictcp_cong_avoid
0.00 +0.1 0.06 ? 18% perf-profile.children.cycles-pp.security_sock_rcv_skb
0.00 +0.1 0.06 ? 11% perf-profile.children.cycles-pp.tcp_chrono_stop
0.00 +0.1 0.06 ? 15% perf-profile.children.cycles-pp.eth_type_trans
0.00 +0.1 0.06 ? 14% perf-profile.children.cycles-pp.kmalloc_slab
0.00 +0.1 0.06 ? 17% perf-profile.children.cycles-pp.__raise_softirq_irqoff
0.06 ? 6% +0.1 0.12 ? 7% perf-profile.children.cycles-pp.sk_filter_trim_cap
0.00 +0.1 0.06 ? 11% perf-profile.children.cycles-pp.switch_ldt
0.00 +0.1 0.06 ? 14% perf-profile.children.cycles-pp.rb_insert_color
0.27 ? 9% +0.1 0.33 ? 7% perf-profile.children.cycles-pp.sk_reset_timer
0.10 ? 15% +0.1 0.17 ? 11% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.10 ? 8% +0.1 0.16 ? 9% perf-profile.children.cycles-pp.__ip_finish_output
0.10 ? 14% +0.1 0.17 ? 11% perf-profile.children.cycles-pp.__tcp_cleanup_rbuf
0.08 ? 8% +0.1 0.15 ? 11% perf-profile.children.cycles-pp.update_min_vruntime
0.06 ? 9% +0.1 0.13 ? 9% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.09 ? 6% +0.1 0.16 ? 9% perf-profile.children.cycles-pp.try_charge_memcg
0.00 +0.1 0.07 ? 10% perf-profile.children.cycles-pp.check_cfs_rq_runtime
0.00 +0.1 0.07 ? 10% perf-profile.children.cycles-pp.clear_buddies
0.04 ? 44% +0.1 0.11 ? 6% perf-profile.children.cycles-pp.tcp_established_options
0.00 +0.1 0.07 ? 14% perf-profile.children.cycles-pp.apparmor_socket_recvmsg
0.61 ? 7% +0.1 0.68 ? 6% perf-profile.children.cycles-pp.__switch_to
0.08 ? 6% +0.1 0.15 ? 9% perf-profile.children.cycles-pp.__copy_skb_header
0.00 +0.1 0.07 ? 20% perf-profile.children.cycles-pp.cubictcp_acked
0.06 ? 9% +0.1 0.13 ? 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.00 +0.1 0.07 ? 12% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
0.06 ? 11% +0.1 0.14 ? 9% perf-profile.children.cycles-pp.tcp_rate_skb_sent
0.08 ? 13% +0.1 0.16 ? 8% perf-profile.children.cycles-pp.rcu_all_qs
0.12 ? 12% +0.1 0.20 ? 8% perf-profile.children.cycles-pp.ip_send_check
0.06 ? 9% +0.1 0.13 ? 7% perf-profile.children.cycles-pp.inet_ehashfn
0.03 ? 70% +0.1 0.11 ? 8% perf-profile.children.cycles-pp.sk_free
0.06 ? 11% +0.1 0.14 ? 8% perf-profile.children.cycles-pp.raw_v4_input
0.00 +0.1 0.08 ? 12% perf-profile.children.cycles-pp.tcp_inbound_md5_hash
0.00 +0.1 0.08 ? 10% perf-profile.children.cycles-pp.neigh_hh_output
0.19 ? 17% +0.1 0.28 ? 18% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.04 ? 45% +0.1 0.13 ? 8% perf-profile.children.cycles-pp.tcp_rate_gen
0.00 +0.1 0.08 ? 5% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.00 +0.1 0.09 ? 10% perf-profile.children.cycles-pp.apparmor_socket_sendmsg
0.58 ? 8% +0.1 0.67 ? 8% perf-profile.children.cycles-pp.tcp_event_new_data_sent
0.00 +0.1 0.09 ? 6% perf-profile.children.cycles-pp.check_stack_object
0.00 +0.1 0.09 ? 6% perf-profile.children.cycles-pp.tcp_v4_fill_cb
0.00 +0.1 0.09 ? 11% perf-profile.children.cycles-pp.tcp_newly_delivered
0.07 ? 13% +0.1 0.16 ? 5% perf-profile.children.cycles-pp.sock_put
0.10 ? 8% +0.1 0.19 ? 7% perf-profile.children.cycles-pp.__ksize
0.05 ? 45% +0.1 0.14 ? 13% perf-profile.children.cycles-pp.netif_skb_features
0.10 ? 9% +0.1 0.20 ? 14% perf-profile.children.cycles-pp.sk_forced_mem_schedule
0.04 ? 71% +0.1 0.13 ? 5% perf-profile.children.cycles-pp.__usecs_to_jiffies
0.09 ? 14% +0.1 0.19 ? 14% perf-profile.children.cycles-pp.tcp_rtt_estimator
0.00 +0.1 0.10 ? 4% perf-profile.children.cycles-pp.tcp_push
0.10 ? 10% +0.1 0.21 ? 7% perf-profile.children.cycles-pp.__tcp_select_window
0.63 ? 6% +0.1 0.74 ? 7% perf-profile.children.cycles-pp.simple_copy_to_iter
0.00 +0.1 0.11 ? 11% perf-profile.children.cycles-pp.resched_curr
0.07 ? 12% +0.1 0.19 ? 22% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.43 ? 6% +0.1 0.54 ? 9% perf-profile.children.cycles-pp.update_rq_clock
0.12 ? 9% +0.1 0.24 ? 7% perf-profile.children.cycles-pp.ip_output
0.18 ? 9% +0.1 0.29 ? 14% perf-profile.children.cycles-pp.__calc_delta
0.50 ? 6% +0.1 0.63 ? 9% perf-profile.children.cycles-pp.set_next_entity
0.09 ? 6% +0.1 0.21 ? 7% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.28 ? 7% +0.1 0.41 ? 8% perf-profile.children.cycles-pp.recv_data
0.00 +0.1 0.13 ? 8% perf-profile.children.cycles-pp.cubictcp_cwnd_event
0.04 ? 71% +0.1 0.17 ? 13% perf-profile.children.cycles-pp.skb_page_frag_refill
0.11 ? 8% +0.1 0.24 ? 10% perf-profile.children.cycles-pp.tcp_tso_segs
0.07 ? 14% +0.1 0.20 ? 10% perf-profile.children.cycles-pp.sk_page_frag_refill
0.11 ? 6% +0.1 0.25 ? 6% perf-profile.children.cycles-pp.__netif_receive_skb_core
0.30 ? 6% +0.1 0.44 ? 8% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.08 ? 18% +0.1 0.23 ? 19% perf-profile.children.cycles-pp.cpuacct_charge
0.29 ? 9% +0.1 0.44 ? 8% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.08 ? 8% +0.1 0.23 ? 9% perf-profile.children.cycles-pp.ip_rcv_core
0.16 ? 9% +0.2 0.32 ? 8% perf-profile.children.cycles-pp.__ip_local_out
0.29 ? 6% +0.2 0.46 ? 6% perf-profile.children.cycles-pp.__update_load_avg_se
0.12 ? 6% +0.2 0.28 ? 10% perf-profile.children.cycles-pp.__list_del_entry_valid
0.35 ? 4% +0.2 0.52 ? 8% perf-profile.children.cycles-pp.update_cfs_group
0.14 ? 5% +0.2 0.31 ? 7% perf-profile.children.cycles-pp.copyin
0.11 ? 6% +0.2 0.28 ? 8% perf-profile.children.cycles-pp.refill_stock
0.18 ? 7% +0.2 0.35 ? 8% perf-profile.children.cycles-pp.__cond_resched
0.14 ? 7% +0.2 0.32 ? 8% perf-profile.children.cycles-pp.tcp_update_pacing_rate
0.11 ? 9% +0.2 0.28 ? 8% perf-profile.children.cycles-pp.tcp_update_skb_after_send
0.15 ? 10% +0.2 0.32 ? 15% perf-profile.children.cycles-pp.mem_cgroup_uncharge_skmem
0.17 ? 6% +0.2 0.34 ? 7% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.12 ? 9% +0.2 0.30 ? 14% perf-profile.children.cycles-pp.validate_xmit_skb
0.15 ? 20% +0.2 0.33 ? 17% perf-profile.children.cycles-pp.find_busiest_group
0.07 ? 9% +0.2 0.26 ? 8% perf-profile.children.cycles-pp.__list_add_valid
0.18 ? 8% +0.2 0.36 ? 10% perf-profile.children.cycles-pp.ip_local_out
0.17 ? 6% +0.2 0.36 ? 17% perf-profile.children.cycles-pp.__mod_memcg_state
0.16 ? 4% +0.2 0.35 ? 9% perf-profile.children.cycles-pp.tcp_wfree
0.46 ? 6% +0.2 0.66 ? 7% perf-profile.children.cycles-pp.read_tsc
0.11 ? 9% +0.2 0.31 ? 8% perf-profile.children.cycles-pp.__might_fault
0.00 +0.2 0.20 ? 10% perf-profile.children.cycles-pp.set_next_buddy
0.12 ? 21% +0.2 0.33 ? 17% perf-profile.children.cycles-pp.update_sd_lb_stats
0.19 ? 17% +0.2 0.40 ? 17% perf-profile.children.cycles-pp.load_balance
0.06 ? 7% +0.2 0.27 ? 8% perf-profile.children.cycles-pp.__kmem_cache_free
0.10 ? 23% +0.2 0.32 ? 17% perf-profile.children.cycles-pp.update_sg_lb_stats
0.18 ? 8% +0.2 0.39 ? 6% perf-profile.children.cycles-pp.__might_sleep
0.23 ? 9% +0.2 0.45 ? 7% perf-profile.children.cycles-pp.send_data
0.35 ? 8% +0.2 0.58 ? 8% perf-profile.children.cycles-pp.__skb_clone
0.67 ? 6% +0.2 0.90 ? 9% perf-profile.children.cycles-pp.__switch_to_asm
0.07 ? 10% +0.2 0.31 ? 8% perf-profile.children.cycles-pp.kmem_cache_free
0.07 ? 12% +0.2 0.32 ? 6% perf-profile.children.cycles-pp.put_prev_entity
0.12 ? 10% +0.2 0.37 ? 8% perf-profile.children.cycles-pp.tcp_ack_update_rtt
0.24 ? 7% +0.3 0.50 ? 9% perf-profile.children.cycles-pp.mem_cgroup_charge_skmem
0.27 ? 12% +0.3 0.54 ? 12% perf-profile.children.cycles-pp.ip_rcv
0.48 ? 9% +0.3 0.75 ? 7% perf-profile.children.cycles-pp.release_sock
0.22 ? 8% +0.3 0.52 ? 6% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.25 ? 7% +0.3 0.56 ? 8% perf-profile.children.cycles-pp.os_xsave
1.01 ? 6% +0.3 1.33 ? 8% perf-profile.children.cycles-pp.skb_release_data
0.56 ? 8% +0.3 0.87 ? 7% perf-profile.children.cycles-pp._raw_spin_lock_bh
0.24 ? 8% +0.3 0.56 ? 9% perf-profile.children.cycles-pp.__kmem_cache_alloc_node
0.14 ? 3% +0.3 0.47 ? 7% perf-profile.children.cycles-pp.__virt_addr_valid
0.20 ? 8% +0.3 0.54 ? 8% perf-profile.children.cycles-pp.tcp_schedule_loss_probe
0.83 ? 5% +0.3 1.17 ? 7% perf-profile.children.cycles-pp.__check_object_size
0.06 ? 9% +0.4 0.41 ? 9% perf-profile.children.cycles-pp.import_single_range
0.35 ? 7% +0.4 0.71 ? 7% perf-profile.children.cycles-pp.lock_sock_nested
0.13 ? 10% +0.4 0.50 ? 11% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.10 ? 13% +0.4 0.47 ? 11% perf-profile.children.cycles-pp.check_preempt_curr
0.27 ? 6% +0.4 0.66 ? 9% perf-profile.children.cycles-pp.irqtime_account_irq
0.20 ? 9% +0.4 0.59 ? 7% perf-profile.children.cycles-pp.kmem_cache_alloc_node
0.84 ? 2% +0.4 1.23 ? 11% perf-profile.children.cycles-pp.ktime_get
0.30 ? 5% +0.4 0.70 ? 8% perf-profile.children.cycles-pp.security_socket_sendmsg
0.28 ? 7% +0.4 0.68 ? 9% perf-profile.children.cycles-pp.__kmalloc_node_track_caller
0.19 ? 5% +0.4 0.60 ? 9% perf-profile.children.cycles-pp.enqueue_to_backlog
0.94 ? 6% +0.4 1.34 ? 7% perf-profile.children.cycles-pp.update_load_avg
0.00 +0.4 0.41 ? 12% perf-profile.children.cycles-pp.check_preempt_wakeup
0.21 ? 6% +0.4 0.63 ? 8% perf-profile.children.cycles-pp.netif_rx_internal
0.21 ? 5% +0.4 0.64 ? 9% perf-profile.children.cycles-pp.__netif_rx
0.30 ? 8% +0.4 0.74 ? 9% perf-profile.children.cycles-pp.kmalloc_reserve
0.34 ? 5% +0.4 0.78 ? 7% perf-profile.children.cycles-pp.tcp_mstamp_refresh
0.26 ? 8% +0.4 0.71 ? 10% perf-profile.children.cycles-pp.reweight_entity
0.27 ? 11% +0.5 0.75 ? 9% perf-profile.children.cycles-pp.__fdget
0.41 ? 5% +0.5 0.89 ? 8% perf-profile.children.cycles-pp.__sk_mem_reduce_allocated
0.00 +0.5 0.50 ? 10% perf-profile.children.cycles-pp.kfree
0.22 ? 7% +0.5 0.76 ? 8% perf-profile.children.cycles-pp.security_socket_recvmsg
0.29 ? 6% +0.5 0.84 ? 7% perf-profile.children.cycles-pp._copy_from_iter
0.77 ? 8% +0.5 1.31 ? 9% perf-profile.children.cycles-pp.update_curr
0.35 ? 6% +0.6 0.91 ? 7% perf-profile.children.cycles-pp.aa_sk_perm
0.24 ? 8% +0.6 0.80 ? 8% perf-profile.children.cycles-pp.sock_recvmsg
0.43 ? 7% +0.6 0.99 ? 7% perf-profile.children.cycles-pp.__entry_text_start
0.33 ? 10% +0.6 0.91 ? 9% perf-profile.children.cycles-pp.sockfd_lookup_light
1.23 ? 8% +0.6 1.82 ? 7% perf-profile.children.cycles-pp.pick_next_task_fair
0.54 ? 7% +0.6 1.18 ? 8% perf-profile.children.cycles-pp.__kfree_skb
1.28 ? 7% +0.8 2.04 ? 8% perf-profile.children.cycles-pp.enqueue_task_fair
1.32 ? 7% +0.8 2.09 ? 8% perf-profile.children.cycles-pp.ttwu_do_activate
0.54 ? 6% +0.8 1.36 ? 7% perf-profile.children.cycles-pp.skb_do_copy_data_nocache
0.00 +1.0 1.01 ? 8% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
0.63 ? 6% +1.1 1.70 ? 8% perf-profile.children.cycles-pp.loopback_xmit
0.82 ? 7% +1.1 1.91 ? 8% perf-profile.children.cycles-pp.__alloc_skb
5.25 ? 7% +1.1 6.39 ? 8% perf-profile.children.cycles-pp.wait_woken
0.68 ? 5% +1.1 1.82 ? 8% perf-profile.children.cycles-pp.dev_hard_start_xmit
0.15 ? 8% +1.2 1.33 ? 8% perf-profile.children.cycles-pp.switch_fpu_return
1.65 ? 6% +1.2 2.87 ? 8% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
9.07 ? 7% +1.4 10.45 ? 8% perf-profile.children.cycles-pp.tcp_rcv_established
10.89 ? 6% +1.4 12.30 ? 8% perf-profile.children.cycles-pp.tcp_v4_rcv
1.40 ? 6% +1.4 2.82 ? 8% perf-profile.children.cycles-pp.__dev_queue_xmit
9.14 ? 6% +1.4 10.56 ? 8% perf-profile.children.cycles-pp.tcp_v4_do_rcv
1.20 ? 7% +1.5 2.68 ? 8% perf-profile.children.cycles-pp.tcp_stream_alloc_skb
13.56 ? 7% +1.6 15.16 ? 8% perf-profile.children.cycles-pp.__sys_recvfrom
11.05 ? 6% +1.7 12.71 ? 8% perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
13.66 ? 7% +1.7 15.36 ? 8% perf-profile.children.cycles-pp.__x64_sys_recvfrom
11.08 ? 6% +1.7 12.80 ? 8% perf-profile.children.cycles-pp.ip_local_deliver_finish
2.51 ? 6% +1.8 4.28 ? 8% perf-profile.children.cycles-pp.tcp_ack
4.22 ? 7% +1.8 6.02 ? 7% perf-profile.children.cycles-pp.schedule_timeout
0.10 ? 9% +2.1 2.19 ? 8% perf-profile.children.cycles-pp.switch_mm_irqs_off
11.50 ? 6% +2.2 13.70 ? 7% perf-profile.children.cycles-pp.__netif_receive_skb_one_core
11.89 ? 6% +2.5 14.38 ? 7% perf-profile.children.cycles-pp.process_backlog
11.90 ? 6% +2.6 14.46 ? 7% perf-profile.children.cycles-pp.__napi_poll
6.04 ? 7% +2.6 8.61 ? 7% perf-profile.children.cycles-pp.__schedule
0.00 +3.2 3.24 ? 7% perf-profile.children.cycles-pp.exit_to_user_mode_loop
16.72 ? 6% +3.3 20.05 ? 7% perf-profile.children.cycles-pp.ip_finish_output2
15.81 ? 7% +3.7 19.46 ? 8% perf-profile.children.cycles-pp.recv
17.54 ? 6% +4.2 21.74 ? 7% perf-profile.children.cycles-pp.__ip_queue_xmit
0.20 ? 8% +4.7 4.87 ? 7% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
4.10 ? 7% +4.8 8.90 ? 7% perf-profile.children.cycles-pp.schedule
0.32 ? 6% +4.8 5.12 ? 7% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
18.90 ? 6% +5.5 24.36 ? 7% perf-profile.children.cycles-pp.__tcp_transmit_skb
20.44 ? 6% +6.5 26.93 ? 7% perf-profile.children.cycles-pp.tcp_write_xmit
20.51 ? 6% +6.6 27.14 ? 7% perf-profile.children.cycles-pp.__tcp_push_pending_frames
21.50 ? 6% +8.3 29.79 ? 6% perf-profile.children.cycles-pp.send_tcp_rr
21.57 ? 6% +8.5 30.03 ? 7% perf-profile.children.cycles-pp.send_omni_inner
21.58 ? 6% +8.5 30.06 ? 7% perf-profile.children.cycles-pp.recv_omni
21.59 ? 6% +8.5 30.08 ? 7% perf-profile.children.cycles-pp.accept_connections
21.59 ? 6% +8.5 30.08 ? 7% perf-profile.children.cycles-pp.accept_connection
21.59 ? 6% +8.5 30.08 ? 7% perf-profile.children.cycles-pp.spawn_child
21.59 ? 6% +8.5 30.08 ? 7% perf-profile.children.cycles-pp.process_requests
24.97 ? 19% +9.0 34.00 ? 15% perf-profile.children.cycles-pp.intel_idle
24.26 ? 6% +9.3 33.55 ? 7% perf-profile.children.cycles-pp.tcp_sendmsg_locked
24.84 ? 6% +9.7 34.54 ? 7% perf-profile.children.cycles-pp.tcp_sendmsg
25.26 ? 6% +10.2 35.49 ? 7% perf-profile.children.cycles-pp.sock_sendmsg
25.68 ? 6% +10.9 36.56 ? 7% perf-profile.children.cycles-pp.__sys_sendto
25.80 ? 6% +11.0 36.82 ? 7% perf-profile.children.cycles-pp.__x64_sys_sendto
26.86 ? 6% +17.5 44.40 ? 7% perf-profile.children.cycles-pp.__send
40.08 ? 6% +17.8 57.92 ? 7% perf-profile.children.cycles-pp.do_syscall_64
40.47 ? 6% +18.4 58.88 ? 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
10.22 ? 7% -10.2 0.00 perf-profile.self.cycles-pp.poll_idle
3.03 ? 7% -2.3 0.70 ? 10% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.03 ? 7% -0.7 0.37 ? 9% perf-profile.self.cycles-pp.wait_woken
0.66 ? 7% -0.5 0.19 ? 10% perf-profile.self.cycles-pp.sock_rfree
0.91 ? 7% -0.4 0.49 ? 12% perf-profile.self.cycles-pp.__inet_lookup_established
0.64 ? 8% -0.4 0.26 ? 8% perf-profile.self.cycles-pp.tcp_rcv_established
0.92 ? 6% -0.4 0.55 ? 6% perf-profile.self.cycles-pp.skb_release_data
0.40 ? 9% -0.3 0.06 ? 6% perf-profile.self.cycles-pp.skb_attempt_defer_free
0.40 ? 8% -0.3 0.08 ? 9% perf-profile.self.cycles-pp.tcp_rcv_space_adjust
0.82 ? 7% -0.3 0.50 ? 8% perf-profile.self.cycles-pp._raw_spin_lock
0.54 ? 5% -0.3 0.24 ? 9% perf-profile.self.cycles-pp.check_heap_object
0.43 ? 6% -0.3 0.16 ? 14% perf-profile.self.cycles-pp.prepare_task_switch
0.57 ? 6% -0.3 0.32 ? 11% perf-profile.self.cycles-pp.__skb_datagram_iter
0.56 ? 9% -0.2 0.32 ? 10% perf-profile.self.cycles-pp.tcp_check_space
0.39 ? 7% -0.2 0.19 ? 11% perf-profile.self.cycles-pp.enqueue_entity
0.35 ? 8% -0.2 0.16 ? 8% perf-profile.self.cycles-pp.kfree_skbmem
0.57 ? 7% -0.2 0.40 ? 8% perf-profile.self.cycles-pp.copy_user_short_string
0.26 ? 9% -0.2 0.09 ? 7% perf-profile.self.cycles-pp.__wake_up_common
0.19 ? 7% -0.2 0.03 ? 70% perf-profile.self.cycles-pp.skb_release_head_state
0.22 ? 8% -0.2 0.07 ? 20% perf-profile.self.cycles-pp.perf_tp_event
0.21 ? 4% -0.2 0.06 ? 19% perf-profile.self.cycles-pp.cpuidle_enter_state
0.67 ? 7% -0.2 0.52 ? 8% perf-profile.self.cycles-pp.tcp_recvmsg_locked
0.25 ? 9% -0.1 0.14 ? 12% perf-profile.self.cycles-pp.dequeue_entity
0.14 ? 9% -0.1 0.05 ? 46% perf-profile.self.cycles-pp.select_task_rq
0.34 ? 3% -0.1 0.27 ? 14% perf-profile.self.cycles-pp.___perf_sw_event
0.16 ? 12% -0.1 0.10 ? 16% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
0.20 ? 7% -0.0 0.16 ? 7% perf-profile.self.cycles-pp.finish_task_switch
0.09 ? 11% -0.0 0.06 ? 9% perf-profile.self.cycles-pp.check_preempt_curr
0.06 ? 7% -0.0 0.03 ? 70% perf-profile.self.cycles-pp.tracing_gen_ctx_irq_test
0.06 ? 11% +0.0 0.09 ? 9% perf-profile.self.cycles-pp.tcp_event_data_recv
0.05 ? 45% +0.0 0.08 ? 7% perf-profile.self.cycles-pp.put_prev_entity
0.22 ? 7% +0.0 0.25 ? 7% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.10 ? 10% +0.0 0.14 ? 15% perf-profile.self.cycles-pp.tcp_rearm_rto
0.22 ? 6% +0.0 0.26 ? 8% perf-profile.self.cycles-pp.dequeue_task_fair
0.13 ? 6% +0.0 0.17 ? 11% perf-profile.self.cycles-pp.update_rq_clock
0.04 ? 47% +0.0 0.09 ? 18% perf-profile.self.cycles-pp.tcp_v4_do_rcv
0.06 ? 11% +0.0 0.10 ? 13% perf-profile.self.cycles-pp.ip_skb_dst_mtu
0.02 ?142% +0.0 0.06 ? 21% perf-profile.self.cycles-pp.timekeeping_max_deferment
0.02 ?141% +0.0 0.06 ? 11% perf-profile.self.cycles-pp.place_entity
0.01 ?223% +0.0 0.06 ? 13% perf-profile.self.cycles-pp.skb_clone
0.00 +0.1 0.05 perf-profile.self.cycles-pp.minmax_running_min
0.06 ? 13% +0.1 0.11 ? 9% perf-profile.self.cycles-pp.rcu_all_qs
0.00 +0.1 0.05 ? 7% perf-profile.self.cycles-pp.__netif_receive_skb_one_core
0.00 +0.1 0.05 ? 7% perf-profile.self.cycles-pp.clear_buddies
0.00 +0.1 0.05 ? 7% perf-profile.self.cycles-pp.tcp_small_queue_check
0.18 ? 7% +0.1 0.23 ? 10% perf-profile.self.cycles-pp.do_softirq
0.00 +0.1 0.05 ? 8% perf-profile.self.cycles-pp.ip_local_deliver
0.00 +0.1 0.06 ? 9% perf-profile.self.cycles-pp.__kmalloc_node_track_caller
0.00 +0.1 0.06 ? 13% perf-profile.self.cycles-pp.skb_copy_datagram_iter
0.00 +0.1 0.06 ? 13% perf-profile.self.cycles-pp.cubictcp_cong_avoid
0.00 +0.1 0.06 ? 11% perf-profile.self.cycles-pp.sk_reset_timer
0.00 +0.1 0.06 ? 11% perf-profile.self.cycles-pp.switch_ldt
0.00 +0.1 0.06 ? 11% perf-profile.self.cycles-pp.rb_insert_color
0.00 +0.1 0.06 ? 11% perf-profile.self.cycles-pp.remove_wait_queue
0.00 +0.1 0.06 ? 6% perf-profile.self.cycles-pp.__ip_finish_output
0.00 +0.1 0.06 ? 11% perf-profile.self.cycles-pp.nf_hook_slow
0.02 ?141% +0.1 0.08 ? 10% perf-profile.self.cycles-pp.tcp_send_mss
0.00 +0.1 0.06 ? 15% perf-profile.self.cycles-pp.eth_type_trans
0.13 ? 8% +0.1 0.19 ? 10% perf-profile.self.cycles-pp.schedule_timeout
0.00 +0.1 0.06 ? 13% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook
0.00 +0.1 0.06 ? 16% perf-profile.self.cycles-pp.kmalloc_reserve
0.00 +0.1 0.06 ? 9% perf-profile.self.cycles-pp.sk_filter_trim_cap
0.00 +0.1 0.06 ? 16% perf-profile.self.cycles-pp.kmalloc_slab
0.06 ? 13% +0.1 0.12 ? 8% perf-profile.self.cycles-pp.tcp_recvmsg
0.00 +0.1 0.06 ? 6% perf-profile.self.cycles-pp.tcp_inbound_md5_hash
0.00 +0.1 0.06 ? 17% perf-profile.self.cycles-pp.apparmor_socket_recvmsg
0.00 +0.1 0.06 ? 11% perf-profile.self.cycles-pp.__wake_up_common_lock
0.07 ? 12% +0.1 0.14 ? 14% perf-profile.self.cycles-pp.update_min_vruntime
0.06 ? 11% +0.1 0.12 ? 10% perf-profile.self.cycles-pp.tcp_rate_skb_sent
0.04 ? 45% +0.1 0.11 ? 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.08 ? 6% +0.1 0.14 ? 6% perf-profile.self.cycles-pp.ip_output
0.04 ? 44% +0.1 0.11 ? 5% perf-profile.self.cycles-pp.tcp_established_options
0.01 ?223% +0.1 0.08 ? 9% perf-profile.self.cycles-pp.lock_sock_nested
0.10 ? 16% +0.1 0.17 ? 11% perf-profile.self.cycles-pp.perf_trace_sched_switch
0.08 ? 8% +0.1 0.16 ? 10% perf-profile.self.cycles-pp.try_charge_memcg
0.00 +0.1 0.07 ? 14% perf-profile.self.cycles-pp.tcp_stream_alloc_skb
0.00 +0.1 0.07 ? 20% perf-profile.self.cycles-pp.cubictcp_acked
0.59 ? 6% +0.1 0.66 ? 6% perf-profile.self.cycles-pp.__switch_to
0.07 ? 7% +0.1 0.14 ? 11% perf-profile.self.cycles-pp.irqtime_account_irq
0.07 ? 8% +0.1 0.14 ? 11% perf-profile.self.cycles-pp.__copy_skb_header
0.06 ? 9% +0.1 0.13 ? 5% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.00 +0.1 0.07 ? 10% perf-profile.self.cycles-pp.skb_do_copy_data_nocache
0.04 ? 44% +0.1 0.12 ? 10% perf-profile.self.cycles-pp.dev_hard_start_xmit
0.11 ? 6% +0.1 0.19 ? 11% perf-profile.self.cycles-pp._copy_to_iter
0.04 ? 71% +0.1 0.11 ? 8% perf-profile.self.cycles-pp.mem_cgroup_uncharge_skmem
0.03 ? 70% +0.1 0.11 ? 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.06 ? 8% +0.1 0.14 ? 7% perf-profile.self.cycles-pp.raw_v4_input
0.03 ? 70% +0.1 0.11 ? 8% perf-profile.self.cycles-pp.sk_free
0.00 +0.1 0.08 ? 8% perf-profile.self.cycles-pp.neigh_hh_output
0.26 ? 7% +0.1 0.34 ? 9% perf-profile.self.cycles-pp.recv_data
0.12 ? 11% +0.1 0.20 ? 8% perf-profile.self.cycles-pp.ip_send_check
0.00 +0.1 0.08 ? 7% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.06 ? 11% +0.1 0.15 ? 18% perf-profile.self.cycles-pp.validate_xmit_skb
0.00 +0.1 0.08 ? 5% perf-profile.self.cycles-pp.__napi_poll
0.00 +0.1 0.08 ? 5% perf-profile.self.cycles-pp.tcp_v4_fill_cb
0.00 +0.1 0.08 ? 11% perf-profile.self.cycles-pp.apparmor_socket_sendmsg
0.04 ? 45% +0.1 0.13 ? 8% perf-profile.self.cycles-pp.inet_ehashfn
0.04 ? 45% +0.1 0.13 ? 8% perf-profile.self.cycles-pp.tcp_rate_gen
0.02 ?141% +0.1 0.10 ? 8% perf-profile.self.cycles-pp.tcp_mstamp_refresh
0.00 +0.1 0.08 ? 14% perf-profile.self.cycles-pp.netif_skb_features
0.00 +0.1 0.09 ? 15% perf-profile.self.cycles-pp.__might_fault
0.00 +0.1 0.09 ? 7% perf-profile.self.cycles-pp.tcp_newly_delivered
0.10 ? 6% +0.1 0.19 ? 14% perf-profile.self.cycles-pp.sk_forced_mem_schedule
0.09 ? 7% +0.1 0.18 ? 9% perf-profile.self.cycles-pp.__ksize
0.00 +0.1 0.09 ? 6% perf-profile.self.cycles-pp.ip_local_deliver_finish
0.00 +0.1 0.09 ? 6% perf-profile.self.cycles-pp.check_stack_object
0.08 ? 14% +0.1 0.18 ? 16% perf-profile.self.cycles-pp.tcp_rtt_estimator
0.06 ? 11% +0.1 0.16 ? 5% perf-profile.self.cycles-pp.sock_put
0.10 ? 10% +0.1 0.20 ? 10% perf-profile.self.cycles-pp.__cond_resched
0.22 ? 8% +0.1 0.32 ? 9% perf-profile.self.cycles-pp.schedule
0.06 ? 9% +0.1 0.16 ? 9% perf-profile.self.cycles-pp.sockfd_lookup_light
0.00 +0.1 0.10 ? 12% perf-profile.self.cycles-pp.resched_curr
0.11 ? 6% +0.1 0.21 ? 20% perf-profile.self.cycles-pp.__mod_memcg_state
0.13 ? 9% +0.1 0.23 ? 19% perf-profile.self.cycles-pp.select_task_rq_fair
0.10 ? 10% +0.1 0.21 ? 7% perf-profile.self.cycles-pp.__tcp_select_window
0.07 ? 5% +0.1 0.17 ? 8% perf-profile.self.cycles-pp.enqueue_to_backlog
0.09 ? 5% +0.1 0.20 ? 12% perf-profile.self.cycles-pp.__x64_sys_recvfrom
0.06 ? 14% +0.1 0.17 ? 25% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.00 +0.1 0.10 ? 4% perf-profile.self.cycles-pp.tcp_ack_update_rtt
0.00 +0.1 0.10 ? 4% perf-profile.self.cycles-pp.tcp_push
0.01 ?223% +0.1 0.11 ? 6% perf-profile.self.cycles-pp.__usecs_to_jiffies
0.35 ? 7% +0.1 0.45 ? 7% perf-profile.self.cycles-pp.update_load_avg
0.09 ? 7% +0.1 0.20 ? 8% perf-profile.self.cycles-pp.sock_sendmsg
0.00 +0.1 0.11 ? 9% perf-profile.self.cycles-pp.ip_rcv
0.12 ? 6% +0.1 0.23 ? 9% perf-profile.self.cycles-pp.security_socket_sendmsg
0.16 ? 10% +0.1 0.27 ? 8% perf-profile.self.cycles-pp.sk_wait_data
0.00 +0.1 0.11 ? 8% perf-profile.self.cycles-pp.__ip_local_out
0.09 ? 14% +0.1 0.20 ? 8% perf-profile.self.cycles-pp.mem_cgroup_charge_skmem
0.18 ? 10% +0.1 0.29 ? 14% perf-profile.self.cycles-pp.__calc_delta
0.09 ? 6% +0.1 0.21 ? 8% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.24 ? 6% +0.1 0.36 ? 11% perf-profile.self.cycles-pp.__local_bh_enable_ip
0.00 +0.1 0.13 ? 8% perf-profile.self.cycles-pp.cubictcp_cwnd_event
0.14 ? 7% +0.1 0.27 ? 5% perf-profile.self.cycles-pp.pick_next_task_fair
0.04 ? 71% +0.1 0.17 ? 11% perf-profile.self.cycles-pp.skb_page_frag_refill
0.15 ? 7% +0.1 0.28 ? 8% perf-profile.self.cycles-pp.switch_fpu_return
0.11 ? 6% +0.1 0.24 ? 9% perf-profile.self.cycles-pp.tcp_sendmsg
0.07 ? 14% +0.1 0.20 ? 8% perf-profile.self.cycles-pp.tcp_update_skb_after_send
0.15 ? 4% +0.1 0.29 ? 9% perf-profile.self.cycles-pp.__sk_mem_reduce_allocated
0.29 ? 6% +0.1 0.42 ? 8% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.29 ? 9% +0.1 0.42 ? 8% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.10 ? 7% +0.1 0.24 ? 9% perf-profile.self.cycles-pp.tcp_tso_segs
0.11 ? 6% +0.1 0.25 ? 7% perf-profile.self.cycles-pp.__netif_receive_skb_core
0.07 ? 12% +0.1 0.21 ? 5% perf-profile.self.cycles-pp.__tcp_push_pending_frames
0.34 ? 9% +0.1 0.48 ? 8% perf-profile.self.cycles-pp.update_curr
0.18 ? 9% +0.1 0.32 ? 7% perf-profile.self.cycles-pp.enqueue_task_fair
0.08 ? 19% +0.1 0.22 ? 18% perf-profile.self.cycles-pp.cpuacct_charge
0.08 ? 8% +0.1 0.23 ? 9% perf-profile.self.cycles-pp.ip_rcv_core
0.01 ?223% +0.1 0.16 ? 12% perf-profile.self.cycles-pp.inet_recvmsg
0.29 ? 5% +0.2 0.44 ? 6% perf-profile.self.cycles-pp.__update_load_avg_se
0.10 ? 5% +0.2 0.25 ? 10% perf-profile.self.cycles-pp.__list_del_entry_valid
0.10 ? 6% +0.2 0.26 ? 8% perf-profile.self.cycles-pp.process_backlog
0.12 ? 8% +0.2 0.27 ? 7% perf-profile.self.cycles-pp.__x64_sys_sendto
0.00 +0.2 0.16 ? 14% perf-profile.self.cycles-pp.check_preempt_wakeup
0.28 ? 8% +0.2 0.44 ? 6% perf-profile.self.cycles-pp.__skb_clone
0.12 ? 8% +0.2 0.28 ? 7% perf-profile.self.cycles-pp.tcp_schedule_loss_probe
0.34 ? 4% +0.2 0.51 ? 8% perf-profile.self.cycles-pp.update_cfs_group
0.16 ? 6% +0.2 0.33 ? 7% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.14 ? 7% +0.2 0.31 ? 8% perf-profile.self.cycles-pp.tcp_update_pacing_rate
0.11 ? 6% +0.2 0.28 ? 8% perf-profile.self.cycles-pp.refill_stock
0.09 ? 13% +0.2 0.26 ? 8% perf-profile.self.cycles-pp.ip_protocol_deliver_rcu
0.00 +0.2 0.17 ? 11% perf-profile.self.cycles-pp.security_socket_recvmsg
0.17 ? 9% +0.2 0.34 ? 8% perf-profile.self.cycles-pp.__alloc_skb
0.08 ? 25% +0.2 0.26 ? 17% perf-profile.self.cycles-pp.update_sg_lb_stats
0.27 ? 5% +0.2 0.46 ? 7% perf-profile.self.cycles-pp.send_omni_inner
0.06 ? 7% +0.2 0.25 ? 8% perf-profile.self.cycles-pp.__list_add_valid
0.22 ? 8% +0.2 0.40 ? 11% perf-profile.self.cycles-pp.recv
0.00 +0.2 0.19 ? 8% perf-profile.self.cycles-pp.exit_to_user_mode_loop
0.20 ? 8% +0.2 0.39 ? 8% perf-profile.self.cycles-pp.send_data
0.38 ? 8% +0.2 0.57 ? 23% perf-profile.self.cycles-pp.ktime_get
0.45 ? 7% +0.2 0.65 ? 7% perf-profile.self.cycles-pp.read_tsc
0.16 ? 4% +0.2 0.35 ? 9% perf-profile.self.cycles-pp.tcp_wfree
0.00 +0.2 0.20 ? 9% perf-profile.self.cycles-pp.set_next_buddy
0.16 ? 9% +0.2 0.36 ? 6% perf-profile.self.cycles-pp.__might_sleep
0.06 ? 7% +0.2 0.27 ? 8% perf-profile.self.cycles-pp.__kmem_cache_free
0.32 ? 5% +0.2 0.53 ? 9% perf-profile.self.cycles-pp.__sys_recvfrom
0.25 ? 14% +0.2 0.46 ? 9% perf-profile.self.cycles-pp.recv_omni
0.62 ? 7% +0.2 0.84 ? 8% perf-profile.self.cycles-pp.__schedule
0.16 ? 7% +0.2 0.38 ? 9% perf-profile.self.cycles-pp.__kmem_cache_alloc_node
0.19 ? 8% +0.2 0.42 ? 7% perf-profile.self.cycles-pp.__softirqentry_text_start
0.67 ? 6% +0.2 0.89 ? 8% perf-profile.self.cycles-pp.__switch_to_asm
0.09 ? 7% +0.2 0.32 ? 8% perf-profile.self.cycles-pp._copy_from_iter
0.18 ? 6% +0.2 0.42 ? 5% perf-profile.self.cycles-pp.__send
0.10 ? 11% +0.2 0.34 ? 10% perf-profile.self.cycles-pp.__check_object_size
0.07 ? 10% +0.2 0.31 ? 8% perf-profile.self.cycles-pp.kmem_cache_free
0.12 ? 7% +0.3 0.38 ? 8% perf-profile.self.cycles-pp.kmem_cache_alloc_node
0.26 ? 6% +0.3 0.52 ? 7% perf-profile.self.cycles-pp.do_syscall_64
0.21 ? 7% +0.3 0.48 ? 7% perf-profile.self.cycles-pp.__entry_text_start
0.22 ? 7% +0.3 0.50 ? 8% perf-profile.self.cycles-pp.__sys_sendto
0.22 ? 8% +0.3 0.50 ? 6% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.53 ? 7% +0.3 0.84 ? 7% perf-profile.self.cycles-pp._raw_spin_lock_bh
0.25 ? 7% +0.3 0.56 ? 8% perf-profile.self.cycles-pp.os_xsave
0.13 ? 4% +0.3 0.46 ? 7% perf-profile.self.cycles-pp.__virt_addr_valid
0.16 ? 7% +0.3 0.49 ? 7% perf-profile.self.cycles-pp.loopback_xmit
0.05 ? 7% +0.3 0.40 ? 9% perf-profile.self.cycles-pp.import_single_range
0.32 ? 6% +0.4 0.67 ? 9% perf-profile.self.cycles-pp.ip_finish_output2
0.01 ?223% +0.4 0.36 ? 9% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.21 ? 8% +0.4 0.59 ? 8% perf-profile.self.cycles-pp.aa_sk_perm
0.27 ? 6% +0.4 0.69 ? 6% perf-profile.self.cycles-pp.__ip_queue_xmit
0.29 ? 6% +0.4 0.73 ? 8% perf-profile.self.cycles-pp.tcp_write_xmit
0.27 ? 10% +0.5 0.74 ? 10% perf-profile.self.cycles-pp.__fdget
0.00 +0.5 0.49 ? 10% perf-profile.self.cycles-pp.kfree
0.66 ? 6% +0.5 1.16 ? 8% perf-profile.self.cycles-pp.__tcp_transmit_skb
0.38 ? 6% +0.6 0.96 ? 8% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.00 +1.0 1.01 ? 8% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.10 ? 9% +2.1 2.17 ? 7% perf-profile.self.cycles-pp.switch_mm_irqs_off


***************************************************************************************************
lkp-cfl-e1: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-11/performance/x86_64-rhel-8.3/30%/debian-11.1-x86_64-20220510.cgz/300s/lkp-cfl-e1/context1/unixbench

commit:
c36eae5a86 ("sched/psi: Fix possible missing or delayed pending event")
b6aabb01e3 ("sched: consider WF_SYNC to find idle siblings")

c36eae5a86d15a69 b6aabb01e3004b0285d953761e1
---------------- ---------------------------
%stddev %change %stddev
\ | \
3547 +36.2% 4834 unixbench.score
1202 ? 4% +3.2e+05% 3838646 ? 17% unixbench.time.involuntary_context_switches
146.17 +4.7% 153.00 unixbench.time.percent_of_cpu_this_job_got
533.06 +1.7% 542.03 unixbench.time.system_time
40.02 +40.3% 56.14 unixbench.time.user_time
4.156e+08 +34.9% 5.604e+08 unixbench.time.voluntary_context_switches
5.537e+08 +36.3% 7.548e+08 unixbench.workload
1.53 +0.5 2.01 mpstat.cpu.all.usr%
4.434e+09 +13.2% 5.022e+09 cpuidle..time
8.352e+08 -98.7% 11131648 ? 12% cpuidle..usage
3.00 +33.3% 4.00 vmstat.procs.r
4212747 -32.1% 2861762 vmstat.system.cs
40960 -29.9% 28703 vmstat.system.in
161422 -32.1% 109641 meminfo.Active
161345 -32.1% 109569 meminfo.Active(anon)
530075 -10.9% 472307 meminfo.Committed_AS
80181 -27.2% 58384 meminfo.Mapped
162280 -32.2% 110057 meminfo.Shmem
40427 -31.8% 27576 proc-vmstat.nr_active_anon
978441 -1.3% 965308 proc-vmstat.nr_file_pages
20137 -26.7% 14769 proc-vmstat.nr_mapped
40661 -31.9% 27701 proc-vmstat.nr_shmem
13329 -0.9% 13206 proc-vmstat.nr_slab_unreclaimable
40427 -31.8% 27576 proc-vmstat.nr_zone_active_anon
1751058 -14.2% 1502883 proc-vmstat.numa_hit
1751425 -14.1% 1504002 proc-vmstat.numa_local
92726 -17.3% 76685 proc-vmstat.pgactivate
1819543 -13.7% 1570329 proc-vmstat.pgalloc_normal
684646 -2.4% 668006 proc-vmstat.pgfault
1807696 -13.7% 1559418 proc-vmstat.pgfree
40680 -1.8% 39968 proc-vmstat.pgreuse
1740 -47.5% 912.83 turbostat.Avg_MHz
38.54 -18.6 19.90 turbostat.Busy%
4515 +1.6% 4586 turbostat.Bzy_MHz
3885351 ? 9% -82.8% 667034 ? 73% turbostat.C1
26.41 ? 6% +50.7% 39.80 ? 19% turbostat.CPU%c6
74.05 -31.1% 50.99 ? 2% turbostat.CorWatt
60.17 -18.3% 49.17 ? 2% turbostat.CoreTmp
0.22 ? 2% +83.6% 0.41 turbostat.IPC
16161117 -30.0% 11319624 turbostat.IRQ
8.241e+08 -100.0% 3881 ? 26% turbostat.POLL
9.50 -9.5 0.00 turbostat.POLL%
58.67 ? 2% -17.0% 48.67 ? 2% turbostat.PkgTmp
74.52 -30.9% 51.46 ? 2% turbostat.PkgWatt
489.94 -35.7% 315.24 turbostat.Totl%C0
4851 ? 56% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.avg
56902 ? 44% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.max
15406 ? 47% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.stddev
0.44 ? 9% -14.4% 0.38 ? 2% sched_debug.cfs_rq:/.h_nr_running.avg
127353 +10.5% 140708 ? 3% sched_debug.cfs_rq:/.load.stddev
120.76 ? 8% +25.4% 151.41 ? 17% sched_debug.cfs_rq:/.load_avg.avg
4851 ? 56% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.avg
56902 ? 44% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.max
15406 ? 47% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.stddev
197189 -9.1% 179171 ? 2% sched_debug.cfs_rq:/.min_vruntime.max
116754 ? 6% +12.9% 131851 ? 3% sched_debug.cfs_rq:/.min_vruntime.min
21217 ? 10% -39.9% 12759 ? 18% sched_debug.cfs_rq:/.min_vruntime.stddev
0.44 ? 7% -18.5% 0.36 ? 3% sched_debug.cfs_rq:/.nr_running.avg
366.23 +21.1% 443.60 ? 3% sched_debug.cfs_rq:/.runnable_avg.avg
714.74 ? 4% +81.8% 1299 ? 3% sched_debug.cfs_rq:/.runnable_avg.max
246.75 ? 5% +101.3% 496.81 ? 3% sched_debug.cfs_rq:/.runnable_avg.stddev
51496 ? 44% -66.9% 17048 ? 49% sched_debug.cfs_rq:/.spread0.max
21217 ? 10% -39.9% 12760 ? 18% sched_debug.cfs_rq:/.spread0.stddev
708.50 ? 4% +49.8% 1061 ? 4% sched_debug.cfs_rq:/.util_avg.max
245.43 ? 5% +63.7% 401.71 ? 3% sched_debug.cfs_rq:/.util_avg.stddev
107.96 ? 11% +42.0% 153.34 ? 4% sched_debug.cfs_rq:/.util_est_enqueued.avg
499.12 ? 2% +65.3% 825.00 ? 4% sched_debug.cfs_rq:/.util_est_enqueued.max
181.05 ? 4% +58.9% 287.67 ? 4% sched_debug.cfs_rq:/.util_est_enqueued.stddev
407511 ? 4% +110.5% 857930 sched_debug.cpu.avg_idle.avg
104621 +462.1% 588066 ? 11% sched_debug.cpu.avg_idle.min
350206 ? 2% -68.5% 110282 ? 21% sched_debug.cpu.avg_idle.stddev
0.35 ? 18% +89.2% 0.67 ? 3% sched_debug.cpu.clock.stddev
0.00 ? 37% +68.1% 0.00 ? 9% sched_debug.cpu.next_balance.stddev
1.07 ? 6% +55.6% 1.67 ? 9% sched_debug.cpu.nr_running.max
0.46 ? 3% +24.0% 0.57 ? 4% sched_debug.cpu.nr_running.stddev
47178388 -32.4% 31875847 sched_debug.cpu.nr_switches.avg
56953632 ? 2% -36.6% 36093819 ? 2% sched_debug.cpu.nr_switches.max
31935712 ? 9% -13.3% 27680720 ? 3% sched_debug.cpu.nr_switches.min
6579393 ? 14% -64.6% 2330670 ? 21% sched_debug.cpu.nr_switches.stddev
950.00 -100.0% 0.00 sched_debug.rt_rq:/.rt_runtime.avg
950.00 -100.0% 0.00 sched_debug.rt_rq:/.rt_runtime.max
950.00 -100.0% 0.00 sched_debug.rt_rq:/.rt_runtime.min
65.64 -30.8% 45.42 perf-stat.i.MPKI
4.603e+09 -7.6% 4.252e+09 perf-stat.i.branch-instructions
2.81 +0.1 2.89 perf-stat.i.branch-miss-rate%
49172917 -1.4% 48506007 perf-stat.i.branch-misses
0.79 +2.6 3.39 ? 7% perf-stat.i.cache-miss-rate%
1887466 -4.2% 1808731 ? 2% perf-stat.i.cache-misses
5.404e+08 -94.1% 32037290 ? 3% perf-stat.i.cache-references
4247507 -32.1% 2884302 perf-stat.i.context-switches
1.99 -24.4% 1.50 perf-stat.i.cpi
2.784e+10 -47.7% 1.457e+10 perf-stat.i.cpu-cycles
52.17 ? 2% +634.1% 383.01 ? 4% perf-stat.i.cpu-migrations
56513 ? 2% -59.7% 22750 ? 9% perf-stat.i.cycles-between-cache-misses
410439 ? 2% +23.7% 507559 ? 7% perf-stat.i.dTLB-load-misses
6.418e+09 -3.1% 6.22e+09 perf-stat.i.dTLB-loads
3.826e+09 -4.0% 3.674e+09 perf-stat.i.dTLB-stores
95.43 -38.9 56.58 perf-stat.i.iTLB-load-miss-rate%
16498563 -5.9% 15523524 perf-stat.i.iTLB-load-misses
239550 ? 3% +5782.0% 14090457 perf-stat.i.iTLB-loads
2.197e+10 -4.8% 2.092e+10 perf-stat.i.instructions
0.70 +78.1% 1.25 perf-stat.i.ipc
1.74 -47.7% 0.91 perf-stat.i.metric.GHz
404.88 ? 2% -4.8% 385.49 ? 2% perf-stat.i.metric.K/sec
961.54 -7.8% 886.77 perf-stat.i.metric.M/sec
1575 -2.7% 1533 perf-stat.i.minor-faults
0.40 ? 4% +0.1 0.47 ? 7% perf-stat.i.node-load-miss-rate%
261859 -14.5% 223787 perf-stat.i.node-stores
1575 -2.7% 1533 perf-stat.i.page-faults
24.60 -93.8% 1.53 ? 3% perf-stat.overall.MPKI
1.07 +0.1 1.14 perf-stat.overall.branch-miss-rate%
0.35 +5.3 5.67 ? 5% perf-stat.overall.cache-miss-rate%
1.27 -45.0% 0.70 perf-stat.overall.cpi
14726 -45.3% 8048 ? 2% perf-stat.overall.cycles-between-cache-misses
0.01 ? 2% +0.0 0.01 ? 7% perf-stat.overall.dTLB-load-miss-rate%
98.57 -46.1 52.42 perf-stat.overall.iTLB-load-miss-rate%
1331 +1.2% 1347 perf-stat.overall.instructions-per-iTLB-miss
0.79 +81.9% 1.44 perf-stat.overall.ipc
0.00 ? 11% +0.0 0.01 ? 10% perf-stat.overall.node-store-miss-rate%
15528 -30.1% 10847 perf-stat.overall.path-length
4.591e+09 -7.6% 4.24e+09 perf-stat.ps.branch-instructions
49052787 -1.4% 48388408 perf-stat.ps.branch-misses
1885579 -4.2% 1807271 ? 2% perf-stat.ps.cache-misses
5.389e+08 -94.1% 31961586 ? 3% perf-stat.ps.cache-references
4235546 -32.1% 2876157 perf-stat.ps.context-switches
2.777e+10 -47.6% 1.454e+10 perf-stat.ps.cpu-cycles
52.05 ? 2% +633.9% 381.95 ? 4% perf-stat.ps.cpu-migrations
409410 ? 2% +23.7% 506263 ? 7% perf-stat.ps.dTLB-load-misses
6.4e+09 -3.1% 6.203e+09 perf-stat.ps.dTLB-loads
3.815e+09 -4.0% 3.664e+09 perf-stat.ps.dTLB-stores
16452159 -5.9% 15479733 perf-stat.ps.iTLB-load-misses
238902 ? 3% +5781.4% 14050699 perf-stat.ps.iTLB-loads
2.19e+10 -4.8% 2.086e+10 perf-stat.ps.instructions
1571 -2.7% 1529 perf-stat.ps.minor-faults
261165 -14.5% 223199 perf-stat.ps.node-stores
1571 -2.7% 1529 perf-stat.ps.page-faults
8.598e+12 -4.8% 8.188e+12 perf-stat.total.instructions
57.34 -35.5 21.87 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
53.70 ? 2% -32.8 20.91 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
53.66 ? 2% -32.8 20.91 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
53.51 ? 2% -32.6 20.91 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
21.60 -21.6 0.00 perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
38.36 ? 2% -17.5 20.86 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
35.56 ? 2% -14.1 21.44 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
33.37 ? 2% -12.8 20.57 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
7.88 ? 2% -7.9 0.00 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
5.64 -5.6 0.00 perf-profile.calltrace.cycles-pp.sched_ttwu_pending.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64_no_verify
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.calltrace.cycles-pp.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.calltrace.cycles-pp.rest_init.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.calltrace.cycles-pp.cpu_startup_entry.rest_init.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify
3.62 ? 24% -2.7 0.96 ? 31% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.rest_init.arch_call_rest_init.start_kernel
2.64 ? 18% -1.7 0.96 ? 31% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init.arch_call_rest_init
2.31 ? 15% -1.4 0.94 ? 31% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init
4.08 ? 2% -1.4 2.73 perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_read
1.90 ? 3% -0.8 1.11 ? 5% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
2.12 ? 2% -0.8 1.32 ? 3% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
0.86 ? 4% -0.6 0.26 ?100% perf-profile.calltrace.cycles-pp.copy_user_short_string.copyout._copy_to_iter.copy_page_to_iter.pipe_read
1.60 ? 2% -0.4 1.16 ? 3% perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__schedule.schedule
1.12 ? 2% -0.4 0.70 ? 3% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
0.97 ? 4% -0.2 0.75 ? 5% perf-profile.calltrace.cycles-pp.update_load_avg.dequeue_entity.dequeue_task_fair.__schedule.schedule
1.20 ? 4% -0.2 1.00 ? 5% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_read.vfs_read.ksys_read
1.04 ? 4% -0.1 0.90 ? 5% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read
1.31 ? 3% +0.1 1.40 ? 2% perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read
1.41 +0.1 1.53 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
1.04 +0.2 1.21 ? 4% perf-profile.calltrace.cycles-pp.perf_trace_sched_wakeup_template.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
2.68 ? 2% +0.2 2.85 perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.vfs_read.ksys_read.do_syscall_64
0.56 ? 4% +0.2 0.80 ? 7% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.42 ? 2% +0.3 1.70 perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64
0.26 ?100% +0.4 0.69 ? 6% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
0.56 ? 3% +0.5 1.07 ? 8% perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.pipe_read.vfs_read
0.00 +0.6 0.56 ? 11% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
0.00 +0.6 0.56 ? 7% perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write
0.00 +0.6 0.58 ? 3% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.00 +0.6 0.58 ? 5% perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock_cpu.update_rq_clock.__schedule.schedule
0.00 +0.6 0.59 ? 3% perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock_cpu.update_rq_clock.try_to_wake_up.autoremove_wake_function
0.00 +0.6 0.62 ? 6% perf-profile.calltrace.cycles-pp.sched_clock_cpu.update_rq_clock.__schedule.schedule.pipe_read
0.00 +0.6 0.63 ? 2% perf-profile.calltrace.cycles-pp.sched_clock_cpu.update_rq_clock.try_to_wake_up.autoremove_wake_function.__wake_up_common
0.00 +0.6 0.63 ? 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.finish_wait.pipe_read.vfs_read.ksys_read
0.00 +0.7 0.66 ? 10% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
0.00 +0.7 0.66 ? 4% perf-profile.calltrace.cycles-pp.__wrgsbase_inactive.read
0.00 +0.7 0.68 ? 8% perf-profile.calltrace.cycles-pp.___perf_sw_event.prepare_task_switch.__schedule.schedule.pipe_read
0.00 +0.7 0.69 ? 4% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +0.7 0.70 ? 6% perf-profile.calltrace.cycles-pp.update_load_avg.dequeue_task_fair.__schedule.schedule.pipe_read
0.00 +0.7 0.70 ? 8% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64
0.65 ? 5% +0.7 1.37 ? 3% perf-profile.calltrace.cycles-pp.os_xsave.read
1.26 ? 4% +0.7 1.98 ? 2% perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write
0.00 +0.7 0.72 ? 10% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
0.00 +0.7 0.74 ? 5% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write
0.00 +0.8 0.76 ? 3% perf-profile.calltrace.cycles-pp.update_load_avg.set_next_entity.pick_next_task_fair.__schedule.schedule
0.00 +0.8 0.76 ? 6% perf-profile.calltrace.cycles-pp.__calc_delta.update_curr.reweight_entity.dequeue_task_fair.__schedule
0.00 +0.8 0.80 ? 4% perf-profile.calltrace.cycles-pp.__calc_delta.update_curr.reweight_entity.enqueue_task_fair.ttwu_do_activate
0.00 +0.8 0.81 ? 4% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
0.00 +0.8 0.82 ? 9% perf-profile.calltrace.cycles-pp.__switch_to.read
0.00 +0.8 0.82 ? 4% perf-profile.calltrace.cycles-pp.finish_wait.pipe_read.vfs_read.ksys_read.do_syscall_64
0.00 +0.9 0.87 ? 3% perf-profile.calltrace.cycles-pp.update_rq_clock.__schedule.schedule.pipe_read.vfs_read
0.00 +0.9 0.88 ? 4% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
0.00 +0.9 0.89 ? 3% perf-profile.calltrace.cycles-pp.main
0.00 +0.9 0.91 ? 5% perf-profile.calltrace.cycles-pp.update_cfs_group.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
0.00 +0.9 0.91 ? 5% perf-profile.calltrace.cycles-pp.update_rq_clock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
0.00 +0.9 0.94 ? 2% perf-profile.calltrace.cycles-pp.__switch_to.__schedule.schedule.pipe_read.vfs_read
1.43 ? 3% +1.0 2.38 ? 2% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64
0.00 +1.0 0.95 ? 2% perf-profile.calltrace.cycles-pp.update_cfs_group.dequeue_task_fair.__schedule.schedule.pipe_read
0.00 +1.0 1.01 ? 5% perf-profile.calltrace.cycles-pp.update_curr.reweight_entity.dequeue_task_fair.__schedule.schedule
0.00 +1.0 1.01 ? 4% perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function
0.91 ? 2% +1.0 1.93 ? 3% perf-profile.calltrace.cycles-pp.__switch_to_asm.read
0.00 +1.0 1.04 ? 4% perf-profile.calltrace.cycles-pp.update_curr.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
0.00 +1.2 1.18 ? 3% perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.pipe_read
0.00 +1.3 1.27 ? 4% perf-profile.calltrace.cycles-pp.check_preempt_curr.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function.__wake_up_common
0.00 +1.3 1.30 ? 4% perf-profile.calltrace.cycles-pp.update_curr.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
0.00 +1.4 1.38 ? 3% perf-profile.calltrace.cycles-pp.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
0.54 ? 3% +1.4 1.94 ? 3% perf-profile.calltrace.cycles-pp.__entry_text_start.read
0.00 +1.6 1.58 ? 3% perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
0.17 ?141% +1.7 1.84 ? 3% perf-profile.calltrace.cycles-pp.__entry_text_start.write
0.00 +1.7 1.67 ? 3% perf-profile.calltrace.cycles-pp.reweight_entity.dequeue_task_fair.__schedule.schedule.pipe_read
0.00 +1.7 1.68 ? 2% perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
0.60 ? 5% +2.1 2.66 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
0.00 +2.2 2.21 ? 2% perf-profile.calltrace.cycles-pp.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.58 ? 6% +2.2 2.82 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_read.vfs_read
0.00 +2.3 2.30 ? 3% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
4.70 +2.4 7.11 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.vfs_read
0.00 +3.1 3.15 ? 3% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
0.00 +5.1 5.05 ? 2% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.pipe_read.vfs_read
0.00 +7.3 7.32 perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
7.83 +7.3 15.18 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write.ksys_write
0.00 +7.5 7.52 perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
6.88 ? 2% +7.6 14.49 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
8.24 +7.8 16.01 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
7.11 +7.8 14.94 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write
11.86 ? 6% +8.6 20.50 ? 2% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
11.86 ? 6% +8.6 20.51 ? 2% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
12.29 +9.5 21.75 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.40 +10.0 23.36 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
13.83 +10.4 24.21 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
14.17 +10.9 25.10 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
14.33 +11.1 25.46 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
9.16 +11.3 20.49 perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read
9.44 +11.9 21.39 perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64
15.53 +13.3 28.86 perf-profile.calltrace.cycles-pp.write
18.03 +14.7 32.74 perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
19.37 +15.7 35.10 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
19.90 +16.3 36.21 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
20.88 +18.6 39.51 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
21.30 +19.5 40.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
24.96 +24.5 49.46 perf-profile.calltrace.cycles-pp.read
57.34 -35.5 21.87 perf-profile.children.cycles-pp.secondary_startup_64_no_verify
57.34 -35.5 21.87 perf-profile.children.cycles-pp.cpu_startup_entry
57.17 -35.3 21.87 perf-profile.children.cycles-pp.do_idle
53.70 ? 2% -32.8 20.91 perf-profile.children.cycles-pp.start_secondary
21.62 -21.6 0.00 perf-profile.children.cycles-pp.poll_idle
41.05 -19.2 21.83 perf-profile.children.cycles-pp.cpuidle_idle_call
35.70 ? 2% -14.2 21.51 perf-profile.children.cycles-pp.cpuidle_enter
35.64 ? 2% -14.1 21.51 perf-profile.children.cycles-pp.cpuidle_enter_state
8.40 -8.4 0.00 perf-profile.children.cycles-pp.flush_smp_call_function_queue
6.02 -6.0 0.00 perf-profile.children.cycles-pp.sched_ttwu_pending
5.20 ? 2% -5.2 0.00 perf-profile.children.cycles-pp.schedule_idle
4.01 -3.8 0.25 ? 11% perf-profile.children.cycles-pp.menu_select
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.children.cycles-pp.start_kernel
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.children.cycles-pp.arch_call_rest_init
3.63 ? 24% -2.7 0.96 ? 31% perf-profile.children.cycles-pp.rest_init
2.59 ? 3% -2.3 0.32 ? 10% perf-profile.children.cycles-pp.ttwu_queue_wakelist
1.83 ? 2% -1.7 0.09 ? 20% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
4.18 ? 2% -1.4 2.81 perf-profile.children.cycles-pp.dequeue_entity
1.13 ? 2% -1.1 0.04 ? 72% perf-profile.children.cycles-pp.tick_nohz_next_event
1.92 ? 3% -0.8 1.11 ? 5% perf-profile.children.cycles-pp.select_task_rq_fair
2.12 ? 2% -0.8 1.34 ? 4% perf-profile.children.cycles-pp.select_task_rq
0.72 ? 3% -0.7 0.06 ? 47% perf-profile.children.cycles-pp.ktime_get
1.53 ? 2% -0.6 0.90 ? 4% perf-profile.children.cycles-pp._raw_spin_lock
1.86 -0.5 1.33 perf-profile.children.cycles-pp.sched_clock_cpu
1.63 ? 2% -0.4 1.20 perf-profile.children.cycles-pp.native_sched_clock
3.03 ? 3% -0.4 2.65 ? 3% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.45 ? 3% -0.4 1.09 ? 8% perf-profile.children.cycles-pp.prepare_task_switch
2.58 -0.3 2.30 perf-profile.children.cycles-pp.mutex_lock
0.53 ? 4% -0.2 0.34 ? 6% perf-profile.children.cycles-pp.finish_task_switch
1.05 ? 4% -0.1 0.90 ? 5% perf-profile.children.cycles-pp.copyout
1.38 ? 5% -0.1 1.26 ? 3% perf-profile.children.cycles-pp.set_next_entity
0.17 ? 10% -0.1 0.09 ? 4% perf-profile.children.cycles-pp.rcu_note_context_switch
0.18 ? 9% -0.0 0.13 ? 16% perf-profile.children.cycles-pp.place_entity
0.10 ? 6% -0.0 0.05 ? 48% perf-profile.children.cycles-pp.rb_next
0.08 ? 11% -0.0 0.06 ? 11% perf-profile.children.cycles-pp.sched_clock
0.06 ? 11% +0.0 0.10 ? 13% perf-profile.children.cycles-pp.rw_verify_area
0.08 ? 11% +0.0 0.12 ? 11% perf-profile.children.cycles-pp.__x64_sys_read
0.30 ? 8% +0.0 0.34 ? 8% perf-profile.children.cycles-pp.perf_trace_buf_update
0.02 ?141% +0.0 0.06 ? 11% perf-profile.children.cycles-pp.clockevents_program_event
0.06 ? 45% +0.0 0.10 ? 13% perf-profile.children.cycles-pp.__softirqentry_text_start
0.04 ? 71% +0.0 0.08 ? 8% perf-profile.children.cycles-pp.perf_swevent_get_recursion_context
0.04 ? 71% +0.1 0.09 ? 14% perf-profile.children.cycles-pp.inode_needs_update_time
0.03 ?102% +0.1 0.08 ? 19% perf-profile.children.cycles-pp.save_fpregs_to_fpstate
0.22 ? 6% +0.1 0.28 ? 11% perf-profile.children.cycles-pp.put_prev_entity
0.01 ?223% +0.1 0.06 ? 11% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.08 ? 14% +0.1 0.13 ? 16% perf-profile.children.cycles-pp.__irq_exit_rcu
0.12 ? 6% +0.1 0.17 ? 4% perf-profile.children.cycles-pp.perf_trace_buf_alloc
0.00 +0.1 0.06 ? 13% perf-profile.children.cycles-pp.default_wake_function
0.31 ? 5% +0.1 0.38 ? 6% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.09 ? 9% +0.1 0.16 ? 10% perf-profile.children.cycles-pp.__x64_sys_write
0.09 ? 10% +0.1 0.16 ? 8% perf-profile.children.cycles-pp.rcu_all_qs
0.04 ? 44% +0.1 0.12 ? 16% perf-profile.children.cycles-pp.page_copy_sane
0.04 ? 71% +0.1 0.11 ? 18% perf-profile.children.cycles-pp.scheduler_tick
0.30 ? 8% +0.1 0.38 ? 4% perf-profile.children.cycles-pp.pick_next_entity
0.11 ? 15% +0.1 0.20 ? 7% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.09 ? 15% +0.1 0.18 ? 17% perf-profile.children.cycles-pp.update_process_times
0.05 ? 46% +0.1 0.14 ? 14% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
0.07 ? 11% +0.1 0.15 ? 15% perf-profile.children.cycles-pp.perf_trace_run_bpf_submit
0.06 ? 11% +0.1 0.15 ? 12% perf-profile.children.cycles-pp.kill_fasync
0.10 ? 16% +0.1 0.20 ? 17% perf-profile.children.cycles-pp.tick_sched_handle
0.04 ? 72% +0.1 0.14 ? 34% perf-profile.children.cycles-pp.write@plt
0.12 ? 19% +0.1 0.23 ? 18% perf-profile.children.cycles-pp.tick_sched_timer
0.00 +0.1 0.12 ? 36% perf-profile.children.cycles-pp.exit_to_user_mode_loop
0.19 ? 9% +0.1 0.31 ? 9% perf-profile.children.cycles-pp.__get_task_ioprio
0.57 ? 6% +0.1 0.68 ? 4% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.13 ? 15% +0.1 0.25 ? 7% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
1.32 ? 3% +0.1 1.44 ? 2% perf-profile.children.cycles-pp._copy_to_iter
1.54 ? 2% +0.1 1.67 perf-profile.children.cycles-pp.__might_resched
0.00 +0.1 0.14 ? 11% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.14 ? 9% +0.2 0.30 ? 6% perf-profile.children.cycles-pp.aa_file_perm
0.09 ? 20% +0.2 0.26 ? 5% perf-profile.children.cycles-pp.__cgroup_account_cputime
0.14 ? 13% +0.2 0.31 ? 12% perf-profile.children.cycles-pp.__list_del_entry_valid
0.14 ? 11% +0.2 0.31 ? 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.20 ? 14% +0.2 0.37 ? 12% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.21 ? 9% +0.2 0.38 ? 5% perf-profile.children.cycles-pp.__cond_resched
1.06 +0.2 1.23 ? 4% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.06 ? 19% +0.2 0.24 ? 8% perf-profile.children.cycles-pp.check_cfs_rq_runtime
0.22 ? 7% +0.2 0.41 ? 6% perf-profile.children.cycles-pp.atime_needs_update
0.00 +0.2 0.22 ? 6% perf-profile.children.cycles-pp.set_next_buddy
2.70 ? 2% +0.2 2.92 perf-profile.children.cycles-pp.prepare_to_wait_event
0.28 ? 11% +0.2 0.51 ? 9% perf-profile.children.cycles-pp.hrtimer_interrupt
0.88 ? 5% +0.2 1.12 ? 4% perf-profile.children.cycles-pp.__might_fault
0.22 ? 8% +0.2 0.46 ? 6% perf-profile.children.cycles-pp.current_time
0.28 ? 7% +0.2 0.52 ? 5% perf-profile.children.cycles-pp.touch_atime
0.30 ? 11% +0.2 0.55 ? 8% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.21 ? 9% +0.2 0.46 ? 8% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.22 ? 8% +0.2 0.47 ? 6% perf-profile.children.cycles-pp.file_update_time
0.62 ? 2% +0.3 0.88 ? 3% perf-profile.children.cycles-pp._raw_spin_lock_irq
2.96 ? 2% +0.3 3.23 ? 3% perf-profile.children.cycles-pp.enqueue_entity
1.44 ? 2% +0.3 1.72 perf-profile.children.cycles-pp.copy_page_to_iter
1.53 ? 3% +0.3 1.81 ? 3% perf-profile.children.cycles-pp.__switch_to
0.26 ? 6% +0.3 0.57 ? 5% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.21 ? 7% +0.3 0.53 ? 7% perf-profile.children.cycles-pp.update_min_vruntime
2.68 ? 5% +0.3 3.02 perf-profile.children.cycles-pp.pick_next_task_fair
0.12 ? 8% +0.3 0.46 ? 4% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
0.42 ? 8% +0.4 0.78 ? 9% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.73 ? 6% +0.4 1.11 ? 5% perf-profile.children.cycles-pp.mutex_unlock
0.48 ? 7% +0.4 0.86 ? 8% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.87 ? 2% +0.4 1.27 ? 4% perf-profile.children.cycles-pp.__update_load_avg_se
0.90 ? 3% +0.4 1.30 ? 4% perf-profile.children.cycles-pp.apparmor_file_permission
0.14 ? 15% +0.4 0.54 ? 2% perf-profile.children.cycles-pp.__list_add_valid
0.89 ? 4% +0.4 1.30 ? 3% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.37 ? 7% +0.4 0.78 ? 8% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.33 ? 4% +0.4 0.75 ? 5% perf-profile.children.cycles-pp.copyin
0.36 ? 4% +0.5 0.82 ? 4% perf-profile.children.cycles-pp.finish_wait
1.03 ? 3% +0.5 1.50 ? 4% perf-profile.children.cycles-pp.security_file_permission
0.18 ? 16% +0.5 0.68 ? 7% perf-profile.children.cycles-pp.cpuacct_charge
0.40 ? 6% +0.5 0.93 ? 3% perf-profile.children.cycles-pp.main
0.53 ? 2% +0.6 1.12 ? 7% perf-profile.children.cycles-pp.__fdget_pos
0.65 ? 5% +0.7 1.38 ? 3% perf-profile.children.cycles-pp.os_xsave
1.30 ? 4% +0.7 2.04 ? 2% perf-profile.children.cycles-pp._copy_from_iter
0.65 ? 3% +0.8 1.41 ? 2% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.02 ? 2% +0.8 1.82 ? 2% perf-profile.children.cycles-pp.update_rq_clock
1.44 ? 3% +1.0 2.42 ? 2% perf-profile.children.cycles-pp.copy_page_from_iter
0.35 ? 5% +1.0 1.38 ? 3% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.00 +1.0 1.04 ? 5% perf-profile.children.cycles-pp.check_preempt_wakeup
0.21 ? 7% +1.1 1.28 ? 4% perf-profile.children.cycles-pp.check_preempt_curr
0.46 ? 5% +1.1 1.56 ? 2% perf-profile.children.cycles-pp.__calc_delta
0.83 ? 5% +1.1 1.96 ? 2% perf-profile.children.cycles-pp.update_cfs_group
2.68 ? 4% +1.3 3.98 ? 3% perf-profile.children.cycles-pp.update_load_avg
1.02 +1.4 2.47 ? 2% perf-profile.children.cycles-pp.__entry_text_start
0.00 +1.6 1.59 ? 3% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
0.36 ? 6% +1.9 2.23 ? 3% perf-profile.children.cycles-pp.switch_fpu_return
0.49 ? 4% +2.1 2.61 ? 4% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
4.72 +2.4 7.12 perf-profile.children.cycles-pp.dequeue_task_fair
0.74 ? 6% +2.4 3.15 ? 3% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.63 ? 4% +2.7 3.35 perf-profile.children.cycles-pp.reweight_entity
1.74 ? 3% +3.2 4.91 ? 2% perf-profile.children.cycles-pp.update_curr
3.34 ? 2% +4.0 7.34 perf-profile.children.cycles-pp.enqueue_task_fair
3.44 ? 2% +4.1 7.54 perf-profile.children.cycles-pp.ttwu_do_activate
0.26 ? 6% +4.9 5.12 ? 3% perf-profile.children.cycles-pp.switch_mm_irqs_off
14.17 ? 2% +6.6 20.77 perf-profile.children.cycles-pp.__schedule
7.84 +7.4 15.19 perf-profile.children.cycles-pp.__wake_up_common
6.90 +7.7 14.56 perf-profile.children.cycles-pp.try_to_wake_up
8.26 +7.8 16.03 perf-profile.children.cycles-pp.__wake_up_common_lock
7.12 +7.9 14.98 perf-profile.children.cycles-pp.autoremove_wake_function
11.94 ? 6% +8.6 20.59 ? 2% perf-profile.children.cycles-pp.mwait_idle_with_hints
11.95 ? 6% +8.6 20.59 ? 2% perf-profile.children.cycles-pp.intel_idle
12.32 +9.5 21.81 perf-profile.children.cycles-pp.pipe_write
13.43 +10.0 23.43 perf-profile.children.cycles-pp.vfs_write
13.85 +10.4 24.25 perf-profile.children.cycles-pp.ksys_write
9.45 +12.1 21.52 perf-profile.children.cycles-pp.schedule
15.82 +13.1 28.92 perf-profile.children.cycles-pp.write
18.16 +14.9 33.02 perf-profile.children.cycles-pp.pipe_read
19.41 +15.7 35.11 perf-profile.children.cycles-pp.vfs_read
19.92 +16.3 36.24 perf-profile.children.cycles-pp.ksys_read
25.25 +24.2 49.48 perf-profile.children.cycles-pp.read
35.17 +29.7 64.91 perf-profile.children.cycles-pp.do_syscall_64
35.72 +30.8 66.50 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
21.21 -21.2 0.00 perf-profile.self.cycles-pp.poll_idle
1.76 ? 2% -1.6 0.14 ? 11% perf-profile.self.cycles-pp.menu_select
1.52 ? 2% -0.6 0.90 ? 4% perf-profile.self.cycles-pp._raw_spin_lock
0.72 ? 5% -0.5 0.23 ? 11% perf-profile.self.cycles-pp.__wake_up_common
2.98 ? 3% -0.4 2.55 ? 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.58 ? 2% -0.4 1.16 perf-profile.self.cycles-pp.native_sched_clock
0.78 ? 2% -0.4 0.40 ? 14% perf-profile.self.cycles-pp.prepare_task_switch
0.32 ? 8% -0.2 0.11 ? 13% perf-profile.self.cycles-pp.cpuidle_enter_state
0.47 ? 5% -0.2 0.28 ? 8% perf-profile.self.cycles-pp.finish_task_switch
1.24 -0.2 1.06 ? 3% perf-profile.self.cycles-pp.mutex_lock
1.03 ? 3% -0.2 0.86 ? 2% perf-profile.self.cycles-pp.enqueue_entity
0.46 ? 13% -0.2 0.31 ? 12% perf-profile.self.cycles-pp.ttwu_queue_wakelist
0.20 ? 5% -0.1 0.10 ? 11% perf-profile.self.cycles-pp.sched_clock_cpu
0.16 ? 8% -0.1 0.08 ? 6% perf-profile.self.cycles-pp.rcu_note_context_switch
0.38 ? 9% -0.1 0.31 ? 10% perf-profile.self.cycles-pp.set_next_entity
0.35 ? 6% -0.1 0.29 ? 7% perf-profile.self.cycles-pp.perf_tp_event
0.16 ? 6% -0.0 0.11 ? 13% perf-profile.self.cycles-pp.put_prev_entity
0.16 ? 7% -0.0 0.12 ? 14% perf-profile.self.cycles-pp.place_entity
0.09 ? 7% -0.0 0.05 ? 47% perf-profile.self.cycles-pp.rb_next
0.07 ? 13% +0.0 0.09 ? 11% perf-profile.self.cycles-pp.perf_trace_buf_alloc
0.08 ? 12% +0.0 0.11 ? 13% perf-profile.self.cycles-pp.__x64_sys_read
0.14 ? 3% +0.0 0.17 ? 10% perf-profile.self.cycles-pp.__might_fault
0.20 ? 9% +0.0 0.24 ? 8% perf-profile.self.cycles-pp.select_task_rq
0.02 ?141% +0.0 0.06 ? 8% perf-profile.self.cycles-pp.perf_swevent_event
0.03 ? 70% +0.0 0.08 ? 14% perf-profile.self.cycles-pp.inode_needs_update_time
0.07 ? 11% +0.0 0.12 ? 14% perf-profile.self.cycles-pp.perf_trace_buf_update
0.06 ? 17% +0.1 0.12 ? 9% perf-profile.self.cycles-pp.rcu_all_qs
0.23 ? 7% +0.1 0.28 ? 3% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
0.06 ? 7% +0.1 0.12 ? 12% perf-profile.self.cycles-pp.finish_wait
0.08 ? 11% +0.1 0.14 ? 14% perf-profile.self.cycles-pp.__x64_sys_write
0.11 ? 11% +0.1 0.18 ? 5% perf-profile.self.cycles-pp.atime_needs_update
0.31 ? 4% +0.1 0.38 ? 6% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.05 ? 51% +0.1 0.12 ? 14% perf-profile.self.cycles-pp.__cgroup_account_cputime
0.01 ?223% +0.1 0.08 ? 8% perf-profile.self.cycles-pp.perf_swevent_get_recursion_context
0.05 ? 46% +0.1 0.12 ? 14% perf-profile.self.cycles-pp.touch_atime
0.01 ?223% +0.1 0.08 ? 17% perf-profile.self.cycles-pp.rw_verify_area
0.15 ? 8% +0.1 0.23 ? 6% perf-profile.self.cycles-pp.check_preempt_curr
0.13 ? 11% +0.1 0.20 ? 3% perf-profile.self.cycles-pp.security_file_permission
0.06 ? 11% +0.1 0.14 ? 13% perf-profile.self.cycles-pp.perf_trace_run_bpf_submit
0.04 ? 71% +0.1 0.12 ? 34% perf-profile.self.cycles-pp.write@plt
0.10 ? 6% +0.1 0.19 ? 13% perf-profile.self.cycles-pp.ttwu_do_activate
0.05 ? 46% +0.1 0.14 ? 18% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.42 ? 7% +0.1 0.50 ? 7% perf-profile.self.cycles-pp.update_rq_clock
0.10 ? 11% +0.1 0.20 ? 7% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.02 ? 99% +0.1 0.12 ? 15% perf-profile.self.cycles-pp.kill_fasync
0.11 ? 12% +0.1 0.20 ? 9% perf-profile.self.cycles-pp.__cond_resched
0.10 ? 3% +0.1 0.20 ? 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.06 ? 11% +0.1 0.16 ? 14% perf-profile.self.cycles-pp.file_update_time
0.00 +0.1 0.10 ? 15% perf-profile.self.cycles-pp.page_copy_sane
0.18 ? 7% +0.1 0.28 ? 8% perf-profile.self.cycles-pp.__get_task_ioprio
0.11 ? 8% +0.1 0.22 ? 20% perf-profile.self.cycles-pp.__wake_up_common_lock
0.12 ? 17% +0.1 0.23 ? 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.57 ? 6% +0.1 0.68 ? 4% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.12 ? 6% +0.1 0.24 ? 9% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.10 ? 8% +0.1 0.22 ? 4% perf-profile.self.cycles-pp.copy_page_to_iter
0.00 +0.1 0.13 ? 10% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.12 ? 10% +0.1 0.25 ? 8% perf-profile.self.cycles-pp.ksys_write
0.17 ? 9% +0.1 0.30 ? 7% perf-profile.self.cycles-pp.autoremove_wake_function
1.54 ? 2% +0.1 1.67 perf-profile.self.cycles-pp.__might_resched
0.14 ? 14% +0.1 0.28 ? 13% perf-profile.self.cycles-pp.__list_del_entry_valid
0.13 ? 12% +0.1 0.27 ? 6% perf-profile.self.cycles-pp.aa_file_perm
0.16 ? 13% +0.1 0.30 ? 4% perf-profile.self.cycles-pp.current_time
0.14 ? 3% +0.2 0.30 ? 8% perf-profile.self.cycles-pp._copy_to_iter
0.14 ? 11% +0.2 0.31 ? 5% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.03 ?101% +0.2 0.21 ? 10% perf-profile.self.cycles-pp.check_cfs_rq_runtime
0.12 ? 10% +0.2 0.30 ? 7% perf-profile.self.cycles-pp.copy_page_from_iter
0.41 ? 4% +0.2 0.60 ? 5% perf-profile.self.cycles-pp.vfs_write
0.54 ? 4% +0.2 0.76 perf-profile.self.cycles-pp.dequeue_task_fair
0.00 +0.2 0.22 ? 6% perf-profile.self.cycles-pp.set_next_buddy
0.38 ? 9% +0.2 0.62 ? 5% perf-profile.self.cycles-pp.enqueue_task_fair
0.25 ? 5% +0.2 0.50 perf-profile.self.cycles-pp._copy_from_iter
0.21 ? 7% +0.3 0.46 ? 8% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.61 +0.3 0.87 ? 3% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.76 ? 5% +0.3 1.02 ? 4% perf-profile.self.cycles-pp.apparmor_file_permission
0.35 ? 6% +0.3 0.62 ? 4% perf-profile.self.cycles-pp.switch_fpu_return
1.48 ? 3% +0.3 1.76 ? 2% perf-profile.self.cycles-pp.__switch_to
0.18 ? 7% +0.3 0.46 ? 6% perf-profile.self.cycles-pp.update_min_vruntime
0.44 ? 7% +0.3 0.73 ? 6% perf-profile.self.cycles-pp.pick_next_task_fair
0.22 ? 7% +0.3 0.54 ? 5% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.28 ? 7% +0.3 0.61 ? 7% perf-profile.self.cycles-pp.ksys_read
0.12 ? 11% +0.3 0.46 ? 4% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
0.73 ? 6% +0.4 1.09 ? 6% perf-profile.self.cycles-pp.mutex_unlock
0.36 ? 8% +0.4 0.74 ? 8% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.85 +0.4 1.24 ? 4% perf-profile.self.cycles-pp.__update_load_avg_se
0.35 ? 6% +0.4 0.74 ? 4% perf-profile.self.cycles-pp.try_to_wake_up
0.14 ? 16% +0.4 0.54 ? 2% perf-profile.self.cycles-pp.__list_add_valid
0.84 ? 4% +0.4 1.27 ? 3% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.38 ? 3% +0.5 0.84 ? 4% perf-profile.self.cycles-pp.do_syscall_64
0.39 ? 8% +0.5 0.86 ? 5% perf-profile.self.cycles-pp.read
0.40 ? 6% +0.5 0.88 ? 4% perf-profile.self.cycles-pp.write
0.94 ? 5% +0.5 1.42 ? 5% perf-profile.self.cycles-pp.update_load_avg
0.36 ? 6% +0.5 0.86 ? 4% perf-profile.self.cycles-pp.main
0.17 ? 16% +0.5 0.67 ? 7% perf-profile.self.cycles-pp.cpuacct_charge
0.22 ? 4% +0.5 0.76 ? 3% perf-profile.self.cycles-pp.schedule
0.52 ? 3% +0.6 1.08 ? 7% perf-profile.self.cycles-pp.__fdget_pos
0.58 ? 5% +0.6 1.16 ? 6% perf-profile.self.cycles-pp.vfs_read
0.63 ? 4% +0.6 1.26 ? 3% perf-profile.self.cycles-pp.reweight_entity
0.64 ? 6% +0.7 1.31 perf-profile.self.cycles-pp.pipe_write
0.46 ? 4% +0.7 1.15 ? 4% perf-profile.self.cycles-pp.__entry_text_start
0.65 ? 5% +0.7 1.37 ? 2% perf-profile.self.cycles-pp.os_xsave
0.62 ? 3% +0.7 1.35 ? 3% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.35 ? 7% +0.7 1.09 ? 4% perf-profile.self.cycles-pp.select_task_rq_fair
0.77 ? 6% +0.8 1.52 ? 3% perf-profile.self.cycles-pp.update_curr
0.00 +0.9 0.90 ? 6% perf-profile.self.cycles-pp.check_preempt_wakeup
0.94 ? 5% +1.0 1.94 perf-profile.self.cycles-pp.pipe_read
0.55 ? 5% +1.1 1.60 ? 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.45 ? 5% +1.1 1.56 ? 2% perf-profile.self.cycles-pp.__calc_delta
0.81 ? 5% +1.1 1.93 ? 2% perf-profile.self.cycles-pp.update_cfs_group
0.00 +1.6 1.59 ? 3% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.25 ? 7% +4.8 5.07 ? 2% perf-profile.self.cycles-pp.switch_mm_irqs_off
11.94 ? 6% +8.6 20.59 ? 2% perf-profile.self.cycles-pp.mwait_idle_with_hints





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (118.00 kB)
config-6.1.0-rc2-00018-gb6aabb01e300 (168.50 kB)
job-script (8.49 kB)
job.yaml (5.72 kB)
reproduce (2.07 kB)
Download all attachments