2022-04-11 14:16:55

by Abel Wu

[permalink] [raw]
Subject: [RFC v2 0/2] introduece sched-idle balance

Overloaded runqueues are those who have more than one pullable non-idle tasks
on them (given the sched-idle cpus are treated as idle cpus). The idle tasks,
which are either assigned SCHED_IDLE policy or in idle cpu cgroup, are tracked
through rq->cfs.idle_h_nr_running.

It would bring benefit if the unoccupied cpus (sched-idle/idle cpus) can start
serving as soon as the non-idle tasks are available. Lots of effort has already
been put into this:

- Task wakeup: the scheduler tries to find such cpus to make full
use of cpu capacity. But due to scalability issues, the search
depth is bounded to a reasonable limit. IOW it's possible that
a task is woken up on a busy cpu while unoccupied cpus are still
out there. Fortunately, these imbalance can be fixed by load
balancers.

- Load balancing: periodic (normal/idle) and newly-idle balancing.
The former is regulated by intervals on each sched-domain and
the intervals can prevent the sched-idle cpus from pulling the
non-idle tasks. While the latter is triggered only when the cpus
become really idle, and the sched-idle cpus are not the case.
The balancing can also be stopped by other constrains.

So the unoccupied cpus could still get a chance to co-exist with overloaded
ones, and in this case the sched-idle balancing will try to fast fix the
imbalance between them at some extent. That is:

- Record the overloaded cpus so we can know where to pull from.
This is done in tick to regulate manipulation on shared data.

- Filter out the overloaded cpus in SIS to improve the idle cpu
searching efficiency. The more overloaded the system is, the
less cpus we will search.

- Quit early in periodic load balancing if the cpu becomes busy.
This is similar to what we do in newly-idle case in which we
stop balancing once we got some work to do.

- The newly-idle balancing will try harder to pull the non-idle
tasks if overloaded cpus exist.

So the whole thing can be treated as an extension to the existing load balance
mechanisms on sched-idle cpus.

Benchmark
=========

Tests are done in an Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz machine with
2 NUMA nodes each of which has 24 cores with SMT2 enabled, so 96 CPUs in total.
Tests are separated into two parts:

- quiet: benchmarks running inside a normal cpu cgroup in a clean
environment

- noisy: benchmarks running inside a normal cpu cgroup, and noise
from an idle cpu cgroup, the two cgroups are at same level. The
noise is produced by perf messaging benchmark which occupies ~20%
cpu capacity in my server.

perf bench sched messaging -g 1 -l 2000000000

All of the benchmarks are done by mmtests with "--no-monitor --performance"
parameters, and with cpu turbo disabled.

As Mel required, the SIS filter part is also benchmarded separately, and the
additional SIS statistics comes from his patch [1].

Results
=======

vanilla: tip sched/core 6255b48aebfd (v5.17-rc5)
filter: vanilla + patch1
balancer: filter + patch2

a) hackbench-process-pipes

[quiet]
vanilla filter balancer
Amean 1 0.3077 ( 0.00%) 0.3340 ( -8.56%) 0.2523 ( 17.98%)
Amean 4 0.7703 ( 0.00%) 0.7360 ( 4.46%) 0.7220 * 6.27%*
Amean 7 0.9253 ( 0.00%) 0.9320 ( -0.72%) 0.9153 ( 1.08%)
Amean 12 1.2397 ( 0.00%) 1.1197 * 9.68%* 1.0867 * 12.34%*
Amean 21 2.8003 ( 0.00%) 2.4663 * 11.93%* 2.4490 * 12.55%*
Amean 30 5.2430 ( 0.00%) 4.1620 * 20.62%* 4.2220 * 19.47%*
Amean 48 7.9023 ( 0.00%) 6.7040 * 15.16%* 7.0897 * 10.28%*
Amean 79 9.6197 ( 0.00%) 8.6310 * 10.28%* 8.6590 * 9.99%*
Amean 110 9.8170 ( 0.00%) 9.3533 ( 4.72%) 9.2813 ( 5.46%)
Amean 141 11.8070 ( 0.00%) 11.3003 * 4.29%* 11.4297 * 3.20%*
Amean 172 14.1017 ( 0.00%) 13.3063 * 5.64%* 13.2740 * 5.87%*
Amean 203 15.9723 ( 0.00%) 16.0813 ( -0.68%) 15.2627 * 4.44%*
Amean 234 18.6590 ( 0.00%) 18.2387 ( 2.25%) 17.4267 * 6.60%*
Amean 265 20.8473 ( 0.00%) 20.4460 ( 1.93%) 19.8227 * 4.92%*
Amean 296 22.5817 ( 0.00%) 22.5307 ( 0.23%) 21.6657 * 4.06%*

Ops SIS Search Efficiency 16.51 27.35 27.37
Ops SIS Domain Search Eff 16.35 27.08 27.05
Ops SIS Fast Success Rate 1.19 1.32 1.58
Ops SIS Success Rate 1.96 2.60 3.06

[noisy]
vanilla filter balancer
Amean 1 0.3627 ( 0.00%) 0.2850 ( 21.42%) 0.2830 ( 21.97%)
Amean 4 0.7290 ( 0.00%) 0.7313 ( -0.32%) 0.7467 ( -2.42%)
Amean 7 0.9353 ( 0.00%) 0.9443 ( -0.96%) 0.9107 * 2.64%*
Amean 12 1.1973 ( 0.00%) 1.2283 ( -2.59%) 1.1013 * 8.02%*
Amean 21 2.5100 ( 0.00%) 2.4190 ( 3.63%) 2.3003 * 8.35%*
Amean 30 4.7437 ( 0.00%) 3.9367 * 17.01%* 3.7473 * 21.00%*
Amean 48 7.4943 ( 0.00%) 6.9470 * 7.30%* 7.0430 ( 6.02%)
Amean 79 9.4737 ( 0.00%) 8.6923 * 8.25%* 8.8930 * 6.13%*
Amean 110 10.7420 ( 0.00%) 9.5363 * 11.22%* 9.2847 * 13.57%*
Amean 141 12.2293 ( 0.00%) 11.0513 * 9.63%* 10.9750 * 10.26%*
Amean 172 14.0277 ( 0.00%) 13.7407 ( 2.05%) 12.9350 * 7.79%*
Amean 203 16.6930 ( 0.00%) 15.6677 * 6.14%* 15.1910 * 9.00%*
Amean 234 18.3360 ( 0.00%) 17.6750 ( 3.60%) 17.1403 * 6.52%*
Amean 265 20.8383 ( 0.00%) 19.9793 ( 4.12%) 19.4780 * 6.53%*
Amean 296 23.3080 ( 0.00%) 21.7693 * 6.60%* 21.3567 * 8.37%*

Ops SIS Search Efficiency 16.53 27.23 27.53
Ops SIS Domain Search Eff 16.27 26.81 27.07
Ops SIS Fast Success Rate 1.87 2.10 2.29
Ops SIS Success Rate 2.93 3.81 4.01

b) hackbench-process-sockets

[quiet]
vanilla filter balancer
Amean 1 0.5213 ( 0.00%) 0.5283 ( -1.34%) 0.5210 ( 0.06%)
Amean 4 1.4733 ( 0.00%) 1.4757 ( -0.16%) 1.4577 * 1.06%*
Amean 7 2.4620 ( 0.00%) 2.5083 * -1.88%* 2.5113 * -2.00%*
Amean 12 4.1283 ( 0.00%) 4.2143 * -2.08%* 4.1967 * -1.66%*
Amean 21 7.0153 ( 0.00%) 7.1620 * -2.09%* 7.0977 * -1.17%*
Amean 30 9.8900 ( 0.00%) 10.0380 * -1.50%* 9.9303 * -0.41%*
Amean 48 15.6753 ( 0.00%) 16.0283 * -2.25%* 15.8213 * -0.93%*
Amean 79 26.3443 ( 0.00%) 26.7147 * -1.41%* 26.3397 ( 0.02%)
Amean 110 36.5437 ( 0.00%) 37.3277 * -2.15%* 36.6197 * -0.21%*
Amean 141 46.5327 ( 0.00%) 47.4803 * -2.04%* 46.8620 * -0.71%*
Amean 172 56.5907 ( 0.00%) 58.0840 * -2.64%* 56.9503 * -0.64%*
Amean 203 66.8573 ( 0.00%) 68.3780 * -2.27%* 67.2330 * -0.56%*
Amean 234 77.2470 ( 0.00%) 78.8317 * -2.05%* 77.6773 * -0.56%*
Amean 265 87.5577 ( 0.00%) 89.3343 * -2.03%* 87.3617 * 0.22%*
Amean 296 97.6160 ( 0.00%) 99.6320 * -2.07%* 97.6450 ( -0.03%)

Ops SIS Search Efficiency 16.50 27.17 27.19
Ops SIS Domain Search Eff 16.32 26.88 26.84
Ops SIS Fast Success Rate 1.32 1.44 1.75
Ops SIS Success Rate 2.06 2.74 3.34

[noisy]
vanilla filter balancer
Amean 1 0.6120 ( 0.00%) 0.6760 * -10.46%* 0.6037 ( 1.36%)
Amean 4 1.5867 ( 0.00%) 1.6540 * -4.24%* 1.5120 * 4.71%*
Amean 7 2.5940 ( 0.00%) 2.6820 * -3.39%* 2.5047 * 3.44%*
Amean 12 4.3407 ( 0.00%) 4.4680 * -2.93%* 4.1513 * 4.36%*
Amean 21 7.3083 ( 0.00%) 7.5073 * -2.72%* 6.8467 * 6.32%*
Amean 30 9.9750 ( 0.00%) 10.4920 * -5.18%* 9.7220 * 2.54%*
Amean 48 15.9123 ( 0.00%) 16.5143 * -3.78%* 15.2683 * 4.05%*
Amean 79 26.2180 ( 0.00%) 27.2497 * -3.93%* 25.1087 * 4.23%*
Amean 110 36.8237 ( 0.00%) 38.8303 * -5.45%* 35.8823 * 2.56%*
Amean 141 47.3357 ( 0.00%) 49.6817 * -4.96%* 45.5723 * 3.73%*
Amean 172 57.4477 ( 0.00%) 60.8553 * -5.93%* 55.5380 * 3.32%*
Amean 203 67.6290 ( 0.00%) 71.8117 * -6.18%* 65.6033 * 3.00%*
Amean 234 77.8347 ( 0.00%) 82.9577 * -6.58%* 75.8713 * 2.52%*
Amean 265 88.4680 ( 0.00%) 94.2737 * -6.56%* 85.8547 * 2.95%*
Amean 296 99.2210 ( 0.00%) 105.9357 * -6.77%* 95.8777 * 3.37%*

Ops SIS Search Efficiency 16.51 27.21 27.62
Ops SIS Domain Search Eff 16.22 26.74 27.07
Ops SIS Fast Success Rate 2.13 2.38 2.73
Ops SIS Success Rate 3.21 4.20 4.66

c) hackbench-thread-pipes

[quiet]
vanilla filter balancer
Amean 1 0.2770 ( 0.00%) 0.2783 ( -0.48%) 0.2777 ( -0.24%)
Amean 4 0.7707 ( 0.00%) 0.7770 ( -0.82%) 0.7687 ( 0.26%)
Amean 7 0.9400 ( 0.00%) 0.9500 ( -1.06%) 0.9230 ( 1.81%)
Amean 12 1.4740 ( 0.00%) 1.4447 ( 1.99%) 1.4213 ( 3.57%)
Amean 21 3.8517 ( 0.00%) 3.5223 * 8.55%* 3.3837 * 12.15%*
Amean 30 6.7057 ( 0.00%) 5.9243 * 11.65%* 5.8200 * 13.21%*
Amean 48 8.9877 ( 0.00%) 8.3357 * 7.25%* 8.0573 * 10.35%*
Amean 79 10.3807 ( 0.00%) 9.6767 * 6.78%* 9.6947 * 6.61%*
Amean 110 11.1830 ( 0.00%) 10.5263 * 5.87%* 10.5247 ( 5.89%)
Amean 141 12.9987 ( 0.00%) 12.6463 ( 2.71%) 12.6697 ( 2.53%)
Amean 172 15.2327 ( 0.00%) 15.6350 ( -2.64%) 14.6007 ( 4.15%)
Amean 203 17.7090 ( 0.00%) 17.4287 ( 1.58%) 16.9330 ( 4.38%)
Amean 234 19.4380 ( 0.00%) 19.6747 ( -1.22%) 19.5393 ( -0.52%)
Amean 265 24.2407 ( 0.00%) 22.7170 ( 6.29%) 21.4700 * 11.43%*
Amean 296 26.5937 ( 0.00%) 26.4057 ( 0.71%) 24.2627 * 8.77%*

Ops SIS Search Efficiency 16.54 27.49 27.60
Ops SIS Domain Search Eff 16.34 27.18 27.23
Ops SIS Fast Success Rate 1.41 1.52 1.84
Ops SIS Success Rate 2.21 2.87 3.46

[noisy]
vanilla filter balancer
Amean 1 0.3097 ( 0.00%) 0.3373 * -8.93%* 0.3140 ( -1.40%)
Amean 4 0.7730 ( 0.00%) 0.7870 ( -1.81%) 0.7500 * 2.98%*
Amean 7 0.9580 ( 0.00%) 0.9520 ( 0.63%) 0.9270 ( 3.24%)
Amean 12 1.4840 ( 0.00%) 1.4103 ( 4.96%) 1.3970 * 5.86%*
Amean 21 3.4623 ( 0.00%) 3.1507 * 9.00%* 3.1517 * 8.97%*
Amean 30 6.1033 ( 0.00%) 5.6037 ( 8.19%) 5.7150 ( 6.36%)
Amean 48 8.9833 ( 0.00%) 8.6097 * 4.16%* 8.5367 * 4.97%*
Amean 79 11.0237 ( 0.00%) 9.5840 * 13.06%* 9.7860 * 11.23%*
Amean 110 12.4213 ( 0.00%) 10.9570 * 11.79%* 10.5110 * 15.38%*
Amean 141 13.4703 ( 0.00%) 12.5320 * 6.97%* 12.4137 ( 7.84%)
Amean 172 17.0973 ( 0.00%) 15.6843 * 8.26%* 14.6183 * 14.50%*
Amean 203 18.8867 ( 0.00%) 17.3487 * 8.14%* 17.8260 * 5.62%*
Amean 234 22.0430 ( 0.00%) 19.8977 * 9.73%* 19.6240 * 10.97%*
Amean 265 23.9877 ( 0.00%) 21.9163 * 8.63%* 22.5933 ( 5.81%)
Amean 296 27.1667 ( 0.00%) 25.2857 ( 6.92%) 23.8423 * 12.24%*

Ops SIS Search Efficiency 16.57 27.57 28.04
Ops SIS Domain Search Eff 16.29 27.11 27.50
Ops SIS Fast Success Rate 2.06 2.29 2.67
Ops SIS Success Rate 3.11 4.04 4.47

d) hackbench-thread-sockets

[quiet]
vanilla filter balancer
Amean 1 0.5773 ( 0.00%) 0.5767 ( 0.12%) 0.5723 ( 0.87%)
Amean 4 1.5083 ( 0.00%) 1.5117 ( -0.22%) 1.5027 ( 0.38%)
Amean 7 2.5453 ( 0.00%) 2.5890 * -1.72%* 2.5823 * -1.45%*
Amean 12 4.2763 ( 0.00%) 4.3357 * -1.39%* 4.3203 * -1.03%*
Amean 21 7.2050 ( 0.00%) 7.3777 * -2.40%* 7.2923 * -1.21%*
Amean 30 10.1203 ( 0.00%) 10.3367 * -2.14%* 10.2107 * -0.89%*
Amean 48 16.0403 ( 0.00%) 16.3427 * -1.88%* 16.1080 ( -0.42%)
Amean 79 27.0260 ( 0.00%) 27.2193 ( -0.72%) 26.7280 * 1.10%*
Amean 110 37.4073 ( 0.00%) 38.1427 * -1.97%* 37.7580 * -0.94%*
Amean 141 47.7927 ( 0.00%) 48.7607 * -2.03%* 48.5797 * -1.65%*
Amean 172 58.1860 ( 0.00%) 59.5697 * -2.38%* 58.7377 * -0.95%*
Amean 203 68.6033 ( 0.00%) 70.6163 * -2.93%* 69.0957 * -0.72%*
Amean 234 79.2923 ( 0.00%) 81.1143 * -2.30%* 79.9310 * -0.81%*
Amean 265 89.6240 ( 0.00%) 91.8750 * -2.51%* 90.4663 * -0.94%*
Amean 296 100.2680 ( 0.00%) 102.9560 * -2.68%* 101.0817 * -0.81%*

Ops SIS Search Efficiency 16.58 25.34 24.62
Ops SIS Domain Search Eff 16.12 24.59 23.81
Ops SIS Fast Success Rate 3.30 3.91 4.33
Ops SIS Success Rate 3.99 5.77 7.09

[noisy]
vanilla filter balancer
Amean 1 0.6607 ( 0.00%) 0.7033 * -6.46%* 0.6727 ( -1.82%)
Amean 4 1.6270 ( 0.00%) 1.6507 * -1.45%* 1.5457 * 5.00%*
Amean 7 2.6850 ( 0.00%) 2.7483 * -2.36%* 2.5850 * 3.72%*
Amean 12 4.5273 ( 0.00%) 4.6250 * -2.16%* 4.2457 * 6.22%*
Amean 21 7.5403 ( 0.00%) 7.6453 ( -1.39%) 7.1340 * 5.39%*
Amean 30 10.4227 ( 0.00%) 10.7350 * -3.00%* 9.9497 * 4.54%*
Amean 48 16.2257 ( 0.00%) 16.9840 * -4.67%* 15.7340 * 3.03%*
Amean 79 27.2820 ( 0.00%) 27.9947 * -2.61%* 25.9023 * 5.06%*
Amean 110 37.9413 ( 0.00%) 40.0053 * -5.44%* 36.9113 * 2.71%*
Amean 141 48.3913 ( 0.00%) 51.3303 * -6.07%* 47.0660 * 2.74%*
Amean 172 58.9597 ( 0.00%) 62.8973 * -6.68%* 57.1193 * 3.12%*
Amean 203 70.1857 ( 0.00%) 74.3620 * -5.95%* 68.0957 * 2.98%*
Amean 234 80.2250 ( 0.00%) 86.1143 * -7.34%* 78.4873 * 2.17%*
Amean 265 91.2950 ( 0.00%) 97.7753 * -7.10%* 88.9163 * 2.61%*
Amean 296 102.1407 ( 0.00%) 109.6700 * -7.37%* 100.2663 * 1.84%*

Ops SIS Search Efficiency 16.57 24.79 25.86
Ops SIS Domain Search Eff 15.80 23.54 24.40
Ops SIS Fast Success Rate 5.55 6.59 7.48
Ops SIS Success Rate 7.20 10.39 11.38

e) schbench

[quiet]
vanilla filter balancer
Lat 50.0th-qrtle-1 5.00 ( 0.00%) 5.00 ( 0.00%) 5.00 ( 0.00%)
Lat 75.0th-qrtle-1 5.00 ( 0.00%) 5.00 ( 0.00%) 5.00 ( 0.00%)
Lat 90.0th-qrtle-1 5.00 ( 0.00%) 5.00 ( 0.00%) 5.00 ( 0.00%)
Lat 95.0th-qrtle-1 6.00 ( 0.00%) 6.00 ( 0.00%) 5.00 ( 16.67%)
Lat 99.0th-qrtle-1 6.00 ( 0.00%) 7.00 ( -16.67%) 6.00 ( 0.00%)
Lat 99.5th-qrtle-1 7.00 ( 0.00%) 8.00 ( -14.29%) 6.00 ( 14.29%)
Lat 99.9th-qrtle-1 8.00 ( 0.00%) 12.00 ( -50.00%) 6.00 ( 25.00%)
Lat 50.0th-qrtle-2 6.00 ( 0.00%) 6.00 ( 0.00%) 6.00 ( 0.00%)
Lat 75.0th-qrtle-2 6.00 ( 0.00%) 6.00 ( 0.00%) 7.00 ( -16.67%)
Lat 90.0th-qrtle-2 7.00 ( 0.00%) 7.00 ( 0.00%) 7.00 ( 0.00%)
Lat 95.0th-qrtle-2 7.00 ( 0.00%) 7.00 ( 0.00%) 7.00 ( 0.00%)
Lat 99.0th-qrtle-2 8.00 ( 0.00%) 8.00 ( 0.00%) 8.00 ( 0.00%)
Lat 99.5th-qrtle-2 9.00 ( 0.00%) 8.00 ( 11.11%) 9.00 ( 0.00%)
Lat 99.9th-qrtle-2 9.00 ( 0.00%) 9.00 ( 0.00%) 9.00 ( 0.00%)
Lat 50.0th-qrtle-4 9.00 ( 0.00%) 8.00 ( 11.11%) 8.00 ( 11.11%)
Lat 75.0th-qrtle-4 10.00 ( 0.00%) 10.00 ( 0.00%) 10.00 ( 0.00%)
Lat 90.0th-qrtle-4 11.00 ( 0.00%) 11.00 ( 0.00%) 11.00 ( 0.00%)
Lat 95.0th-qrtle-4 12.00 ( 0.00%) 12.00 ( 0.00%) 11.00 ( 8.33%)
Lat 99.0th-qrtle-4 13.00 ( 0.00%) 13.00 ( 0.00%) 13.00 ( 0.00%)
Lat 99.5th-qrtle-4 14.00 ( 0.00%) 14.00 ( 0.00%) 14.00 ( 0.00%)
Lat 99.9th-qrtle-4 16.00 ( 0.00%) 16.00 ( 0.00%) 16.00 ( 0.00%)
Lat 50.0th-qrtle-8 13.00 ( 0.00%) 12.00 ( 7.69%) 12.00 ( 7.69%)
Lat 75.0th-qrtle-8 16.00 ( 0.00%) 15.00 ( 6.25%) 16.00 ( 0.00%)
Lat 90.0th-qrtle-8 18.00 ( 0.00%) 17.00 ( 5.56%) 18.00 ( 0.00%)
Lat 95.0th-qrtle-8 19.00 ( 0.00%) 18.00 ( 5.26%) 18.00 ( 5.26%)
Lat 99.0th-qrtle-8 23.00 ( 0.00%) 21.00 ( 8.70%) 20.00 ( 13.04%)
Lat 99.5th-qrtle-8 24.00 ( 0.00%) 23.00 ( 4.17%) 22.00 ( 8.33%)
Lat 99.9th-qrtle-8 29.00 ( 0.00%) 25.00 ( 13.79%) 26.00 ( 10.34%)
Lat 50.0th-qrtle-16 20.00 ( 0.00%) 21.00 ( -5.00%) 20.00 ( 0.00%)
Lat 75.0th-qrtle-16 27.00 ( 0.00%) 28.00 ( -3.70%) 27.00 ( 0.00%)
Lat 90.0th-qrtle-16 32.00 ( 0.00%) 33.00 ( -3.12%) 31.00 ( 3.12%)
Lat 95.0th-qrtle-16 33.00 ( 0.00%) 35.00 ( -6.06%) 33.00 ( 0.00%)
Lat 99.0th-qrtle-16 38.00 ( 0.00%) 40.00 ( -5.26%) 38.00 ( 0.00%)
Lat 99.5th-qrtle-16 40.00 ( 0.00%) 42.00 ( -5.00%) 41.00 ( -2.50%)
Lat 99.9th-qrtle-16 43.00 ( 0.00%) 49.00 ( -13.95%) 50.00 ( -16.28%)
Lat 50.0th-qrtle-32 38.00 ( 0.00%) 37.00 ( 2.63%) 36.00 ( 5.26%)
Lat 75.0th-qrtle-32 55.00 ( 0.00%) 54.00 ( 1.82%) 53.00 ( 3.64%)
Lat 90.0th-qrtle-32 65.00 ( 0.00%) 64.00 ( 1.54%) 62.00 ( 4.62%)
Lat 95.0th-qrtle-32 69.00 ( 0.00%) 68.00 ( 1.45%) 67.00 ( 2.90%)
Lat 99.0th-qrtle-32 80.00 ( 0.00%) 80.00 ( 0.00%) 76.00 ( 5.00%)
Lat 99.5th-qrtle-32 85.00 ( 0.00%) 90.00 ( -5.88%) 82.00 ( 3.53%)
Lat 99.9th-qrtle-32 93.00 ( 0.00%) 135.00 ( -45.16%) 90.00 ( 3.23%)
Lat 50.0th-qrtle-47 55.00 ( 0.00%) 55.00 ( 0.00%) 53.00 ( 3.64%)
Lat 75.0th-qrtle-47 81.00 ( 0.00%) 81.00 ( 0.00%) 77.00 ( 4.94%)
Lat 90.0th-qrtle-47 97.00 ( 0.00%) 97.00 ( 0.00%) 92.00 ( 5.15%)
Lat 95.0th-qrtle-47 104.00 ( 0.00%) 103.00 ( 0.96%) 99.00 ( 4.81%)
Lat 99.0th-qrtle-47 120.00 ( 0.00%) 120.00 ( 0.00%) 119.00 ( 0.83%)
Lat 99.5th-qrtle-47 131.00 ( 0.00%) 133.00 ( -1.53%) 127.00 ( 3.05%)
Lat 99.9th-qrtle-47 161.00 ( 0.00%) 163.00 ( -1.24%) 165.00 ( -2.48%)

Ops SIS Search Efficiency 83.44 77.01 84.31
Ops SIS Domain Search Eff 4.56 4.11 4.47
Ops SIS Fast Success Rate 99.05 98.72 99.13
Ops SIS Success Rate 99.65 99.54 99.77

[noisy]
vanilla filter balancer
Lat 50.0th-qrtle-1 7.00 ( 0.00%) 8.00 ( -14.29%) 9.00 ( -28.57%)
Lat 75.0th-qrtle-1 8.00 ( 0.00%) 9.00 ( -12.50%) 10.00 ( -25.00%)
Lat 90.0th-qrtle-1 9.00 ( 0.00%) 10.00 ( -11.11%) 11.00 ( -22.22%)
Lat 95.0th-qrtle-1 9.00 ( 0.00%) 11.00 ( -22.22%) 12.00 ( -33.33%)
Lat 99.0th-qrtle-1 11.00 ( 0.00%) 13.00 ( -18.18%) 15.00 ( -36.36%)
Lat 99.5th-qrtle-1 13.00 ( 0.00%) 14.00 ( -7.69%) 18.00 ( -38.46%)
Lat 99.9th-qrtle-1 13.00 ( 0.00%) 14.00 ( -7.69%) 18.00 ( -38.46%)
Lat 50.0th-qrtle-2 9.00 ( 0.00%) 9.00 ( 0.00%) 9.00 ( 0.00%)
Lat 75.0th-qrtle-2 11.00 ( 0.00%) 10.00 ( 9.09%) 11.00 ( 0.00%)
Lat 90.0th-qrtle-2 12.00 ( 0.00%) 11.00 ( 8.33%) 12.00 ( 0.00%)
Lat 95.0th-qrtle-2 13.00 ( 0.00%) 12.00 ( 7.69%) 14.00 ( -7.69%)
Lat 99.0th-qrtle-2 15.00 ( 0.00%) 15.00 ( 0.00%) 15.00 ( 0.00%)
Lat 99.5th-qrtle-2 15.00 ( 0.00%) 17.00 ( -13.33%) 16.00 ( -6.67%)
Lat 99.9th-qrtle-2 17.00 ( 0.00%) 19.00 ( -11.76%) 21.00 ( -23.53%)
Lat 50.0th-qrtle-4 12.00 ( 0.00%) 12.00 ( 0.00%) 12.00 ( 0.00%)
Lat 75.0th-qrtle-4 14.00 ( 0.00%) 15.00 ( -7.14%) 14.00 ( 0.00%)
Lat 90.0th-qrtle-4 16.00 ( 0.00%) 17.00 ( -6.25%) 16.00 ( 0.00%)
Lat 95.0th-qrtle-4 17.00 ( 0.00%) 18.00 ( -5.88%) 16.00 ( 5.88%)
Lat 99.0th-qrtle-4 20.00 ( 0.00%) 20.00 ( 0.00%) 19.00 ( 5.00%)
Lat 99.5th-qrtle-4 21.00 ( 0.00%) 20.00 ( 4.76%) 21.00 ( 0.00%)
Lat 99.9th-qrtle-4 26.00 ( 0.00%) 21.00 ( 19.23%) 22.00 ( 15.38%)
Lat 50.0th-qrtle-8 17.00 ( 0.00%) 16.00 ( 5.88%) 17.00 ( 0.00%)
Lat 75.0th-qrtle-8 22.00 ( 0.00%) 21.00 ( 4.55%) 21.00 ( 4.55%)
Lat 90.0th-qrtle-8 26.00 ( 0.00%) 24.00 ( 7.69%) 25.00 ( 3.85%)
Lat 95.0th-qrtle-8 28.00 ( 0.00%) 25.00 ( 10.71%) 26.00 ( 7.14%)
Lat 99.0th-qrtle-8 32.00 ( 0.00%) 29.00 ( 9.38%) 29.00 ( 9.38%)
Lat 99.5th-qrtle-8 34.00 ( 0.00%) 31.00 ( 8.82%) 31.00 ( 8.82%)
Lat 99.9th-qrtle-8 42.00 ( 0.00%) 34.00 ( 19.05%) 35.00 ( 16.67%)
Lat 50.0th-qrtle-16 29.00 ( 0.00%) 30.00 ( -3.45%) 27.00 ( 6.90%)
Lat 75.0th-qrtle-16 40.00 ( 0.00%) 41.00 ( -2.50%) 37.00 ( 7.50%)
Lat 90.0th-qrtle-16 46.00 ( 0.00%) 49.00 ( -6.52%) 43.00 ( 6.52%)
Lat 95.0th-qrtle-16 49.00 ( 0.00%) 53.00 ( -8.16%) 46.00 ( 6.12%)
Lat 99.0th-qrtle-16 55.00 ( 0.00%) 59.00 ( -7.27%) 52.00 ( 5.45%)
Lat 99.5th-qrtle-16 57.00 ( 0.00%) 62.00 ( -8.77%) 55.00 ( 3.51%)
Lat 99.9th-qrtle-16 63.00 ( 0.00%) 84.00 ( -33.33%) 62.00 ( 1.59%)
Lat 50.0th-qrtle-32 48.00 ( 0.00%) 49.00 ( -2.08%) 49.00 ( -2.08%)
Lat 75.0th-qrtle-32 69.00 ( 0.00%) 70.00 ( -1.45%) 70.00 ( -1.45%)
Lat 90.0th-qrtle-32 83.00 ( 0.00%) 87.00 ( -4.82%) 85.00 ( -2.41%)
Lat 95.0th-qrtle-32 90.00 ( 0.00%) 96.00 ( -6.67%) 91.00 ( -1.11%)
Lat 99.0th-qrtle-32 102.00 ( 0.00%) 110.00 ( -7.84%) 104.00 ( -1.96%)
Lat 99.5th-qrtle-32 107.00 ( 0.00%) 115.00 ( -7.48%) 109.00 ( -1.87%)
Lat 99.9th-qrtle-32 112.00 ( 0.00%) 151.00 ( -34.82%) 118.00 ( -5.36%)
Lat 50.0th-qrtle-47 64.00 ( 0.00%) 64.00 ( 0.00%) 66.00 ( -3.12%)
Lat 75.0th-qrtle-47 93.00 ( 0.00%) 92.00 ( 1.08%) 96.00 ( -3.23%)
Lat 90.0th-qrtle-47 113.00 ( 0.00%) 110.00 ( 2.65%) 116.00 ( -2.65%)
Lat 95.0th-qrtle-47 126.00 ( 0.00%) 122.00 ( 3.17%) 126.00 ( 0.00%)
Lat 99.0th-qrtle-47 159.00 ( 0.00%) 137.00 ( 13.84%) 143.00 ( 10.06%)
Lat 99.5th-qrtle-47 231.00 ( 0.00%) 144.00 ( 37.66%) 152.00 ( 34.20%)
Lat 99.9th-qrtle-47 9136.00 ( 0.00%) 181.00 ( 98.02%) 1190.00 ( 86.97%)

Ops SIS Search Efficiency 94.62 94.47 94.79
Ops SIS Domain Search Eff 15.30 14.91 15.31
Ops SIS Fast Success Rate 98.97 98.97 99.01
Ops SIS Success Rate 99.98 99.98 99.99

f) tbench4 Throughput

[quiet]
vanilla filter balancer
Hmean 1 287.90 ( 0.00%) 290.20 * 0.80%* 296.20 * 2.88%*
Hmean 2 582.58 ( 0.00%) 586.96 * 0.75%* 599.12 * 2.84%*
Hmean 4 1151.83 ( 0.00%) 1165.32 * 1.17%* 1181.72 * 2.60%*
Hmean 8 2317.84 ( 0.00%) 2311.67 * -0.27%* 2344.42 * 1.15%*
Hmean 16 4530.15 ( 0.00%) 4555.04 * 0.55%* 4561.38 * 0.69%*
Hmean 32 7643.04 ( 0.00%) 7644.20 ( 0.02%) 7707.31 * 0.84%*
Hmean 64 9310.48 ( 0.00%) 9664.41 * 3.80%* 9615.94 * 3.28%*
Hmean 128 21837.26 ( 0.00%) 13628.90 * -37.59%* 15996.13 * -26.75%*
Hmean 256 20789.62 ( 0.00%) 22550.77 * 8.47%* 22776.78 * 9.56%*
Hmean 384 19329.85 ( 0.00%) 19786.13 * 2.36%* 17499.33 * -9.47%*

Ops SIS Search Efficiency 22.09 23.39 23.54
Ops SIS Domain Search Eff 14.64 14.78 14.96
Ops SIS Fast Success Rate 39.51 43.21 42.87
Ops SIS Success Rate 45.19 57.62 54.69

[noisy]
vanilla filter balancer
Hmean 1 275.05 ( 0.00%) 275.66 * 0.22%* 276.99 * 0.70%*
Hmean 2 543.31 ( 0.00%) 548.55 * 0.97%* 549.71 * 1.18%*
Hmean 4 1077.41 ( 0.00%) 1082.60 * 0.48%* 1090.07 * 1.18%*
Hmean 8 2119.68 ( 0.00%) 2140.60 * 0.99%* 2133.47 * 0.65%*
Hmean 16 3914.25 ( 0.00%) 3938.84 * 0.63%* 3914.42 ( 0.00%)
Hmean 32 6574.06 ( 0.00%) 6650.66 * 1.17%* 6622.36 * 0.73%*
Hmean 64 8757.89 ( 0.00%) 9047.06 * 3.30%* 8987.00 * 2.62%*
Hmean 128 20533.22 ( 0.00%) 15573.19 * -24.16%* 20746.83 * 1.04%*
Hmean 256 20194.51 ( 0.00%) 18961.24 * -6.11%* 20115.34 * -0.39%*
Hmean 384 17552.64 ( 0.00%) 17949.46 * 2.26%* 19796.18 * 12.78%*

Ops SIS Search Efficiency 22.30 25.21 28.33
Ops SIS Domain Search Eff 14.87 16.69 19.60
Ops SIS Fast Success Rate 39.15 40.58 38.35
Ops SIS Success Rate 43.88 49.88 43.08

g) netperf-udp

[quiet]
vanilla filter balancer
Hmean send-64 184.86 ( 0.00%) 182.76 * -1.14%* 185.78 ( 0.50%)
Hmean send-128 368.11 ( 0.00%) 362.97 * -1.40%* 371.56 ( 0.94%)
Hmean send-256 730.95 ( 0.00%) 717.07 * -1.90%* 728.71 ( -0.31%)
Hmean send-1024 2804.94 ( 0.00%) 2782.30 * -0.81%* 2825.97 ( 0.75%)
Hmean send-2048 5355.49 ( 0.00%) 5228.70 * -2.37%* 5370.87 ( 0.29%)
Hmean send-3312 8235.83 ( 0.00%) 8247.11 ( 0.14%) 8298.53 ( 0.76%)
Hmean send-4096 9916.04 ( 0.00%) 10012.30 * 0.97%* 10086.82 * 1.72%*
Hmean send-8192 16743.15 ( 0.00%) 16847.61 ( 0.62%) 16856.65 ( 0.68%)
Hmean send-16384 26512.04 ( 0.00%) 26512.69 ( 0.00%) 26537.69 ( 0.10%)
Hmean recv-64 184.86 ( 0.00%) 182.76 * -1.14%* 185.78 ( 0.50%)
Hmean recv-128 368.11 ( 0.00%) 362.97 * -1.40%* 371.56 ( 0.94%)
Hmean recv-256 730.95 ( 0.00%) 717.07 * -1.90%* 728.71 ( -0.31%)
Hmean recv-1024 2804.94 ( 0.00%) 2782.30 * -0.81%* 2825.97 ( 0.75%)
Hmean recv-2048 5355.49 ( 0.00%) 5228.70 * -2.37%* 5370.87 ( 0.29%)
Hmean recv-3312 8235.83 ( 0.00%) 8247.11 ( 0.14%) 8298.53 ( 0.76%)
Hmean recv-4096 9916.04 ( 0.00%) 10012.30 * 0.97%* 10086.78 * 1.72%*
Hmean recv-8192 16743.10 ( 0.00%) 16847.59 ( 0.62%) 16856.55 ( 0.68%)
Hmean recv-16384 26512.04 ( 0.00%) 26512.68 ( 0.00%) 26537.69 ( 0.10%)

Ops SIS Search Efficiency 100.00 100.00 100.00
Ops SIS Domain Search Eff 20.00 23.09 24.32
Ops SIS Fast Success Rate 100.00 100.00 100.00
Ops SIS Success Rate 100.00 100.00 100.00

[noisy]
vanilla filter balancer
Hmean send-64 180.48 ( 0.00%) 181.99 ( 0.84%) 182.52 ( 1.13%)
Hmean send-128 350.18 ( 0.00%) 360.14 * 2.85%* 367.83 * 5.04%*
Hmean send-256 708.12 ( 0.00%) 707.57 ( -0.08%) 723.46 * 2.17%*
Hmean send-1024 2752.72 ( 0.00%) 2757.50 ( 0.17%) 2781.04 ( 1.03%)
Hmean send-2048 5218.99 ( 0.00%) 5127.64 ( -1.75%) 5332.42 * 2.17%*
Hmean send-3312 8037.54 ( 0.00%) 8054.75 ( 0.21%) 8179.42 ( 1.77%)
Hmean send-4096 9834.51 ( 0.00%) 9782.38 ( -0.53%) 9901.92 ( 0.69%)
Hmean send-8192 15947.03 ( 0.00%) 16072.71 ( 0.79%) 16700.60 * 4.73%*
Hmean send-16384 25479.72 ( 0.00%) 24922.78 ( -2.19%) 25751.27 ( 1.07%)
Hmean recv-64 180.45 ( 0.00%) 181.97 ( 0.84%) 182.49 ( 1.13%)
Hmean recv-128 350.10 ( 0.00%) 360.06 * 2.84%* 367.78 * 5.05%*
Hmean recv-256 707.73 ( 0.00%) 707.27 ( -0.06%) 723.00 * 2.16%*
Hmean recv-1024 2750.10 ( 0.00%) 2755.09 ( 0.18%) 2778.95 ( 1.05%)
Hmean recv-2048 5213.28 ( 0.00%) 5121.07 ( -1.77%) 5326.50 * 2.17%*
Hmean recv-3312 8026.50 ( 0.00%) 8045.41 ( 0.24%) 8172.19 ( 1.82%)
Hmean recv-4096 9821.89 ( 0.00%) 9769.07 ( -0.54%) 9889.91 ( 0.69%)
Hmean recv-8192 15922.95 ( 0.00%) 16052.20 ( 0.81%) 16679.50 * 4.75%*
Hmean recv-16384 25441.27 ( 0.00%) 24876.79 ( -2.22%) 25706.80 ( 1.04%)

Ops SIS Search Efficiency 98.46 98.50 98.41
Ops SIS Domain Search Eff 24.87 24.95 24.91
Ops SIS Fast Success Rate 99.48 99.50 99.46
Ops SIS Success Rate 100.00 100.00 100.00

h) netperf-tcp

[quiet]
vanilla filter balancer
Hmean 64 839.51 ( 0.00%) 831.51 ( -0.95%) 869.34 * 3.55%*
Hmean 128 1629.70 ( 0.00%) 1635.25 ( 0.34%) 1692.19 * 3.83%*
Hmean 256 3046.92 ( 0.00%) 3019.34 * -0.91%* 3106.67 * 1.96%*
Hmean 1024 10164.80 ( 0.00%) 10023.82 ( -1.39%) 10317.10 * 1.50%*
Hmean 2048 16942.50 ( 0.00%) 16519.73 * -2.50%* 17435.42 * 2.91%*
Hmean 3312 21379.01 ( 0.00%) 21065.52 * -1.47%* 21642.12 * 1.23%*
Hmean 4096 23452.47 ( 0.00%) 23539.17 ( 0.37%) 23987.73 * 2.28%*
Hmean 8192 29084.04 ( 0.00%) 28763.16 * -1.10%* 29710.51 * 2.15%*
Hmean 16384 33512.05 ( 0.00%) 33124.21 * -1.16%* 34055.83 * 1.62%*

Ops SIS Search Efficiency 100.00 100.00 100.00
Ops SIS Domain Search Eff 18.08 21.82 25.44
Ops SIS Fast Success Rate 100.00 100.00 100.00
Ops SIS Success Rate 100.00 100.00 100.00

[noisy]
vanilla filter balancer
Hmean 64 868.21 ( 0.00%) 833.12 * -4.04%* 805.06 * -7.27%*
Hmean 128 1657.75 ( 0.00%) 1608.11 * -2.99%* 1554.89 * -6.20%*
Hmean 256 3093.27 ( 0.00%) 2953.87 * -4.51%* 2950.83 * -4.60%*
Hmean 1024 10168.94 ( 0.00%) 9589.55 * -5.70%* 9712.72 * -4.49%*
Hmean 2048 16539.60 ( 0.00%) 16031.40 * -3.07%* 15902.98 * -3.85%*
Hmean 3312 20710.95 ( 0.00%) 20174.85 * -2.59%* 20144.92 * -2.73%*
Hmean 4096 22728.08 ( 0.00%) 22633.01 ( -0.42%) 22419.51 ( -1.36%)
Hmean 8192 28017.31 ( 0.00%) 27761.16 ( -0.91%) 27667.10 ( -1.25%)
Hmean 16384 32314.91 ( 0.00%) 32084.40 ( -0.71%) 32090.03 ( -0.70%)

Ops SIS Search Efficiency 97.95 98.04 98.01
Ops SIS Domain Search Eff 24.72 24.73 24.71
Ops SIS Fast Success Rate 99.31 99.34 99.33
Ops SIS Success Rate 100.00 100.00 100.00


Conclusion
==========

The results didn't show a global win, but the balancer did outperform vanilla in
lots of the benchmarks both in quiet and noisy environment. The SIS filter makes
SIS more efficient as expected, and the balancer does even better by making the
overloaded cpu mask more accurate.

The only obvious regression is netperf-tcp in noisy environment, and I haven't
yet figured out why, and the more interesting thing is that the netperf/tbench
results on another machine (used in my patch v1) showed a suspicious 50%~90%
improvement by this patch series. It might be worthy of digging deeper.

Comments and tests are appreciated!

---

v2:
- several optimizations on sched-idle balancing
- ignore asym topos in can_migrate_task
- add more benchmarks including SIS efficiency
- re-organize patch as suggested by Mel

v1 can be found at [2].

[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/

Abel Wu (2):
sched/fair: filter out overloaded cpus in SIS
sched/fair: introduce sched-idle balance

include/linux/sched/idle.h | 1 +
include/linux/sched/topology.h | 12 +++
kernel/sched/core.c | 2 +
kernel/sched/fair.c | 210 +++++++++++++++++++++++++++++++++++++++--
kernel/sched/sched.h | 8 ++
kernel/sched/topology.c | 4 +-
6 files changed, 228 insertions(+), 9 deletions(-)

--
2.11.0


2022-04-12 00:48:59

by Abel Wu

[permalink] [raw]
Subject: [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS

It would bring benefit if the unoccupied cpus (sched-idle/idle cpus) can
start serving as soon as the non-idle tasks are available. Lots of effort
has already done, and task wakeup path is one of them.

When a task is woken up, the scheduler tends to put it on an unoccupied
cpu to make full use of cpu capacity. But due to scalability issues, the
search depth is bounded to a reasonable limit. IOW it's possible that a
task is woken up on a busy cpu while unoccupied cpus are still out there.

This patch focuses on improving the SIS searching efficiency by filtering
out the overloaded cpus, so as a result the more overloaded the system
is, the less cpus we will search.

Signed-off-by: Abel Wu <[email protected]>
---
include/linux/sched/topology.h | 12 ++++++++
kernel/sched/core.c | 1 +
kernel/sched/fair.c | 65 +++++++++++++++++++++++++++++++++++++++++-
kernel/sched/sched.h | 6 ++++
kernel/sched/topology.c | 4 ++-
5 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 56cffe42abbc..fb35a1983568 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -81,6 +81,18 @@ struct sched_domain_shared {
atomic_t ref;
atomic_t nr_busy_cpus;
int has_idle_cores;
+
+ /*
+ * The state of overloaded cpus is for different use against
+ * the above elements and they are all hot, so start a new
+ * cacheline to avoid false sharing.
+ */
+ atomic_t nr_overloaded ____cacheline_aligned;
+
+ /*
+ * Must be last
+ */
+ unsigned long overloaded[];
};

struct sched_domain {
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ef946123e9af..a372881f8eaf 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9495,6 +9495,7 @@ void __init sched_init(void)
rq->wake_stamp = jiffies;
rq->wake_avg_idle = rq->avg_idle;
rq->max_idle_balance_cost = sysctl_sched_migration_cost;
+ rq->overloaded = 0;

INIT_LIST_HEAD(&rq->cfs_tasks);

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 16874e112fe6..fbeb05321615 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6284,6 +6284,15 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
#endif /* CONFIG_SCHED_SMT */

/*
+ * It would be very unlikely to find an unoccupied cpu when system is heavily
+ * overloaded. Even if we could, the cost might bury the benefit.
+ */
+static inline bool sched_domain_overloaded(struct sched_domain *sd, int nr_overloaded)
+{
+ return nr_overloaded > sd->span_weight - (sd->span_weight >> 4);
+}
+
+/*
* Scan the LLC domain for idle CPUs; this is dynamically regulated by
* comparing the average scan cost (tracked in sd->avg_scan_cost) against the
* average idle time for this rq (as found in rq->avg_idle).
@@ -6291,7 +6300,7 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
{
struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
- int i, cpu, idle_cpu = -1, nr = INT_MAX;
+ int i, cpu, idle_cpu = -1, nr = INT_MAX, nro;
struct rq *this_rq = this_rq();
int this = smp_processor_id();
struct sched_domain *this_sd;
@@ -6301,7 +6310,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
if (!this_sd)
return -1;

+ nro = atomic_read(&sd->shared->nr_overloaded);
+ if (sched_domain_overloaded(sd, nro))
+ return -1;
+
cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+ if (nro)
+ cpumask_andnot(cpus, cpus, sdo_mask(sd->shared));

if (sched_feat(SIS_PROP) && !has_idle_core) {
u64 avg_cost, avg_idle, span_avg;
@@ -7018,6 +7033,51 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)

return newidle_balance(rq, rf) != 0;
}
+
+static inline bool cfs_rq_overloaded(struct rq *rq)
+{
+ return rq->cfs.h_nr_running - rq->cfs.idle_h_nr_running > 1;
+}
+
+/*
+ * Use locality-friendly rq->overloaded to cache the status of the rq
+ * to minimize the heavy cost on LLC shared data.
+ *
+ * Must be called with rq locked
+ */
+static void update_overload_status(struct rq *rq)
+{
+ struct sched_domain_shared *sds;
+ bool overloaded = cfs_rq_overloaded(rq);
+ int cpu = cpu_of(rq);
+
+ lockdep_assert_rq_held(rq);
+
+ if (rq->overloaded == overloaded)
+ return;
+
+ rcu_read_lock();
+ sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
+ if (unlikely(!sds))
+ goto unlock;
+
+ if (overloaded) {
+ cpumask_set_cpu(cpu, sdo_mask(sds));
+ atomic_inc(&sds->nr_overloaded);
+ } else {
+ cpumask_clear_cpu(cpu, sdo_mask(sds));
+ atomic_dec(&sds->nr_overloaded);
+ }
+
+ rq->overloaded = overloaded;
+unlock:
+ rcu_read_unlock();
+}
+
+#else
+
+static inline void update_overload_status(struct rq *rq) { }
+
#endif /* CONFIG_SMP */

static unsigned long wakeup_gran(struct sched_entity *se)
@@ -7365,6 +7425,8 @@ done: __maybe_unused;
if (new_tasks > 0)
goto again;

+ update_overload_status(rq);
+
/*
* rq is about to be idle, check if we need to update the
* lost_idle_time of clock_pelt
@@ -11183,6 +11245,7 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
if (static_branch_unlikely(&sched_numa_balancing))
task_tick_numa(rq, curr);

+ update_overload_status(rq);
update_misfit_status(curr, rq);
update_overutilized_status(task_rq(curr));

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3da5718cd641..afa1bb68c3ec 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1012,6 +1012,7 @@ struct rq {

unsigned char nohz_idle_balance;
unsigned char idle_balance;
+ unsigned char overloaded;

unsigned long misfit_task_load;

@@ -1764,6 +1765,11 @@ static inline struct sched_domain *lowest_flag_domain(int cpu, int flag)
return sd;
}

+static inline struct cpumask *sdo_mask(struct sched_domain_shared *sds)
+{
+ return to_cpumask(sds->overloaded);
+}
+
DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc);
DECLARE_PER_CPU(int, sd_llc_size);
DECLARE_PER_CPU(int, sd_llc_id);
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 32841c6741d1..fea1294ebd16 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1621,6 +1621,8 @@ sd_init(struct sched_domain_topology_level *tl,
sd->shared = *per_cpu_ptr(sdd->sds, sd_id);
atomic_inc(&sd->shared->ref);
atomic_set(&sd->shared->nr_busy_cpus, sd_weight);
+ atomic_set(&sd->shared->nr_overloaded, 0);
+ cpumask_clear(sdo_mask(sd->shared));
}

sd->private = sdd;
@@ -2086,7 +2088,7 @@ static int __sdt_alloc(const struct cpumask *cpu_map)

*per_cpu_ptr(sdd->sd, j) = sd;

- sds = kzalloc_node(sizeof(struct sched_domain_shared),
+ sds = kzalloc_node(sizeof(struct sched_domain_shared) + cpumask_size(),
GFP_KERNEL, cpu_to_node(j));
if (!sds)
return -ENOMEM;
--
2.11.0

2022-04-12 20:11:31

by Abel Wu

[permalink] [raw]
Subject: Re: [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS

Hi Josh, thanks very much for your reviewing.

On 4/12/22 9:23 AM, Josh Don Wrote:
> Hi Abel,
>
> Thanks for the patch, a few comments:
>
>> /*
>> + * It would be very unlikely to find an unoccupied cpu when system is heavily
>> + * overloaded. Even if we could, the cost might bury the benefit.
>> + */
>> +static inline bool sched_domain_overloaded(struct sched_domain *sd, int nr_overloaded)
>> +{
>> + return nr_overloaded > sd->span_weight - (sd->span_weight >> 4);
>> +}
>> +
>> +/*
>> * Scan the LLC domain for idle CPUs; this is dynamically regulated by
>> * comparing the average scan cost (tracked in sd->avg_scan_cost) against the
>> * average idle time for this rq (as found in rq->avg_idle).
>> @@ -6291,7 +6300,7 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
>> static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
>> {
>> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> - int i, cpu, idle_cpu = -1, nr = INT_MAX;
>> + int i, cpu, idle_cpu = -1, nr = INT_MAX, nro;
>> struct rq *this_rq = this_rq();
>> int this = smp_processor_id();
>> struct sched_domain *this_sd;
>> @@ -6301,7 +6310,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>> if (!this_sd)
>> return -1;
>>
>> + nro = atomic_read(&sd->shared->nr_overloaded);
>> + if (sched_domain_overloaded(sd, nro))
>> + return -1;
>
> This early bail out doesn't seem to be related to the main idea of
> your patch. Apart from deciding the exact heuristic value for what is

I agree that this early check doesn't seem to have a strong bound with
the idea "filter out the overloaded cpus", but this check is aligned
with the goal of "search less when becoming more overloaded".

As to the heuristic value, which is about 95%, I think it would be nice
if I can show more test results? I also have tested sd->imbalance_pct
and 100% (nro == sd->span_weight), seems like 95% is a better choice.

> considered too unlikely to find an idle cpu, this doesn't work well
> with tasks constrained by affinity; a task may have a small affinity
> mask containing idle cpus it may wake onto, even if most of the node
> is overloaded.

Yes, indeed. And I haven't come to a solution except that remove this
check entirely. Ideas?

>
>> cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>> + if (nro)
>> + cpumask_andnot(cpus, cpus, sdo_mask(sd->shared));
>
> To prevent us from exhausting our search attempts too quickly, this
> only needs to go under the sched_feat(SIS_PROP) && !has_idle_core case
> below. But by doing this unconditionally here, I guess your secondary
> goal is to reduce total search cost in both cases. Just wondering, did

Yes, it's unnecessary to try the overloaded cpus. But this makes sense
only if the overloaded cpumask is relatively accurate as you pointed
out below.

> you observe significant time spent here that you are trying to
> optimize? By reducing our search space by the overload mask, it is
> important that the mask is relatively up to date, or else we could
> miss an opportunity to find an idle cpu.

I think that's why Mel asked for the SIS statistics. The result in the
cover letter shows improvement on the search efficiency, and that is
what the overhead of the cpumask calculation trade for. Would it be
better if skip the update when nro is small?

>
>> if (sched_feat(SIS_PROP) && !has_idle_core) {
>> u64 avg_cost, avg_idle, span_avg;
>> @@ -7018,6 +7033,51 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>>
>> return newidle_balance(rq, rf) != 0;
>> }
>> +
>> +static inline bool cfs_rq_overloaded(struct rq *rq)
>> +{
>> + return rq->cfs.h_nr_running - rq->cfs.idle_h_nr_running > 1;
>> +}
>
> Why > 1 instead of > 0? If a cpu is running 1 non-idle task and any
> number of idle tasks, I'd think it is still "occupied" in the way
> you've defined. We'd want to steer wakeups to cpus running 0 non-idle
> tasks.

The idea behind "> 1" is telling the unoccupied cpus to pull non-idle
tasks from it (in the next patch). Although "> 0" is more efficient in
wakeup, it blinds us when pulling tasks.

>
>> +static void update_overload_status(struct rq *rq)
>> +{
>> + struct sched_domain_shared *sds;
>> + bool overloaded = cfs_rq_overloaded(rq);
>> + int cpu = cpu_of(rq);
>> +
>> + lockdep_assert_rq_held(rq);
>> +
>> + if (rq->overloaded == overloaded)
>> + return;
>> +
>> + rcu_read_lock();
>> + sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
>> + if (unlikely(!sds))
>> + goto unlock;
>> +
>> + if (overloaded) {
>> + cpumask_set_cpu(cpu, sdo_mask(sds));
>> + atomic_inc(&sds->nr_overloaded);
>> + } else {
>> + cpumask_clear_cpu(cpu, sdo_mask(sds));
>> + atomic_dec(&sds->nr_overloaded);
>> + }
>
> Why are these cpu mask writes not atomic?

They are atomic. The non-atomic version is __cpumask_{set,clear}_cpu.
Did I miss something?

>
>> +
>> + rq->overloaded = overloaded;
>> +unlock:
>> + rcu_read_unlock();
>> +}
>> +
>> +#else
>> +
>> +static inline void update_overload_status(struct rq *rq) { }
>> +
>> #endif /* CONFIG_SMP */
>>
>> static unsigned long wakeup_gran(struct sched_entity *se)
>> @@ -7365,6 +7425,8 @@ done: __maybe_unused;
>> if (new_tasks > 0)
>> goto again;
>>
>> + update_overload_status(rq);
>> +
>> /*
>> * rq is about to be idle, check if we need to update the
>> * lost_idle_time of clock_pelt
>> @@ -11183,6 +11245,7 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
>> if (static_branch_unlikely(&sched_numa_balancing))
>> task_tick_numa(rq, curr);
>>
>> + update_overload_status(rq);
>> update_misfit_status(curr, rq);
>> update_overutilized_status(task_rq(curr));
>
> I'd caution about using task_tick and pick_next_task_fair as the
> places we set and clear overload.
>
> Some issues with task_tick:
> - ticks may be disabled in NO_HZ_FULL (an issue if we define overload
> as > 0 non-idle tasks)
> - most ticks will have the same state, so somewhat redundant checking.
> Could use an edge based trigger instead, such as enqueue/dequeue
> (somewhat similar to rq->rd->overload).

1. In NO_HZ_FULL, given rq is overloaded, say have non-idle task A and
B enqueued, if A is dequeued before next tick then tick will be off
and the rq will keep "overloaded" while it's actually not. But this
doesn't necessarily be a bad thing because this cpu will be skipped
in wakeup path which helps in improving searching efficiency.

2. Yes, that's why I use rq->overloaded to save the last update state.
So when the overloaded state doesn't change, what we all do is a
simple check on a local variable.
The enqueue/dequeue path is not well bounded, and it could be very
frequent on short running workloads, which would introduce great
overhead to update the LLC shared atomic/cpumask.

>
> With pick_next_task_fair():
> - there's a window between a thread dequeuing, and then scheduler
> running through to the end of pick_next_task_fair(), during which we
> falsely observe the cpu as overloaded
> - this breaks with core scheduling, since we may use pick_task_fair
> rather than pick_next_task_fair

1. I'm afraid I don't understand what the problem is, can you explain
more on this? Thanks.

2. Nice catch, I will fix it in next update. (Maybe by updating the
overloaded status in do_idle()?)

Thanks & BR,
Abel

2022-04-12 21:42:30

by Josh Don

[permalink] [raw]
Subject: Re: [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS

Hi Abel,

Thanks for the patch, a few comments:

> /*
> + * It would be very unlikely to find an unoccupied cpu when system is heavily
> + * overloaded. Even if we could, the cost might bury the benefit.
> + */
> +static inline bool sched_domain_overloaded(struct sched_domain *sd, int nr_overloaded)
> +{
> + return nr_overloaded > sd->span_weight - (sd->span_weight >> 4);
> +}
> +
> +/*
> * Scan the LLC domain for idle CPUs; this is dynamically regulated by
> * comparing the average scan cost (tracked in sd->avg_scan_cost) against the
> * average idle time for this rq (as found in rq->avg_idle).
> @@ -6291,7 +6300,7 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
> static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
> {
> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> - int i, cpu, idle_cpu = -1, nr = INT_MAX;
> + int i, cpu, idle_cpu = -1, nr = INT_MAX, nro;
> struct rq *this_rq = this_rq();
> int this = smp_processor_id();
> struct sched_domain *this_sd;
> @@ -6301,7 +6310,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> if (!this_sd)
> return -1;
>
> + nro = atomic_read(&sd->shared->nr_overloaded);
> + if (sched_domain_overloaded(sd, nro))
> + return -1;

This early bail out doesn't seem to be related to the main idea of
your patch. Apart from deciding the exact heuristic value for what is
considered too unlikely to find an idle cpu, this doesn't work well
with tasks constrained by affinity; a task may have a small affinity
mask containing idle cpus it may wake onto, even if most of the node
is overloaded.

> cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> + if (nro)
> + cpumask_andnot(cpus, cpus, sdo_mask(sd->shared));

To prevent us from exhausting our search attempts too quickly, this
only needs to go under the sched_feat(SIS_PROP) && !has_idle_core case
below. But by doing this unconditionally here, I guess your secondary
goal is to reduce total search cost in both cases. Just wondering, did
you observe significant time spent here that you are trying to
optimize? By reducing our search space by the overload mask, it is
important that the mask is relatively up to date, or else we could
miss an opportunity to find an idle cpu.

> if (sched_feat(SIS_PROP) && !has_idle_core) {
> u64 avg_cost, avg_idle, span_avg;
> @@ -7018,6 +7033,51 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>
> return newidle_balance(rq, rf) != 0;
> }
> +
> +static inline bool cfs_rq_overloaded(struct rq *rq)
> +{
> + return rq->cfs.h_nr_running - rq->cfs.idle_h_nr_running > 1;
> +}

Why > 1 instead of > 0? If a cpu is running 1 non-idle task and any
number of idle tasks, I'd think it is still "occupied" in the way
you've defined. We'd want to steer wakeups to cpus running 0 non-idle
tasks.

> +static void update_overload_status(struct rq *rq)
> +{
> + struct sched_domain_shared *sds;
> + bool overloaded = cfs_rq_overloaded(rq);
> + int cpu = cpu_of(rq);
> +
> + lockdep_assert_rq_held(rq);
> +
> + if (rq->overloaded == overloaded)
> + return;
> +
> + rcu_read_lock();
> + sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> + if (unlikely(!sds))
> + goto unlock;
> +
> + if (overloaded) {
> + cpumask_set_cpu(cpu, sdo_mask(sds));
> + atomic_inc(&sds->nr_overloaded);
> + } else {
> + cpumask_clear_cpu(cpu, sdo_mask(sds));
> + atomic_dec(&sds->nr_overloaded);
> + }

Why are these cpu mask writes not atomic?

> +
> + rq->overloaded = overloaded;
> +unlock:
> + rcu_read_unlock();
> +}
> +
> +#else
> +
> +static inline void update_overload_status(struct rq *rq) { }
> +
> #endif /* CONFIG_SMP */
>
> static unsigned long wakeup_gran(struct sched_entity *se)
> @@ -7365,6 +7425,8 @@ done: __maybe_unused;
> if (new_tasks > 0)
> goto again;
>
> + update_overload_status(rq);
> +
> /*
> * rq is about to be idle, check if we need to update the
> * lost_idle_time of clock_pelt
> @@ -11183,6 +11245,7 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
> if (static_branch_unlikely(&sched_numa_balancing))
> task_tick_numa(rq, curr);
>
> + update_overload_status(rq);
> update_misfit_status(curr, rq);
> update_overutilized_status(task_rq(curr));

I'd caution about using task_tick and pick_next_task_fair as the
places we set and clear overload.

Some issues with task_tick:
- ticks may be disabled in NO_HZ_FULL (an issue if we define overload
as > 0 non-idle tasks)
- most ticks will have the same state, so somewhat redundant checking.
Could use an edge based trigger instead, such as enqueue/dequeue
(somewhat similar to rq->rd->overload).

With pick_next_task_fair():
- there's a window between a thread dequeuing, and then scheduler
running through to the end of pick_next_task_fair(), during which we
falsely observe the cpu as overloaded
- this breaks with core scheduling, since we may use pick_task_fair
rather than pick_next_task_fair

Thanks,
Josh

2022-04-14 13:19:37

by Josh Don

[permalink] [raw]
Subject: Re: [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS

On Tue, Apr 12, 2022 at 10:55 AM Abel Wu <[email protected]> wrote:
>
> On 4/12/22 9:23 AM, Josh Don Wrote:

> >> static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
> >> {
> >> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> >> - int i, cpu, idle_cpu = -1, nr = INT_MAX;
> >> + int i, cpu, idle_cpu = -1, nr = INT_MAX, nro;
> >> struct rq *this_rq = this_rq();
> >> int this = smp_processor_id();
> >> struct sched_domain *this_sd;
> >> @@ -6301,7 +6310,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> >> if (!this_sd)
> >> return -1;
> >>
> >> + nro = atomic_read(&sd->shared->nr_overloaded);
> >> + if (sched_domain_overloaded(sd, nro))
> >> + return -1;
> >
> > This early bail out doesn't seem to be related to the main idea of
> > your patch. Apart from deciding the exact heuristic value for what is
>
> I agree that this early check doesn't seem to have a strong bound with
> the idea "filter out the overloaded cpus", but this check is aligned
> with the goal of "search less when becoming more overloaded".
>
> As to the heuristic value, which is about 95%, I think it would be nice
> if I can show more test results? I also have tested sd->imbalance_pct
> and 100% (nro == sd->span_weight), seems like 95% is a better choice.
>
> > considered too unlikely to find an idle cpu, this doesn't work well
> > with tasks constrained by affinity; a task may have a small affinity
> > mask containing idle cpus it may wake onto, even if most of the node
> > is overloaded.
>
> Yes, indeed. And I haven't come to a solution except that remove this
> check entirely. Ideas?

Does this check help that much? Given that you added the filter below
to cut out searching overloaded cpus, I would think that the below is
sufficient.

Another use case that would break with the above:

A few cpus are reserved for a job, so that it always has a couple cpus
dedicated to it. It can run across the entire machine though (no
affinity restriction). If the rest of the machine is very busy, we'd
still want to be able to search for and find the idle reserved cpus
for the job.

> >> cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> >> + if (nro)
> >> + cpumask_andnot(cpus, cpus, sdo_mask(sd->shared));
> >
> > To prevent us from exhausting our search attempts too quickly, this
> > only needs to go under the sched_feat(SIS_PROP) && !has_idle_core case
> > below. But by doing this unconditionally here, I guess your secondary
> > goal is to reduce total search cost in both cases. Just wondering, did
>
> Yes, it's unnecessary to try the overloaded cpus. But this makes sense
> only if the overloaded cpumask is relatively accurate as you pointed
> out below.
>
> > you observe significant time spent here that you are trying to
> > optimize? By reducing our search space by the overload mask, it is
> > important that the mask is relatively up to date, or else we could
> > miss an opportunity to find an idle cpu.
>
> I think that's why Mel asked for the SIS statistics. The result in the
> cover letter shows improvement on the search efficiency, and that is
> what the overhead of the cpumask calculation trade for. Would it be
> better if skip the update when nro is small?

Just pointing out that with a very fast wake/sleep rate, you could hit
cases where you potentially fail to consider waking onto a cpu that is
actually idle. But I think this concern is addressed below.

> >
> >> if (sched_feat(SIS_PROP) && !has_idle_core) {
> >> u64 avg_cost, avg_idle, span_avg;
> >> @@ -7018,6 +7033,51 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> >>
> >> return newidle_balance(rq, rf) != 0;
> >> }
> >> +
> >> +static inline bool cfs_rq_overloaded(struct rq *rq)
> >> +{
> >> + return rq->cfs.h_nr_running - rq->cfs.idle_h_nr_running > 1;
> >> +}
> >
> > Why > 1 instead of > 0? If a cpu is running 1 non-idle task and any
> > number of idle tasks, I'd think it is still "occupied" in the way
> > you've defined. We'd want to steer wakeups to cpus running 0 non-idle
> > tasks.
>
> The idea behind "> 1" is telling the unoccupied cpus to pull non-idle
> tasks from it (in the next patch). Although "> 0" is more efficient in
> wakeup, it blinds us when pulling tasks.

Ok, I was a bit confused because there are different considerations
for >=1 and >=2 non-idle tasks.

So you consider >= 2 non-idle tasks to be "overloaded". TBH I do
prefer this than ">=1" for the wakeup filtering anyway, because if
there are at least two tasks, that makes it less likely for us to race
with seeing a spurious wakeup/sleep causing a cpu to be fully
idle/non-idle (ie. we have more confidence that we can safely filter
out the overload mask).

> > Why are these cpu mask writes not atomic?
>
> They are atomic. The non-atomic version is __cpumask_{set,clear}_cpu.
> Did I miss something?

Ah, I confused these, my bad.

> > I'd caution about using task_tick and pick_next_task_fair as the
> > places we set and clear overload.
> >
> > Some issues with task_tick:
> > - ticks may be disabled in NO_HZ_FULL (an issue if we define overload
> > as > 0 non-idle tasks)
> > - most ticks will have the same state, so somewhat redundant checking.
> > Could use an edge based trigger instead, such as enqueue/dequeue
> > (somewhat similar to rq->rd->overload).
>
> 1. In NO_HZ_FULL, given rq is overloaded, say have non-idle task A and
> B enqueued, if A is dequeued before next tick then tick will be off
> and the rq will keep "overloaded" while it's actually not. But this
> doesn't necessarily be a bad thing because this cpu will be skipped
> in wakeup path which helps in improving searching efficiency.

Yea this concern is alleviated because overload is actually >=2 tasks
(I had been incorrectly assuming you wanted to mark overload for >=1
non-idle task.

> 2. Yes, that's why I use rq->overloaded to save the last update state.
> So when the overloaded state doesn't change, what we all do is a
> simple check on a local variable.
> The enqueue/dequeue path is not well bounded, and it could be very
> frequent on short running workloads, which would introduce great
> overhead to update the LLC shared atomic/cpumask.

Yea, the frequent update would be an issue. I now see the check on the
cpu-local variable.

So the rate limit on updates comes from the fact that
!overloaded->overloaded requires a tick. We can quickly go from
overloaded->!overloaded, but will take another tick until we can go
back to overloaded.

> > With pick_next_task_fair():
> > - there's a window between a thread dequeuing, and then scheduler
> > running through to the end of pick_next_task_fair(), during which we
> > falsely observe the cpu as overloaded
> > - this breaks with core scheduling, since we may use pick_task_fair
> > rather than pick_next_task_fair
>
> 1. I'm afraid I don't understand what the problem is, can you explain
> more on this? Thanks.

Can ignore this comment, I don't think it is relevant given that this
isn't really a regression vs. the latency between the last thread
dequeuing and available_idle_cpu() returning true.

> 2. Nice catch, I will fix it in next update. (Maybe by updating the
> overloaded status in do_idle()?)

Ideally can catch it before we actually switch to rq->idle (just
trying to minimize latency to mark as !overloaded).

Thanks,
Josh

2022-04-16 01:44:49

by Josh Don

[permalink] [raw]
Subject: Re: [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS

> > Does this check help that much? Given that you added the filter below
> > to cut out searching overloaded cpus, I would think that the below is
> > sufficient.
>
> I see a ~10% performance drop in the higher load part of the hackbench
> and tbench without this check, in which cases system is quite overloaded
> and idle cpus can hardly exist.
>
> >
> > Another use case that would break with the above:
> >
> > A few cpus are reserved for a job, so that it always has a couple cpus
> > dedicated to it. It can run across the entire machine though (no
> > affinity restriction). If the rest of the machine is very busy, we'd
> > still want to be able to search for and find the idle reserved cpus
> > for the job.
>
> Yes, this could be true if very few cpus are reserved for the job. Along
> with the previous affinity case, I think the following might help both:
>
> static inline bool
> sched_domain_overloaded(struct sched_domain *sd, int nr_overloaded)
> {
> return nr_overloaded == sd->span_weight;
> }
>
> Besides, I think sched_idle_balance() will work well on this case.

The change to sched_domain_overloaded SGTM. But note that an async
load balancing operation such as sched_idle_balance() can't be relied
on for keeping wakeup latency low if we fail to find an idle cpu to
wake on (and one exists).

2022-04-16 01:48:26

by Abel Wu

[permalink] [raw]
Subject: Re: [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS

On 4/14/22 7:49 AM, Josh Don Wrote:
> On Tue, Apr 12, 2022 at 10:55 AM Abel Wu <[email protected]> wrote:
>>
>> On 4/12/22 9:23 AM, Josh Don Wrote:
>
>>>> static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
>>>> {
>>>> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>>>> - int i, cpu, idle_cpu = -1, nr = INT_MAX;
>>>> + int i, cpu, idle_cpu = -1, nr = INT_MAX, nro;
>>>> struct rq *this_rq = this_rq();
>>>> int this = smp_processor_id();
>>>> struct sched_domain *this_sd;
>>>> @@ -6301,7 +6310,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>>>> if (!this_sd)
>>>> return -1;
>>>>
>>>> + nro = atomic_read(&sd->shared->nr_overloaded);
>>>> + if (sched_domain_overloaded(sd, nro))
>>>> + return -1;
>>>
>>> This early bail out doesn't seem to be related to the main idea of
>>> your patch. Apart from deciding the exact heuristic value for what is
>>
>> I agree that this early check doesn't seem to have a strong bound with
>> the idea "filter out the overloaded cpus", but this check is aligned
>> with the goal of "search less when becoming more overloaded".
>>
>> As to the heuristic value, which is about 95%, I think it would be nice
>> if I can show more test results? I also have tested sd->imbalance_pct
>> and 100% (nro == sd->span_weight), seems like 95% is a better choice.
>>
>>> considered too unlikely to find an idle cpu, this doesn't work well
>>> with tasks constrained by affinity; a task may have a small affinity
>>> mask containing idle cpus it may wake onto, even if most of the node
>>> is overloaded.
>>
>> Yes, indeed. And I haven't come to a solution except that remove this
>> check entirely. Ideas?
>
> Does this check help that much? Given that you added the filter below
> to cut out searching overloaded cpus, I would think that the below is
> sufficient.

I see a ~10% performance drop in the higher load part of the hackbench
and tbench without this check, in which cases system is quite overloaded
and idle cpus can hardly exist.

>
> Another use case that would break with the above:
>
> A few cpus are reserved for a job, so that it always has a couple cpus
> dedicated to it. It can run across the entire machine though (no
> affinity restriction). If the rest of the machine is very busy, we'd
> still want to be able to search for and find the idle reserved cpus
> for the job.

Yes, this could be true if very few cpus are reserved for the job. Along
with the previous affinity case, I think the following might help both:

static inline bool
sched_domain_overloaded(struct sched_domain *sd, int nr_overloaded)
{
return nr_overloaded == sd->span_weight;
}

Besides, I think sched_idle_balance() will work well on this case.

>
>>>> cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>>>> + if (nro)
>>>> + cpumask_andnot(cpus, cpus, sdo_mask(sd->shared));
>>>
>>> To prevent us from exhausting our search attempts too quickly, this
>>> only needs to go under the sched_feat(SIS_PROP) && !has_idle_core case
>>> below. But by doing this unconditionally here, I guess your secondary
>>> goal is to reduce total search cost in both cases. Just wondering, did
>>
>> Yes, it's unnecessary to try the overloaded cpus. But this makes sense
>> only if the overloaded cpumask is relatively accurate as you pointed
>> out below.
>>
>>> you observe significant time spent here that you are trying to
>>> optimize? By reducing our search space by the overload mask, it is
>>> important that the mask is relatively up to date, or else we could
>>> miss an opportunity to find an idle cpu.
>>
>> I think that's why Mel asked for the SIS statistics. The result in the
>> cover letter shows improvement on the search efficiency, and that is
>> what the overhead of the cpumask calculation trade for. Would it be
>> better if skip the update when nro is small?
>
> Just pointing out that with a very fast wake/sleep rate, you could hit
> cases where you potentially fail to consider waking onto a cpu that is
> actually idle. But I think this concern is addressed below.
>
>>>
>>>> if (sched_feat(SIS_PROP) && !has_idle_core) {
>>>> u64 avg_cost, avg_idle, span_avg;
>>>> @@ -7018,6 +7033,51 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>>>>
>>>> return newidle_balance(rq, rf) != 0;
>>>> }
>>>> +
>>>> +static inline bool cfs_rq_overloaded(struct rq *rq)
>>>> +{
>>>> + return rq->cfs.h_nr_running - rq->cfs.idle_h_nr_running > 1;
>>>> +}
>>>
>>> Why > 1 instead of > 0? If a cpu is running 1 non-idle task and any
>>> number of idle tasks, I'd think it is still "occupied" in the way
>>> you've defined. We'd want to steer wakeups to cpus running 0 non-idle
>>> tasks.
>>
>> The idea behind "> 1" is telling the unoccupied cpus to pull non-idle
>> tasks from it (in the next patch). Although "> 0" is more efficient in
>> wakeup, it blinds us when pulling tasks.
>
> Ok, I was a bit confused because there are different considerations
> for >=1 and >=2 non-idle tasks.
>
> So you consider >= 2 non-idle tasks to be "overloaded". TBH I do
> prefer this than ">=1" for the wakeup filtering anyway, because if
> there are at least two tasks, that makes it less likely for us to race
> with seeing a spurious wakeup/sleep causing a cpu to be fully
> idle/non-idle (ie. we have more confidence that we can safely filter
> out the overload mask).

Agreed.

>
>>> Why are these cpu mask writes not atomic?
>>
>> They are atomic. The non-atomic version is __cpumask_{set,clear}_cpu.
>> Did I miss something?
>
> Ah, I confused these, my bad.
>
>>> I'd caution about using task_tick and pick_next_task_fair as the
>>> places we set and clear overload.
>>>
>>> Some issues with task_tick:
>>> - ticks may be disabled in NO_HZ_FULL (an issue if we define overload
>>> as > 0 non-idle tasks)
>>> - most ticks will have the same state, so somewhat redundant checking.
>>> Could use an edge based trigger instead, such as enqueue/dequeue
>>> (somewhat similar to rq->rd->overload).
>>
>> 1. In NO_HZ_FULL, given rq is overloaded, say have non-idle task A and
>> B enqueued, if A is dequeued before next tick then tick will be off
>> and the rq will keep "overloaded" while it's actually not. But this
>> doesn't necessarily be a bad thing because this cpu will be skipped
>> in wakeup path which helps in improving searching efficiency.
>
> Yea this concern is alleviated because overload is actually >=2 tasks
> (I had been incorrectly assuming you wanted to mark overload for >=1
> non-idle task.
>
>> 2. Yes, that's why I use rq->overloaded to save the last update state.
>> So when the overloaded state doesn't change, what we all do is a
>> simple check on a local variable.
>> The enqueue/dequeue path is not well bounded, and it could be very
>> frequent on short running workloads, which would introduce great
>> overhead to update the LLC shared atomic/cpumask.
>
> Yea, the frequent update would be an issue. I now see the check on the
> cpu-local variable.
>
> So the rate limit on updates comes from the fact that
> !overloaded->overloaded requires a tick. We can quickly go from
> overloaded->!overloaded, but will take another tick until we can go
> back to overloaded.

From the SIS's point of view, yes. But sched_idle_balance() will help
in overloaded->!overloaded transition, since an update will be issued
on the overloaded cpu after pulling. This tendency towards !overload
makes overload bits more reliable than the !overload bits in cpumask,
and when comes to the worst case, that is all bits are !overload, it
just fallback to the original SIS code.

>
>>> With pick_next_task_fair():
>>> - there's a window between a thread dequeuing, and then scheduler
>>> running through to the end of pick_next_task_fair(), during which we
>>> falsely observe the cpu as overloaded
>>> - this breaks with core scheduling, since we may use pick_task_fair
>>> rather than pick_next_task_fair
>>
>> 1. I'm afraid I don't understand what the problem is, can you explain
>> more on this? Thanks.
>
> Can ignore this comment, I don't think it is relevant given that this
> isn't really a regression vs. the latency between the last thread
> dequeuing and available_idle_cpu() returning true.
>
>> 2. Nice catch, I will fix it in next update. (Maybe by updating the
>> overloaded status in do_idle()?)
>
> Ideally can catch it before we actually switch to rq->idle (just
> trying to minimize latency to mark as !overloaded).

Makes sense.

Thanks & BR,
Abel

2022-04-25 22:28:02

by kernel test robot

[permalink] [raw]
Subject: [sched/fair] 6b433275e3: stress-ng.sock.ops_per_sec 16.2% improvement


(please be noted we also observed performance drop on
"netperf: netperf.Throughput_Mbps -44.4% regression",
but the data is not stable, especial the data for 6b433275e3.
details are as below)

Greeting,

FYI, we noticed a 16.2% improvement of stress-ng.sock.ops_per_sec due to commit:


commit: 6b433275e3a3cf18a16c0d2afb7fbc7a43bd3dbc ("[RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS")
url: https://github.com/intel-lab-lkp/linux/commits/Abel-Wu/introduece-sched-idle-balance/20220409-215303
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 089c02ae2771a14af2928c59c56abfb9b885a8d7
patch link: https://lore.kernel.org/lkml/[email protected]

in testcase: stress-ng
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
with following parameters:

nr_threads: 100%
testtime: 60s
class: network
test: sock
cpufreq_governor: performance
ucode: 0xd000331


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps -44.4% regression |
| test machine | 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=900s |
| | test=TCP_STREAM |
| | ucode=0x500320a |
+------------------+-------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.fifo.ops_per_sec 26.8% improvement |
| test machine | 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory |
| test parameters | class=pipe |
| | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=fifo |
| | testtime=60s |
| | ucode=0xd000331 |
+------------------+-------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
network/gcc-11/performance/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-icl-2sp6/sock/stress-ng/60s/0xd000331

commit:
089c02ae27 ("ftrace: Use preemption model accessors for trace header printout")
6b433275e3 ("sched/fair: filter out overloaded cpus in SIS")

089c02ae2771a14a 6b433275e3a3cf18a16c0d2afb7
---------------- ---------------------------
%stddev %change %stddev
\ | \
2770364 +16.2% 3218069 stress-ng.sock.ops
46171 +16.2% 53633 stress-ng.sock.ops_per_sec
5.952e+08 -30.3% 4.151e+08 ? 2% stress-ng.time.involuntary_context_switches
9199 +4.1% 9579 stress-ng.time.percent_of_cpu_this_job_got
5427 +4.4% 5666 stress-ng.time.system_time
6.112e+08 -28.0% 4.403e+08 stress-ng.time.voluntary_context_switches
11029514 ? 2% +21.5% 13402519 ? 2% cpuidle..usage
18689062 -29.1% 13251715 vmstat.system.cs
60526 ? 23% -42.3% 34928 ? 23% numa-meminfo.node1.Active
60526 ? 23% -42.3% 34928 ? 23% numa-meminfo.node1.Active(anon)
32260894 ? 50% +135.6% 76016130 ? 6% numa-numastat.node0.local_node
32321995 ? 49% +135.3% 76056280 ? 6% numa-numastat.node0.numa_hit
62966 ? 24% -41.7% 36737 ? 18% meminfo.Active
62966 ? 24% -41.7% 36737 ? 18% meminfo.Active(anon)
120624 ? 11% +20.5% 145331 ? 6% meminfo.Mapped
5.48 ? 9% +0.7 6.21 ? 10% mpstat.cpu.all.idle%
1.23 +0.2 1.45 ? 3% mpstat.cpu.all.irq%
22.46 -3.4 19.03 mpstat.cpu.all.soft%
32324845 ? 49% +135.3% 76071014 ? 6% numa-vmstat.node0.numa_hit
32263743 ? 50% +135.7% 76030863 ? 6% numa-vmstat.node0.numa_local
14955 ? 24% -43.9% 8397 ? 24% numa-vmstat.node1.nr_active_anon
14955 ? 24% -43.9% 8397 ? 24% numa-vmstat.node1.nr_zone_active_anon
336.75 ? 14% +28.8% 433.89 ? 11% sched_debug.cfs_rq:/.runnable_avg.stddev
-574367 +51.0% -867232 sched_debug.cfs_rq:/.spread0.min
4523462 -28.5% 3234077 sched_debug.cpu.nr_switches.avg
5422243 ? 10% -33.1% 3628496 ? 4% sched_debug.cpu.nr_switches.max
2895 +3.0% 2982 turbostat.Bzy_MHz
8257844 ? 3% +29.9% 10727822 ? 2% turbostat.C1
1695590 ? 16% -39.8% 1020687 ? 14% turbostat.C1E
183102 ?105% +167.3% 489441 ? 6% turbostat.C6
1.87 ?109% +3.3 5.19 ? 13% turbostat.C6%
1.64 ?122% +200.5% 4.94 ? 13% turbostat.CPU%c6
0.28 ? 7% -16.4% 0.24 turbostat.IPC
874435 ? 2% +31.4% 1148985 ? 3% turbostat.POLL
0.02 +0.0 0.03 turbostat.POLL%
15257 ? 23% -41.6% 8903 ? 25% proc-vmstat.nr_active_anon
98281 ? 3% +6.9% 105084 ? 3% proc-vmstat.nr_inactive_anon
30793 ? 12% +19.1% 36686 ? 7% proc-vmstat.nr_mapped
87672 +1.4% 88924 proc-vmstat.nr_slab_unreclaimable
15257 ? 23% -41.6% 8903 ? 25% proc-vmstat.nr_zone_active_anon
98281 ? 3% +6.9% 105084 ? 3% proc-vmstat.nr_zone_inactive_anon
1.005e+08 ? 2% +56.5% 1.573e+08 proc-vmstat.numa_hit
1.004e+08 ? 2% +56.6% 1.572e+08 proc-vmstat.numa_local
2209 ? 14% +28.4% 2837 ? 9% proc-vmstat.numa_pages_migrated
1.004e+08 ? 2% +56.5% 1.572e+08 proc-vmstat.pgalloc_normal
435497 +2.0% 444333 proc-vmstat.pgfault
1.002e+08 ? 2% +56.6% 1.57e+08 proc-vmstat.pgfree
2209 ? 14% +28.4% 2837 ? 9% proc-vmstat.pgmigrate_success
8.30 ? 3% +65.2% 13.72 perf-stat.i.MPKI
5.662e+10 -10.7% 5.059e+10 perf-stat.i.branch-instructions
5.571e+08 -13.4% 4.824e+08 perf-stat.i.branch-misses
11.62 ? 4% -2.3 9.29 perf-stat.i.cache-miss-rate%
2.871e+08 ? 6% +16.8% 3.354e+08 perf-stat.i.cache-misses
2.404e+09 ? 2% +47.0% 3.533e+09 perf-stat.i.cache-references
19355289 -28.7% 13800045 perf-stat.i.context-switches
1.20 +15.6% 1.39 perf-stat.i.cpi
3.491e+11 +3.1% 3.601e+11 perf-stat.i.cpu-cycles
102803 ? 2% +154.7% 261853 ? 2% perf-stat.i.cpu-migrations
1397 ? 8% -15.3% 1184 perf-stat.i.cycles-between-cache-misses
0.01 ? 40% +0.0 0.01 ? 2% perf-stat.i.dTLB-load-miss-rate%
2927467 ? 8% +98.1% 5798140 ? 4% perf-stat.i.dTLB-load-misses
8.529e+10 -11.0% 7.595e+10 perf-stat.i.dTLB-loads
0.01 ? 9% +0.0 0.02 ? 4% perf-stat.i.dTLB-store-miss-rate%
3922667 ? 4% +51.1% 5927414 ? 5% perf-stat.i.dTLB-store-misses
4.909e+10 -11.3% 4.355e+10 perf-stat.i.dTLB-stores
2.892e+11 -10.9% 2.578e+11 perf-stat.i.instructions
0.83 -13.5% 0.72 perf-stat.i.ipc
12.76 +3.0% 13.14 perf-stat.i.major-faults
2.73 +3.1% 2.81 perf-stat.i.metric.GHz
1511 -10.2% 1356 perf-stat.i.metric.M/sec
6.04 ? 9% +1.5 7.54 ? 3% perf-stat.i.node-load-miss-rate%
3432454 ? 7% +37.7% 4726278 ? 5% perf-stat.i.node-load-misses
4.08 ? 5% +1.4 5.51 ? 3% perf-stat.i.node-store-miss-rate%
2905147 ? 3% +100.9% 5837508 ? 2% perf-stat.i.node-store-misses
94641010 ? 7% +29.5% 1.226e+08 perf-stat.i.node-stores
8.31 ? 3% +64.7% 13.68 perf-stat.overall.MPKI
0.98 -0.0 0.95 perf-stat.overall.branch-miss-rate%
11.99 ? 4% -2.5 9.52 perf-stat.overall.cache-miss-rate%
1.21 +15.6% 1.40 perf-stat.overall.cpi
1216 ? 6% -11.9% 1072 perf-stat.overall.cycles-between-cache-misses
0.00 ? 8% +0.0 0.01 ? 3% perf-stat.overall.dTLB-load-miss-rate%
0.01 ? 4% +0.0 0.01 ? 4% perf-stat.overall.dTLB-store-miss-rate%
0.83 -13.5% 0.72 perf-stat.overall.ipc
3.92 ? 11% +1.9 5.79 ? 5% perf-stat.overall.node-load-miss-rate%
2.99 ? 7% +1.6 4.54 ? 3% perf-stat.overall.node-store-miss-rate%
5.58e+10 -10.5% 4.994e+10 perf-stat.ps.branch-instructions
5.489e+08 -13.3% 4.76e+08 perf-stat.ps.branch-misses
2.84e+08 ? 6% +16.6% 3.312e+08 perf-stat.ps.cache-misses
2.368e+09 ? 2% +47.0% 3.48e+09 perf-stat.ps.cache-references
19072464 -28.5% 13641191 perf-stat.ps.context-switches
3.442e+11 +3.2% 3.55e+11 perf-stat.ps.cpu-cycles
100880 ? 2% +155.1% 257340 ? 3% perf-stat.ps.cpu-migrations
2896052 ? 8% +96.9% 5703524 ? 4% perf-stat.ps.dTLB-load-misses
8.405e+10 -10.8% 7.497e+10 perf-stat.ps.dTLB-loads
3866307 ? 4% +50.8% 5831234 ? 5% perf-stat.ps.dTLB-store-misses
4.837e+10 -11.1% 4.3e+10 perf-stat.ps.dTLB-stores
2.851e+11 -10.7% 2.545e+11 perf-stat.ps.instructions
3390204 ? 7% +37.5% 4661478 ? 5% perf-stat.ps.node-load-misses
2868113 ? 3% +100.3% 5743781 ? 2% perf-stat.ps.node-store-misses
93732924 ? 8% +29.0% 1.209e+08 perf-stat.ps.node-stores
1.806e+13 -11.0% 1.608e+13 perf-stat.total.instructions
19.88 ? 2% -5.6 14.29 perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
18.37 ? 2% -5.1 13.29 perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked
23.38 -5.1 18.33 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
23.67 -5.1 18.61 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
23.43 -4.5 18.92 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
23.11 -4.4 18.69 perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit
32.14 -4.3 27.80 perf-profile.calltrace.cycles-pp.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
21.73 -4.0 17.72 perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2
18.02 ? 2% -4.0 14.07 perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
21.20 ? 2% -3.8 17.41 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip
21.09 ? 2% -3.8 17.33 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq
40.66 -3.7 36.96 perf-profile.calltrace.cycles-pp.recv
18.40 ? 2% -3.5 14.87 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
20.08 ? 2% -3.5 16.58 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start
13.86 ? 2% -3.4 10.49 perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
17.69 ? 2% -3.4 14.33 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
38.54 -3.3 35.21 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recv
17.30 -3.3 13.98 perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
14.71 ? 2% -3.3 11.42 perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core
38.16 -3.3 34.90 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
11.91 -3.1 8.80 perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
37.21 -3.1 34.12 perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
37.06 -3.1 34.00 perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
35.66 -3.0 32.61 perf-profile.calltrace.cycles-pp.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
35.39 -3.0 32.39 perf-profile.calltrace.cycles-pp.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
11.15 -2.8 8.32 perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
8.63 ? 3% -2.8 5.84 perf-profile.calltrace.cycles-pp.sock_def_readable.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
8.50 ? 3% -2.8 5.73 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
9.58 ? 3% -2.7 6.84 perf-profile.calltrace.cycles-pp.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
8.25 ? 3% -2.7 5.54 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.tcp_rcv_established.tcp_v4_do_rcv
15.50 ? 2% -2.7 12.83 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
8.02 ? 3% -2.7 5.37 perf-profile.calltrace.cycles-pp.try_to_wake_up.__wake_up_common.__wake_up_common_lock.sock_def_readable.tcp_rcv_established
8.25 ? 3% -2.5 5.78 perf-profile.calltrace.cycles-pp.wait_woken.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
9.96 -2.4 7.52 perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked.tcp_recvmsg
7.88 ? 3% -2.4 5.44 perf-profile.calltrace.cycles-pp.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
7.70 ? 3% -2.4 5.32 perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
7.48 ? 3% -2.3 5.16 perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.wait_woken.sk_wait_data
8.16 -1.9 6.26 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked
4.87 ? 2% -1.8 3.05 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
4.69 ? 2% -1.8 2.90 ? 2% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
3.83 ? 2% -1.6 2.28 ? 2% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
3.66 ? 2% -1.5 2.19 ? 2% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
2.98 ? 3% -1.1 1.91 ? 2% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.__wake_up_common.__wake_up_common_lock.sock_def_readable
2.88 ? 3% -1.0 1.84 ? 2% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.__wake_up_common.__wake_up_common_lock
4.00 ? 2% -1.0 2.98 perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.63 ? 3% -0.9 1.68 ? 2% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.__wake_up_common
2.45 ? 2% -0.8 1.63 ? 2% perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
2.85 ? 3% -0.8 2.02 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.wait_woken
2.61 ? 2% -0.8 1.81 perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.__wake_up_common.__wake_up_common_lock.sock_def_readable
2.52 ? 3% -0.8 1.75 perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.__wake_up_common.__wake_up_common_lock
2.31 ? 2% -0.6 1.70 perf-profile.calltrace.cycles-pp.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
2.18 -0.6 1.60 ? 2% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
2.22 -0.6 1.65 perf-profile.calltrace.cycles-pp.xmit_one.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit
1.76 ? 3% -0.6 1.21 ? 2% perf-profile.calltrace.cycles-pp.select_idle_cpu.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up
1.64 ? 2% -0.5 1.10 ? 3% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
2.02 -0.5 1.52 perf-profile.calltrace.cycles-pp.loopback_xmit.xmit_one.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2
0.98 ? 3% -0.4 0.59 ? 4% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
1.02 ? 2% -0.4 0.65 ? 2% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
1.53 ? 2% -0.4 1.18 perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked
1.15 ? 2% -0.3 0.84 perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.schedule_timeout.wait_woken
1.03 ? 3% -0.3 0.72 ? 2% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.__wake_up_common
0.94 ? 2% -0.3 0.64 ? 2% perf-profile.calltrace.cycles-pp.__tcp_send_ack.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
0.87 ? 2% -0.3 0.58 ? 2% perf-profile.calltrace.cycles-pp.__alloc_skb.__tcp_send_ack.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
1.59 -0.3 1.32 perf-profile.calltrace.cycles-pp.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
0.75 ? 2% -0.2 0.54 ? 2% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.schedule_timeout.wait_woken
0.98 ? 3% -0.2 0.80 ? 2% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
1.78 -0.1 1.65 perf-profile.calltrace.cycles-pp.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
0.71 ? 2% -0.1 0.59 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
0.74 -0.1 0.67 perf-profile.calltrace.cycles-pp.tcp_rcv_space_adjust.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
0.70 ? 3% -0.0 0.65 perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
0.64 -0.0 0.60 ? 2% perf-profile.calltrace.cycles-pp.sockfd_lookup_light.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.84 +0.1 0.94 perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg
0.58 +0.1 0.70 perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
1.21 +0.1 1.34 perf-profile.calltrace.cycles-pp.sock_do_ioctl.sock_ioctl.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
0.98 +0.1 1.12 perf-profile.calltrace.cycles-pp.tcp_ioctl.sock_do_ioctl.sock_ioctl.do_vfs_ioctl.__x64_sys_ioctl
1.34 +0.1 1.48 perf-profile.calltrace.cycles-pp.sock_ioctl.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.64 +0.1 0.78 perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked
0.80 ? 9% +0.2 0.96 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg_locked
1.49 +0.2 1.65 perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
1.03 ? 9% +0.2 1.20 perf-profile.calltrace.cycles-pp.skb_release_data.__kfree_skb.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
1.03 ? 9% +0.2 1.21 perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
0.85 ? 9% +0.2 1.03 perf-profile.calltrace.cycles-pp.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg_locked.tcp_sendmsg
2.33 +0.3 2.58 perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
0.41 ? 71% +0.3 0.66 perf-profile.calltrace.cycles-pp.tcp_rcv_state_process.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
2.58 +0.3 2.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
0.62 ? 2% +0.3 0.88 perf-profile.calltrace.cycles-pp.__inet_stream_connect.inet_stream_connect.__sys_connect.__x64_sys_connect.do_syscall_64
2.70 +0.3 2.96 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl
0.62 ? 2% +0.3 0.88 perf-profile.calltrace.cycles-pp.inet_stream_connect.__sys_connect.__x64_sys_connect.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.64 ? 2% +0.3 0.90 perf-profile.calltrace.cycles-pp.__x64_sys_connect.do_syscall_64.entry_SYSCALL_64_after_hwframe.__connect
0.64 ? 2% +0.3 0.91 perf-profile.calltrace.cycles-pp.__connect
0.64 ? 2% +0.3 0.90 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__connect
0.64 ? 2% +0.3 0.90 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__connect
0.64 ? 2% +0.3 0.90 perf-profile.calltrace.cycles-pp.__sys_connect.__x64_sys_connect.do_syscall_64.entry_SYSCALL_64_after_hwframe.__connect
0.59 ? 5% +0.3 0.86 ? 4% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
0.59 ? 4% +0.3 0.86 ? 4% perf-profile.calltrace.cycles-pp.cpu_startup_entry.secondary_startup_64_no_verify
0.60 ? 4% +0.3 0.87 ? 4% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
3.48 +0.3 3.76 perf-profile.calltrace.cycles-pp.ioctl
0.63 ? 2% +0.3 0.92 perf-profile.calltrace.cycles-pp.tcp_send_mss.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
1.20 ? 8% +0.3 1.51 perf-profile.calltrace.cycles-pp.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
1.22 ? 8% +0.3 1.54 perf-profile.calltrace.cycles-pp.sk_page_frag_refill.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
0.60 +0.3 0.93 perf-profile.calltrace.cycles-pp.lock_sock_nested.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
0.43 ? 44% +0.3 0.77 perf-profile.calltrace.cycles-pp.tcp_current_mss.tcp_send_mss.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
0.17 ?141% +0.4 0.58 perf-profile.calltrace.cycles-pp.__check_object_size.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
0.60 ? 3% +0.4 1.02 perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock
0.00 +0.5 0.52 perf-profile.calltrace.cycles-pp.__entry_text_start.send
0.00 +0.5 0.54 perf-profile.calltrace.cycles-pp.__lock_sock_fast.tcp_ioctl.sock_do_ioctl.sock_ioctl.do_vfs_ioctl
0.00 +0.6 0.55 ? 4% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
0.00 +0.6 0.56 perf-profile.calltrace.cycles-pp.__fput.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
0.00 +0.6 0.59 perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
0.00 +0.6 0.60 perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
0.00 +0.6 0.60 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
0.00 +0.6 0.65 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
0.00 +0.7 0.66 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__close
0.00 +0.7 0.66 ? 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_bh.lock_sock_nested.tcp_sendmsg.sock_sendmsg.__sys_sendto
0.00 +0.7 0.68 perf-profile.calltrace.cycles-pp.__close
0.55 ? 6% +0.7 1.26 ? 4% perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock.tcp_recvmsg
0.00 +0.7 0.71 ? 6% perf-profile.calltrace.cycles-pp.tcp_data_queue.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock
0.00 +0.7 0.73 ? 3% perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked
0.00 +0.8 0.75 ? 2% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.__release_sock
0.00 +0.8 0.78 ? 2% perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg
0.72 ? 7% +0.8 1.52 ? 3% perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
0.69 ? 5% +0.8 1.52 ? 2% perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established
0.00 +0.9 0.89 ? 2% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
1.04 ? 8% +0.9 1.94 ? 2% perf-profile.calltrace.cycles-pp.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
0.77 ? 5% +0.9 1.68 ? 3% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv
0.09 ?223% +1.0 1.08 ? 2% perf-profile.calltrace.cycles-pp.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
1.34 ? 6% +1.0 2.35 ? 2% perf-profile.calltrace.cycles-pp.release_sock.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
0.87 ? 5% +1.0 1.90 ? 2% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.__release_sock
0.88 ? 5% +1.0 1.91 ? 2% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock
1.49 ? 3% +1.3 2.82 perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock.tcp_sendmsg
1.52 ? 4% +1.3 2.87 perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.__release_sock.release_sock.tcp_sendmsg.sock_sendmsg
4.52 ? 6% +1.4 5.88 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
4.61 ? 6% +1.4 5.97 perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked
3.50 ? 5% +1.4 4.87 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked
3.62 ? 5% +1.4 5.01 perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg
5.01 ? 6% +1.4 6.42 perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg
4.24 ? 4% +1.6 5.80 perf-profile.calltrace.cycles-pp._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
6.42 ? 5% +1.6 8.04 perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
6.51 ? 4% +1.6 8.12 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
4.94 ? 4% +1.7 6.63 perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
2.35 ? 3% +1.9 4.21 ? 2% perf-profile.calltrace.cycles-pp.__release_sock.release_sock.tcp_sendmsg.sock_sendmsg.__sys_sendto
36.64 +2.0 38.63 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
51.16 +2.0 53.20 perf-profile.calltrace.cycles-pp.send
2.77 ? 3% +2.0 4.82 ? 2% perf-profile.calltrace.cycles-pp.release_sock.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
49.13 +2.3 51.47 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.send
48.76 +2.4 51.13 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
1.00 ? 9% +3.4 4.43 ? 8% perf-profile.calltrace.cycles-pp.__sk_mem_raise_allocated.__sk_mem_schedule.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
1.02 ? 9% +3.4 4.46 ? 8% perf-profile.calltrace.cycles-pp.__sk_mem_schedule.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
43.50 +4.2 47.74 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
43.21 +4.3 47.52 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
41.92 +4.4 46.33 perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
40.49 +4.4 44.93 perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
35.21 -6.5 28.67 perf-profile.children.cycles-pp.__tcp_transmit_skb
32.41 -6.0 26.44 perf-profile.children.cycles-pp.__ip_queue_xmit
29.91 -5.4 24.53 perf-profile.children.cycles-pp.ip_finish_output2
25.44 -4.4 21.05 perf-profile.children.cycles-pp.__local_bh_enable_ip
32.21 -4.3 27.87 perf-profile.children.cycles-pp.tcp_recvmsg_locked
24.83 -4.3 20.52 perf-profile.children.cycles-pp.do_softirq
24.72 -4.1 20.63 perf-profile.children.cycles-pp.__softirqentry_text_start
11.63 ? 2% -3.9 7.72 perf-profile.children.cycles-pp.schedule
23.16 -3.9 19.27 perf-profile.children.cycles-pp.net_rx_action
25.57 -3.8 21.80 perf-profile.children.cycles-pp.__tcp_push_pending_frames
22.67 -3.8 18.92 perf-profile.children.cycles-pp.__napi_poll
22.58 -3.7 18.85 perf-profile.children.cycles-pp.process_backlog
40.99 -3.7 37.30 perf-profile.children.cycles-pp.recv
21.77 -3.6 18.15 perf-profile.children.cycles-pp.__netif_receive_skb_one_core
11.89 ? 2% -3.6 8.28 perf-profile.children.cycles-pp.__schedule
19.98 -3.3 16.68 perf-profile.children.cycles-pp.ip_local_deliver_finish
25.81 -3.1 22.66 perf-profile.children.cycles-pp.tcp_write_xmit
19.32 -3.1 16.18 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
18.99 -3.1 15.86 perf-profile.children.cycles-pp.tcp_v4_rcv
37.25 -3.1 34.15 perf-profile.children.cycles-pp.__x64_sys_recvfrom
37.10 -3.1 34.04 perf-profile.children.cycles-pp.__sys_recvfrom
35.67 -3.0 32.63 perf-profile.children.cycles-pp.inet_recvmsg
35.48 -3.0 32.47 perf-profile.children.cycles-pp.tcp_recvmsg
9.62 ? 3% -2.8 6.87 perf-profile.children.cycles-pp.sk_wait_data
8.26 ? 3% -2.5 5.79 perf-profile.children.cycles-pp.wait_woken
8.95 ? 3% -2.4 6.50 perf-profile.children.cycles-pp.__wake_up_common_lock
9.06 ? 3% -2.4 6.62 perf-profile.children.cycles-pp.sock_def_readable
7.95 ? 2% -2.4 5.54 perf-profile.children.cycles-pp.schedule_timeout
8.67 ? 3% -2.4 6.27 perf-profile.children.cycles-pp.__wake_up_common
8.48 ? 3% -2.4 6.10 perf-profile.children.cycles-pp.try_to_wake_up
6.19 ? 2% -1.7 4.45 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
5.82 ? 2% -1.7 4.09 perf-profile.children.cycles-pp.exit_to_user_mode_prepare
4.42 ? 2% -1.4 3.01 perf-profile.children.cycles-pp.exit_to_user_mode_loop
16.59 ? 2% -1.4 15.23 perf-profile.children.cycles-pp.tcp_rcv_established
3.11 ? 3% -1.0 2.12 perf-profile.children.cycles-pp.select_task_rq
3.01 ? 3% -1.0 2.04 perf-profile.children.cycles-pp.select_task_rq_fair
2.77 ? 3% -0.9 1.86 ? 2% perf-profile.children.cycles-pp.select_idle_sibling
18.43 -0.9 17.54 perf-profile.children.cycles-pp.tcp_v4_do_rcv
2.90 ? 3% -0.8 2.08 perf-profile.children.cycles-pp.dequeue_task_fair
1.58 ? 3% -0.7 0.84 ? 2% perf-profile.children.cycles-pp.prepare_task_switch
2.75 ? 2% -0.7 2.04 perf-profile.children.cycles-pp.ttwu_do_activate
2.67 ? 3% -0.7 1.98 perf-profile.children.cycles-pp.enqueue_task_fair
2.32 ? 2% -0.7 1.67 perf-profile.children.cycles-pp.switch_mm_irqs_off
4.38 -0.6 3.75 perf-profile.children.cycles-pp.__dev_queue_xmit
2.00 ? 3% -0.6 1.42 ? 2% perf-profile.children.cycles-pp.pick_next_task_fair
1.88 ? 3% -0.5 1.35 ? 2% perf-profile.children.cycles-pp.select_idle_cpu
2.73 -0.5 2.21 perf-profile.children.cycles-pp.__alloc_skb
1.69 ? 3% -0.5 1.20 ? 2% perf-profile.children.cycles-pp.update_curr
2.03 ? 3% -0.5 1.54 perf-profile.children.cycles-pp.update_load_avg
1.12 ? 4% -0.5 0.66 perf-profile.children.cycles-pp.available_idle_cpu
2.61 -0.4 2.18 perf-profile.children.cycles-pp.dev_hard_start_xmit
2.51 -0.4 2.11 perf-profile.children.cycles-pp.xmit_one
0.87 ? 2% -0.4 0.48 ? 3% perf-profile.children.cycles-pp.reweight_entity
2.34 -0.4 1.97 perf-profile.children.cycles-pp.loopback_xmit
3.40 -0.3 3.08 perf-profile.children.cycles-pp.tcp_ack
1.13 ? 2% -0.3 0.84 perf-profile.children.cycles-pp.switch_fpu_return
1.14 ? 3% -0.3 0.87 perf-profile.children.cycles-pp.enqueue_entity
1.13 -0.2 0.88 perf-profile.children.cycles-pp.__tcp_send_ack
2.40 ? 4% -0.2 2.16 perf-profile.children.cycles-pp._raw_spin_lock
0.83 ? 3% -0.2 0.58 ? 3% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
0.87 ? 2% -0.2 0.63 ? 2% perf-profile.children.cycles-pp.__switch_to
0.76 ? 2% -0.2 0.52 ? 2% perf-profile.children.cycles-pp.update_rq_clock
0.73 ? 4% -0.2 0.49 ? 2% perf-profile.children.cycles-pp.__update_load_avg_se
0.81 ? 2% -0.2 0.58 ? 2% perf-profile.children.cycles-pp.__switch_to_asm
0.71 ? 3% -0.2 0.48 ? 2% perf-profile.children.cycles-pp.___perf_sw_event
0.63 ? 3% -0.2 0.41 ? 4% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
1.16 -0.2 0.93 ? 2% perf-profile.children.cycles-pp.__netif_rx
0.93 -0.2 0.71 perf-profile.children.cycles-pp.kmalloc_reserve
0.91 ? 2% -0.2 0.68 ? 3% perf-profile.children.cycles-pp.sched_clock_cpu
0.65 -0.2 0.43 ? 2% perf-profile.children.cycles-pp.save_fpregs_to_fpstate
0.70 ? 3% -0.2 0.49 ? 4% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
1.10 -0.2 0.89 ? 2% perf-profile.children.cycles-pp.netif_rx_internal
0.78 -0.2 0.57 ? 3% perf-profile.children.cycles-pp.irqtime_account_irq
0.75 ? 3% -0.2 0.54 ? 2% perf-profile.children.cycles-pp.set_next_entity
0.66 ? 4% -0.2 0.46 perf-profile.children.cycles-pp.ttwu_do_wakeup
0.83 -0.2 0.64 perf-profile.children.cycles-pp.__kmalloc_node_track_caller
1.02 -0.2 0.83 perf-profile.children.cycles-pp.enqueue_to_backlog
0.67 -0.2 0.48 ? 2% perf-profile.children.cycles-pp.validate_xmit_skb
1.03 ? 3% -0.2 0.84 perf-profile.children.cycles-pp.dequeue_entity
0.72 ? 2% -0.2 0.54 ? 3% perf-profile.children.cycles-pp.native_sched_clock
0.63 ? 4% -0.2 0.44 perf-profile.children.cycles-pp.check_preempt_curr
0.54 ? 4% -0.2 0.36 perf-profile.children.cycles-pp.check_preempt_wakeup
2.31 -0.2 2.15 perf-profile.children.cycles-pp.tcp_clean_rtx_queue
0.45 ? 2% -0.2 0.29 ? 3% perf-profile.children.cycles-pp.put_prev_entity
0.57 ? 2% -0.2 0.42 ? 2% perf-profile.children.cycles-pp.ip_output
0.67 -0.2 0.52 ? 2% perf-profile.children.cycles-pp.__netif_receive_skb_core
0.54 -0.1 0.40 ? 2% perf-profile.children.cycles-pp.ktime_get_with_offset
0.38 ? 2% -0.1 0.24 ? 4% perf-profile.children.cycles-pp.cpumask_next_wrap
0.36 ? 2% -0.1 0.23 ? 4% perf-profile.children.cycles-pp._find_next_bit
0.77 -0.1 0.64 perf-profile.children.cycles-pp.kmem_cache_alloc_node
0.46 ? 2% -0.1 0.34 ? 3% perf-profile.children.cycles-pp.ip_rcv_core
1.82 -0.1 1.70 perf-profile.children.cycles-pp.tcp_stream_alloc_skb
0.83 -0.1 0.72 perf-profile.children.cycles-pp.tcp_mstamp_refresh
0.49 -0.1 0.38 ? 2% perf-profile.children.cycles-pp.ip_local_out
0.32 ? 2% -0.1 0.22 ? 2% perf-profile.children.cycles-pp.netif_skb_features
0.30 ? 2% -0.1 0.20 ? 3% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.90 ? 2% -0.1 0.80 ? 2% perf-profile.children.cycles-pp.ip_rcv
0.27 ? 2% -0.1 0.17 ? 4% perf-profile.children.cycles-pp.perf_trace_buf_alloc
1.10 -0.1 1.00 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.31 ? 3% -0.1 0.21 ? 2% perf-profile.children.cycles-pp.nf_hook_slow
0.42 ? 2% -0.1 0.33 ? 2% perf-profile.children.cycles-pp.__ip_local_out
0.24 ? 6% -0.1 0.16 ? 4% perf-profile.children.cycles-pp.update_min_vruntime
0.21 ? 2% -0.1 0.13 ? 6% perf-profile.children.cycles-pp.__calc_delta
1.09 -0.1 1.01 perf-profile.children.cycles-pp.read_tsc
0.20 ? 4% -0.1 0.13 ? 5% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.28 ? 3% -0.1 0.21 ? 5% perf-profile.children.cycles-pp.perf_tp_event
0.75 -0.1 0.68 perf-profile.children.cycles-pp.tcp_rcv_space_adjust
0.21 ? 3% -0.1 0.14 ? 3% perf-profile.children.cycles-pp.pick_next_entity
0.18 ? 3% -0.1 0.12 ? 4% perf-profile.children.cycles-pp.apparmor_ip_postroute
0.32 ? 2% -0.1 0.26 ? 2% perf-profile.children.cycles-pp.remove_wait_queue
0.30 ? 2% -0.1 0.24 perf-profile.children.cycles-pp.tcp_ack_update_rtt
0.14 ? 3% -0.1 0.09 ? 4% perf-profile.children.cycles-pp.skb_network_protocol
0.25 ? 3% -0.1 0.20 ? 4% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.20 ? 4% -0.1 0.14 ? 3% perf-profile.children.cycles-pp.cpumask_next
0.35 -0.1 0.29 ? 2% perf-profile.children.cycles-pp.__list_add_valid
0.28 -0.1 0.22 ? 3% perf-profile.children.cycles-pp.ip_send_check
0.21 ? 3% -0.1 0.16 ? 3% perf-profile.children.cycles-pp.ip_finish_output
0.20 -0.0 0.15 ? 2% perf-profile.children.cycles-pp.__fdget
0.73 -0.0 0.68 perf-profile.children.cycles-pp.kfree
0.28 ? 2% -0.0 0.23 ? 3% perf-profile.children.cycles-pp.__ip_finish_output
0.18 ? 2% -0.0 0.13 ? 5% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
0.21 ? 2% -0.0 0.17 ? 2% perf-profile.children.cycles-pp.inet_ehashfn
0.54 -0.0 0.49 perf-profile.children.cycles-pp.tcp_event_new_data_sent
0.28 ? 3% -0.0 0.23 perf-profile.children.cycles-pp.sk_filter_trim_cap
0.16 ? 2% -0.0 0.12 ? 4% perf-profile.children.cycles-pp.tcp_rtt_estimator
0.10 ? 8% -0.0 0.06 perf-profile.children.cycles-pp.switch_ldt
0.14 ? 2% -0.0 0.10 ? 4% perf-profile.children.cycles-pp.__copy_skb_header
0.16 ? 6% -0.0 0.12 ? 4% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.15 ? 5% -0.0 0.11 ? 7% perf-profile.children.cycles-pp.perf_trace_buf_update
0.14 ? 3% -0.0 0.10 ? 4% perf-profile.children.cycles-pp.update_irq_load_avg
0.12 -0.0 0.08 ? 5% perf-profile.children.cycles-pp.ip_local_deliver
0.11 ? 5% -0.0 0.07 ? 6% perf-profile.children.cycles-pp.kmalloc_slab
0.32 -0.0 0.29 ? 2% perf-profile.children.cycles-pp.import_single_range
0.14 ? 2% -0.0 0.11 ? 4% perf-profile.children.cycles-pp.tcp_v4_fill_cb
0.19 ? 4% -0.0 0.16 ? 3% perf-profile.children.cycles-pp.add_wait_queue
0.18 ? 3% -0.0 0.14 ? 3% perf-profile.children.cycles-pp.__cgroup_account_cputime
0.14 ? 3% -0.0 0.11 ? 5% perf-profile.children.cycles-pp.tcp_inbound_md5_hash
0.06 -0.0 0.02 ? 99% perf-profile.children.cycles-pp.__raise_softirq_irqoff
0.14 ? 5% -0.0 0.11 ? 4% perf-profile.children.cycles-pp.set_next_buddy
0.12 ? 4% -0.0 0.09 ? 5% perf-profile.children.cycles-pp.tcp_options_write
0.14 ? 3% -0.0 0.11 ? 3% perf-profile.children.cycles-pp.neigh_hh_output
0.32 ? 3% -0.0 0.29 ? 3% perf-profile.children.cycles-pp.__ksize
0.20 ? 2% -0.0 0.16 ? 3% perf-profile.children.cycles-pp.__tcp_select_window
0.09 ? 5% -0.0 0.06 ? 6% perf-profile.children.cycles-pp.sched_clock
0.09 ? 5% -0.0 0.06 ? 6% perf-profile.children.cycles-pp.clear_buddies
0.48 -0.0 0.46 perf-profile.children.cycles-pp.__list_del_entry_valid
0.10 ? 5% -0.0 0.06 ? 7% perf-profile.children.cycles-pp.bpf_skops_write_hdr_opt
0.18 ? 2% -0.0 0.15 ? 3% perf-profile.children.cycles-pp.tcp_update_skb_after_send
0.08 ? 4% -0.0 0.05 ? 7% perf-profile.children.cycles-pp.perf_swevent_get_recursion_context
0.20 ? 2% -0.0 0.16 ? 3% perf-profile.children.cycles-pp.tcp_skb_entail
0.40 -0.0 0.38 ? 2% perf-profile.children.cycles-pp.tcp_schedule_loss_probe
0.10 ? 3% -0.0 0.07 ? 5% perf-profile.children.cycles-pp.skb_clone_tx_timestamp
0.10 ? 4% -0.0 0.07 perf-profile.children.cycles-pp.qdisc_pkt_len_init
0.12 ? 4% -0.0 0.10 ? 3% perf-profile.children.cycles-pp.__usecs_to_jiffies
0.12 ? 4% -0.0 0.09 ? 5% perf-profile.children.cycles-pp.rb_erase
0.13 ? 3% -0.0 0.10 ? 3% perf-profile.children.cycles-pp.security_sock_rcv_skb
0.15 ? 7% -0.0 0.13 ? 6% perf-profile.children.cycles-pp.cpuacct_charge
0.12 ? 6% -0.0 0.10 ? 5% perf-profile.children.cycles-pp.eth_type_trans
0.10 ? 3% -0.0 0.08 ? 6% perf-profile.children.cycles-pp.__build_skb_around
0.07 ? 6% -0.0 0.05 perf-profile.children.cycles-pp.schedule_debug
0.10 ? 4% -0.0 0.08 ? 5% perf-profile.children.cycles-pp.tcp_rate_skb_sent
0.09 -0.0 0.07 ? 5% perf-profile.children.cycles-pp.tcp_v4_send_check
0.09 ? 5% -0.0 0.06 ? 7% perf-profile.children.cycles-pp.validate_xmit_xfrm
0.09 ? 5% -0.0 0.06 ? 7% perf-profile.children.cycles-pp.tracing_gen_ctx_irq_test
0.13 ? 3% -0.0 0.11 ? 4% perf-profile.children.cycles-pp.tcp_event_data_recv
0.07 -0.0 0.05 perf-profile.children.cycles-pp.validate_xmit_vlan
0.10 ? 3% -0.0 0.08 perf-profile.children.cycles-pp.skb_push
0.08 ? 4% -0.0 0.06 perf-profile.children.cycles-pp.tcp_rate_gen
0.11 ? 3% -0.0 0.09 perf-profile.children.cycles-pp.tcp_rearm_rto
0.43 ? 2% -0.0 0.41 perf-profile.children.cycles-pp._raw_spin_lock_irq
0.10 ? 4% -0.0 0.08 perf-profile.children.cycles-pp.tcp_rate_skb_delivered
0.08 ? 6% -0.0 0.06 perf-profile.children.cycles-pp.netdev_core_pick_tx
0.16 ? 3% -0.0 0.14 ? 3% perf-profile.children.cycles-pp.ip_skb_dst_mtu
0.11 ? 4% -0.0 0.09 perf-profile.children.cycles-pp.skb_clone
0.10 ? 4% -0.0 0.09 ? 5% perf-profile.children.cycles-pp.tcp_update_pacing_rate
0.08 ? 5% -0.0 0.07 perf-profile.children.cycles-pp.tcp_newly_delivered
0.08 -0.0 0.06 ? 7% perf-profile.children.cycles-pp.rb_insert_color
0.08 -0.0 0.07 perf-profile.children.cycles-pp.__xfrm_policy_check2
0.08 -0.0 0.07 perf-profile.children.cycles-pp.rb_next
0.07 ? 5% +0.0 0.08 ? 4% perf-profile.children.cycles-pp.kthread
0.10 ? 3% +0.0 0.11 ? 3% perf-profile.children.cycles-pp.inet_sendmsg
0.12 ? 3% +0.0 0.13 ? 2% perf-profile.children.cycles-pp.tcp_update_recv_tstamps
0.05 +0.0 0.06 ? 7% perf-profile.children.cycles-pp.run_ksoftirqd
0.06 +0.0 0.08 ? 6% perf-profile.children.cycles-pp.do_tcp_getsockopt
0.05 ? 8% +0.0 0.07 ? 5% perf-profile.children.cycles-pp.slab_pre_alloc_hook
0.05 ? 8% +0.0 0.07 ? 5% perf-profile.children.cycles-pp.rb_first
0.10 ? 3% +0.0 0.12 ? 3% perf-profile.children.cycles-pp.tcp_check_space
0.05 ? 8% +0.0 0.07 perf-profile.children.cycles-pp.tcp_downgrade_zcopy_pure
0.04 ? 44% +0.0 0.06 perf-profile.children.cycles-pp.security_file_alloc
0.04 ? 44% +0.0 0.06 perf-profile.children.cycles-pp.__d_alloc
0.04 ? 44% +0.0 0.06 perf-profile.children.cycles-pp.getsockname
0.04 ? 44% +0.0 0.06 perf-profile.children.cycles-pp.inode_init_always
0.06 ? 7% +0.0 0.08 ? 5% perf-profile.children.cycles-pp.tcp_rate_check_app_limited
0.05 +0.0 0.07 perf-profile.children.cycles-pp.__sk_destruct
0.06 +0.0 0.08 perf-profile.children.cycles-pp.tcp_small_queue_check
0.06 ? 7% +0.0 0.09 ? 5% perf-profile.children.cycles-pp.do_dentry_open
0.08 ? 6% +0.0 0.10 ? 4% perf-profile.children.cycles-pp.alloc_file_pseudo
0.06 ? 6% +0.0 0.08 ? 4% perf-profile.children.cycles-pp.ip_route_output_key_hash
0.33 +0.0 0.35 perf-profile.children.cycles-pp.__mod_timer
0.08 ? 5% +0.0 0.11 perf-profile.children.cycles-pp.sock_alloc
0.11 +0.0 0.14 ? 3% perf-profile.children.cycles-pp.__sock_wfree
0.08 ? 6% +0.0 0.10 perf-profile.children.cycles-pp.sock_alloc_file
0.07 ? 10% +0.0 0.10 perf-profile.children.cycles-pp.free_unref_page_commit
0.09 ? 5% +0.0 0.12 ? 4% perf-profile.children.cycles-pp.tcp_conn_request
0.08 ? 5% +0.0 0.11 ? 3% perf-profile.children.cycles-pp.dentry_open
0.14 ? 2% +0.0 0.17 ? 2% perf-profile.children.cycles-pp.__sys_getsockopt
0.08 +0.0 0.11 perf-profile.children.cycles-pp.seq_read
0.08 +0.0 0.11 perf-profile.children.cycles-pp.seq_read_iter
0.05 ? 7% +0.0 0.08 ? 4% perf-profile.children.cycles-pp.check_new_pages
0.04 ? 44% +0.0 0.07 ? 5% perf-profile.children.cycles-pp.lookup_fast
0.10 ? 5% +0.0 0.13 perf-profile.children.cycles-pp.kmem_cache_alloc_lru
0.07 +0.0 0.10 perf-profile.children.cycles-pp.tcp_v4_syn_recv_sock
0.07 +0.0 0.10 perf-profile.children.cycles-pp.do_tcp_setsockopt
0.10 ? 3% +0.0 0.13 perf-profile.children.cycles-pp.__dentry_kill
0.09 ? 4% +0.0 0.12 ? 4% perf-profile.children.cycles-pp.sk_free
0.09 ? 4% +0.0 0.12 perf-profile.children.cycles-pp.tcp_mtu_probe
0.11 ? 9% +0.0 0.14 ? 4% perf-profile.children.cycles-pp.kmem_cache_alloc
0.12 ? 5% +0.0 0.15 ? 2% perf-profile.children.cycles-pp.alloc_inode
0.09 ? 5% +0.0 0.12 perf-profile.children.cycles-pp.tcp_check_req
0.10 ? 4% +0.0 0.14 ? 2% perf-profile.children.cycles-pp.dentry_kill
0.09 ? 5% +0.0 0.13 ? 3% perf-profile.children.cycles-pp.vfs_read
0.14 ? 3% +0.0 0.18 ? 2% perf-profile.children.cycles-pp.__x64_sys_getsockopt
0.12 ? 5% +0.0 0.16 ? 3% perf-profile.children.cycles-pp.alloc_empty_file
0.12 ? 4% +0.0 0.15 ? 2% perf-profile.children.cycles-pp.__alloc_file
0.09 ? 5% +0.0 0.13 ? 2% perf-profile.children.cycles-pp.ksys_read
0.02 ? 99% +0.0 0.06 perf-profile.children.cycles-pp.do_open
0.14 +0.0 0.18 ? 2% perf-profile.children.cycles-pp.tcp_cleanup_rbuf
0.10 ? 4% +0.0 0.14 ? 2% perf-profile.children.cycles-pp.read
0.06 ? 7% +0.0 0.10 perf-profile.children.cycles-pp.__zone_watermark_ok
0.20 ? 2% +0.0 0.23 ? 2% perf-profile.children.cycles-pp.kfree_skbmem
0.58 +0.0 0.62 perf-profile.children.cycles-pp.__might_fault
0.14 +0.0 0.18 ? 2% perf-profile.children.cycles-pp.__x64_sys_setsockopt
0.21 ? 3% +0.0 0.25 ? 2% perf-profile.children.cycles-pp.getsockopt
0.14 ? 2% +0.0 0.18 ? 2% perf-profile.children.cycles-pp.dput
0.14 ? 3% +0.0 0.18 ? 2% perf-profile.children.cycles-pp.__sys_setsockopt
0.09 ? 8% +0.0 0.13 ? 2% perf-profile.children.cycles-pp.__x64_sys_accept
0.02 ? 99% +0.0 0.07 ? 7% perf-profile.children.cycles-pp.link_path_walk
0.38 ? 2% +0.0 0.42 perf-profile.children.cycles-pp.__sk_dst_check
0.09 ? 6% +0.0 0.13 ? 3% perf-profile.children.cycles-pp.__x64_sys_accept4
0.12 ? 4% +0.0 0.16 perf-profile.children.cycles-pp.__ns_get_path
0.02 ?141% +0.0 0.06 ? 6% perf-profile.children.cycles-pp.tcp_ack_update_window
0.14 ? 3% +0.0 0.18 ? 2% perf-profile.children.cycles-pp.new_inode_pseudo
0.10 ? 4% +0.0 0.15 perf-profile.children.cycles-pp.accept
0.08 ? 8% +0.0 0.13 ? 2% perf-profile.children.cycles-pp.inet_csk_accept
0.10 ? 3% +0.0 0.15 ? 3% perf-profile.children.cycles-pp.tcp_fin
0.01 ?223% +0.0 0.06 ? 8% perf-profile.children.cycles-pp.alloc_file
0.00 +0.1 0.05 perf-profile.children.cycles-pp.sk_clone_lock
0.00 +0.1 0.05 perf-profile.children.cycles-pp.___slab_alloc
0.00 +0.1 0.05 perf-profile.children.cycles-pp.sock_alloc_inode
0.00 +0.1 0.05 perf-profile.children.cycles-pp.sk_alloc
0.00 +0.1 0.05 perf-profile.children.cycles-pp.seq_show
0.00 +0.1 0.05 perf-profile.children.cycles-pp.__vsnprintf_chk
0.00 +0.1 0.05 perf-profile.children.cycles-pp.evict
0.00 +0.1 0.05 perf-profile.children.cycles-pp.rcu_cblist_dequeue
0.00 +0.1 0.05 perf-profile.children.cycles-pp.walk_component
0.08 ? 6% +0.1 0.13 ? 2% perf-profile.children.cycles-pp.autoremove_wake_function
0.11 ? 6% +0.1 0.16 ? 3% perf-profile.children.cycles-pp.accept4
0.00 +0.1 0.05 ? 7% perf-profile.children.cycles-pp.tcp_done
0.28 ? 6% +0.1 0.33 perf-profile.children.cycles-pp.ip_rcv_finish_core
0.20 ? 2% +0.1 0.25 perf-profile.children.cycles-pp.setsockopt
0.08 ? 26% +0.1 0.13 ? 28% perf-profile.children.cycles-pp.mod_objcg_state
0.00 +0.1 0.05 ? 8% perf-profile.children.cycles-pp.dev_ifconf
0.00 +0.1 0.05 ? 8% perf-profile.children.cycles-pp.inet_csk_clone_lock
0.02 ?141% +0.1 0.07 perf-profile.children.cycles-pp.inet_create
0.04 ? 71% +0.1 0.09 ? 11% perf-profile.children.cycles-pp.obj_cgroup_uncharge_pages
0.12 ? 4% +0.1 0.18 perf-profile.children.cycles-pp.__sock_create
0.41 ? 5% +0.1 0.46 perf-profile.children.cycles-pp.skb_release_head_state
0.09 ? 6% +0.1 0.15 ? 3% perf-profile.children.cycles-pp.inet_accept
0.02 ? 99% +0.1 0.08 ? 13% perf-profile.children.cycles-pp.page_counter_uncharge
0.00 +0.1 0.06 ? 8% perf-profile.children.cycles-pp.set_task_cpu
0.00 +0.1 0.06 ? 8% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.00 +0.1 0.06 ? 8% perf-profile.children.cycles-pp.ip_route_output_flow
0.00 +0.1 0.06 ? 6% perf-profile.children.cycles-pp.tcp_validate_incoming
0.15 ? 4% +0.1 0.21 ? 2% perf-profile.children.cycles-pp.do_filp_open
0.15 ? 4% +0.1 0.21 perf-profile.children.cycles-pp.path_openat
0.00 +0.1 0.06 perf-profile.children.cycles-pp.obj_cgroup_charge
0.11 ? 4% +0.1 0.17 ? 2% perf-profile.children.cycles-pp.tcp_child_process
0.00 +0.1 0.06 ? 7% perf-profile.children.cycles-pp.tcp_create_openreq_child
0.00 +0.1 0.06 ? 7% perf-profile.children.cycles-pp.inet_csk_destroy_sock
0.18 ? 5% +0.1 0.24 perf-profile.children.cycles-pp.do_sys_openat2
0.18 ? 6% +0.1 0.24 ? 2% perf-profile.children.cycles-pp.__x64_sys_openat
0.20 ? 2% +0.1 0.27 perf-profile.children.cycles-pp.tcp_tso_segs
0.17 ? 4% +0.1 0.24 perf-profile.children.cycles-pp.__sys_socket
0.02 ?146% +0.1 0.09 ? 38% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.00 +0.1 0.07 perf-profile.children.cycles-pp.lock_timer_base
0.17 ? 4% +0.1 0.24 perf-profile.children.cycles-pp.__x64_sys_socket
0.20 ? 6% +0.1 0.27 perf-profile.children.cycles-pp.tcp_tx_timestamp
0.18 ? 4% +0.1 0.26 perf-profile.children.cycles-pp.__socket
0.17 ? 4% +0.1 0.24 perf-profile.children.cycles-pp.do_accept
0.11 ? 8% +0.1 0.18 ? 2% perf-profile.children.cycles-pp.skb_try_coalesce
0.22 ? 4% +0.1 0.30 ? 3% perf-profile.children.cycles-pp.sock_put
0.19 ? 5% +0.1 0.26 perf-profile.children.cycles-pp.open
0.18 ? 4% +0.1 0.26 ? 2% perf-profile.children.cycles-pp.__sys_accept4
0.10 ? 4% +0.1 0.18 ? 4% perf-profile.children.cycles-pp.__sk_flush_backlog
1.36 +0.1 1.45 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.23 ? 3% +0.1 0.31 perf-profile.children.cycles-pp.open_related_ns
0.13 ? 5% +0.1 0.21 perf-profile.children.cycles-pp.__free_one_page
0.17 ? 4% +0.1 0.25 perf-profile.children.cycles-pp.__sys_accept4_file
0.29 +0.1 0.37 perf-profile.children.cycles-pp.tcp_queue_rcv
0.10 ? 8% +0.1 0.18 ? 3% perf-profile.children.cycles-pp.free_pcp_prepare
0.98 +0.1 1.07 perf-profile.children.cycles-pp.__cond_resched
0.33 ? 4% +0.1 0.42 ? 5% perf-profile.children.cycles-pp.mwait_idle_with_hints
0.13 ? 2% +0.1 0.22 ? 3% perf-profile.children.cycles-pp.__tcp_close
1.17 +0.1 1.26 perf-profile.children.cycles-pp.__entry_text_start
0.22 ? 2% +0.1 0.31 perf-profile.children.cycles-pp.tcp_connect
0.33 ? 4% +0.1 0.42 ? 4% perf-profile.children.cycles-pp.intel_idle
0.14 ? 2% +0.1 0.24 ? 3% perf-profile.children.cycles-pp.tcp_close
0.12 ? 7% +0.1 0.22 perf-profile.children.cycles-pp.tcp_try_coalesce
0.12 ? 4% +0.1 0.21 perf-profile.children.cycles-pp.tcp_add_backlog
0.86 +0.1 0.96 perf-profile.children.cycles-pp.simple_copy_to_iter
0.32 ? 3% +0.1 0.43 perf-profile.children.cycles-pp.__sys_shutdown
0.17 ? 2% +0.1 0.28 ? 2% perf-profile.children.cycles-pp.inet_release
0.33 ? 2% +0.1 0.44 perf-profile.children.cycles-pp.shutdown
0.24 ? 4% +0.1 0.35 perf-profile.children.cycles-pp.tcp_push
0.37 ? 5% +0.1 0.48 ? 5% perf-profile.children.cycles-pp.cpuidle_enter_state
0.32 ? 3% +0.1 0.43 perf-profile.children.cycles-pp.__x64_sys_shutdown
0.14 ? 3% +0.1 0.24 ? 4% perf-profile.children.cycles-pp.schedule_idle
0.18 ? 2% +0.1 0.29 ? 2% perf-profile.children.cycles-pp.__sock_release
0.32 ? 2% +0.1 0.43 perf-profile.children.cycles-pp.inet_shutdown
0.38 ? 5% +0.1 0.49 ? 4% perf-profile.children.cycles-pp.cpuidle_enter
0.00 +0.1 0.11 ? 9% perf-profile.children.cycles-pp.sk_forced_mem_schedule
0.18 ? 2% +0.1 0.29 ? 2% perf-profile.children.cycles-pp.sock_close
0.44 ? 4% +0.1 0.55 ? 2% perf-profile.children.cycles-pp.ipv4_dst_check
0.14 ? 6% +0.1 0.26 ? 3% perf-profile.children.cycles-pp.__slab_free
0.29 ? 3% +0.1 0.41 perf-profile.children.cycles-pp.tcp_rcv_synsent_state_process
0.00 +0.1 0.12 ? 8% perf-profile.children.cycles-pp.__sk_mem_reduce_allocated
0.17 ? 2% +0.1 0.30 ? 5% perf-profile.children.cycles-pp.__irq_exit_rcu
0.31 ? 2% +0.1 0.44 perf-profile.children.cycles-pp.tcp_v4_connect
0.19 ? 3% +0.1 0.32 ? 7% perf-profile.children.cycles-pp.rcu_do_batch
1.02 +0.1 1.16 perf-profile.children.cycles-pp.tcp_ioctl
1.22 +0.1 1.36 perf-profile.children.cycles-pp.sock_do_ioctl
0.21 ? 3% +0.1 0.35 ? 6% perf-profile.children.cycles-pp.rcu_core
0.43 +0.1 0.57 perf-profile.children.cycles-pp.__lock_sock_fast
0.42 ? 5% +0.1 0.56 ? 4% perf-profile.children.cycles-pp.cpuidle_idle_call
0.81 ? 9% +0.2 0.97 perf-profile.children.cycles-pp.get_page_from_freelist
0.26 ? 4% +0.2 0.43 ? 11% perf-profile.children.cycles-pp.tick_sched_timer
1.50 +0.2 1.67 perf-profile.children.cycles-pp.do_vfs_ioctl
0.34 ? 6% +0.2 0.50 perf-profile.children.cycles-pp.dst_release
0.16 ? 4% +0.2 0.32 ? 10% perf-profile.children.cycles-pp.scheduler_tick
1.14 ? 8% +0.2 1.31 ? 2% perf-profile.children.cycles-pp.free_unref_page
0.45 ? 3% +0.2 0.62 ? 8% perf-profile.children.cycles-pp.hrtimer_interrupt
0.22 ? 4% +0.2 0.39 ? 11% perf-profile.children.cycles-pp.update_process_times
0.11 ? 4% +0.2 0.28 ? 10% perf-profile.children.cycles-pp.task_tick_fair
0.45 ? 2% +0.2 0.62 ? 8% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.22 ? 3% +0.2 0.40 ? 10% perf-profile.children.cycles-pp.tick_sched_handle
0.39 +0.2 0.57 perf-profile.children.cycles-pp.__fput
0.31 ? 4% +0.2 0.49 ? 8% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.86 ? 9% +0.2 1.04 perf-profile.children.cycles-pp.__alloc_pages
0.41 +0.2 0.59 perf-profile.children.cycles-pp.task_work_run
1.41 +0.2 1.60 perf-profile.children.cycles-pp.__check_object_size
92.88 +0.2 93.08 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.48 +0.2 0.69 perf-profile.children.cycles-pp.__close
0.25 ? 6% +0.2 0.47 ? 2% perf-profile.children.cycles-pp.ipv4_mtu
2.36 +0.3 2.62 perf-profile.children.cycles-pp.__x64_sys_ioctl
0.62 ? 2% +0.3 0.88 perf-profile.children.cycles-pp.__inet_stream_connect
2.52 ? 3% +0.3 2.78 perf-profile.children.cycles-pp.skb_release_data
0.62 ? 2% +0.3 0.88 perf-profile.children.cycles-pp.inet_stream_connect
1.69 +0.3 1.96 perf-profile.children.cycles-pp.sock_ioctl
0.64 ? 2% +0.3 0.90 perf-profile.children.cycles-pp.__x64_sys_connect
0.64 ? 2% +0.3 0.91 perf-profile.children.cycles-pp.__connect
0.64 ? 2% +0.3 0.90 perf-profile.children.cycles-pp.__sys_connect
0.60 ? 4% +0.3 0.87 ? 4% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
0.60 ? 4% +0.3 0.87 ? 4% perf-profile.children.cycles-pp.cpu_startup_entry
0.59 ? 4% +0.3 0.86 ? 4% perf-profile.children.cycles-pp.do_idle
0.56 ? 2% +0.3 0.84 perf-profile.children.cycles-pp.tcp_current_mss
2.94 ? 2% +0.3 3.22 perf-profile.children.cycles-pp.__kfree_skb
1.48 ? 7% +0.3 1.76 ? 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.66 ? 2% +0.3 0.95 perf-profile.children.cycles-pp.tcp_send_mss
3.74 +0.3 4.02 perf-profile.children.cycles-pp.ioctl
0.70 ? 8% +0.3 0.98 ? 2% perf-profile.children.cycles-pp.update_cfs_group
0.69 ? 2% +0.3 0.99 ? 6% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.64 ? 2% +0.3 0.93 ? 5% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
92.11 +0.3 92.41 perf-profile.children.cycles-pp.do_syscall_64
1.21 ? 8% +0.3 1.52 perf-profile.children.cycles-pp.skb_page_frag_refill
1.25 ? 8% +0.3 1.58 perf-profile.children.cycles-pp.sk_page_frag_refill
1.36 ? 9% +0.3 1.70 perf-profile.children.cycles-pp.tcp_rcv_state_process
1.06 +0.5 1.56 perf-profile.children.cycles-pp.lock_sock_nested
0.16 ? 9% +0.5 0.67 ? 7% perf-profile.children.cycles-pp.tcp_try_rmem_schedule
0.53 ? 5% +0.7 1.23 ? 5% perf-profile.children.cycles-pp.tcp_data_queue
1.46 +0.9 2.41 ? 2% perf-profile.children.cycles-pp._raw_spin_lock_bh
4.63 ? 6% +1.4 5.99 perf-profile.children.cycles-pp.copyout
3.64 ? 4% +1.4 5.04 perf-profile.children.cycles-pp.copyin
5.04 ? 6% +1.4 6.45 perf-profile.children.cycles-pp._copy_to_iter
4.27 ? 4% +1.6 5.84 perf-profile.children.cycles-pp._copy_from_iter
6.46 ? 4% +1.6 8.08 perf-profile.children.cycles-pp.__skb_datagram_iter
6.52 ? 4% +1.6 8.14 perf-profile.children.cycles-pp.skb_copy_datagram_iter
4.99 ? 4% +1.7 6.68 perf-profile.children.cycles-pp.skb_do_copy_data_nocache
36.84 +2.0 38.87 perf-profile.children.cycles-pp.tcp_sendmsg_locked
51.57 +2.1 53.65 perf-profile.children.cycles-pp.send
8.21 ? 5% +2.8 10.97 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
3.87 ? 2% +3.0 6.87 ? 2% perf-profile.children.cycles-pp.__release_sock
4.68 +3.2 7.90 ? 2% perf-profile.children.cycles-pp.release_sock
1.21 ? 9% +4.1 5.29 ? 8% perf-profile.children.cycles-pp.__sk_mem_raise_allocated
1.24 ? 9% +4.1 5.33 ? 8% perf-profile.children.cycles-pp.__sk_mem_schedule
43.55 +4.2 47.78 perf-profile.children.cycles-pp.__x64_sys_sendto
43.26 +4.3 47.57 perf-profile.children.cycles-pp.__sys_sendto
41.95 +4.4 46.37 perf-profile.children.cycles-pp.sock_sendmsg
40.58 +4.4 45.01 perf-profile.children.cycles-pp.tcp_sendmsg
2.29 ? 2% -0.6 1.65 perf-profile.self.cycles-pp.switch_mm_irqs_off
0.87 ? 4% -0.5 0.38 ? 3% perf-profile.self.cycles-pp.prepare_task_switch
1.09 ? 4% -0.4 0.64 ? 2% perf-profile.self.cycles-pp.available_idle_cpu
1.24 ? 3% -0.4 0.86 perf-profile.self.cycles-pp.__schedule
0.61 ? 3% -0.3 0.26 perf-profile.self.cycles-pp.dequeue_task_fair
1.74 -0.3 1.45 perf-profile.self.cycles-pp.__tcp_transmit_skb
0.49 ? 3% -0.2 0.24 ? 3% perf-profile.self.cycles-pp.reweight_entity
0.82 ? 3% -0.2 0.57 ? 2% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.51 ? 3% -0.2 0.27 ? 3% perf-profile.self.cycles-pp.enqueue_task_fair
0.80 ? 2% -0.2 0.57 ? 2% perf-profile.self.cycles-pp.__switch_to_asm
0.83 ? 2% -0.2 0.60 ? 2% perf-profile.self.cycles-pp.__switch_to
0.82 ? 2% -0.2 0.60 ? 2% perf-profile.self.cycles-pp.__ip_queue_xmit
0.67 ? 4% -0.2 0.45 ? 2% perf-profile.self.cycles-pp.__update_load_avg_se
0.79 ? 3% -0.2 0.57 ? 3% perf-profile.self.cycles-pp.ip_finish_output2
0.64 -0.2 0.43 ? 2% perf-profile.self.cycles-pp.save_fpregs_to_fpstate
0.73 ? 3% -0.2 0.52 ? 2% perf-profile.self.cycles-pp.update_curr
0.60 ? 3% -0.2 0.38 ? 3% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.62 ? 2% -0.2 0.41 ? 2% perf-profile.self.cycles-pp.___perf_sw_event
1.16 ? 2% -0.2 0.97 perf-profile.self.cycles-pp._raw_spin_lock
0.63 ? 4% -0.2 0.44 ? 2% perf-profile.self.cycles-pp.select_idle_cpu
0.70 ? 2% -0.2 0.52 ? 3% perf-profile.self.cycles-pp.native_sched_clock
0.59 -0.2 0.42 ? 2% perf-profile.self.cycles-pp.loopback_xmit
0.56 ? 2% -0.2 0.40 ? 2% perf-profile.self.cycles-pp.__softirqentry_text_start
0.66 -0.2 0.50 ? 2% perf-profile.self.cycles-pp.__netif_receive_skb_core
0.45 ? 3% -0.1 0.31 ? 3% perf-profile.self.cycles-pp.pick_next_task_fair
0.33 ? 3% -0.1 0.21 ? 3% perf-profile.self.cycles-pp.select_idle_sibling
0.50 ? 3% -0.1 0.38 perf-profile.self.cycles-pp.__kmalloc_node_track_caller
0.32 ? 3% -0.1 0.20 ? 5% perf-profile.self.cycles-pp._find_next_bit
0.37 ? 3% -0.1 0.25 perf-profile.self.cycles-pp.update_rq_clock
0.34 ? 4% -0.1 0.23 ? 2% perf-profile.self.cycles-pp.sk_wait_data
0.64 ? 2% -0.1 0.52 perf-profile.self.cycles-pp.__alloc_skb
0.41 ? 2% -0.1 0.30 ? 2% perf-profile.self.cycles-pp.net_rx_action
0.86 -0.1 0.75 perf-profile.self.cycles-pp.tcp_v4_rcv
0.44 -0.1 0.33 ? 3% perf-profile.self.cycles-pp.ip_rcv_core
0.30 ? 3% -0.1 0.20 ? 3% perf-profile.self.cycles-pp.schedule
0.29 ? 2% -0.1 0.19 ? 2% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.53 -0.1 0.43 perf-profile.self.cycles-pp.kmem_cache_alloc_node
0.61 -0.1 0.51 perf-profile.self.cycles-pp.tcp_ack
0.64 -0.1 0.54 ? 2% perf-profile.self.cycles-pp.do_syscall_64
1.06 -0.1 0.97 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.32 ? 3% -0.1 0.22 ? 3% perf-profile.self.cycles-pp.enqueue_entity
0.53 ? 2% -0.1 0.43 perf-profile.self.cycles-pp.kmem_cache_free
0.40 -0.1 0.31 ? 2% perf-profile.self.cycles-pp.process_backlog
0.80 -0.1 0.70 ? 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.46 ? 2% -0.1 0.37 ? 3% perf-profile.self.cycles-pp.kfree
0.32 -0.1 0.24 ? 3% perf-profile.self.cycles-pp.do_softirq
0.20 ? 4% -0.1 0.13 ? 3% perf-profile.self.cycles-pp.check_preempt_wakeup
1.04 -0.1 0.96 perf-profile.self.cycles-pp.read_tsc
0.28 ? 3% -0.1 0.21 ? 3% perf-profile.self.cycles-pp.__x64_sys_sendto
0.19 ? 3% -0.1 0.12 ? 4% perf-profile.self.cycles-pp.perf_trace_buf_alloc
0.19 -0.1 0.12 ? 7% perf-profile.self.cycles-pp.__calc_delta
0.56 -0.1 0.48 perf-profile.self.cycles-pp.__local_bh_enable_ip
0.24 ? 2% -0.1 0.16 ? 2% perf-profile.self.cycles-pp.select_task_rq_fair
0.25 ? 2% -0.1 0.18 ? 2% perf-profile.self.cycles-pp.ktime_get_with_offset
0.30 ? 2% -0.1 0.23 ? 2% perf-profile.self.cycles-pp.enqueue_to_backlog
0.20 ? 7% -0.1 0.14 ? 4% perf-profile.self.cycles-pp.update_min_vruntime
0.19 ? 3% -0.1 0.12 ? 4% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.27 ? 2% -0.1 0.21 ? 2% perf-profile.self.cycles-pp.validate_xmit_skb
0.38 -0.1 0.32 perf-profile.self.cycles-pp.tcp_rcv_space_adjust
0.80 -0.1 0.74 perf-profile.self.cycles-pp.tcp_recvmsg_locked
0.27 ? 2% -0.1 0.21 ? 3% perf-profile.self.cycles-pp.security_socket_sendmsg
0.17 ? 4% -0.1 0.11 ? 5% perf-profile.self.cycles-pp.exit_to_user_mode_loop
0.26 ? 2% -0.1 0.20 ? 3% perf-profile.self.cycles-pp.ip_output
0.62 ? 2% -0.1 0.56 perf-profile.self.cycles-pp.tcp_clean_rtx_queue
0.25 ? 3% -0.1 0.19 ? 3% perf-profile.self.cycles-pp.irqtime_account_irq
0.19 ? 2% -0.1 0.13 ? 2% perf-profile.self.cycles-pp.netif_skb_features
0.16 ? 4% -0.1 0.10 perf-profile.self.cycles-pp.apparmor_ip_postroute
0.38 -0.1 0.33 perf-profile.self.cycles-pp.memcg_slab_free_hook
0.20 ? 3% -0.1 0.14 perf-profile.self.cycles-pp.dequeue_entity
0.18 ? 4% -0.1 0.12 ? 4% perf-profile.self.cycles-pp.pick_next_entity
0.17 ? 2% -0.1 0.12 ? 4% perf-profile.self.cycles-pp.__fdget
0.26 ? 2% -0.1 0.21 ? 2% perf-profile.self.cycles-pp.ip_send_check
0.19 ? 2% -0.0 0.14 ? 3% perf-profile.self.cycles-pp.finish_task_switch
0.16 ? 3% -0.0 0.12 ? 4% perf-profile.self.cycles-pp.cpumask_next_wrap
0.71 ? 3% -0.0 0.66 perf-profile.self.cycles-pp.update_load_avg
0.19 ? 4% -0.0 0.14 ? 3% perf-profile.self.cycles-pp.ip_finish_output
0.18 ? 2% -0.0 0.14 ? 3% perf-profile.self.cycles-pp.ip_rcv
0.12 ? 4% -0.0 0.08 ? 4% perf-profile.self.cycles-pp.skb_network_protocol
0.14 ? 3% -0.0 0.10 ? 6% perf-profile.self.cycles-pp.put_prev_entity
0.17 ? 2% -0.0 0.12 ? 3% perf-profile.self.cycles-pp.schedule_timeout
0.16 ? 4% -0.0 0.12 ? 5% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.31 -0.0 0.26 ? 2% perf-profile.self.cycles-pp.__list_add_valid
0.18 ? 2% -0.0 0.13 ? 2% perf-profile.self.cycles-pp.set_next_entity
0.16 ? 4% -0.0 0.12 ? 4% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
0.37 ? 3% -0.0 0.33 perf-profile.self.cycles-pp.wait_woken
0.18 ? 2% -0.0 0.14 ? 3% perf-profile.self.cycles-pp.xmit_one
0.30 -0.0 0.26 ? 3% perf-profile.self.cycles-pp.import_single_range
0.15 ? 3% -0.0 0.11 ? 4% perf-profile.self.cycles-pp.tcp_rtt_estimator
0.21 ? 3% -0.0 0.17 ? 2% perf-profile.self.cycles-pp.tcp_event_new_data_sent
0.13 -0.0 0.09 ? 4% perf-profile.self.cycles-pp.__copy_skb_header
0.14 ? 6% -0.0 0.11 ? 4% perf-profile.self.cycles-pp.perf_trace_sched_switch
0.19 ? 2% -0.0 0.15 ? 3% perf-profile.self.cycles-pp.inet_ehashfn
0.13 ? 3% -0.0 0.09 perf-profile.self.cycles-pp.update_irq_load_avg
0.14 ? 3% -0.0 0.11 perf-profile.self.cycles-pp.__x64_sys_recvfrom
0.28 ? 3% -0.0 0.25 perf-profile.self.cycles-pp.switch_fpu_return
0.14 ? 3% -0.0 0.10 ? 3% perf-profile.self.cycles-pp.ip_local_deliver_finish
0.11 ? 3% -0.0 0.08 ? 6% perf-profile.self.cycles-pp.ip_local_deliver
0.10 ? 3% -0.0 0.06 ? 7% perf-profile.self.cycles-pp.kmalloc_slab
0.09 ? 10% -0.0 0.06 perf-profile.self.cycles-pp.switch_ldt
0.08 ? 6% -0.0 0.04 ? 44% perf-profile.self.cycles-pp.perf_swevent_get_recursion_context
0.19 ? 3% -0.0 0.16 ? 2% perf-profile.self.cycles-pp.inet_recvmsg
0.10 ? 4% -0.0 0.07 ? 5% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
0.18 ? 2% -0.0 0.15 ? 2% perf-profile.self.cycles-pp.__tcp_select_window
0.31 -0.0 0.28 perf-profile.self.cycles-pp.tcp_recvmsg
0.13 ? 2% -0.0 0.10 perf-profile.self.cycles-pp.__netif_receive_skb_one_core
0.14 ? 2% -0.0 0.11 ? 4% perf-profile.self.cycles-pp.nf_hook_slow
0.13 ? 5% -0.0 0.10 ? 6% perf-profile.self.cycles-pp.perf_tp_event
0.13 ? 2% -0.0 0.10 perf-profile.self.cycles-pp.tcp_v4_fill_cb
0.11 ? 4% -0.0 0.08 ? 4% perf-profile.self.cycles-pp.skb_release_head_state
0.13 ? 2% -0.0 0.10 ? 4% perf-profile.self.cycles-pp.__ip_local_out
0.12 ? 4% -0.0 0.08 ? 5% perf-profile.self.cycles-pp.__ip_finish_output
0.32 -0.0 0.29 perf-profile.self.cycles-pp.__sys_recvfrom
0.31 ? 3% -0.0 0.28 ? 2% perf-profile.self.cycles-pp.__ksize
0.13 ? 5% -0.0 0.10 ? 3% perf-profile.self.cycles-pp.set_next_buddy
0.13 ? 2% -0.0 0.10 ? 3% perf-profile.self.cycles-pp.neigh_hh_output
0.27 -0.0 0.24 ? 3% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.10 ? 3% -0.0 0.07 ? 5% perf-profile.self.cycles-pp.__cgroup_account_cputime
0.10 ? 4% -0.0 0.07 ? 5% perf-profile.self.cycles-pp.tcp_options_write
0.10 ? 3% -0.0 0.07 ? 6% perf-profile.self.cycles-pp.dev_hard_start_xmit
0.11 -0.0 0.08 ? 5% perf-profile.self.cycles-pp.__sk_dst_check
0.10 ? 4% -0.0 0.07 perf-profile.self.cycles-pp.kmalloc_reserve
0.09 ? 5% -0.0 0.06 perf-profile.self.cycles-pp.qdisc_pkt_len_init
0.08 ? 4% -0.0 0.05 ? 7% perf-profile.self.cycles-pp.__wake_up_common_lock
0.41 -0.0 0.38 ? 2% perf-profile.self.cycles-pp.__sys_sendto
0.22 ? 2% -0.0 0.20 ? 2% perf-profile.self.cycles-pp.try_to_wake_up
0.20 -0.0 0.18 ? 2% perf-profile.self.cycles-pp.tcp_schedule_loss_probe
0.08 -0.0 0.06 ? 9% perf-profile.self.cycles-pp.bpf_skops_write_hdr_opt
0.10 ? 4% -0.0 0.08 perf-profile.self.cycles-pp.tcp_inbound_md5_hash
0.13 ? 3% -0.0 0.11 ? 5% perf-profile.self.cycles-pp.sched_clock_cpu
0.12 ? 4% -0.0 0.09 ? 7% perf-profile.self.cycles-pp.eth_type_trans
0.23 -0.0 0.21 ? 2% perf-profile.self.cycles-pp.__mod_timer
0.08 -0.0 0.06 ? 6% perf-profile.self.cycles-pp.tcp_v4_send_check
0.08 ? 4% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.validate_xmit_xfrm
0.08 ? 4% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.tcp_rate_gen
0.15 ? 3% -0.0 0.12 ? 4% perf-profile.self.cycles-pp.tcp_skb_entail
0.08 ? 5% -0.0 0.06 ? 6% perf-profile.self.cycles-pp.skb_clone_tx_timestamp
0.10 -0.0 0.08 perf-profile.self.cycles-pp.tcp_rate_skb_sent
0.10 ? 3% -0.0 0.08 ? 8% perf-profile.self.cycles-pp.select_task_rq
0.09 ? 5% -0.0 0.07 ? 7% perf-profile.self.cycles-pp.netif_rx_internal
0.08 -0.0 0.06 perf-profile.self.cycles-pp.__build_skb_around
0.07 -0.0 0.05 perf-profile.self.cycles-pp.cpumask_next
0.12 ? 3% -0.0 0.10 ? 3% perf-profile.self.cycles-pp.tcp_event_data_recv
0.20 -0.0 0.18 ? 2% perf-profile.self.cycles-pp.tcp_sendmsg
0.08 ? 6% -0.0 0.06 ? 6% perf-profile.self.cycles-pp.__kfree_skb
0.19 ? 2% -0.0 0.18 ? 2% perf-profile.self.cycles-pp.ip_protocol_deliver_rcu
0.10 ? 3% -0.0 0.08 ? 5% perf-profile.self.cycles-pp.tcp_update_pacing_rate
0.10 ? 5% -0.0 0.08 ? 6% perf-profile.self.cycles-pp.tcp_update_skb_after_send
0.07 ? 5% -0.0 0.05 ? 8% perf-profile.self.cycles-pp.ip_local_out
0.07 ? 5% -0.0 0.05 perf-profile.self.cycles-pp.clear_buddies
0.09 ? 4% -0.0 0.07 perf-profile.self.cycles-pp.__napi_poll
0.08 ? 4% -0.0 0.06 ? 7% perf-profile.self.cycles-pp.tracing_gen_ctx_irq_test
0.10 ? 5% -0.0 0.08 ? 4% perf-profile.self.cycles-pp.security_sock_rcv_skb
0.10 ? 5% -0.0 0.08 ? 4% perf-profile.self.cycles-pp.tcp_rearm_rto
0.15 ? 5% -0.0 0.13 ? 3% perf-profile.self.cycles-pp.resched_curr
0.09 ? 5% -0.0 0.08 ? 6% perf-profile.self.cycles-pp.rb_erase
0.09 ? 4% -0.0 0.07 ? 5% perf-profile.self.cycles-pp.tcp_rate_skb_delivered
0.07 ? 6% -0.0 0.06 ? 6% perf-profile.self.cycles-pp.check_preempt_curr
0.10 ? 5% -0.0 0.08 perf-profile.self.cycles-pp.__tcp_ack_snd_check
0.13 ? 3% -0.0 0.12 ? 3% perf-profile.self.cycles-pp.tcp_v4_do_rcv
0.11 ? 3% -0.0 0.09 ? 5% perf-profile.self.cycles-pp.tcp_mstamp_refresh
0.08 -0.0 0.06 ? 7% perf-profile.self.cycles-pp.__usecs_to_jiffies
0.14 ? 3% -0.0 0.13 perf-profile.self.cycles-pp.sk_filter_trim_cap
0.15 ? 3% -0.0 0.13 ? 2% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook
0.12 ? 4% -0.0 0.10 perf-profile.self.cycles-pp.tcp_ack_update_rtt
0.19 ? 2% -0.0 0.18 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.10 ? 4% -0.0 0.09 perf-profile.self.cycles-pp.__wake_up_common
0.07 ? 5% -0.0 0.06 perf-profile.self.cycles-pp.ttwu_do_activate
0.08 ? 4% -0.0 0.07 perf-profile.self.cycles-pp.tcp_stream_alloc_skb
0.06 -0.0 0.05 perf-profile.self.cycles-pp.skb_push
0.06 -0.0 0.05 perf-profile.self.cycles-pp.cubictcp_acked
0.05 +0.0 0.06 perf-profile.self.cycles-pp.copyin
0.06 +0.0 0.07 perf-profile.self.cycles-pp.inet_sendmsg
0.10 ? 4% +0.0 0.11 perf-profile.self.cycles-pp.tcp_check_space
0.13 ? 3% +0.0 0.15 ? 2% perf-profile.self.cycles-pp.sockfd_lookup_light
0.05 ? 8% +0.0 0.07 ? 5% perf-profile.self.cycles-pp.tcp_rate_check_app_limited
0.13 ? 3% +0.0 0.15 ? 3% perf-profile.self.cycles-pp.tcp_send_mss
0.40 +0.0 0.42 ? 2% perf-profile.self.cycles-pp.recv
0.15 ? 2% +0.0 0.17 ? 2% perf-profile.self.cycles-pp.do_vfs_ioctl
0.05 +0.0 0.07 ? 5% perf-profile.self.cycles-pp.tcp_small_queue_check
0.06 ? 11% +0.0 0.09 ? 4% perf-profile.self.cycles-pp.free_unref_page_commit
0.08 ? 6% +0.0 0.10 perf-profile.self.cycles-pp.tcp_mtu_probe
0.10 +0.0 0.13 ? 2% perf-profile.self.cycles-pp.__sock_wfree
0.05 +0.0 0.08 perf-profile.self.cycles-pp.tcp_add_backlog
0.05 ? 7% +0.0 0.08 ? 4% perf-profile.self.cycles-pp.check_new_pages
0.32 +0.0 0.35 perf-profile.self.cycles-pp.__entry_text_start
0.27 ? 2% +0.0 0.31 perf-profile.self.cycles-pp._copy_to_iter
0.08 ? 4% +0.0 0.12 ? 4% perf-profile.self.cycles-pp.sk_free
0.12 +0.0 0.16 ? 3% perf-profile.self.cycles-pp.tcp_cleanup_rbuf
0.18 ? 2% +0.0 0.22 ? 2% perf-profile.self.cycles-pp.tcp_current_mss
0.17 ? 2% +0.0 0.20 ? 2% perf-profile.self.cycles-pp.skb_do_copy_data_nocache
0.25 +0.0 0.29 ? 2% perf-profile.self.cycles-pp.__skb_clone
0.06 ? 6% +0.0 0.10 perf-profile.self.cycles-pp.__zone_watermark_ok
0.18 ? 2% +0.0 0.22 perf-profile.self.cycles-pp.kfree_skbmem
0.08 ? 9% +0.0 0.12 perf-profile.self.cycles-pp.rmqueue
0.07 +0.0 0.12 ? 4% perf-profile.self.cycles-pp.get_page_from_freelist
0.08 ? 5% +0.0 0.13 ? 3% perf-profile.self.cycles-pp.__free_one_page
0.01 ?223% +0.0 0.06 ? 6% perf-profile.self.cycles-pp.kmem_cache_alloc
0.02 ?141% +0.1 0.07 ? 7% perf-profile.self.cycles-pp.__alloc_pages
0.00 +0.1 0.05 perf-profile.self.cycles-pp.tcp_ack_update_window
0.00 +0.1 0.05 perf-profile.self.cycles-pp.rcu_cblist_dequeue
0.00 +0.1 0.05 ? 7% perf-profile.self.cycles-pp.rb_first
0.26 ? 6% +0.1 0.32 perf-profile.self.cycles-pp.ip_rcv_finish_core
0.14 ? 7% +0.1 0.20 perf-profile.self.cycles-pp.tcp_tx_timestamp
0.00 +0.1 0.06 ? 6% perf-profile.self.cycles-pp.rmqueue_bulk
0.83 +0.1 0.89 perf-profile.self.cycles-pp.__dev_queue_xmit
0.19 ? 3% +0.1 0.25 perf-profile.self.cycles-pp.tcp_tso_segs
0.08 ? 4% +0.1 0.15 ? 2% perf-profile.self.cycles-pp.__release_sock
0.10 ? 9% +0.1 0.18 ? 2% perf-profile.self.cycles-pp.skb_try_coalesce
0.00 +0.1 0.07 ? 10% perf-profile.self.cycles-pp.page_counter_uncharge
0.21 ? 3% +0.1 0.29 ? 3% perf-profile.self.cycles-pp.sock_put
0.01 ?223% +0.1 0.09 ? 41% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
1.36 +0.1 1.44 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.40 +0.1 0.48 perf-profile.self.cycles-pp.__skb_datagram_iter
0.09 ? 5% +0.1 0.18 ? 2% perf-profile.self.cycles-pp.free_pcp_prepare
0.32 ? 5% +0.1 0.40 ? 5% perf-profile.self.cycles-pp.mwait_idle_with_hints
0.00 +0.1 0.10 ? 7% perf-profile.self.cycles-pp.sk_forced_mem_schedule
0.22 ? 4% +0.1 0.32 perf-profile.self.cycles-pp.tcp_push
0.31 ? 7% +0.1 0.42 ? 2% perf-profile.self.cycles-pp.skb_page_frag_refill
0.14 ? 4% +0.1 0.26 ? 2% perf-profile.self.cycles-pp.__slab_free
0.41 ? 4% +0.1 0.53 perf-profile.self.cycles-pp.ipv4_dst_check
0.00 +0.1 0.12 ? 9% perf-profile.self.cycles-pp.__sk_mem_reduce_allocated
0.41 ? 3% +0.1 0.54 ? 2% perf-profile.self.cycles-pp._copy_from_iter
0.59 +0.1 0.72 perf-profile.self.cycles-pp.skb_release_data
0.33 ? 6% +0.2 0.49 ? 2% perf-profile.self.cycles-pp.dst_release
0.80 +0.2 0.98 perf-profile.self.cycles-pp.__check_object_size
0.23 ? 5% +0.2 0.46 ? 2% perf-profile.self.cycles-pp.ipv4_mtu
1.48 ? 7% +0.3 1.76 ? 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.67 ? 7% +0.3 0.97 ? 2% perf-profile.self.cycles-pp.update_cfs_group
1.78 ? 5% +0.5 2.32 perf-profile.self.cycles-pp.tcp_write_xmit
1.12 +0.6 1.73 ? 3% perf-profile.self.cycles-pp._raw_spin_lock_bh
2.30 ? 4% +0.7 2.99 perf-profile.self.cycles-pp.tcp_sendmsg_locked
8.12 ? 5% +2.7 10.82 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
1.19 ? 9% +4.0 5.22 ? 8% perf-profile.self.cycles-pp.__sk_mem_raise_allocated


***************************************************************************************************
lkp-csl-2ap4: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
cs-localhost/gcc-11/performance/ipv4/x86_64-rhel-8.3/200%/debian-10.4-x86_64-20200603.cgz/900s/lkp-csl-2ap4/TCP_STREAM/netperf/0x500320a

commit:
089c02ae27 ("ftrace: Use preemption model accessors for trace header printout")
6b433275e3 ("sched/fair: filter out overloaded cpus in SIS")

089c02ae2771a14a 6b433275e3a3cf18a16c0d2afb7
---------------- ---------------------------
%stddev %change %stddev
\ | \
4701 ? 11% -44.4% 2612 ? 33% netperf.Throughput_Mbps
1805303 ? 11% -44.4% 1003138 ? 33% netperf.Throughput_total_Mbps
7.318e+09 ? 16% -60.3% 2.904e+09 ? 65% netperf.time.involuntary_context_switches
9183 ? 2% -8.5% 8398 ? 4% netperf.time.percent_of_cpu_this_job_got
76163 ? 2% -10.1% 68461 ? 5% netperf.time.system_time
7722 ? 2% +7.8% 8324 ? 3% netperf.time.user_time
1.24e+10 ? 11% -44.4% 6.888e+09 ? 33% netperf.workload
2.03 ? 9% +0.9 2.97 ? 14% mpstat.cpu.all.irq%
16.05 ? 9% -5.1 10.95 ? 18% mpstat.cpu.all.soft%
16081667 ? 16% -59.9% 6441541 ? 64% vmstat.system.cs
412129 -10.6% 368582 ? 14% vmstat.system.in
2953 +4.1% 3073 turbostat.Bzy_MHz
1.08 ? 12% -24.2% 0.82 ? 17% turbostat.CPU%c1
0.10 ? 16% -57.1% 0.04 ? 50% turbostat.IPC
144406 ? 39% -42.2% 83397 ? 41% numa-meminfo.node1.AnonPages
280909 ? 80% -66.5% 94203 ? 50% numa-meminfo.node1.Inactive
280909 ? 80% -66.5% 94203 ? 50% numa-meminfo.node1.Inactive(anon)
7513 ? 5% +33.8% 10054 ? 28% numa-meminfo.node2.KernelStack
1.329e+09 ? 3% -57.8% 5.603e+08 ? 8% numa-numastat.node0.local_node
1.329e+09 ? 3% -57.8% 5.603e+08 ? 8% numa-numastat.node0.numa_hit
1.003e+09 ? 26% -37.4% 6.283e+08 ? 24% numa-numastat.node1.local_node
1.003e+09 ? 26% -37.3% 6.283e+08 ? 24% numa-numastat.node1.numa_hit
105957 +2.7% 108777 proc-vmstat.nr_slab_unreclaimable
4.376e+09 ? 9% -37.8% 2.723e+09 ? 25% proc-vmstat.numa_hit
4.376e+09 ? 9% -37.8% 2.722e+09 ? 25% proc-vmstat.numa_local
4.374e+09 ? 9% -37.8% 2.722e+09 ? 25% proc-vmstat.pgalloc_normal
4.374e+09 ? 9% -37.8% 2.721e+09 ? 25% proc-vmstat.pgfree
245123 -2.8% 238345 ? 2% proc-vmstat.pgreuse
1.329e+09 ? 3% -57.8% 5.603e+08 ? 8% numa-vmstat.node0.numa_hit
1.329e+09 ? 3% -57.8% 5.603e+08 ? 8% numa-vmstat.node0.numa_local
36082 ? 39% -42.2% 20856 ? 41% numa-vmstat.node1.nr_anon_pages
70727 ? 82% -66.7% 23573 ? 50% numa-vmstat.node1.nr_inactive_anon
70727 ? 82% -66.7% 23573 ? 50% numa-vmstat.node1.nr_zone_inactive_anon
1.003e+09 ? 26% -37.3% 6.283e+08 ? 24% numa-vmstat.node1.numa_hit
1.003e+09 ? 26% -37.4% 6.283e+08 ? 24% numa-vmstat.node1.numa_local
7514 ? 5% +33.7% 10050 ? 28% numa-vmstat.node2.nr_kernel_stack
9484366 ? 81% +111.4% 20049256 ? 27% sched_debug.cfs_rq:/.MIN_vruntime.max
102.85 ? 20% +45.9% 150.08 ? 15% sched_debug.cfs_rq:/.load_avg.max
9484366 ? 81% +111.4% 20049256 ? 27% sched_debug.cfs_rq:/.max_vruntime.max
-37498528 +21.4% -45520159 sched_debug.cfs_rq:/.spread0.avg
102.01 ? 4% -12.8% 88.92 ? 7% sched_debug.cfs_rq:/.util_avg.stddev
190.37 ? 12% -38.2% 117.62 ? 18% sched_debug.cfs_rq:/.util_est_enqueued.min
6324 ? 38% +156.7% 16237 ? 41% sched_debug.cpu.avg_idle.min
286287 ? 7% -11.3% 253842 ? 6% sched_debug.cpu.avg_idle.stddev
204.01 ? 31% +102.8% 413.65 ? 21% sched_debug.cpu.clock.stddev
10208 ? 37% -41.3% 5989 ? 55% sched_debug.cpu.clock_task.stddev
0.00 ? 30% +101.7% 0.00 ? 21% sched_debug.cpu.next_balance.stddev
32000235 ? 18% -57.5% 13593568 ? 56% sched_debug.cpu.nr_switches.avg
0.00 ?135% +9.5e+05% 0.09 ?105% sched_debug.rt_rq:/.rt_time.avg
0.00 ?135% +9.5e+05% 16.48 ?105% sched_debug.rt_rq:/.rt_time.max
0.00 ?135% +9.5e+05% 1.19 ?105% sched_debug.rt_rq:/.rt_time.stddev
13.35 ? 7% +107.1% 27.65 ? 27% perf-stat.i.MPKI
4.085e+10 ? 15% -55.4% 1.822e+10 ? 50% perf-stat.i.branch-instructions
5.149e+08 ? 16% -55.9% 2.273e+08 ? 53% perf-stat.i.branch-misses
16690647 ? 17% -60.8% 6537000 ? 65% perf-stat.i.context-switches
3.08 ? 13% +146.2% 7.59 ? 32% perf-stat.i.cpi
5.517e+11 +3.6% 5.716e+11 perf-stat.i.cpu-cycles
757.31 ? 12% -22.9% 583.92 ? 10% perf-stat.i.cpu-migrations
1123 ? 44% -52.8% 530.23 ? 29% perf-stat.i.cycles-between-cache-misses
26399845 ? 34% -59.3% 10744768 ? 54% perf-stat.i.dTLB-load-misses
6.126e+10 ? 15% -55.2% 2.745e+10 ? 50% perf-stat.i.dTLB-loads
5174658 ? 66% -68.0% 1657325 ? 61% perf-stat.i.dTLB-store-misses
3.495e+10 ? 15% -55.5% 1.555e+10 ? 50% perf-stat.i.dTLB-stores
61.10 +6.1 67.20 ? 4% perf-stat.i.iTLB-load-miss-rate%
2.707e+08 ? 16% -55.0% 1.219e+08 ? 50% perf-stat.i.iTLB-load-misses
1.771e+08 ? 18% -63.5% 64622781 ? 67% perf-stat.i.iTLB-loads
2.069e+11 ? 15% -55.3% 9.243e+10 ? 50% perf-stat.i.instructions
0.38 ? 16% -56.3% 0.17 ? 50% perf-stat.i.ipc
2.87 +3.6% 2.98 perf-stat.i.metric.GHz
728.27 ? 15% -54.2% 333.34 ? 48% perf-stat.i.metric.M/sec
34.65 ? 69% -26.6 8.02 ? 79% perf-stat.i.node-load-miss-rate%
38032587 ? 23% -48.0% 19778776 ? 40% perf-stat.i.node-load-misses
26.05 ?108% -23.1 2.97 ? 62% perf-stat.i.node-store-miss-rate%
10666695 ? 19% -48.0% 5547155 ? 32% perf-stat.i.node-store-misses
12.37 ? 7% +119.3% 27.12 ? 28% perf-stat.overall.MPKI
2.82 ? 15% +164.4% 7.46 ? 33% perf-stat.overall.cpi
814.23 ? 25% -40.9% 481.05 ? 24% perf-stat.overall.cycles-between-cache-misses
60.62 +6.3 66.95 ? 4% perf-stat.overall.iTLB-load-miss-rate%
0.36 ? 15% -55.8% 0.16 ? 50% perf-stat.overall.ipc
20.73 ? 47% -14.2 6.51 ? 72% perf-stat.overall.node-load-miss-rate%
7.26 ? 58% -5.4 1.83 ? 75% perf-stat.overall.node-store-miss-rate%
14719 ? 2% -20.9% 11643 ? 12% perf-stat.overall.path-length
3.953e+10 ? 14% -54.5% 1.8e+10 ? 49% perf-stat.ps.branch-instructions
4.98e+08 ? 15% -54.9% 2.244e+08 ? 52% perf-stat.ps.branch-misses
16095450 ? 16% -60.0% 6440979 ? 64% perf-stat.ps.context-switches
5.527e+11 +3.6% 5.725e+11 perf-stat.ps.cpu-cycles
770.48 ? 12% -24.9% 578.71 ? 11% perf-stat.ps.cpu-migrations
25360360 ? 32% -58.0% 10639294 ? 53% perf-stat.ps.dTLB-load-misses
5.93e+10 ? 14% -54.2% 2.713e+10 ? 49% perf-stat.ps.dTLB-loads
4898387 ? 64% -66.5% 1640892 ? 60% perf-stat.ps.dTLB-store-misses
3.382e+10 ? 14% -54.6% 1.537e+10 ? 49% perf-stat.ps.dTLB-stores
2.62e+08 ? 15% -54.1% 1.204e+08 ? 49% perf-stat.ps.iTLB-load-misses
1.707e+08 ? 17% -62.7% 63724182 ? 66% perf-stat.ps.iTLB-loads
2.003e+11 ? 14% -54.4% 9.135e+10 ? 49% perf-stat.ps.instructions
36791027 ? 22% -47.1% 19465718 ? 39% perf-stat.ps.node-load-misses
10396698 ? 18% -47.7% 5442218 ? 33% perf-stat.ps.node-store-misses
1.831e+14 ? 14% -54.3% 8.36e+13 ? 49% perf-stat.total.instructions
12.76 ? 18% -8.3 4.43 ? 53% perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
11.20 ? 18% -6.9 4.33 ? 46% perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
18.04 ? 11% -6.7 11.31 ? 16% perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb
17.79 ? 11% -6.6 11.20 ? 16% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit
52.16 ? 3% -6.5 45.70 ? 4% perf-profile.calltrace.cycles-pp.send_tcp_stream.main.__libc_start_main
52.12 ? 3% -6.4 45.68 ? 4% perf-profile.calltrace.cycles-pp.send_omni_inner.send_tcp_stream.main.__libc_start_main
51.94 ? 3% -6.1 45.85 ? 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.send.send_omni_inner.send_tcp_stream
51.29 ? 3% -6.1 45.20 ? 4% perf-profile.calltrace.cycles-pp.send.send_omni_inner.send_tcp_stream.main.__libc_start_main
50.98 ? 3% -6.1 44.91 ? 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.send.send_omni_inner.send_tcp_stream.main
16.90 ? 11% -6.1 10.84 ? 15% perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip.ip_finish_output2
10.15 ? 17% -6.0 4.10 ? 46% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
10.26 ? 16% -6.0 4.24 ? 44% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
46.61 ? 4% -5.9 40.70 ? 4% perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.send.send_omni_inner
9.62 ? 18% -5.9 3.73 ? 52% perf-profile.calltrace.cycles-pp.release_sock.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
46.49 ? 4% -5.8 40.65 ? 4% perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
16.50 ? 11% -5.8 10.69 ? 15% perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq.__local_bh_enable_ip
9.36 ? 18% -5.8 3.58 ? 53% perf-profile.calltrace.cycles-pp.__release_sock.release_sock.tcp_sendmsg.sock_sendmsg.__sys_sendto
16.42 ? 11% -5.8 10.66 ? 15% perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start.do_softirq
8.82 ? 18% -5.6 3.18 ? 55% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
45.72 ? 4% -5.6 40.10 ? 4% perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
15.93 ? 11% -5.5 10.44 ? 14% perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__softirqentry_text_start
44.98 ? 4% -5.3 39.65 ? 4% perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
7.28 ? 17% -5.3 1.97 ? 75% perf-profile.calltrace.cycles-pp.sock_def_readable.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
7.14 ? 17% -5.2 1.92 ? 74% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
6.92 ? 17% -5.2 1.76 ? 78% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.tcp_rcv_established.tcp_v4_do_rcv
8.06 ? 18% -5.1 2.92 ? 55% perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked
6.67 ? 17% -5.1 1.60 ? 84% perf-profile.calltrace.cycles-pp.try_to_wake_up.__wake_up_common.__wake_up_common_lock.sock_def_readable.tcp_rcv_established
7.40 ? 18% -5.0 2.35 ? 61% perf-profile.calltrace.cycles-pp.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
6.42 ? 18% -4.5 1.96 ? 66% perf-profile.calltrace.cycles-pp.wait_woken.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
13.96 ? 9% -4.4 9.56 ? 11% perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
13.58 ? 9% -4.3 9.30 ? 11% perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
6.13 ? 18% -4.3 1.86 ? 66% perf-profile.calltrace.cycles-pp.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
10.48 ? 11% -4.2 6.24 ? 19% perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core
6.02 ? 18% -4.2 1.82 ? 66% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
13.34 ? 9% -4.2 9.18 ? 11% perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
5.89 ? 18% -4.1 1.78 ? 66% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.wait_woken.sk_wait_data
10.14 ? 10% -4.0 6.13 ? 18% perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
6.29 ? 18% -3.8 2.52 ? 54% perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.__release_sock.release_sock.tcp_sendmsg.sock_sendmsg
5.95 ? 18% -3.6 2.36 ? 53% perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock.tcp_sendmsg
5.26 ? 19% -3.3 1.94 ? 53% perf-profile.calltrace.cycles-pp.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
5.16 ? 18% -3.2 1.95 ? 52% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock
5.13 ? 18% -3.2 1.94 ? 52% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.__release_sock
4.74 ? 19% -3.0 1.75 ? 53% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
4.36 ? 19% -2.8 1.60 ? 53% perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg
3.99 ? 19% -2.5 1.44 ? 52% perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked
4.18 ? 18% -2.2 2.00 ? 29% perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established
2.28 ? 24% -1.7 0.54 ?116% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.wait_woken
4.84 ? 11% -1.7 3.18 ? 17% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv
2.30 ? 16% -1.4 0.88 ? 53% perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.__release_sock.release_sock
1.77 ? 16% -1.3 0.48 ?112% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.__release_sock
1.86 ? 17% -1.3 0.58 ? 88% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.send
1.68 ? 44% -1.3 0.41 ? 76% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
8.40 ? 9% -1.2 7.16 ? 8% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
1.58 ? 17% -1.2 0.42 ?111% perf-profile.calltrace.cycles-pp.recv
1.56 ? 17% -1.1 0.44 ?111% perf-profile.calltrace.cycles-pp.send
1.24 ? 21% -0.7 0.52 ? 83% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked
2.26 ? 5% -0.5 1.78 ? 8% perf-profile.calltrace.cycles-pp.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
2.11 ? 4% -0.4 1.70 ? 7% perf-profile.calltrace.cycles-pp.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
6.95 ? 2% +0.3 7.27 perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked.tcp_recvmsg
0.55 ? 45% +0.4 0.92 ? 24% perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg
0.48 ? 46% +0.4 0.87 ? 26% perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked
0.74 ? 9% +0.5 1.22 ? 13% perf-profile.calltrace.cycles-pp.sk_page_frag_refill.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
0.72 ? 10% +0.5 1.21 ? 14% perf-profile.calltrace.cycles-pp.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
0.09 ?223% +0.7 0.81 ? 24% perf-profile.calltrace.cycles-pp.release_sock.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
0.18 ?141% +0.7 0.92 ? 18% perf-profile.calltrace.cycles-pp.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg_locked.tcp_sendmsg
0.00 +0.8 0.81 ? 21% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg_locked
34.53 +1.1 35.64 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
0.19 ?142% +1.2 1.41 ? 35% perf-profile.calltrace.cycles-pp.skb_release_data.__kfree_skb.__sk_defer_free_flush.tcp_v4_rcv.ip_protocol_deliver_rcu
0.19 ?142% +1.2 1.41 ? 35% perf-profile.calltrace.cycles-pp.__kfree_skb.__sk_defer_free_flush.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish
0.20 ?142% +1.3 1.49 ? 35% perf-profile.calltrace.cycles-pp.__sk_defer_free_flush.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core
5.46 ? 3% +1.5 6.93 ? 6% perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_recvmsg_locked
0.52 ? 81% +1.5 2.00 ? 25% perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
0.52 ? 81% +1.5 2.00 ? 25% perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
0.34 ?152% +1.8 2.12 ? 27% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter
41.80 ? 6% +8.9 50.66 ? 5% perf-profile.calltrace.cycles-pp.recv.recv_omni.process_requests.spawn_child.accept_connection
41.35 ? 6% +9.1 50.45 ? 5% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests.spawn_child
37.18 ? 7% +9.2 46.39 ? 6% perf-profile.calltrace.cycles-pp.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
41.12 ? 7% +9.2 50.36 ? 6% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests
39.12 ? 7% +9.3 48.38 ? 6% perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv
38.29 ? 7% +9.6 47.85 ? 6% perf-profile.calltrace.cycles-pp.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
38.13 ? 7% +9.7 47.78 ? 7% perf-profile.calltrace.cycles-pp.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
38.50 ? 7% +9.8 48.32 ? 6% perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv.recv_omni
10.96 ? 28% +10.0 21.01 ? 16% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked
12.20 ? 25% +10.1 22.33 ? 16% perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
11.21 ? 28% +10.2 21.40 ? 16% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg
11.68 ? 27% +10.3 22.02 ? 16% perf-profile.calltrace.cycles-pp._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
15.86 ? 31% +14.9 30.77 ? 16% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
16.71 ? 30% +15.3 32.02 ? 16% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg
16.30 ? 31% +15.5 31.79 ? 16% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked
17.87 ? 29% +15.7 33.59 ? 16% perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
17.84 ? 29% +15.7 33.57 ? 16% perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.tcp_recvmsg_locked.tcp_recvmsg.inet_recvmsg
27.74 ? 14% -12.2 15.57 ? 25% perf-profile.children.cycles-pp.__tcp_transmit_skb
21.97 ? 16% -11.7 10.31 ? 35% perf-profile.children.cycles-pp.tcp_write_xmit
25.43 ? 14% -10.8 14.64 ? 24% perf-profile.children.cycles-pp.__ip_queue_xmit
22.89 ? 13% -9.5 13.43 ? 21% perf-profile.children.cycles-pp.ip_finish_output2
16.64 ? 14% -8.0 8.60 ? 29% perf-profile.children.cycles-pp.__tcp_push_pending_frames
17.72 ? 13% -7.8 9.91 ? 22% perf-profile.children.cycles-pp.tcp_v4_do_rcv
55.10 ? 4% -7.5 47.59 ? 5% perf-profile.children.cycles-pp.send
17.03 ? 12% -7.4 9.59 ? 21% perf-profile.children.cycles-pp.tcp_rcv_established
18.87 ? 12% -7.3 11.56 ? 17% perf-profile.children.cycles-pp.__local_bh_enable_ip
18.56 ? 12% -7.1 11.44 ? 17% perf-profile.children.cycles-pp.do_softirq
54.21 ? 4% -6.9 47.33 ? 4% perf-profile.children.cycles-pp.send_omni_inner
18.36 ? 12% -6.8 11.53 ? 16% perf-profile.children.cycles-pp.__softirqentry_text_start
53.87 ? 3% -6.8 47.07 ? 4% perf-profile.children.cycles-pp.send_tcp_stream
17.39 ? 12% -6.4 11.00 ? 16% perf-profile.children.cycles-pp.net_rx_action
16.98 ? 12% -6.1 10.84 ? 15% perf-profile.children.cycles-pp.__napi_poll
16.89 ? 12% -6.1 10.80 ? 15% perf-profile.children.cycles-pp.process_backlog
47.46 ? 4% -6.1 41.38 ? 4% perf-profile.children.cycles-pp.__x64_sys_sendto
47.34 ? 4% -6.0 41.33 ? 4% perf-profile.children.cycles-pp.__sys_sendto
46.65 ? 4% -5.9 40.77 ? 4% perf-profile.children.cycles-pp.sock_sendmsg
9.54 ? 17% -5.9 3.68 ? 48% perf-profile.children.cycles-pp.__schedule
10.69 ? 17% -5.8 4.85 ? 39% perf-profile.children.cycles-pp.release_sock
16.39 ? 11% -5.8 10.58 ? 15% perf-profile.children.cycles-pp.__netif_receive_skb_one_core
45.99 ? 4% -5.7 40.31 ? 4% perf-profile.children.cycles-pp.tcp_sendmsg
10.14 ? 17% -5.6 4.59 ? 40% perf-profile.children.cycles-pp.__release_sock
7.57 ? 18% -5.1 2.45 ? 50% perf-profile.children.cycles-pp.sock_def_readable
7.90 ? 17% -4.9 3.02 ? 48% perf-profile.children.cycles-pp.schedule
7.47 ? 17% -4.8 2.62 ? 42% perf-profile.children.cycles-pp.__wake_up_common_lock
7.42 ? 18% -4.8 2.58 ? 55% perf-profile.children.cycles-pp.sk_wait_data
14.46 ? 10% -4.8 9.70 ? 11% perf-profile.children.cycles-pp.ip_local_deliver_finish
7.25 ? 17% -4.7 2.53 ? 42% perf-profile.children.cycles-pp.__wake_up_common
7.02 ? 17% -4.6 2.43 ? 42% perf-profile.children.cycles-pp.try_to_wake_up
14.05 ? 10% -4.5 9.51 ? 11% perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
13.84 ? 10% -4.4 9.39 ? 11% perf-profile.children.cycles-pp.tcp_v4_rcv
6.46 ? 17% -4.0 2.48 ? 46% perf-profile.children.cycles-pp.wait_woken
6.18 ? 17% -3.8 2.37 ? 46% perf-profile.children.cycles-pp.schedule_timeout
2.58 ? 16% -1.7 0.91 ? 50% perf-profile.children.cycles-pp.ttwu_do_activate
2.53 ? 17% -1.7 0.87 ? 52% perf-profile.children.cycles-pp.enqueue_task_fair
2.82 ? 17% -1.7 1.17 ? 45% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
2.43 ? 18% -1.6 0.85 ? 53% perf-profile.children.cycles-pp.dequeue_task_fair
2.34 ? 17% -1.6 0.78 ? 30% perf-profile.children.cycles-pp.select_task_rq
3.63 ? 15% -1.5 2.10 ? 31% perf-profile.children.cycles-pp.__dev_queue_xmit
3.32 ? 13% -1.5 1.83 ? 25% perf-profile.children.cycles-pp.tcp_ack
2.19 ? 18% -1.5 0.71 ? 30% perf-profile.children.cycles-pp.select_task_rq_fair
1.92 ? 18% -1.3 0.60 ? 27% perf-profile.children.cycles-pp.select_idle_sibling
2.10 ? 18% -1.3 0.80 ? 55% perf-profile.children.cycles-pp.pick_next_task_fair
2.10 ? 17% -1.2 0.86 ? 47% perf-profile.children.cycles-pp.__cond_resched
1.93 ? 17% -1.2 0.75 ? 51% perf-profile.children.cycles-pp.exit_to_user_mode_loop
1.74 ? 16% -1.1 0.65 ? 51% perf-profile.children.cycles-pp.update_load_avg
2.51 ? 12% -1.1 1.44 ? 24% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
1.70 ? 16% -1.0 0.67 ? 46% perf-profile.children.cycles-pp.update_curr
1.40 ? 19% -1.0 0.37 ? 21% perf-profile.children.cycles-pp.select_idle_cpu
1.74 ? 15% -1.0 0.76 ? 40% perf-profile.children.cycles-pp.dev_hard_start_xmit
1.61 ? 17% -1.0 0.64 ? 52% perf-profile.children.cycles-pp.switch_mm_irqs_off
1.67 ? 15% -0.9 0.74 ? 39% perf-profile.children.cycles-pp.xmit_one
3.06 ? 8% -0.9 2.15 ? 13% perf-profile.children.cycles-pp.__alloc_skb
1.56 ? 16% -0.9 0.68 ? 40% perf-profile.children.cycles-pp.loopback_xmit
1.52 ? 16% -0.8 0.67 ? 40% perf-profile.children.cycles-pp.kmem_cache_alloc_node
1.32 ? 16% -0.8 0.49 ? 52% perf-profile.children.cycles-pp.enqueue_entity
1.20 ? 27% -0.7 0.49 ? 63% perf-profile.children.cycles-pp.dst_release
1.06 ? 26% -0.6 0.41 ? 64% perf-profile.children.cycles-pp.skb_release_head_state
0.90 ? 19% -0.6 0.26 ? 56% perf-profile.children.cycles-pp.reweight_entity
1.04 ? 17% -0.6 0.40 ? 52% perf-profile.children.cycles-pp.dequeue_entity
2.54 ? 7% -0.6 1.92 ? 10% perf-profile.children.cycles-pp.tcp_stream_alloc_skb
0.90 ? 19% -0.6 0.28 ? 60% perf-profile.children.cycles-pp.set_next_entity
1.14 ? 25% -0.6 0.54 ? 56% perf-profile.children.cycles-pp.ip_rcv
0.70 ? 20% -0.6 0.14 ? 21% perf-profile.children.cycles-pp.available_idle_cpu
0.89 ? 18% -0.5 0.34 ? 39% perf-profile.children.cycles-pp.prepare_task_switch
0.96 ? 29% -0.5 0.48 ? 51% perf-profile.children.cycles-pp.ipv4_dst_check
0.87 ? 18% -0.5 0.42 ? 12% perf-profile.children.cycles-pp._raw_spin_lock
0.75 ? 17% -0.4 0.31 ? 44% perf-profile.children.cycles-pp.__switch_to_asm
0.65 ? 17% -0.4 0.22 ? 60% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.64 ? 17% -0.4 0.22 ? 50% perf-profile.children.cycles-pp.sched_clock_cpu
0.75 ? 19% -0.4 0.33 ? 41% perf-profile.children.cycles-pp._raw_spin_lock_bh
0.62 ? 17% -0.4 0.21 ? 60% perf-profile.children.cycles-pp.check_preempt_curr
0.80 ? 16% -0.4 0.38 ? 36% perf-profile.children.cycles-pp.switch_fpu_return
0.77 ? 29% -0.4 0.36 ? 61% perf-profile.children.cycles-pp.ip_rcv_finish_core
0.57 ? 17% -0.4 0.18 ? 63% perf-profile.children.cycles-pp.check_preempt_wakeup
0.68 ? 15% -0.4 0.30 ? 39% perf-profile.children.cycles-pp.__netif_rx
0.78 ? 16% -0.4 0.40 ? 32% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.66 ? 15% -0.4 0.29 ? 40% perf-profile.children.cycles-pp.netif_rx_internal
0.54 ? 18% -0.4 0.17 ? 54% perf-profile.children.cycles-pp.native_sched_clock
0.70 ? 16% -0.4 0.33 ? 32% perf-profile.children.cycles-pp.tcp_event_new_data_sent
0.73 ? 14% -0.4 0.37 ? 31% perf-profile.children.cycles-pp.__tcp_send_ack
0.64 ? 15% -0.4 0.28 ? 40% perf-profile.children.cycles-pp.__netif_receive_skb_core
0.56 ? 17% -0.4 0.20 ? 41% perf-profile.children.cycles-pp.update_rq_clock
0.58 ? 15% -0.3 0.26 ? 39% perf-profile.children.cycles-pp.enqueue_to_backlog
0.52 ? 16% -0.3 0.21 ? 41% perf-profile.children.cycles-pp.irqtime_account_irq
0.59 ? 16% -0.3 0.28 ? 38% perf-profile.children.cycles-pp.__switch_to
0.60 ? 18% -0.3 0.29 ? 31% perf-profile.children.cycles-pp.__inet_lookup_established
0.53 ? 14% -0.3 0.23 ? 32% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.46 ? 18% -0.3 0.16 ? 59% perf-profile.children.cycles-pp.put_prev_entity
0.63 ? 13% -0.3 0.34 ? 23% perf-profile.children.cycles-pp.sk_reset_timer
0.48 ? 16% -0.3 0.19 ? 43% perf-profile.children.cycles-pp.read_tsc
0.49 ? 17% -0.3 0.21 ? 37% perf-profile.children.cycles-pp.lock_sock_nested
0.49 ? 18% -0.3 0.22 ? 46% perf-profile.children.cycles-pp.tcp_schedule_loss_probe
0.53 ? 17% -0.3 0.26 ? 34% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
0.43 ? 16% -0.3 0.17 ? 41% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.41 ? 17% -0.3 0.15 ? 52% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.58 ? 13% -0.3 0.33 ? 21% perf-profile.children.cycles-pp.__mod_timer
0.41 ? 17% -0.3 0.16 ? 46% perf-profile.children.cycles-pp.__update_load_avg_se
0.82 ? 14% -0.2 0.57 ? 5% perf-profile.children.cycles-pp.ktime_get
0.43 ? 16% -0.2 0.19 ? 41% perf-profile.children.cycles-pp.___perf_sw_event
0.47 ? 17% -0.2 0.24 ? 38% perf-profile.children.cycles-pp.validate_xmit_skb
0.70 ? 9% -0.2 0.48 ? 15% perf-profile.children.cycles-pp.kmalloc_reserve
0.42 ? 16% -0.2 0.20 ? 39% perf-profile.children.cycles-pp.save_fpregs_to_fpstate
0.41 ? 15% -0.2 0.19 ? 38% perf-profile.children.cycles-pp.__entry_text_start
0.46 ? 14% -0.2 0.25 ? 25% perf-profile.children.cycles-pp.__might_resched
0.41 ? 15% -0.2 0.20 ? 29% perf-profile.children.cycles-pp.tcp_mstamp_refresh
0.29 ? 16% -0.2 0.11 ? 54% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.32 ? 17% -0.2 0.13 ? 64% perf-profile.children.cycles-pp.set_next_buddy
0.32 ? 19% -0.2 0.14 ? 50% perf-profile.children.cycles-pp.tcp_ack_update_rtt
0.48 ? 8% -0.2 0.31 ? 12% perf-profile.children.cycles-pp.__virt_addr_valid
0.36 ? 22% -0.2 0.19 ? 39% perf-profile.children.cycles-pp.tcp_add_backlog
0.63 ? 8% -0.2 0.46 ? 13% perf-profile.children.cycles-pp.__kmalloc_node_track_caller
0.35 ? 17% -0.2 0.18 ? 28% perf-profile.children.cycles-pp.update_cfs_group
0.32 ? 12% -0.2 0.15 ? 25% perf-profile.children.cycles-pp.__might_fault
0.30 ? 15% -0.2 0.13 ? 40% perf-profile.children.cycles-pp.__might_sleep
0.32 ? 14% -0.2 0.16 ? 27% perf-profile.children.cycles-pp.ktime_get_with_offset
0.28 ? 17% -0.2 0.11 ? 48% perf-profile.children.cycles-pp.ip_local_out
0.23 ? 20% -0.2 0.07 ? 47% perf-profile.children.cycles-pp.rb_erase
0.27 ? 18% -0.2 0.11 ? 52% perf-profile.children.cycles-pp.__calc_delta
0.33 ? 18% -0.2 0.18 ? 41% perf-profile.children.cycles-pp.sk_filter_trim_cap
0.32 ? 15% -0.1 0.17 ? 19% perf-profile.children.cycles-pp.send_data
0.24 ? 16% -0.1 0.09 ? 55% perf-profile.children.cycles-pp.update_min_vruntime
0.25 ? 16% -0.1 0.11 ? 47% perf-profile.children.cycles-pp.ip_rcv_core
0.18 ? 18% -0.1 0.04 ?109% perf-profile.children.cycles-pp.perf_trace_buf_alloc
0.24 ? 17% -0.1 0.10 ? 47% perf-profile.children.cycles-pp.__ip_local_out
0.25 ? 17% -0.1 0.11 ? 59% perf-profile.children.cycles-pp.pick_next_entity
0.26 ? 14% -0.1 0.12 ? 29% perf-profile.children.cycles-pp.perf_tp_event
0.24 ? 18% -0.1 0.10 ? 36% perf-profile.children.cycles-pp.ip_finish_output
0.22 ? 18% -0.1 0.09 ? 41% perf-profile.children.cycles-pp.cpuacct_charge
0.32 ? 16% -0.1 0.18 ? 14% perf-profile.children.cycles-pp.recv_data
0.45 ? 9% -0.1 0.32 ? 9% perf-profile.children.cycles-pp.kmem_cache_free
0.26 ? 13% -0.1 0.14 ? 38% perf-profile.children.cycles-pp.ip_output
0.22 ? 18% -0.1 0.10 ? 43% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.21 ? 16% -0.1 0.09 ? 36% perf-profile.children.cycles-pp._find_next_bit
0.21 ? 20% -0.1 0.09 ? 56% perf-profile.children.cycles-pp.tcp_rtt_estimator
0.27 ? 12% -0.1 0.16 ? 27% perf-profile.children.cycles-pp.tcp_tso_segs
0.16 ? 18% -0.1 0.04 ?111% perf-profile.children.cycles-pp.ip_send_check
0.21 ? 17% -0.1 0.10 ? 40% perf-profile.children.cycles-pp.netif_skb_features
0.53 ? 7% -0.1 0.42 ? 10% perf-profile.children.cycles-pp.aa_sk_perm
0.46 ? 7% -0.1 0.35 ? 8% perf-profile.children.cycles-pp.security_socket_sendmsg
0.24 ? 18% -0.1 0.12 ? 45% perf-profile.children.cycles-pp.finish_task_switch
0.15 ? 21% -0.1 0.04 ?115% perf-profile.children.cycles-pp.sock_put
0.16 ? 19% -0.1 0.05 ? 88% perf-profile.children.cycles-pp.tcp_rearm_rto
0.14 ? 11% -0.1 0.04 ? 75% perf-profile.children.cycles-pp.__copy_skb_header
0.15 ? 14% -0.1 0.05 ? 82% perf-profile.children.cycles-pp.inet_ehashfn
0.17 ? 12% -0.1 0.07 ? 33% perf-profile.children.cycles-pp.tcp_update_skb_after_send
0.19 ? 19% -0.1 0.09 ? 39% perf-profile.children.cycles-pp.tcp_wfree
0.17 ? 17% -0.1 0.07 ? 44% perf-profile.children.cycles-pp.remove_wait_queue
0.15 ? 14% -0.1 0.05 ? 77% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.72 ? 5% -0.1 0.62 ? 4% perf-profile.children.cycles-pp.sockfd_lookup_light
0.18 ? 19% -0.1 0.08 ? 41% perf-profile.children.cycles-pp.add_wait_queue
0.33 ? 6% -0.1 0.24 ? 14% perf-profile.children.cycles-pp.__skb_clone
0.22 ? 13% -0.1 0.13 ? 18% perf-profile.children.cycles-pp.tcp_rcv_space_adjust
0.18 ? 16% -0.1 0.08 ? 36% perf-profile.children.cycles-pp.memcg_slab_free_hook
0.13 ? 16% -0.1 0.03 ?106% perf-profile.children.cycles-pp.kmalloc_slab
0.20 ? 17% -0.1 0.11 ? 41% perf-profile.children.cycles-pp.__tcp_ack_snd_check
0.14 ? 17% -0.1 0.05 ? 80% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
0.24 ? 11% -0.1 0.16 ? 15% perf-profile.children.cycles-pp.__ksize
0.16 ? 16% -0.1 0.08 ? 44% perf-profile.children.cycles-pp.cpumask_next
0.17 ? 13% -0.1 0.09 ? 29% perf-profile.children.cycles-pp.tcp_established_options
0.15 ? 15% -0.1 0.07 ? 37% perf-profile.children.cycles-pp.__tcp_select_window
0.14 ? 14% -0.1 0.07 ? 29% perf-profile.children.cycles-pp.tcp_event_data_recv
0.14 ? 14% -0.1 0.08 ? 22% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.13 ? 12% -0.1 0.06 ? 28% perf-profile.children.cycles-pp.__list_add_valid
0.13 ? 17% -0.1 0.07 ? 37% perf-profile.children.cycles-pp.rb_next
0.10 ? 7% -0.1 0.04 ? 73% perf-profile.children.cycles-pp.eth_type_trans
0.11 ? 16% -0.1 0.06 ? 25% perf-profile.children.cycles-pp.perf_trace_buf_update
0.13 ? 9% -0.1 0.08 ? 13% perf-profile.children.cycles-pp.kfree_skbmem
0.56 ? 5% -0.1 0.52 ? 2% perf-profile.children.cycles-pp.__fget_light
0.10 ? 16% -0.0 0.06 ? 25% perf-profile.children.cycles-pp.tcp_cleanup_rbuf
0.09 ? 12% -0.0 0.05 ? 47% perf-profile.children.cycles-pp.check_stack_object
0.10 ? 8% -0.0 0.07 ? 15% perf-profile.children.cycles-pp.rcu_all_qs
0.11 ? 6% -0.0 0.09 ? 4% perf-profile.children.cycles-pp.lock_timer_base
0.07 ? 10% -0.0 0.05 ? 8% perf-profile.children.cycles-pp.enqueue_timer
0.26 ? 3% +0.1 0.31 ? 4% perf-profile.children.cycles-pp.tcp_eat_recv_skb
0.02 ?142% +0.1 0.08 ? 21% perf-profile.children.cycles-pp.native_irq_return_iret
0.00 +0.1 0.08 ? 25% perf-profile.children.cycles-pp.write
0.02 ?141% +0.1 0.10 ? 20% perf-profile.children.cycles-pp.__zone_watermark_ok
0.35 ? 3% +0.1 0.42 ? 11% perf-profile.children.cycles-pp.__list_del_entry_valid
0.18 ? 8% +0.1 0.26 ? 8% perf-profile.children.cycles-pp.sock_rfree
0.02 ?142% +0.1 0.11 ? 31% perf-profile.children.cycles-pp.call_timer_fn
0.02 ?142% +0.1 0.12 ? 27% perf-profile.children.cycles-pp.run_timer_softirq
0.02 ?141% +0.1 0.12 ? 27% perf-profile.children.cycles-pp.__run_timers
0.02 ?141% +0.1 0.14 ? 29% perf-profile.children.cycles-pp.__free_one_page
0.02 ?141% +0.1 0.15 ? 25% perf-profile.children.cycles-pp.check_new_pages
0.07 ? 76% +0.1 0.21 ? 24% perf-profile.children.cycles-pp.__irq_exit_rcu
0.51 ? 7% +0.1 0.66 ? 9% perf-profile.children.cycles-pp.kfree
0.03 ?141% +0.2 0.19 ? 28% perf-profile.children.cycles-pp.___slab_alloc
0.12 ? 27% +0.2 0.28 ? 27% perf-profile.children.cycles-pp.__put_page
0.12 ? 27% +0.2 0.32 ? 17% perf-profile.children.cycles-pp.task_tick_fair
0.06 ? 78% +0.2 0.26 ? 33% perf-profile.children.cycles-pp.PageHuge
0.26 ? 21% +0.2 0.46 ? 18% perf-profile.children.cycles-pp.tcp_queue_rcv
0.02 ?142% +0.2 0.22 ? 39% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.18 ? 37% +0.2 0.41 ? 18% perf-profile.children.cycles-pp.skb_clone
0.04 ?108% +0.2 0.28 ? 29% perf-profile.children.cycles-pp.sk_stream_write_space
0.13 ? 29% +0.2 0.38 ? 23% perf-profile.children.cycles-pp.tcp_check_space
0.04 ?141% +0.2 0.28 ? 29% perf-profile.children.cycles-pp.sk_stream_wait_memory
0.19 ? 23% +0.2 0.44 ? 16% perf-profile.children.cycles-pp.scheduler_tick
0.03 ?141% +0.3 0.29 ? 33% perf-profile.children.cycles-pp.free_pcppages_bulk
0.10 ? 82% +0.3 0.36 ? 23% perf-profile.children.cycles-pp.__slab_free
0.09 ? 84% +0.3 0.36 ? 27% perf-profile.children.cycles-pp.skb_try_coalesce
0.04 ?107% +0.3 0.31 ? 30% perf-profile.children.cycles-pp.rmqueue_bulk
0.10 ? 84% +0.3 0.38 ? 27% perf-profile.children.cycles-pp.tcp_try_coalesce
0.32 ? 19% +0.4 0.68 ? 17% perf-profile.children.cycles-pp.update_process_times
0.33 ? 20% +0.4 0.72 ? 17% perf-profile.children.cycles-pp.tick_sched_handle
0.39 ? 13% +0.4 0.78 ? 17% perf-profile.children.cycles-pp.tick_sched_timer
0.24 ? 28% +0.4 0.67 ? 24% perf-profile.children.cycles-pp.rmqueue
0.32 ? 29% +0.5 0.78 ? 24% perf-profile.children.cycles-pp.free_pcp_prepare
0.49 ? 15% +0.5 0.96 ? 16% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.49 ? 16% +0.5 0.96 ? 19% perf-profile.children.cycles-pp.__alloc_pages
0.37 ? 21% +0.5 0.85 ? 21% perf-profile.children.cycles-pp.get_page_from_freelist
0.80 ? 9% +0.5 1.30 ? 13% perf-profile.children.cycles-pp.sk_page_frag_refill
0.76 ? 10% +0.5 1.26 ? 14% perf-profile.children.cycles-pp.skb_page_frag_refill
0.74 ? 8% +0.5 1.25 ? 16% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.72 ? 9% +0.5 1.24 ? 16% perf-profile.children.cycles-pp.hrtimer_interrupt
0.84 ? 10% +0.6 1.48 ? 16% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.53 ? 23% +0.7 1.22 ? 23% perf-profile.children.cycles-pp.free_unref_page
35.38 +0.8 36.21 perf-profile.children.cycles-pp.tcp_sendmsg_locked
1.66 ? 8% +0.9 2.52 ? 15% perf-profile.children.cycles-pp.skb_release_data
0.32 ? 84% +1.2 1.54 ? 31% perf-profile.children.cycles-pp.__sk_defer_free_flush
1.26 ? 27% +1.4 2.69 ? 20% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
95.58 +2.1 97.64 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
95.15 +2.3 97.46 perf-profile.children.cycles-pp.do_syscall_64
43.52 ? 5% +7.9 51.38 ? 5% perf-profile.children.cycles-pp.recv
42.50 ? 6% +8.6 51.08 ? 5% perf-profile.children.cycles-pp.accept_connection
42.50 ? 6% +8.6 51.08 ? 5% perf-profile.children.cycles-pp.spawn_child
42.50 ? 6% +8.6 51.08 ? 5% perf-profile.children.cycles-pp.process_requests
42.48 ? 6% +8.6 51.07 ? 5% perf-profile.children.cycles-pp.recv_omni
39.24 ? 7% +9.4 48.60 ? 6% perf-profile.children.cycles-pp.__x64_sys_recvfrom
37.21 ? 7% +9.4 46.58 ? 6% perf-profile.children.cycles-pp.tcp_recvmsg_locked
39.13 ? 7% +9.4 48.56 ? 6% perf-profile.children.cycles-pp.__sys_recvfrom
38.29 ? 7% +9.7 48.03 ? 7% perf-profile.children.cycles-pp.inet_recvmsg
38.16 ? 7% +9.8 47.97 ? 7% perf-profile.children.cycles-pp.tcp_recvmsg
11.67 ? 27% +10.1 21.79 ? 15% perf-profile.children.cycles-pp.copyin
12.75 ? 25% +10.2 22.91 ? 15% perf-profile.children.cycles-pp.skb_do_copy_data_nocache
12.15 ? 26% +10.2 22.39 ? 15% perf-profile.children.cycles-pp._copy_from_iter
16.73 ? 30% +15.3 32.04 ? 16% perf-profile.children.cycles-pp._copy_to_iter
16.31 ? 31% +15.5 31.79 ? 16% perf-profile.children.cycles-pp.copyout
17.87 ? 29% +15.7 33.60 ? 16% perf-profile.children.cycles-pp.skb_copy_datagram_iter
17.84 ? 29% +15.7 33.57 ? 16% perf-profile.children.cycles-pp.__skb_datagram_iter
27.76 ? 30% +25.7 53.48 ? 16% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
1.58 ? 17% -0.9 0.64 ? 52% perf-profile.self.cycles-pp.switch_mm_irqs_off
1.17 ? 27% -0.7 0.49 ? 62% perf-profile.self.cycles-pp.dst_release
0.84 ? 20% -0.6 0.21 ? 34% perf-profile.self.cycles-pp._raw_spin_lock
0.90 ? 16% -0.6 0.33 ? 53% perf-profile.self.cycles-pp.update_load_avg
0.69 ? 20% -0.6 0.13 ? 19% perf-profile.self.cycles-pp.available_idle_cpu
0.94 ? 15% -0.5 0.44 ? 36% perf-profile.self.cycles-pp.__schedule
0.96 ? 28% -0.5 0.47 ? 51% perf-profile.self.cycles-pp.ipv4_dst_check
0.77 ? 15% -0.5 0.30 ? 47% perf-profile.self.cycles-pp.update_curr
0.75 ? 17% -0.4 0.31 ? 44% perf-profile.self.cycles-pp.__switch_to_asm
1.24 ? 8% -0.4 0.83 ? 16% perf-profile.self.cycles-pp.__tcp_transmit_skb
0.76 ? 29% -0.4 0.36 ? 62% perf-profile.self.cycles-pp.ip_rcv_finish_core
0.70 ? 19% -0.4 0.30 ? 44% perf-profile.self.cycles-pp._raw_spin_lock_bh
0.77 ? 16% -0.4 0.40 ? 32% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.53 ? 18% -0.4 0.16 ? 55% perf-profile.self.cycles-pp.native_sched_clock
0.63 ? 15% -0.4 0.28 ? 39% perf-profile.self.cycles-pp.__netif_receive_skb_core
0.44 ? 23% -0.3 0.10 ? 50% perf-profile.self.cycles-pp.reweight_entity
0.66 ? 21% -0.3 0.32 ? 49% perf-profile.self.cycles-pp.tcp_v4_rcv
0.62 ? 18% -0.3 0.28 ? 46% perf-profile.self.cycles-pp.ip_finish_output2
0.46 ? 19% -0.3 0.16 ? 32% perf-profile.self.cycles-pp.prepare_task_switch
0.47 ? 17% -0.3 0.17 ? 20% perf-profile.self.cycles-pp.select_idle_cpu
0.56 ? 16% -0.3 0.27 ? 37% perf-profile.self.cycles-pp.__switch_to
0.46 ? 16% -0.3 0.19 ? 43% perf-profile.self.cycles-pp.read_tsc
0.53 ? 17% -0.3 0.26 ? 34% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.63 ? 10% -0.3 0.36 ? 24% perf-profile.self.cycles-pp.tcp_recvmsg_locked
0.37 ? 15% -0.3 0.12 ? 46% perf-profile.self.cycles-pp.enqueue_task_fair
0.44 ? 17% -0.3 0.18 ? 47% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.42 ? 15% -0.3 0.17 ? 41% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.40 ? 17% -0.3 0.14 ? 58% perf-profile.self.cycles-pp.enqueue_entity
0.40 ? 18% -0.3 0.15 ? 50% perf-profile.self.cycles-pp.loopback_xmit
0.40 ? 16% -0.2 0.14 ? 53% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.40 ? 17% -0.2 0.16 ? 45% perf-profile.self.cycles-pp.__update_load_avg_se
0.43 ? 16% -0.2 0.19 ? 36% perf-profile.self.cycles-pp.tcp_ack
0.47 ? 14% -0.2 0.23 ? 33% perf-profile.self.cycles-pp.tcp_clean_rtx_queue
0.40 ? 16% -0.2 0.16 ? 43% perf-profile.self.cycles-pp.__softirqentry_text_start
0.36 ? 17% -0.2 0.13 ? 43% perf-profile.self.cycles-pp.net_rx_action
0.32 ? 19% -0.2 0.10 ? 50% perf-profile.self.cycles-pp.dequeue_task_fair
0.38 ? 16% -0.2 0.16 ? 45% perf-profile.self.cycles-pp.___perf_sw_event
0.31 ? 18% -0.2 0.09 ? 34% perf-profile.self.cycles-pp.update_rq_clock
0.41 ? 16% -0.2 0.20 ? 38% perf-profile.self.cycles-pp.save_fpregs_to_fpstate
0.44 ? 20% -0.2 0.23 ? 27% perf-profile.self.cycles-pp.__inet_lookup_established
0.45 ? 13% -0.2 0.24 ? 24% perf-profile.self.cycles-pp.__might_resched
0.42 ? 12% -0.2 0.22 ? 28% perf-profile.self.cycles-pp.kmem_cache_alloc_node
0.30 ? 17% -0.2 0.12 ? 45% perf-profile.self.cycles-pp.__local_bh_enable_ip
0.30 ? 18% -0.2 0.12 ? 65% perf-profile.self.cycles-pp.set_next_buddy
0.29 ? 17% -0.2 0.11 ? 56% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.42 ? 14% -0.2 0.25 ? 22% perf-profile.self.cycles-pp.tcp_rcv_established
0.24 ? 18% -0.2 0.07 ? 71% perf-profile.self.cycles-pp.sk_wait_data
0.30 ? 24% -0.2 0.13 ? 64% perf-profile.self.cycles-pp.tcp_v4_do_rcv
0.34 ? 15% -0.2 0.18 ? 27% perf-profile.self.cycles-pp.update_cfs_group
0.28 ? 17% -0.2 0.11 ? 47% perf-profile.self.cycles-pp.wait_woken
0.20 ? 14% -0.2 0.04 ?113% perf-profile.self.cycles-pp.check_preempt_wakeup
0.46 ? 8% -0.2 0.30 ? 12% perf-profile.self.cycles-pp.__virt_addr_valid
0.36 ? 11% -0.2 0.21 ? 25% perf-profile.self.cycles-pp.__kmalloc_node_track_caller
0.26 ? 17% -0.2 0.11 ? 52% perf-profile.self.cycles-pp.__calc_delta
0.29 ? 16% -0.2 0.14 ? 33% perf-profile.self.cycles-pp.__mod_timer
0.26 ? 17% -0.2 0.11 ? 43% perf-profile.self.cycles-pp.select_task_rq_fair
0.30 ? 13% -0.2 0.15 ? 32% perf-profile.self.cycles-pp.do_syscall_64
0.26 ? 15% -0.2 0.11 ? 44% perf-profile.self.cycles-pp.do_softirq
0.34 ? 14% -0.2 0.19 ? 36% perf-profile.self.cycles-pp.pick_next_task_fair
0.39 ? 10% -0.1 0.24 ? 18% perf-profile.self.cycles-pp.send_omni_inner
0.34 ? 16% -0.1 0.19 ? 22% perf-profile.self.cycles-pp.send
0.26 ? 14% -0.1 0.12 ? 39% perf-profile.self.cycles-pp.__might_sleep
0.23 ? 15% -0.1 0.08 ? 55% perf-profile.self.cycles-pp.update_min_vruntime
0.25 ? 16% -0.1 0.11 ? 38% perf-profile.self.cycles-pp.enqueue_to_backlog
0.26 ? 15% -0.1 0.12 ? 39% perf-profile.self.cycles-pp.select_idle_sibling
0.24 ? 18% -0.1 0.10 ? 34% perf-profile.self.cycles-pp.ip_finish_output
0.24 ? 17% -0.1 0.11 ? 47% perf-profile.self.cycles-pp.ip_rcv_core
0.23 ? 18% -0.1 0.10 ? 61% perf-profile.self.cycles-pp.pick_next_entity
0.18 ? 17% -0.1 0.05 ? 87% perf-profile.self.cycles-pp.tcp_recvmsg
0.21 ? 18% -0.1 0.08 ? 38% perf-profile.self.cycles-pp.tcp_event_new_data_sent
0.22 ? 18% -0.1 0.08 ? 38% perf-profile.self.cycles-pp.cpuacct_charge
0.25 ? 15% -0.1 0.12 ? 24% perf-profile.self.cycles-pp.send_data
0.24 ? 14% -0.1 0.11 ? 37% perf-profile.self.cycles-pp.switch_fpu_return
0.18 ? 18% -0.1 0.06 ? 63% perf-profile.self.cycles-pp.set_next_entity
0.18 ? 20% -0.1 0.06 ? 92% perf-profile.self.cycles-pp.put_prev_entity
0.17 ? 19% -0.1 0.06 ? 89% perf-profile.self.cycles-pp.__cond_resched
0.24 ? 14% -0.1 0.13 ? 22% perf-profile.self.cycles-pp.__sys_sendto
0.25 ? 15% -0.1 0.14 ? 21% perf-profile.self.cycles-pp.recv_data
0.18 ? 15% -0.1 0.06 ? 65% perf-profile.self.cycles-pp.schedule
0.15 ? 17% -0.1 0.04 ?111% perf-profile.self.cycles-pp.ip_send_check
0.26 ? 11% -0.1 0.16 ? 26% perf-profile.self.cycles-pp.tcp_tso_segs
0.15 ? 14% -0.1 0.04 ?110% perf-profile.self.cycles-pp.inet_ehashfn
0.30 ? 12% -0.1 0.19 ? 9% perf-profile.self.cycles-pp.recv_omni
0.19 ? 22% -0.1 0.09 ? 53% perf-profile.self.cycles-pp.tcp_rtt_estimator
0.19 ? 16% -0.1 0.09 ? 36% perf-profile.self.cycles-pp._find_next_bit
0.14 ? 19% -0.1 0.04 ?112% perf-profile.self.cycles-pp.sock_put
0.23 ? 15% -0.1 0.13 ? 18% perf-profile.self.cycles-pp.recv
0.42 ? 9% -0.1 0.32 ? 11% perf-profile.self.cycles-pp.skb_release_data
0.14 ? 13% -0.1 0.04 ? 75% perf-profile.self.cycles-pp.__copy_skb_header
0.21 ? 10% -0.1 0.11 ? 32% perf-profile.self.cycles-pp.process_backlog
0.19 ? 15% -0.1 0.09 ? 32% perf-profile.self.cycles-pp.irqtime_account_irq
0.15 ? 14% -0.1 0.05 ? 77% perf-profile.self.cycles-pp.perf_trace_sched_switch
0.33 ? 9% -0.1 0.23 ? 9% perf-profile.self.cycles-pp.kmem_cache_free
0.18 ? 16% -0.1 0.08 ? 38% perf-profile.self.cycles-pp.memcg_slab_free_hook
0.19 ? 17% -0.1 0.09 ? 39% perf-profile.self.cycles-pp.tcp_wfree
0.19 ? 20% -0.1 0.10 ? 33% perf-profile.self.cycles-pp.sk_filter_trim_cap
0.17 ? 17% -0.1 0.08 ? 30% perf-profile.self.cycles-pp.tcp_queue_rcv
0.17 ? 13% -0.1 0.08 ? 34% perf-profile.self.cycles-pp.ip_output
0.14 ? 17% -0.1 0.05 ? 80% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
0.18 ? 12% -0.1 0.10 ? 30% perf-profile.self.cycles-pp.__sys_recvfrom
0.42 ? 7% -0.1 0.33 ? 10% perf-profile.self.cycles-pp.aa_sk_perm
0.12 ? 17% -0.1 0.04 ?105% perf-profile.self.cycles-pp.__x64_sys_sendto
0.33 ? 10% -0.1 0.25 ? 8% perf-profile.self.cycles-pp.kfree
0.18 ? 16% -0.1 0.09 ? 30% perf-profile.self.cycles-pp.__entry_text_start
0.15 ? 16% -0.1 0.07 ? 33% perf-profile.self.cycles-pp.release_sock
0.27 ? 9% -0.1 0.19 ? 11% perf-profile.self.cycles-pp._copy_to_iter
0.18 ? 18% -0.1 0.09 ? 42% perf-profile.self.cycles-pp.__tcp_ack_snd_check
0.13 ? 16% -0.1 0.04 ? 79% perf-profile.self.cycles-pp.inet_recvmsg
0.16 ? 14% -0.1 0.08 ? 38% perf-profile.self.cycles-pp.try_to_wake_up
0.24 ? 12% -0.1 0.16 ? 14% perf-profile.self.cycles-pp.__ksize
0.18 ? 11% -0.1 0.10 ? 15% perf-profile.self.cycles-pp.ktime_get_with_offset
0.16 ? 14% -0.1 0.09 ? 36% perf-profile.self.cycles-pp.ip_protocol_deliver_rcu
0.14 ? 13% -0.1 0.06 ? 29% perf-profile.self.cycles-pp.perf_tp_event
0.15 ? 14% -0.1 0.07 ? 30% perf-profile.self.cycles-pp.select_task_rq
0.12 ? 17% -0.1 0.04 ?110% perf-profile.self.cycles-pp.schedule_timeout
0.16 ? 15% -0.1 0.09 ? 29% perf-profile.self.cycles-pp.tcp_established_options
0.11 ? 13% -0.1 0.03 ?103% perf-profile.self.cycles-pp.tcp_add_backlog
0.14 ? 16% -0.1 0.07 ? 33% perf-profile.self.cycles-pp.__tcp_select_window
0.14 ? 16% -0.1 0.07 ? 29% perf-profile.self.cycles-pp.tcp_event_data_recv
0.17 ? 16% -0.1 0.10 ? 30% perf-profile.self.cycles-pp.validate_xmit_skb
0.10 ? 7% -0.1 0.03 ?102% perf-profile.self.cycles-pp.eth_type_trans
0.14 ? 12% -0.1 0.07 ? 20% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.12 ? 16% -0.1 0.06 ? 60% perf-profile.self.cycles-pp.tcp_skb_entail
0.12 ? 12% -0.1 0.06 ? 30% perf-profile.self.cycles-pp.__list_add_valid
0.56 ? 5% -0.1 0.49 ? 3% perf-profile.self.cycles-pp.__fget_light
0.13 ? 14% -0.1 0.07 ? 33% perf-profile.self.cycles-pp.rb_next
0.13 ? 10% -0.1 0.07 ? 24% perf-profile.self.cycles-pp.__might_fault
0.14 ? 14% -0.1 0.09 ? 25% perf-profile.self.cycles-pp.sock_def_readable
0.09 ? 14% -0.1 0.03 ?103% perf-profile.self.cycles-pp.__wake_up_common
0.11 ? 11% -0.1 0.05 ? 51% perf-profile.self.cycles-pp.xmit_one
0.08 ? 17% -0.1 0.03 ?103% perf-profile.self.cycles-pp.tcp_cleanup_rbuf
0.13 ? 10% -0.0 0.08 ? 13% perf-profile.self.cycles-pp.kfree_skbmem
0.10 ? 12% -0.0 0.06 ? 19% perf-profile.self.cycles-pp.security_socket_sendmsg
0.11 ? 11% -0.0 0.06 ? 23% perf-profile.self.cycles-pp.sock_sendmsg
0.10 ? 16% -0.0 0.06 ? 21% perf-profile.self.cycles-pp.__release_sock
0.13 ? 14% -0.0 0.09 ? 15% perf-profile.self.cycles-pp.tcp_stream_alloc_skb
0.08 ? 10% -0.0 0.05 ? 47% perf-profile.self.cycles-pp.tcp_eat_recv_skb
0.08 ? 8% -0.0 0.06 ? 11% perf-profile.self.cycles-pp.rcu_all_qs
0.08 ? 8% -0.0 0.06 ? 11% perf-profile.self.cycles-pp.free_unref_page
0.07 ? 6% +0.0 0.09 ? 11% perf-profile.self.cycles-pp.tcp_check_space
0.18 ? 2% +0.0 0.20 ? 2% perf-profile.self.cycles-pp.skb_page_frag_refill
0.10 +0.0 0.12 ? 6% perf-profile.self.cycles-pp.rmqueue
0.02 ?141% +0.0 0.06 ? 8% perf-profile.self.cycles-pp.alloc_pages
0.27 ? 3% +0.0 0.32 ? 6% perf-profile.self.cycles-pp._copy_from_iter
0.76 +0.0 0.81 perf-profile.self.cycles-pp.tcp_write_xmit
0.02 ?142% +0.1 0.08 ? 20% perf-profile.self.cycles-pp.native_irq_return_iret
0.02 ?141% +0.1 0.10 ? 20% perf-profile.self.cycles-pp.__zone_watermark_ok
0.18 ? 9% +0.1 0.25 ? 7% perf-profile.self.cycles-pp.sock_rfree
0.32 ? 4% +0.1 0.42 ? 12% perf-profile.self.cycles-pp.__list_del_entry_valid
0.03 ?141% +0.1 0.14 ? 24% perf-profile.self.cycles-pp.___slab_alloc
0.01 ?223% +0.1 0.13 ? 26% perf-profile.self.cycles-pp.__free_one_page
0.02 ?141% +0.1 0.15 ? 25% perf-profile.self.cycles-pp.check_new_pages
0.32 ? 15% +0.1 0.45 ? 7% perf-profile.self.cycles-pp.__skb_datagram_iter
0.06 ? 77% +0.2 0.25 ? 34% perf-profile.self.cycles-pp.PageHuge
0.16 ? 27% +0.2 0.35 ? 25% perf-profile.self.cycles-pp.skb_do_copy_data_nocache
0.02 ?142% +0.2 0.22 ? 39% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.17 ? 40% +0.2 0.40 ? 18% perf-profile.self.cycles-pp.skb_clone
0.19 ? 43% +0.2 0.42 ? 17% perf-profile.self.cycles-pp.__tcp_push_pending_frames
0.52 ? 15% +0.2 0.76 ? 10% perf-profile.self.cycles-pp.__alloc_skb
0.09 ? 84% +0.3 0.35 ? 28% perf-profile.self.cycles-pp.skb_try_coalesce
0.10 ? 82% +0.3 0.36 ? 23% perf-profile.self.cycles-pp.__slab_free
0.61 ? 12% +0.4 0.98 ? 19% perf-profile.self.cycles-pp.__check_object_size
0.32 ? 29% +0.4 0.76 ? 24% perf-profile.self.cycles-pp.free_pcp_prepare
1.50 ? 13% +0.6 2.10 ? 9% perf-profile.self.cycles-pp.tcp_sendmsg_locked
5.06 ? 8% +0.8 5.89 ? 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
27.43 ? 30% +25.1 52.52 ? 16% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string



***************************************************************************************************
lkp-icl-2sp6: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
pipe/gcc-11/performance/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-icl-2sp6/fifo/stress-ng/60s/0xd000331

commit:
089c02ae27 ("ftrace: Use preemption model accessors for trace header printout")
6b433275e3 ("sched/fair: filter out overloaded cpus in SIS")

089c02ae2771a14a 6b433275e3a3cf18a16c0d2afb7
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.736e+09 ? 6% +26.8% 2.2e+09 ? 6% stress-ng.fifo.ops
28931140 ? 6% +26.8% 36670758 ? 6% stress-ng.fifo.ops_per_sec
8.772e+08 ? 3% -38.2% 5.421e+08 stress-ng.time.involuntary_context_switches
408.02 ? 3% +12.8% 460.42 stress-ng.time.user_time
1.417e+09 ? 3% -12.6% 1.238e+09 ? 7% stress-ng.time.voluntary_context_switches
21260022 ? 3% -53.5% 9882616 ? 5% cpuidle..usage
5.80 ? 9% -0.8 5.03 mpstat.cpu.all.idle%
0.02 ? 5% -0.0 0.02 ? 8% mpstat.cpu.all.soft%
5.24 ? 3% +0.6 5.83 mpstat.cpu.all.usr%
18344723 ? 3% -58.5% 7614636 ? 6% turbostat.C1
0.77 ? 4% -0.5 0.30 ? 7% turbostat.C1%
46526185 ? 9% -35.6% 29952171 ? 5% turbostat.IRQ
257.33 +16.1% 298.67 ? 2% vmstat.procs.r
35575264 ? 3% -22.5% 27561102 ? 5% vmstat.system.cs
714764 ? 9% -35.4% 461616 ? 5% vmstat.system.in
1415904 ? 12% +88.8% 2673914 ? 6% proc-vmstat.numa_hit
1300187 ? 13% +96.8% 2558221 ? 6% proc-vmstat.numa_local
1415865 ? 12% +88.9% 2673934 ? 6% proc-vmstat.pgalloc_normal
1214846 ? 14% +103.7% 2474149 ? 6% proc-vmstat.pgfree
27150 ? 2% -2.3% 26529 proc-vmstat.pgreuse
1.09 +18.9% 1.30 ? 4% sched_debug.cfs_rq:/.h_nr_running.avg
0.64 ? 3% +20.4% 0.77 ? 7% sched_debug.cfs_rq:/.h_nr_running.stddev
404114 ? 21% -42.3% 233298 ? 21% sched_debug.cfs_rq:/.min_vruntime.stddev
1225 +16.0% 1422 ? 2% sched_debug.cfs_rq:/.runnable_avg.avg
2533 ? 9% +37.9% 3493 ? 19% sched_debug.cfs_rq:/.runnable_avg.max
404214 ? 21% -42.3% 233408 ? 21% sched_debug.cfs_rq:/.spread0.stddev
1.10 ? 3% +19.3% 1.31 ? 3% sched_debug.cpu.nr_running.avg
3.00 +55.6% 4.67 ? 23% sched_debug.cpu.nr_running.max
0.65 ? 3% +19.7% 0.78 ? 7% sched_debug.cpu.nr_running.stddev
8624987 ? 2% -22.6% 6679073 ? 5% sched_debug.cpu.nr_switches.avg
12669773 ? 4% -21.8% 9903407 ? 12% sched_debug.cpu.nr_switches.max
2489911 ? 18% -54.7% 1128032 ? 15% sched_debug.cpu.nr_switches.stddev
4.46 ? 5% -27.0% 3.26 ? 5% perf-stat.i.MPKI
5.637e+10 ? 2% +3.8% 5.853e+10 perf-stat.i.branch-instructions
0.79 ? 3% -0.1 0.69 ? 3% perf-stat.i.branch-miss-rate%
4.063e+08 -11.7% 3.585e+08 ? 4% perf-stat.i.branch-misses
0.80 ? 19% +0.3 1.08 ? 11% perf-stat.i.cache-miss-rate%
1.189e+09 ? 2% -25.3% 8.875e+08 ? 2% perf-stat.i.cache-references
36910521 ? 2% -22.6% 28555484 ? 5% perf-stat.i.context-switches
7.988e+10 ? 2% +4.5% 8.347e+10 perf-stat.i.dTLB-loads
4.885e+10 ? 2% +4.6% 5.112e+10 perf-stat.i.dTLB-stores
2.795e+11 ? 2% +4.4% 2.917e+11 perf-stat.i.instructions
350.05 ? 2% -18.2% 286.28 ? 6% perf-stat.i.metric.K/sec
1455 ? 2% +4.1% 1515 perf-stat.i.metric.M/sec
4.25 ? 3% -28.5% 3.04 ? 2% perf-stat.overall.MPKI
0.72 ? 2% -0.1 0.61 ? 4% perf-stat.overall.branch-miss-rate%
0.66 ? 15% +0.2 0.91 ? 5% perf-stat.overall.cache-miss-rate%
0.73 ? 2% +3.6% 0.75 perf-stat.overall.ipc
5.551e+10 ? 2% +3.9% 5.766e+10 perf-stat.ps.branch-instructions
3.999e+08 -11.8% 3.527e+08 ? 4% perf-stat.ps.branch-misses
1.17e+09 ? 2% -25.4% 8.73e+08 ? 2% perf-stat.ps.cache-references
36346244 ? 2% -22.7% 28101939 ? 5% perf-stat.ps.context-switches
7.867e+10 ? 2% +4.5% 8.223e+10 perf-stat.ps.dTLB-loads
4.811e+10 ? 2% +4.7% 5.036e+10 perf-stat.ps.dTLB-stores
2.753e+11 ? 2% +4.4% 2.873e+11 perf-stat.ps.instructions
1.738e+13 +4.6% 1.819e+13 perf-stat.total.instructions
28.90 ? 4% -8.5 20.41 ? 13% perf-profile.calltrace.cycles-pp.read
26.32 ? 4% -8.3 18.04 ? 13% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
25.93 ? 4% -8.2 17.71 ? 13% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
11.73 ? 9% -7.1 4.66 ? 22% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.new_sync_read.vfs_read.ksys_read
11.38 ? 10% -6.9 4.46 ? 22% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_read.new_sync_read.vfs_read
11.28 ? 10% -6.9 4.39 ? 22% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read.new_sync_read
11.17 ? 10% -6.8 4.34 ? 23% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read
17.47 ? 5% -5.7 11.78 ? 14% perf-profile.calltrace.cycles-pp.pipe_read.new_sync_read.vfs_read.ksys_read.do_syscall_64
17.99 ? 5% -5.6 12.40 ? 13% perf-profile.calltrace.cycles-pp.new_sync_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
9.14 ? 11% -5.4 3.72 ? 22% perf-profile.calltrace.cycles-pp.schedule.pipe_write.new_sync_write.vfs_write.ksys_write
8.94 ? 11% -5.3 3.64 ? 23% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_write.new_sync_write.vfs_write
19.71 ? 4% -5.3 14.44 ? 12% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
20.13 ? 4% -5.2 14.98 ? 12% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
5.23 ? 6% -3.1 2.09 ? 20% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
5.46 ? 6% -3.1 2.34 ? 19% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
4.98 ? 19% -2.9 2.06 ? 27% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
4.89 ? 19% -2.9 2.02 ? 28% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
4.88 ? 20% -2.9 2.03 ? 27% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_write.new_sync_write
5.94 ? 3% -2.6 3.30 ? 4% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.72 ? 3% -2.6 3.17 ? 4% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
5.51 ? 3% -2.5 3.03 ? 4% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
3.34 ? 5% -2.1 1.28 ? 19% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
3.24 ? 5% -2.0 1.24 ? 19% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
2.99 ? 5% -1.9 1.13 ? 19% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function
1.08 ? 5% -0.8 0.28 ?100% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
1.21 ? 5% -0.6 0.57 ? 11% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
0.87 ? 19% -0.6 0.31 ?100% perf-profile.calltrace.cycles-pp.available_idle_cpu.select_idle_cpu.select_idle_sibling.select_task_rq_fair.select_task_rq
1.41 ? 3% -0.4 0.98 ? 7% perf-profile.calltrace.cycles-pp.cpu_startup_entry.secondary_startup_64_no_verify
1.43 ? 2% -0.4 1.00 ? 6% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
1.41 ? 3% -0.4 0.98 ? 6% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
0.55 ? 7% +0.2 0.71 ? 5% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.core_sys_select.kern_select.__x64_sys_select.do_syscall_64
0.99 ? 8% +0.2 1.16 ? 7% perf-profile.calltrace.cycles-pp._copy_from_user.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.59 ? 9% +0.2 0.78 ? 7% perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter.pipe_write.new_sync_write.vfs_write
0.80 ? 9% +0.2 1.05 ? 7% perf-profile.calltrace.cycles-pp.select_estimate_accuracy.do_select.core_sys_select.kern_select.__x64_sys_select
0.76 ? 8% +0.3 1.02 ? 7% perf-profile.calltrace.cycles-pp._copy_to_user.poll_select_finish.kern_select.__x64_sys_select.do_syscall_64
0.82 ? 10% +0.3 1.08 ? 8% perf-profile.calltrace.cycles-pp._copy_from_user.core_sys_select.kern_select.__x64_sys_select.do_syscall_64
1.31 ? 4% +0.3 1.59 ? 2% perf-profile.calltrace.cycles-pp.stress_fifo
0.36 ? 70% +0.3 0.66 ? 7% perf-profile.calltrace.cycles-pp.ktime_get_ts64.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.34 ? 70% +0.3 0.64 ? 7% perf-profile.calltrace.cycles-pp.ktime_get_ts64.select_estimate_accuracy.do_select.core_sys_select.kern_select
0.36 ? 71% +0.3 0.67 ? 7% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.copy_page_from_iter.pipe_write.new_sync_write
0.35 ? 70% +0.3 0.67 ? 7% perf-profile.calltrace.cycles-pp.ktime_get_ts64.poll_select_finish.kern_select.__x64_sys_select.do_syscall_64
1.49 ? 10% +0.4 1.87 ? 8% perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.new_sync_read.vfs_read.ksys_read
1.29 ? 9% +0.4 1.71 ? 7% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.new_sync_write.vfs_write.ksys_write
0.17 ?141% +0.5 0.62 ? 7% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_to_user.poll_select_finish.kern_select.__x64_sys_select
0.86 ? 16% +0.5 1.32 ? 11% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.new_sync_write.vfs_write.ksys_write
0.36 ? 70% +0.5 0.84 ? 35% perf-profile.calltrace.cycles-pp.__entry_text_start.write
0.09 ?223% +0.5 0.58 ? 7% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.36 ? 71% +0.5 0.86 ? 36% perf-profile.calltrace.cycles-pp.__entry_text_start.read
0.36 ? 70% +0.5 0.86 ? 35% perf-profile.calltrace.cycles-pp.__entry_text_start.select
0.09 ?223% +0.5 0.60 ? 6% perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.new_sync_read.vfs_read.ksys_read
0.19 ?142% +0.5 0.72 ? 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.try_to_wake_up.pollwake.__wake_up_common.__wake_up_common_lock
0.09 ?223% +0.5 0.62 ? 9% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.core_sys_select.kern_select.__x64_sys_select
1.63 ? 8% +0.5 2.15 ? 7% perf-profile.calltrace.cycles-pp.poll_select_finish.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.51 ? 9% +0.5 2.05 ? 7% perf-profile.calltrace.cycles-pp.memset_erms.core_sys_select.kern_select.__x64_sys_select.do_syscall_64
0.09 ?223% +0.5 0.63 ? 7% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.kern_select.__x64_sys_select.do_syscall_64
0.00 +0.5 0.54 ? 4% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.10 ?223% +0.6 0.69 ? 13% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.new_sync_write.vfs_write.ksys_write
22.08 ? 5% +4.8 26.90 ? 6% perf-profile.calltrace.cycles-pp.do_select.core_sys_select.kern_select.__x64_sys_select.do_syscall_64
26.20 ? 4% +6.2 32.36 ? 4% perf-profile.calltrace.cycles-pp.core_sys_select.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.07 ? 3% +7.2 37.31 ? 3% perf-profile.calltrace.cycles-pp.kern_select.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe.select
30.28 ? 3% +7.2 37.52 ? 3% perf-profile.calltrace.cycles-pp.__x64_sys_select.do_syscall_64.entry_SYSCALL_64_after_hwframe.select
30.92 ? 3% +7.4 38.29 ? 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.select
31.20 ? 3% +7.5 38.66 ? 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.select
33.41 ? 3% +8.2 41.66 ? 2% perf-profile.calltrace.cycles-pp.select
29.32 ? 4% -8.6 20.76 ? 12% perf-profile.children.cycles-pp.read
11.29 ? 10% -6.9 4.40 ? 22% perf-profile.children.cycles-pp.autoremove_wake_function
24.00 ? 8% -6.3 17.74 ? 9% perf-profile.children.cycles-pp.schedule
24.16 ? 8% -6.2 18.00 ? 9% perf-profile.children.cycles-pp.__schedule
17.57 ? 5% -5.7 11.91 ? 14% perf-profile.children.cycles-pp.pipe_read
18.03 ? 5% -5.6 12.44 ? 13% perf-profile.children.cycles-pp.new_sync_read
19.78 ? 4% -5.3 14.52 ? 12% perf-profile.children.cycles-pp.vfs_read
20.17 ? 4% -5.1 15.03 ? 12% perf-profile.children.cycles-pp.ksys_read
7.09 ? 3% -2.8 4.34 ? 4% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
6.02 ? 3% -2.7 3.33 ? 4% perf-profile.children.cycles-pp.exit_to_user_mode_loop
7.58 ? 3% -2.6 5.00 ? 3% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
6.24 ? 8% -1.5 4.72 ? 13% perf-profile.children.cycles-pp.select_task_rq
6.06 ? 8% -1.5 4.56 ? 14% perf-profile.children.cycles-pp.select_task_rq_fair
5.61 ? 9% -1.4 4.20 ? 14% perf-profile.children.cycles-pp.select_idle_sibling
2.31 ? 6% -1.2 1.07 ? 6% perf-profile.children.cycles-pp.prepare_task_switch
3.63 ? 4% -1.1 2.54 ? 5% perf-profile.children.cycles-pp.pick_next_task_fair
4.55 ? 6% -1.1 3.47 ? 9% perf-profile.children.cycles-pp.update_load_avg
1.95 ? 8% -1.0 0.98 ? 12% perf-profile.children.cycles-pp.available_idle_cpu
3.06 ? 5% -0.9 2.16 ? 8% perf-profile.children.cycles-pp.update_curr
3.70 ? 3% -0.8 2.88 ? 7% perf-profile.children.cycles-pp.switch_mm_irqs_off
1.24 ? 5% -0.7 0.59 ? 4% perf-profile.children.cycles-pp.ttwu_do_wakeup
1.18 ? 5% -0.6 0.53 ? 3% perf-profile.children.cycles-pp.check_preempt_curr
1.01 ? 6% -0.6 0.40 ? 4% perf-profile.children.cycles-pp.check_preempt_wakeup
1.59 ? 5% -0.5 1.08 ? 8% perf-profile.children.cycles-pp.__update_load_avg_se
1.38 ? 5% -0.5 0.87 ? 8% perf-profile.children.cycles-pp.update_rq_clock
1.61 ? 7% -0.5 1.13 ? 10% perf-profile.children.cycles-pp.reweight_entity
0.79 ? 2% -0.4 0.35 ? 17% perf-profile.children.cycles-pp.cpuidle_idle_call
1.43 ? 2% -0.4 1.00 ? 6% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
1.43 ? 2% -0.4 1.00 ? 6% perf-profile.children.cycles-pp.cpu_startup_entry
1.43 ? 2% -0.4 1.00 ? 6% perf-profile.children.cycles-pp.do_idle
1.57 ? 5% -0.4 1.16 ? 7% perf-profile.children.cycles-pp.set_next_entity
1.03 ? 6% -0.4 0.63 ? 6% perf-profile.children.cycles-pp.___perf_sw_event
0.70 ? 2% -0.4 0.30 ? 18% perf-profile.children.cycles-pp.cpuidle_enter
0.70 ? 2% -0.4 0.30 ? 18% perf-profile.children.cycles-pp.cpuidle_enter_state
1.27 ? 6% -0.4 0.89 ? 8% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.62 ? 7% -0.4 0.26 ? 20% perf-profile.children.cycles-pp.prepare_to_wait_event
0.62 ? 2% -0.4 0.26 ? 17% perf-profile.children.cycles-pp.intel_idle
0.62 ? 2% -0.4 0.26 ? 17% perf-profile.children.cycles-pp.mwait_idle_with_hints
1.38 ? 4% -0.4 1.03 ? 7% perf-profile.children.cycles-pp.__switch_to_asm
2.05 ? 5% -0.3 1.70 ? 9% perf-profile.children.cycles-pp.dequeue_entity
1.42 ? 2% -0.3 1.08 ? 6% perf-profile.children.cycles-pp.__switch_to
0.70 ? 3% -0.2 0.48 ? 3% perf-profile.children.cycles-pp.finish_task_switch
0.64 ? 2% -0.2 0.42 ? 5% perf-profile.children.cycles-pp.put_prev_entity
0.31 ? 6% -0.2 0.14 ? 7% perf-profile.children.cycles-pp.resched_curr
0.79 ? 4% -0.1 0.66 ? 8% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.50 -0.1 0.37 ? 6% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.48 ? 2% -0.1 0.36 ? 7% perf-profile.children.cycles-pp.save_fpregs_to_fpstate
0.74 ? 3% -0.1 0.63 ? 9% perf-profile.children.cycles-pp.switch_fpu_return
0.24 ? 29% -0.1 0.14 ? 7% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.18 ? 20% -0.1 0.09 ? 16% perf-profile.children.cycles-pp.get_nohz_timer_target
0.46 ? 6% -0.1 0.37 ? 10% perf-profile.children.cycles-pp.update_min_vruntime
0.36 ? 3% -0.1 0.27 ? 6% perf-profile.children.cycles-pp.pick_next_entity
0.32 ? 2% -0.1 0.24 ? 5% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.18 ? 10% -0.1 0.10 ? 17% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.35 ? 6% -0.1 0.28 ? 8% perf-profile.children.cycles-pp.cpumask_next
0.43 ? 3% -0.1 0.36 ? 5% perf-profile.children.cycles-pp.__list_add_valid
0.41 ? 6% -0.1 0.34 ? 10% perf-profile.children.cycles-pp.__calc_delta
0.29 ? 5% -0.1 0.23 ? 9% perf-profile.children.cycles-pp.__cgroup_account_cputime
0.41 ? 5% -0.1 0.35 ? 10% perf-profile.children.cycles-pp.perf_tp_event
0.17 ? 4% -0.1 0.11 ? 6% perf-profile.children.cycles-pp.clear_buddies
0.13 ? 11% -0.1 0.08 ? 16% perf-profile.children.cycles-pp.sysvec_call_function_single
0.29 ? 4% -0.1 0.24 ? 10% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
0.11 ? 23% -0.1 0.06 ? 8% perf-profile.children.cycles-pp.switch_ldt
0.20 ? 3% -0.0 0.16 ? 8% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.20 ? 7% -0.0 0.16 ? 12% perf-profile.children.cycles-pp.set_next_buddy
0.12 ? 8% -0.0 0.08 ? 17% perf-profile.children.cycles-pp.__sysvec_call_function_single
0.11 ? 9% -0.0 0.07 ? 18% perf-profile.children.cycles-pp.sched_ttwu_pending
0.22 ? 6% -0.0 0.19 ? 11% perf-profile.children.cycles-pp.perf_trace_buf_update
0.06 ? 6% -0.0 0.03 ? 70% perf-profile.children.cycles-pp.perf_swevent_event
0.12 ? 4% -0.0 0.09 ? 7% perf-profile.children.cycles-pp.schedule_debug
0.14 ? 3% -0.0 0.12 ? 4% perf-profile.children.cycles-pp.ttwu_queue_wakelist
0.08 ? 4% -0.0 0.06 ? 8% perf-profile.children.cycles-pp.perf_exclude_event
0.11 ? 6% -0.0 0.09 ? 5% perf-profile.children.cycles-pp.rb_next
0.08 ? 4% -0.0 0.06 ? 11% perf-profile.children.cycles-pp.rcu_note_context_switch
0.13 ? 4% -0.0 0.11 ? 7% perf-profile.children.cycles-pp.perf_trace_buf_alloc
0.06 ? 6% -0.0 0.04 ? 44% perf-profile.children.cycles-pp.cr4_update_irqsoff
0.09 ? 4% -0.0 0.07 ? 9% perf-profile.children.cycles-pp.perf_trace_run_bpf_submit
0.08 ? 10% +0.0 0.10 ? 8% perf-profile.children.cycles-pp.timespec64_add_safe
0.10 ? 9% +0.0 0.12 ? 8% perf-profile.children.cycles-pp.set_normalized_timespec64
0.08 ? 4% +0.0 0.11 ? 5% perf-profile.children.cycles-pp.__x64_sys_write
0.15 ? 8% +0.0 0.19 ? 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.12 ? 9% +0.0 0.15 ? 8% perf-profile.children.cycles-pp.rw_verify_area
0.14 ? 9% +0.0 0.18 ? 4% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
0.14 ? 8% +0.0 0.19 ? 7% perf-profile.children.cycles-pp.check_stack_object
0.00 +0.1 0.05 perf-profile.children.cycles-pp.__fdget
0.02 ?142% +0.1 0.07 ? 10% perf-profile.children.cycles-pp.__bitmap_andnot
0.16 ? 9% +0.1 0.22 ? 6% perf-profile.children.cycles-pp.memset
0.11 ? 11% +0.1 0.17 ? 6% perf-profile.children.cycles-pp.fsnotify_perm
0.39 ? 3% +0.1 0.46 ? 6% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.44 ? 3% +0.1 0.51 ? 6% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.23 ? 8% +0.1 0.30 ? 6% perf-profile.children.cycles-pp.aa_file_perm
0.36 ? 3% +0.1 0.43 ? 6% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.36 ? 4% +0.1 0.43 ? 6% perf-profile.children.cycles-pp.hrtimer_interrupt
0.25 ? 7% +0.1 0.33 ? 6% perf-profile.children.cycles-pp.rcu_all_qs
0.23 ? 3% +0.1 0.32 ? 7% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.20 ? 3% +0.1 0.28 ? 8% perf-profile.children.cycles-pp.tick_sched_timer
0.15 ? 2% +0.1 0.24 ? 9% perf-profile.children.cycles-pp.scheduler_tick
0.18 ? 2% +0.1 0.26 ? 8% perf-profile.children.cycles-pp.update_process_times
0.18 ? 2% +0.1 0.27 ? 8% perf-profile.children.cycles-pp.tick_sched_handle
0.59 ? 3% +0.1 0.68 ? 3% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.12 ? 6% +0.1 0.21 ? 11% perf-profile.children.cycles-pp.task_tick_fair
0.24 ? 15% +0.1 0.35 ? 12% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.40 ? 8% +0.1 0.53 ? 5% perf-profile.children.cycles-pp.atime_needs_update
0.34 ? 10% +0.1 0.47 ? 10% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.37 ? 8% +0.1 0.50 ? 5% perf-profile.children.cycles-pp.file_update_time
0.38 ? 9% +0.1 0.51 ? 6% perf-profile.children.cycles-pp.__check_object_size
1.21 ? 5% +0.1 1.35 ? 4% perf-profile.children.cycles-pp.security_file_permission
0.40 ? 8% +0.1 0.54 ? 5% perf-profile.children.cycles-pp.current_time
0.49 ? 8% +0.2 0.65 ? 6% perf-profile.children.cycles-pp.touch_atime
0.51 ? 9% +0.2 0.66 ? 4% perf-profile.children.cycles-pp.__fdget_pos
0.55 ? 8% +0.2 0.70 ? 6% perf-profile.children.cycles-pp.__cond_resched
0.77 ? 7% +0.2 0.94 ? 5% perf-profile.children.cycles-pp.apparmor_file_permission
0.62 ? 8% +0.2 0.81 ? 7% perf-profile.children.cycles-pp.copyin
0.67 ? 9% +0.2 0.87 ? 7% perf-profile.children.cycles-pp.__fsnotify_parent
0.82 ? 5% +0.2 1.03 perf-profile.children.cycles-pp.__fget_light
0.84 ? 9% +0.3 1.10 ? 7% perf-profile.children.cycles-pp.select_estimate_accuracy
0.77 ? 10% +0.3 1.03 ? 8% perf-profile.children.cycles-pp.__might_sleep
0.81 ? 8% +0.3 1.08 ? 6% perf-profile.children.cycles-pp._copy_to_user
1.39 ? 4% +0.3 1.68 ? 2% perf-profile.children.cycles-pp.stress_fifo
1.01 ? 9% +0.3 1.33 ? 7% perf-profile.children.cycles-pp.read_tsc
1.57 ? 10% +0.4 1.97 ? 8% perf-profile.children.cycles-pp.copy_page_to_iter
1.37 ? 9% +0.4 1.79 ? 7% perf-profile.children.cycles-pp.__might_fault
1.35 ? 9% +0.4 1.79 ? 7% perf-profile.children.cycles-pp.copy_page_from_iter
1.35 ? 8% +0.4 1.80 ? 6% perf-profile.children.cycles-pp.__might_resched
1.90 ? 9% +0.5 2.37 ? 7% perf-profile.children.cycles-pp._copy_from_user
1.74 ? 8% +0.5 2.22 ? 7% perf-profile.children.cycles-pp.syscall_return_via_sysret
1.56 ? 9% +0.5 2.05 ? 7% perf-profile.children.cycles-pp.ktime_get_ts64
1.54 ? 9% +0.5 2.04 ? 7% perf-profile.children.cycles-pp.__entry_text_start
1.68 ? 8% +0.5 2.23 ? 7% perf-profile.children.cycles-pp.poll_select_finish
1.58 ? 9% +0.6 2.14 ? 7% perf-profile.children.cycles-pp.memset_erms
1.91 ? 16% +0.9 2.78 ? 11% perf-profile.children.cycles-pp.mutex_lock
3.45 ? 9% +1.0 4.44 ? 7% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
22.20 ? 5% +4.9 27.06 ? 6% perf-profile.children.cycles-pp.do_select
26.42 ? 3% +6.2 32.66 ? 4% perf-profile.children.cycles-pp.core_sys_select
30.31 ? 3% +7.2 37.56 ? 3% perf-profile.children.cycles-pp.__x64_sys_select
30.15 ? 3% +7.3 37.41 ? 3% perf-profile.children.cycles-pp.kern_select
33.84 ? 3% +8.2 42.03 ? 3% perf-profile.children.cycles-pp.select
1.89 ? 8% -1.0 0.94 ? 12% perf-profile.self.cycles-pp.available_idle_cpu
1.30 ? 7% -0.8 0.46 ? 7% perf-profile.self.cycles-pp.prepare_task_switch
3.64 ? 3% -0.8 2.84 ? 7% perf-profile.self.cycles-pp.switch_mm_irqs_off
1.36 ? 7% -0.7 0.70 ? 8% perf-profile.self.cycles-pp._raw_spin_lock
2.06 ? 4% -0.6 1.48 ? 8% perf-profile.self.cycles-pp.__schedule
1.47 ? 5% -0.5 0.99 ? 8% perf-profile.self.cycles-pp.__update_load_avg_se
1.38 ? 5% -0.5 0.91 ? 8% perf-profile.self.cycles-pp.update_curr
0.85 ? 6% -0.4 0.42 ? 8% perf-profile.self.cycles-pp.update_rq_clock
0.80 ? 9% -0.4 0.43 ? 11% perf-profile.self.cycles-pp.dequeue_task_fair
0.90 ? 6% -0.4 0.53 ? 7% perf-profile.self.cycles-pp.___perf_sw_event
1.21 ? 6% -0.4 0.84 ? 8% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.61 ? 2% -0.4 0.25 ? 18% perf-profile.self.cycles-pp.mwait_idle_with_hints
1.36 ? 4% -0.3 1.01 ? 7% perf-profile.self.cycles-pp.__switch_to_asm
1.37 ? 3% -0.3 1.04 ? 6% perf-profile.self.cycles-pp.__switch_to
0.82 ? 5% -0.3 0.50 ? 6% perf-profile.self.cycles-pp.pick_next_task_fair
0.75 ? 7% -0.3 0.44 ? 9% perf-profile.self.cycles-pp.enqueue_task_fair
0.75 ? 7% -0.3 0.48 ? 10% perf-profile.self.cycles-pp.reweight_entity
0.77 ? 9% -0.2 0.54 ? 12% perf-profile.self.cycles-pp.select_idle_sibling
0.38 ? 6% -0.2 0.17 ? 4% perf-profile.self.cycles-pp.check_preempt_wakeup
0.44 ? 4% -0.2 0.27 ? 5% perf-profile.self.cycles-pp.finish_task_switch
0.30 ? 7% -0.2 0.13 ? 6% perf-profile.self.cycles-pp.resched_curr
0.71 ? 5% -0.2 0.55 ? 9% perf-profile.self.cycles-pp.enqueue_entity
0.48 ? 2% -0.1 0.36 ? 7% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.47 ? 2% -0.1 0.36 ? 6% perf-profile.self.cycles-pp.save_fpregs_to_fpstate
0.20 ? 6% -0.1 0.09 ? 18% perf-profile.self.cycles-pp.prepare_to_wait_event
0.18 ? 19% -0.1 0.08 ? 17% perf-profile.self.cycles-pp.get_nohz_timer_target
0.32 ? 4% -0.1 0.22 ? 5% perf-profile.self.cycles-pp.set_next_entity
0.39 ? 6% -0.1 0.30 ? 10% perf-profile.self.cycles-pp.dequeue_entity
0.38 ? 6% -0.1 0.29 ? 9% perf-profile.self.cycles-pp.select_task_rq_fair
0.31 ? 2% -0.1 0.22 ? 6% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.22 ? 3% -0.1 0.14 ? 5% perf-profile.self.cycles-pp.exit_to_user_mode_loop
0.42 ? 4% -0.1 0.34 ? 7% perf-profile.self.cycles-pp.schedule
0.30 ? 5% -0.1 0.22 ? 3% perf-profile.self.cycles-pp.switch_fpu_return
0.39 ? 7% -0.1 0.31 ? 10% perf-profile.self.cycles-pp.update_min_vruntime
0.31 ? 4% -0.1 0.23 ? 6% perf-profile.self.cycles-pp.pick_next_entity
0.38 ? 2% -0.1 0.32 ? 6% perf-profile.self.cycles-pp.__list_add_valid
0.10 ? 28% -0.1 0.03 ? 70% perf-profile.self.cycles-pp.switch_ldt
0.38 ? 5% -0.1 0.32 ? 10% perf-profile.self.cycles-pp.__calc_delta
0.13 ? 4% -0.1 0.08 ? 7% perf-profile.self.cycles-pp.clear_buddies
0.19 ? 4% -0.0 0.14 ? 7% perf-profile.self.cycles-pp.perf_trace_sched_switch
0.20 -0.0 0.15 ? 4% perf-profile.self.cycles-pp.put_prev_entity
0.27 ? 4% -0.0 0.22 ? 11% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
0.18 ? 3% -0.0 0.14 ? 11% perf-profile.self.cycles-pp.__cgroup_account_cputime
0.20 ? 6% -0.0 0.17 ? 8% perf-profile.self.cycles-pp.perf_tp_event
0.17 ? 5% -0.0 0.14 ? 7% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
0.14 ? 5% -0.0 0.11 ? 9% perf-profile.self.cycles-pp.check_preempt_curr
0.06 ? 7% -0.0 0.03 ? 70% perf-profile.self.cycles-pp.perf_exclude_event
0.11 ? 6% -0.0 0.09 ? 10% perf-profile.self.cycles-pp.cpumask_next
0.11 ? 4% -0.0 0.08 ? 8% perf-profile.self.cycles-pp.schedule_debug
0.15 ? 3% -0.0 0.13 ? 8% perf-profile.self.cycles-pp.ttwu_do_activate
0.09 ? 5% -0.0 0.08 ? 6% perf-profile.self.cycles-pp.rb_next
0.08 ? 4% -0.0 0.07 ? 7% perf-profile.self.cycles-pp.perf_trace_run_bpf_submit
0.06 ? 11% +0.0 0.08 ? 6% perf-profile.self.cycles-pp.memset
0.06 ? 11% +0.0 0.08 ? 5% perf-profile.self.cycles-pp.timespec64_add_safe
0.06 ? 9% +0.0 0.08 ? 7% perf-profile.self.cycles-pp.__x64_sys_write
0.04 ? 44% +0.0 0.06 ? 7% perf-profile.self.cycles-pp.copyin
0.07 ? 8% +0.0 0.10 ? 8% perf-profile.self.cycles-pp.set_normalized_timespec64
0.04 ? 45% +0.0 0.07 ? 5% perf-profile.self.cycles-pp.copyout
0.10 ? 9% +0.0 0.13 ? 9% perf-profile.self.cycles-pp.touch_atime
0.09 ? 9% +0.0 0.12 ? 8% perf-profile.self.cycles-pp.remove_wait_queue
0.09 ? 7% +0.0 0.12 ? 5% perf-profile.self.cycles-pp.rw_verify_area
0.10 ? 6% +0.0 0.13 ? 8% perf-profile.self.cycles-pp.__fdget_pos
0.14 ? 7% +0.0 0.18 ? 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.11 ? 8% +0.0 0.14 ? 6% perf-profile.self.cycles-pp._copy_to_user
0.12 ? 9% +0.0 0.16 ? 4% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.12 ? 8% +0.0 0.16 ? 7% perf-profile.self.cycles-pp.check_stack_object
0.16 ? 9% +0.0 0.20 ? 7% perf-profile.self.cycles-pp.ksys_read
0.14 ? 11% +0.0 0.18 ? 7% perf-profile.self.cycles-pp.poll_freewait
0.14 ? 7% +0.0 0.18 ? 6% perf-profile.self.cycles-pp.rcu_all_qs
0.02 ?142% +0.0 0.07 ? 7% perf-profile.self.cycles-pp.__bitmap_andnot
0.17 ? 8% +0.1 0.23 ? 6% perf-profile.self.cycles-pp.file_update_time
0.10 ? 12% +0.1 0.16 ? 6% perf-profile.self.cycles-pp.fsnotify_perm
0.19 ? 10% +0.1 0.25 ? 6% perf-profile.self.cycles-pp.atime_needs_update
0.28 ? 9% +0.1 0.34 ? 6% perf-profile.self.cycles-pp.__might_fault
0.20 ? 10% +0.1 0.27 ? 6% perf-profile.self.cycles-pp.aa_file_perm
0.17 ? 12% +0.1 0.24 ? 9% perf-profile.self.cycles-pp.ksys_write
0.36 ? 5% +0.1 0.43 ? 3% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.20 ? 9% +0.1 0.27 ? 8% perf-profile.self.cycles-pp.__check_object_size
0.29 ? 9% +0.1 0.38 ? 7% perf-profile.self.cycles-pp.select_estimate_accuracy
0.25 ? 8% +0.1 0.34 ? 6% perf-profile.self.cycles-pp.current_time
0.54 ? 7% +0.1 0.62 ? 5% perf-profile.self.cycles-pp.apparmor_file_permission
0.81 ? 4% +0.1 0.90 ? 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.24 ? 10% +0.1 0.34 ? 6% perf-profile.self.cycles-pp.new_sync_write
0.34 ? 8% +0.1 0.43 ? 7% perf-profile.self.cycles-pp.poll_select_finish
0.30 ? 9% +0.1 0.40 ? 6% perf-profile.self.cycles-pp.copy_page_from_iter
0.31 ? 8% +0.1 0.41 ? 6% perf-profile.self.cycles-pp.__cond_resched
0.20 ? 17% +0.1 0.31 ? 13% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.30 ? 7% +0.1 0.40 ? 5% perf-profile.self.cycles-pp.new_sync_read
0.42 ? 9% +0.1 0.54 ? 8% perf-profile.self.cycles-pp.vfs_read
0.29 ? 11% +0.1 0.41 ? 11% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.39 ? 9% +0.1 0.50 ? 7% perf-profile.self.cycles-pp.copy_page_to_iter
0.36 ? 10% +0.1 0.47 ? 7% perf-profile.self.cycles-pp.vfs_write
0.50 ? 8% +0.1 0.62 ? 6% perf-profile.self.cycles-pp.read
0.51 ? 6% +0.1 0.65 ? 5% perf-profile.self.cycles-pp.select
0.46 ? 6% +0.1 0.59 ? 3% perf-profile.self.cycles-pp.kern_select
0.41 ? 9% +0.1 0.54 ? 6% perf-profile.self.cycles-pp.__entry_text_start
0.51 ? 6% +0.1 0.65 ? 4% perf-profile.self.cycles-pp.write
0.25 ? 14% +0.2 0.41 ? 3% perf-profile.self.cycles-pp.pollwake
0.56 ? 8% +0.2 0.73 ? 8% perf-profile.self.cycles-pp.ktime_get_ts64
0.38 ? 11% +0.2 0.55 ? 7% perf-profile.self.cycles-pp.pipe_poll
0.63 ? 7% +0.2 0.82 ? 6% perf-profile.self.cycles-pp.core_sys_select
0.78 ? 5% +0.2 0.98 ? 2% perf-profile.self.cycles-pp.__fget_light
0.61 ? 9% +0.2 0.81 ? 7% perf-profile.self.cycles-pp.__fsnotify_parent
0.64 ? 10% +0.2 0.85 ? 8% perf-profile.self.cycles-pp.__might_sleep
1.22 ? 5% +0.3 1.51 ? 2% perf-profile.self.cycles-pp.stress_fifo
0.97 ? 9% +0.3 1.28 ? 7% perf-profile.self.cycles-pp.read_tsc
1.20 ? 8% +0.4 1.60 ? 6% perf-profile.self.cycles-pp.__might_resched
1.73 ? 8% +0.5 2.22 ? 7% perf-profile.self.cycles-pp.syscall_return_via_sysret
1.52 ? 9% +0.5 2.06 ? 7% perf-profile.self.cycles-pp.memset_erms
1.96 ? 3% +0.5 2.50 ? 2% perf-profile.self.cycles-pp.do_select
1.34 ? 20% +0.7 2.02 ? 12% perf-profile.self.cycles-pp.mutex_lock
3.32 ? 9% +1.0 4.27 ? 7% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (190.99 kB)
config-5.18.0-rc1-00004-g6b433275e3a3 (165.11 kB)
job-script (8.12 kB)
job.yaml (5.54 kB)
reproduce (348.00 B)
Download all attachments