2023-05-10 08:13:31

by Oliver Sang

[permalink] [raw]
Subject: [linus:master] [sched/numa] fc137c0dda: autonuma-benchmark.numa01.seconds 118.9% regression



Hello,

kernel test robot noticed a 118.9% regression of autonuma-benchmark.numa01.seconds on:


commit: fc137c0ddab29b591db6a091dc6d7ce20ccb73f2 ("sched/numa: enhance vma scanning logic")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: autonuma-benchmark
test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory
parameters:

iterations: 4x
test: numa02_SMT
cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+------------------------------------------------------------------------------------------------+
| testcase: change | autonuma-benchmark: autonuma-benchmark.numa01.seconds 39.3% regression |
| test machine | 224 threads 2 sockets (Sapphire Rapids) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | iterations=4x |
| | test=numa02_SMT |
+------------------+------------------------------------------------------------------------------------------------+
| testcase: change | autonuma-benchmark: autonuma-benchmark.numa01.seconds 48.9% regression |
| test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory |
| test parameters | cpufreq_governor=performance |
| | debug-setup=no-monitor |
| | iterations=4x |
| | test=numa02_SMT |
+------------------+------------------------------------------------------------------------------------------------+


one thing we want to mention is this is numa02_SMT test, but we found the
regression mostly for autonuma-benchmark.numa01.seconds
summary as below and detailed data is in [1]

=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/numa02_SMT/autonuma-benchmark

ef6a22b70f6d9044 fc137c0ddab29b591db6a091dc6
---------------- ---------------------------
%stddev %change %stddev
\ | \
198.95 ? 7% +118.9% 435.43 ? 2% autonuma-benchmark.numa01.seconds
14.30 ? 2% -3.3% 13.83 ? 2% autonuma-benchmark.numa02.seconds
12.46 -1.1% 12.33 ? 2% autonuma-benchmark.numa02_SMT.seconds


=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-spr-r02/numa02_SMT/autonuma-benchmark

ef6a22b70f6d9044 fc137c0ddab29b591db6a091dc6
---------------- ---------------------------
%stddev %change %stddev
\ | \
191.58 ? 2% +39.3% 266.89 autonuma-benchmark.numa01.seconds
8.06 +0.2% 8.08 autonuma-benchmark.numa02.seconds
6.41 ? 12% +8.1% 6.93 autonuma-benchmark.numa02_SMT.seconds


then we test again with the perf monitor disabled to reduce the impact by perf
since perf profile data has many perf command related functions.
we could see the performance delta is reduced (even numa01 still have around 50%
regression).

=========================================================================================
compiler/cpufreq_governor/debug-setup/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/performance/no-monitor/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/numa02_SMT/autonuma-benchmark

ef6a22b70f6d9044 fc137c0ddab29b591db6a091dc6
---------------- ---------------------------
%stddev %change %stddev
\ | \
203.37 ? 8% +48.9% 302.87 ? 3% autonuma-benchmark.numa01.seconds
13.85 -0.4% 13.80 autonuma-benchmark.numa02.seconds
12.20 -0.2% 12.18 autonuma-benchmark.numa02_SMT.seconds


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

[1]

=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/numa02_SMT/autonuma-benchmark

commit:
ef6a22b70f ("sched/numa: apply the scan delay to every new vma")
fc137c0dda ("sched/numa: enhance vma scanning logic")

ef6a22b70f6d9044 fc137c0ddab29b591db6a091dc6
---------------- ---------------------------
%stddev %change %stddev
\ | \
198.95 ? 7% +118.9% 435.43 ? 2% autonuma-benchmark.numa01.seconds
14.30 ? 2% -3.3% 13.83 ? 2% autonuma-benchmark.numa02.seconds
12.46 -1.1% 12.33 ? 2% autonuma-benchmark.numa02_SMT.seconds
907.24 ? 6% +104.0% 1850 autonuma-benchmark.time.elapsed_time
907.24 ? 6% +104.0% 1850 autonuma-benchmark.time.elapsed_time.max
391359 ? 7% +109.6% 820115 ? 2% autonuma-benchmark.time.involuntary_context_switches
1674215 ? 3% +25.4% 2099248 autonuma-benchmark.time.minor_page_faults
68259 ? 7% +104.6% 139628 autonuma-benchmark.time.user_time
33292 ? 17% -31.5% 22801 autonuma-benchmark.time.voluntary_context_switches
90254 -1.3% 89057 vmstat.system.in
8.255e+09 ? 4% +97.9% 1.634e+10 ? 6% cpuidle..time
8518920 ? 4% +98.1% 16871964 ? 6% cpuidle..usage
953.10 ? 6% +99.0% 1896 uptime.boot
11931 ? 2% +68.5% 20105 ? 5% uptime.idle
31722 ? 2% +35.7% 43051 ? 3% meminfo.Active
31610 ? 2% +35.9% 42947 ? 3% meminfo.Active(anon)
1567 ? 3% +28.0% 2006 meminfo.Mlocked
44483 ? 4% +20.2% 53481 ? 5% meminfo.Shmem
0.05 ? 17% -0.1 0.00 ?115% mpstat.cpu.all.iowait%
1.57 +0.9 2.44 mpstat.cpu.all.irq%
0.05 +0.0 0.08 mpstat.cpu.all.soft%
2.47 ? 3% -1.0 1.51 ? 4% mpstat.cpu.all.sys%
1325319 ? 13% +67.1% 2214191 ? 6% numa-numastat.node0.local_node
1669069 ? 11% +46.6% 2446395 ? 6% numa-numastat.node0.numa_hit
343780 ? 19% -32.5% 232190 ? 24% numa-numastat.node0.other_node
1351528 ? 11% +58.0% 2134780 ? 5% numa-numastat.node1.local_node
1632897 ? 8% +53.8% 2511008 ? 4% numa-numastat.node1.numa_hit
25990 ? 6% +24.2% 32278 ? 9% turbostat.C1
0.08 ? 14% -0.1 0.03 ? 17% turbostat.C1E%
8339796 ? 4% +99.9% 16670602 ? 6% turbostat.C6
0.42 ? 11% -44.0% 0.24 ? 7% turbostat.CPU%c6
83201587 ? 6% +97.6% 1.644e+08 turbostat.IRQ
32712 ? 16% +51.9% 49691 ? 7% turbostat.POLL
0.34 ? 13% -43.8% 0.19 ? 7% turbostat.Pkg%pc2
250.53 -5.1% 237.66 turbostat.PkgWatt
27.99 -20.6% 22.23 turbostat.RAMWatt
4846 ? 8% +87.7% 9098 ? 12% numa-meminfo.node0.Active
4790 ? 8% +88.9% 9047 ? 12% numa-meminfo.node0.Active(anon)
8801 ? 7% +53.5% 13508 ? 8% numa-meminfo.node0.Shmem
26801 +26.6% 33928 ? 5% numa-meminfo.node1.Active
26745 +26.7% 33875 ? 5% numa-meminfo.node1.Active(anon)
2666372 ? 2% +40.1% 3735457 ? 18% numa-meminfo.node1.AnonHugePages
2730009 ? 2% +39.4% 3806916 ? 17% numa-meminfo.node1.AnonPages
2738755 ? 2% +39.2% 3813200 ? 17% numa-meminfo.node1.Inactive
2738657 ? 2% +39.2% 3813099 ? 17% numa-meminfo.node1.Inactive(anon)
35654 ? 5% +11.9% 39902 ? 6% numa-meminfo.node1.Shmem
1198 ? 8% +89.0% 2264 ? 12% numa-vmstat.node0.nr_active_anon
2201 ? 7% +53.5% 3380 ? 8% numa-vmstat.node0.nr_shmem
1198 ? 8% +89.0% 2264 ? 12% numa-vmstat.node0.nr_zone_active_anon
1669280 ? 11% +46.5% 2445929 ? 5% numa-vmstat.node0.numa_hit
1325531 ? 13% +67.0% 2213724 ? 6% numa-vmstat.node0.numa_local
343780 ? 19% -32.5% 232190 ? 24% numa-vmstat.node0.numa_other
6697 +26.5% 8473 ? 5% numa-vmstat.node1.nr_active_anon
682644 ? 2% +39.4% 951464 ? 17% numa-vmstat.node1.nr_anon_pages
1301 ? 2% +40.0% 1822 ? 18% numa-vmstat.node1.nr_anon_transparent_hugepages
684804 ? 2% +39.2% 953008 ? 17% numa-vmstat.node1.nr_inactive_anon
8923 ? 5% +11.8% 9977 ? 6% numa-vmstat.node1.nr_shmem
6697 +26.5% 8473 ? 5% numa-vmstat.node1.nr_zone_active_anon
684803 ? 2% +39.2% 953007 ? 17% numa-vmstat.node1.nr_zone_inactive_anon
1632842 ? 8% +53.7% 2510369 ? 4% numa-vmstat.node1.numa_hit
1351473 ? 11% +57.9% 2134142 ? 5% numa-vmstat.node1.numa_local
7883 ? 2% +36.3% 10741 ? 3% proc-vmstat.nr_active_anon
1436177 +7.4% 1541812 proc-vmstat.nr_anon_pages
2739 +7.9% 2956 proc-vmstat.nr_anon_transparent_hugepages
1439302 +7.3% 1544428 proc-vmstat.nr_inactive_anon
391.83 ? 3% +28.1% 502.00 proc-vmstat.nr_mlock
3729 +5.2% 3925 proc-vmstat.nr_page_table_pages
11118 ? 4% +20.2% 13362 ? 5% proc-vmstat.nr_shmem
7883 ? 2% +36.3% 10741 ? 3% proc-vmstat.nr_zone_active_anon
1439302 +7.3% 1544428 proc-vmstat.nr_zone_inactive_anon
175921 ? 3% -79.8% 35622 ? 13% proc-vmstat.numa_hint_faults
115702 ? 8% -78.5% 24909 ? 19% proc-vmstat.numa_hint_faults_local
3304700 ? 4% +50.1% 4959330 proc-vmstat.numa_hit
102422 ? 4% -94.8% 5353 ? 7% proc-vmstat.numa_huge_pte_updates
2679582 ? 4% +62.4% 4350898 proc-vmstat.numa_local
623652 -2.4% 608410 ? 2% proc-vmstat.numa_other
8685106 ? 6% -87.5% 1082922 proc-vmstat.numa_pages_migrated
52587827 ? 4% -94.6% 2850465 ? 7% proc-vmstat.numa_pte_updates
4004609 ? 4% +58.4% 6344664 proc-vmstat.pgfault
8685106 ? 6% -87.5% 1082922 proc-vmstat.pgmigrate_success
165180 ? 6% +77.9% 293828 ? 2% proc-vmstat.pgreuse
16935 ? 6% -87.6% 2100 proc-vmstat.thp_migration_success
6874624 ? 6% +99.4% 13705216 ? 2% proc-vmstat.unevictable_pgs_scanned
1366547 ? 4% +65.9% 2267715 proc-vmstat.vma_lock_abort
2047 ? 83% -100.0% 0.00 proc-vmstat.vma_lock_retry
1291100 ? 2% +13.6% 1467326 proc-vmstat.vma_lock_success
4716 ? 9% -21.9% 3683 ? 16% sched_debug.cfs_rq:/.load.min
43237186 ? 10% +123.3% 96553100 ? 4% sched_debug.cfs_rq:/.min_vruntime.avg
44834019 ? 10% +125.1% 1.009e+08 ? 4% sched_debug.cfs_rq:/.min_vruntime.max
39418790 ? 11% +123.0% 87908268 ? 4% sched_debug.cfs_rq:/.min_vruntime.min
1144476 ? 6% +146.3% 2818840 ? 6% sched_debug.cfs_rq:/.min_vruntime.stddev
0.08 ? 28% +40.8% 0.11 ? 15% sched_debug.cfs_rq:/.nr_running.stddev
6.39 ? 45% -69.0% 1.98 ? 18% sched_debug.cfs_rq:/.removed.load_avg.avg
69.51 ? 8% -42.9% 39.71 ? 34% sched_debug.cfs_rq:/.removed.load_avg.max
19.45 ? 25% -56.8% 8.40 ? 23% sched_debug.cfs_rq:/.removed.load_avg.stddev
2.31 ? 46% -65.1% 0.81 ? 25% sched_debug.cfs_rq:/.removed.runnable_avg.avg
7.72 ? 26% -53.9% 3.56 ? 30% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
2.31 ? 46% -65.1% 0.81 ? 25% sched_debug.cfs_rq:/.removed.util_avg.avg
7.72 ? 26% -53.9% 3.56 ? 30% sched_debug.cfs_rq:/.removed.util_avg.stddev
163.15 ? 13% +22.1% 199.24 ? 9% sched_debug.cfs_rq:/.runnable_avg.stddev
3361686 ? 17% +112.3% 7136786 ? 18% sched_debug.cfs_rq:/.spread0.avg
4951283 ? 11% +132.6% 11515787 ? 15% sched_debug.cfs_rq:/.spread0.max
1139698 ? 6% +146.6% 2810782 ? 6% sched_debug.cfs_rq:/.spread0.stddev
113.85 ? 22% +28.3% 146.07 ? 10% sched_debug.cfs_rq:/.util_avg.stddev
451.84 ? 6% -98.4% 7.03 ? 41% sched_debug.cfs_rq:/.util_est_enqueued.avg
1078 ? 8% -78.1% 235.96 ? 34% sched_debug.cfs_rq:/.util_est_enqueued.max
290.02 ? 7% -87.9% 35.06 ? 30% sched_debug.cfs_rq:/.util_est_enqueued.stddev
489663 ? 22% +65.4% 809711 ? 5% sched_debug.cpu.avg_idle.min
192021 ? 10% +25.1% 240152 ? 4% sched_debug.cpu.avg_idle.stddev
465375 ? 7% +101.8% 939116 ? 4% sched_debug.cpu.clock.avg
465585 ? 7% +101.8% 939460 ? 4% sched_debug.cpu.clock.max
465153 ? 7% +101.8% 938746 ? 4% sched_debug.cpu.clock.min
124.32 ? 11% +66.1% 206.49 ? 7% sched_debug.cpu.clock.stddev
458182 ? 7% +99.9% 916080 ? 4% sched_debug.cpu.clock_task.avg
459113 ? 7% +100.3% 919749 ? 4% sched_debug.cpu.clock_task.max
442945 ? 8% +103.3% 900390 ? 4% sched_debug.cpu.clock_task.min
1728 ? 6% +82.2% 3149 ? 31% sched_debug.cpu.clock_task.stddev
13277 ? 7% +61.4% 21433 sched_debug.cpu.curr->pid.avg
16890 ? 6% +69.9% 28705 ? 4% sched_debug.cpu.curr->pid.max
9462 ? 17% +57.5% 14904 ? 14% sched_debug.cpu.curr->pid.min
1346 ? 36% +132.5% 3129 ? 27% sched_debug.cpu.curr->pid.stddev
675236 ? 3% +40.6% 949396 sched_debug.cpu.max_idle_balance_cost.max
38536 ? 16% +141.1% 92913 ? 4% sched_debug.cpu.max_idle_balance_cost.stddev
0.00 ? 11% +63.6% 0.00 ? 7% sched_debug.cpu.next_balance.stddev
11174 ? 7% +87.4% 20935 ? 4% sched_debug.cpu.nr_switches.avg
50841 ? 15% +60.9% 81805 ? 20% sched_debug.cpu.nr_switches.max
3594 ? 12% +77.5% 6378 ? 9% sched_debug.cpu.nr_switches.min
8462 ? 9% +65.1% 13969 ? 12% sched_debug.cpu.nr_switches.stddev
465146 ? 7% +101.8% 938733 ? 4% sched_debug.cpu_clk
462205 ? 7% +102.5% 935792 ? 4% sched_debug.ktime
465863 ? 7% +101.7% 939438 ? 4% sched_debug.sched_clk
56.31 ? 4% -37.4% 35.24 perf-stat.i.MPKI
1.195e+08 -12.0% 1.051e+08 perf-stat.i.branch-instructions
2170451 ? 3% -16.8% 1806326 perf-stat.i.branch-misses
65.87 +1.1 67.01 perf-stat.i.cache-miss-rate%
19697358 ? 3% -39.5% 11911557 perf-stat.i.cache-misses
29660372 ? 4% -39.7% 17871038 perf-stat.i.cache-references
422.84 +4.4% 441.24 perf-stat.i.cpi
135.40 -15.3% 114.62 perf-stat.i.cpu-migrations
11842 ? 3% +56.8% 18564 perf-stat.i.cycles-between-cache-misses
0.04 ? 2% -0.0 0.04 ? 5% perf-stat.i.dTLB-load-miss-rate%
74639 ? 2% -23.4% 57162 ? 4% perf-stat.i.dTLB-load-misses
1.655e+08 -9.8% 1.494e+08 perf-stat.i.dTLB-loads
0.26 -0.0 0.23 perf-stat.i.dTLB-store-miss-rate%
215751 -13.8% 185908 perf-stat.i.dTLB-store-misses
91617620 -8.4% 83893934 perf-stat.i.dTLB-stores
498608 ? 8% -24.3% 377270 ? 3% perf-stat.i.iTLB-load-misses
6.137e+08 -11.2% 5.447e+08 perf-stat.i.instructions
1366 ? 3% +12.4% 1535 perf-stat.i.instructions-per-iTLB-miss
0.01 ? 5% -25.2% 0.00 ? 4% perf-stat.i.ipc
3.26 ? 2% -19.7% 2.62 perf-stat.i.metric.M/sec
4232 -20.9% 3346 perf-stat.i.minor-faults
54.83 -2.4 52.43 perf-stat.i.node-load-miss-rate%
397861 ? 2% -39.9% 239208 perf-stat.i.node-load-misses
311437 ? 3% -29.3% 220078 ? 4% perf-stat.i.node-loads
36.23 ? 8% +14.7 50.97 perf-stat.i.node-store-miss-rate%
11382299 ? 9% -54.5% 5178565 perf-stat.i.node-stores
4232 -20.9% 3346 perf-stat.i.page-faults
48.36 ? 2% -32.1% 32.86 perf-stat.overall.MPKI
1.82 -0.1 1.72 perf-stat.overall.branch-miss-rate%
357.56 ? 2% +12.8% 403.43 perf-stat.overall.cpi
11082 ? 4% +64.5% 18230 perf-stat.overall.cycles-between-cache-misses
0.04 ? 2% -0.0 0.04 ? 4% perf-stat.overall.dTLB-load-miss-rate%
0.24 -0.0 0.22 perf-stat.overall.dTLB-store-miss-rate%
1246 ? 8% +17.4% 1462 ? 3% perf-stat.overall.instructions-per-iTLB-miss
0.00 ? 2% -11.4% 0.00 perf-stat.overall.ipc
54.93 -3.7 51.19 ? 2% perf-stat.overall.node-load-miss-rate%
33.16 ? 11% +16.9 50.11 perf-stat.overall.node-store-miss-rate%
1.192e+08 -12.5% 1.043e+08 perf-stat.ps.branch-instructions
2173761 ? 3% -17.6% 1791060 perf-stat.ps.branch-misses
19774101 ? 3% -39.5% 11970134 perf-stat.ps.cache-misses
29613162 ? 4% -40.0% 17774173 perf-stat.ps.cache-references
135.25 -15.7% 113.97 perf-stat.ps.cpu-migrations
72513 ? 2% -24.4% 54824 ? 4% perf-stat.ps.dTLB-load-misses
1.649e+08 -10.1% 1.483e+08 perf-stat.ps.dTLB-loads
216266 -13.5% 186980 perf-stat.ps.dTLB-store-misses
91180718 -8.7% 83287491 perf-stat.ps.dTLB-stores
495096 ? 9% -25.2% 370277 ? 3% perf-stat.ps.iTLB-load-misses
6.121e+08 -11.6% 5.409e+08 perf-stat.ps.instructions
4183 -21.4% 3288 perf-stat.ps.minor-faults
396633 ? 2% -40.5% 236105 perf-stat.ps.node-load-misses
325540 ? 3% -30.8% 225344 ? 4% perf-stat.ps.node-loads
11391808 ? 9% -54.5% 5184100 perf-stat.ps.node-stores
4183 -21.4% 3288 perf-stat.ps.page-faults
5.551e+11 ? 4% +80.4% 1.001e+12 perf-stat.total.instructions
12.12 ? 49% -12.1 0.00 perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.cmd_sched.run_builtin.main
12.12 ? 49% -12.1 0.00 perf-profile.calltrace.cycles-pp.cmd_record.cmd_sched.run_builtin.main.__libc_start_main
12.12 ? 49% -12.1 0.00 perf-profile.calltrace.cycles-pp.cmd_sched.run_builtin.main.__libc_start_main
14.02 ? 39% -11.7 2.32 ? 88% perf-profile.calltrace.cycles-pp.main.__libc_start_main
14.02 ? 39% -11.7 2.32 ? 88% perf-profile.calltrace.cycles-pp.run_builtin.main.__libc_start_main
14.02 ? 39% -11.7 2.32 ? 88% perf-profile.calltrace.cycles-pp.__libc_start_main
12.52 ? 39% -10.9 1.60 ? 95% perf-profile.calltrace.cycles-pp.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record
10.90 ? 49% -10.9 0.00 perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.cmd_sched.run_builtin
10.85 ? 49% -10.9 0.00 perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.cmd_sched
21.20 ? 41% -10.1 11.08 ? 18% perf-profile.calltrace.cycles-pp.record__finish_output.__cmd_record
21.20 ? 41% -10.1 11.08 ? 18% perf-profile.calltrace.cycles-pp.perf_session__process_events.record__finish_output.__cmd_record
21.16 ? 41% -10.1 11.04 ? 18% perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
7.20 ? 32% -5.9 1.25 ?175% perf-profile.calltrace.cycles-pp.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output
7.20 ? 32% -5.9 1.25 ?175% perf-profile.calltrace.cycles-pp.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
7.02 ? 31% -5.8 1.22 ?176% perf-profile.calltrace.cycles-pp.perf_session__deliver_event.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events
1.88 ? 13% -0.7 1.18 ? 11% perf-profile.calltrace.cycles-pp.wait_for_lsr.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp.irq_work_run_list.irq_work_run.__sysvec_irq_work.sysvec_irq_work.asm_sysvec_irq_work
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp.irq_work_single.irq_work_run_list.irq_work_run.__sysvec_irq_work.sysvec_irq_work
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp._printk.irq_work_single.irq_work_run_list.irq_work_run.__sysvec_irq_work
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp.vprintk_emit._printk.irq_work_single.irq_work_run_list.irq_work_run
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit._printk.irq_work_single.irq_work_run_list
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp.console_flush_all.console_unlock.vprintk_emit._printk.irq_work_single
1.89 ? 13% -0.6 1.24 ? 18% perf-profile.calltrace.cycles-pp.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit._printk
0.84 ? 22% -0.4 0.40 ? 71% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.71 ? 21% -0.4 0.29 ?101% perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
0.76 ? 21% -0.4 0.40 ? 71% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close_nocancel
0.76 ? 21% -0.4 0.40 ? 71% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close_nocancel
0.90 ? 18% -0.3 0.56 ? 45% perf-profile.calltrace.cycles-pp.__close_nocancel
0.86 ? 19% -0.3 0.54 ? 45% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__close_nocancel
0.84 ? 19% -0.3 0.54 ? 45% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close_nocancel
0.30 ?101% +0.7 1.04 ? 28% perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.00 +0.7 0.74 ? 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read.readn.evsel__read_counter
0.30 ?101% +0.7 1.04 ? 28% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.00 +0.7 0.74 ? 11% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read.readn.evsel__read_counter.read_counters
0.00 +0.8 0.77 ? 11% perf-profile.calltrace.cycles-pp.__libc_read.readn.evsel__read_counter.read_counters.process_interval
0.00 +0.8 0.77 ? 12% perf-profile.calltrace.cycles-pp.readn.evsel__read_counter.read_counters.process_interval.dispatch_events
0.00 +0.8 0.78 ? 15% perf-profile.calltrace.cycles-pp.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +0.8 0.80 ? 15% perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +0.8 0.80 ? 25% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read.readn
0.00 +1.1 1.05 ? 22% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread
0.00 +1.1 1.05 ? 22% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.00 +1.1 1.08 ? 23% perf-profile.calltrace.cycles-pp.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread.ret_from_fork
0.26 ?141% +1.1 1.36 ? 13% perf-profile.calltrace.cycles-pp.evsel__read_counter.read_counters.process_interval.dispatch_events.cmd_stat
0.00 +1.1 1.15 ? 24% perf-profile.calltrace.cycles-pp.rcu_gp_fqs_loop.rcu_gp_kthread.kthread.ret_from_fork
0.00 +1.4 1.37 ? 24% perf-profile.calltrace.cycles-pp.rcu_gp_kthread.kthread.ret_from_fork
5.16 ?124% +15.0 20.16 ? 44% perf-profile.calltrace.cycles-pp.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record
1.78 ? 42% +15.6 17.34 ? 9% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt
5.22 ?124% +15.6 20.79 ? 45% perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record
5.26 ?124% +15.8 21.07 ? 45% perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record
14.02 ? 39% -12.2 1.82 ? 83% perf-profile.children.cycles-pp.cmd_record
12.12 ? 49% -12.1 0.00 perf-profile.children.cycles-pp.cmd_sched
14.02 ? 39% -11.6 2.41 ? 81% perf-profile.children.cycles-pp.main
14.02 ? 39% -11.6 2.41 ? 81% perf-profile.children.cycles-pp.run_builtin
14.02 ? 39% -11.6 2.42 ? 80% perf-profile.children.cycles-pp.__libc_start_main
21.20 ? 41% -10.1 11.08 ? 18% perf-profile.children.cycles-pp.record__finish_output
21.20 ? 41% -10.1 11.08 ? 18% perf-profile.children.cycles-pp.perf_session__process_events
21.16 ? 41% -10.1 11.04 ? 18% perf-profile.children.cycles-pp.reader__read_event
7.20 ? 32% -5.9 1.25 ?175% perf-profile.children.cycles-pp.__ordered_events__flush
7.20 ? 32% -5.9 1.25 ?175% perf-profile.children.cycles-pp.perf_session__process_user_event
8.07 ? 26% -5.6 2.52 ? 68% perf-profile.children.cycles-pp.perf_session__deliver_event
2.08 ? 45% -1.3 0.80 ? 83% perf-profile.children.cycles-pp.do_sys_poll
2.09 ? 45% -1.3 0.81 ? 82% perf-profile.children.cycles-pp.__poll
2.08 ? 45% -1.3 0.80 ? 83% perf-profile.children.cycles-pp.__x64_sys_poll
1.73 ? 45% -1.1 0.67 ? 82% perf-profile.children.cycles-pp.do_poll
2.38 ? 15% -1.0 1.37 ? 12% perf-profile.children.cycles-pp.wait_for_lsr
2.40 ? 14% -1.0 1.44 ? 16% perf-profile.children.cycles-pp.serial8250_console_write
2.41 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.console_flush_all
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.irq_work_run_list
2.41 ? 14% -1.0 1.46 ? 16% perf-profile.children.cycles-pp.console_unlock
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.asm_sysvec_irq_work
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.sysvec_irq_work
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.__sysvec_irq_work
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.irq_work_run
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.irq_work_single
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp._printk
2.40 ? 14% -1.0 1.45 ? 16% perf-profile.children.cycles-pp.vprintk_emit
1.07 ? 30% -0.8 0.30 ? 94% perf-profile.children.cycles-pp.machine__process_mmap2_event
0.92 ? 31% -0.7 0.25 ? 91% perf-profile.children.cycles-pp.map__new
0.88 ? 36% -0.6 0.29 ? 52% perf-profile.children.cycles-pp.vsnprintf
0.73 ? 41% -0.5 0.22 ? 67% perf-profile.children.cycles-pp.seq_printf
0.56 ? 37% -0.4 0.14 ?119% perf-profile.children.cycles-pp.machine__findnew_vdso
0.90 ? 25% -0.4 0.50 ? 13% perf-profile.children.cycles-pp.rcu_do_batch
0.57 ? 23% -0.3 0.29 ? 21% perf-profile.children.cycles-pp.run_ksoftirqd
0.93 ? 16% -0.2 0.68 ? 14% perf-profile.children.cycles-pp.__close_nocancel
0.35 ? 42% -0.2 0.12 ? 76% perf-profile.children.cycles-pp.__libc_calloc
0.30 ? 19% -0.2 0.12 ? 19% perf-profile.children.cycles-pp.__slab_free
0.48 ? 17% -0.2 0.33 ? 36% perf-profile.children.cycles-pp.__fxstat64
0.34 ? 26% -0.1 0.21 ? 22% perf-profile.children.cycles-pp.single_release
0.36 ? 20% -0.1 0.24 ? 23% perf-profile.children.cycles-pp.lock_vma_under_rcu
0.18 ? 34% -0.1 0.06 ?106% perf-profile.children.cycles-pp.number
0.14 ? 31% -0.1 0.03 ?100% perf-profile.children.cycles-pp.flush_memcg_stats_dwork
0.14 ? 33% -0.1 0.03 ?100% perf-profile.children.cycles-pp.cgroup_rstat_flush_locked
0.14 ? 33% -0.1 0.03 ?100% perf-profile.children.cycles-pp.__mem_cgroup_flush_stats
0.14 ? 33% -0.1 0.03 ?100% perf-profile.children.cycles-pp.cgroup_rstat_flush_irqsafe
0.44 ? 11% -0.1 0.35 ? 10% perf-profile.children.cycles-pp.__list_del_entry_valid
0.24 ? 25% -0.1 0.16 ? 10% perf-profile.children.cycles-pp.__kmem_cache_free
0.29 ? 16% -0.1 0.21 ? 25% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.02 ?144% +0.1 0.08 ? 29% perf-profile.children.cycles-pp.activate_task
0.04 ?102% +0.1 0.10 ? 19% perf-profile.children.cycles-pp.__perf_event_read
0.11 ? 31% +0.1 0.18 ? 18% perf-profile.children.cycles-pp.__send_signal_locked
0.05 ? 71% +0.1 0.12 ? 26% perf-profile.children.cycles-pp.__x64_sys_ioctl
0.08 ? 34% +0.1 0.16 ? 15% perf-profile.children.cycles-pp.complete_signal
0.06 ? 51% +0.1 0.14 ? 46% perf-profile.children.cycles-pp.ioctl
0.06 ? 51% +0.1 0.14 ? 50% perf-profile.children.cycles-pp.perf_evsel__disable_cpu
0.01 ?223% +0.1 0.10 ? 53% perf-profile.children.cycles-pp.__get_task_ioprio
0.05 ? 73% +0.1 0.15 ? 51% perf-profile.children.cycles-pp.perf_evsel__run_ioctl
0.06 ? 75% +0.1 0.16 ? 5% perf-profile.children.cycles-pp.__kmalloc
0.09 ? 55% +0.1 0.21 ? 13% perf-profile.children.cycles-pp.perf_event_read
0.03 ?144% +0.2 0.18 ? 9% perf-profile.children.cycles-pp.__perf_read_group_add
0.04 ?105% +0.2 0.21 ? 29% perf-profile.children.cycles-pp.swake_up_one
0.03 ?103% +0.2 0.22 ? 23% perf-profile.children.cycles-pp.rcu_report_qs_rdp
0.01 ?223% +0.2 0.22 ? 25% perf-profile.children.cycles-pp.detach_tasks
0.21 ? 36% +0.2 0.42 ? 17% perf-profile.children.cycles-pp.__evlist__disable
0.53 ? 21% +0.2 0.76 ? 8% perf-profile.children.cycles-pp.__orc_find
0.08 ? 84% +0.2 0.31 ? 12% perf-profile.children.cycles-pp.evlist__id2evsel
0.35 ? 33% +0.3 0.62 ? 28% perf-profile.children.cycles-pp.apparmor_file_permission
0.42 ? 28% +0.4 0.78 ? 30% perf-profile.children.cycles-pp.security_file_permission
0.39 ? 42% +0.4 0.79 ? 12% perf-profile.children.cycles-pp.perf_read
0.93 ? 19% +0.6 1.49 ? 11% perf-profile.children.cycles-pp.dequeue_entity
0.06 ? 80% +0.6 0.63 ? 24% perf-profile.children.cycles-pp.rebalance_domains
1.08 ? 21% +0.6 1.66 ? 12% perf-profile.children.cycles-pp.dequeue_task_fair
1.55 ? 12% +0.6 2.12 ? 10% perf-profile.children.cycles-pp.unwind_next_frame
0.78 ? 21% +0.6 1.35 ? 19% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.67 ? 29% +0.6 1.26 ? 12% perf-profile.children.cycles-pp.__irq_exit_rcu
1.76 ? 17% +0.6 2.36 ? 6% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
1.79 ? 12% +0.6 2.42 ? 10% perf-profile.children.cycles-pp.perf_callchain_kernel
0.68 ? 35% +0.6 1.33 ? 9% perf-profile.children.cycles-pp.readn
2.11 ? 14% +0.7 2.78 ? 7% perf-profile.children.cycles-pp.update_curr
2.23 ? 15% +0.8 3.01 ? 11% perf-profile.children.cycles-pp.get_perf_callchain
2.29 ? 15% +0.8 3.08 ? 11% perf-profile.children.cycles-pp.perf_callchain
2.49 ? 16% +0.8 3.33 ? 10% perf-profile.children.cycles-pp.perf_prepare_sample
0.50 ? 45% +0.9 1.37 ? 13% perf-profile.children.cycles-pp.evsel__read_counter
2.95 ? 16% +0.9 3.87 ? 10% perf-profile.children.cycles-pp.perf_event_output_forward
2.99 ? 16% +1.0 3.95 ? 10% perf-profile.children.cycles-pp.__perf_event_overflow
0.14 ? 71% +1.0 1.16 ? 25% perf-profile.children.cycles-pp.schedule_timeout
0.07 ?101% +1.1 1.15 ? 24% perf-profile.children.cycles-pp.rcu_gp_fqs_loop
3.25 ? 15% +1.1 4.32 ? 10% perf-profile.children.cycles-pp.perf_tp_event
2.68 ? 19% +1.2 3.92 ? 14% perf-profile.children.cycles-pp.cmd_stat
2.54 ? 20% +1.2 3.79 ? 16% perf-profile.children.cycles-pp.read_counters
2.66 ? 19% +1.3 3.92 ? 14% perf-profile.children.cycles-pp.dispatch_events
2.64 ? 19% +1.3 3.90 ? 14% perf-profile.children.cycles-pp.process_interval
0.09 ? 99% +1.3 1.37 ? 24% perf-profile.children.cycles-pp.rcu_gp_kthread
2.73 ? 17% +1.7 4.39 ? 16% perf-profile.children.cycles-pp.schedule
3.41 ? 14% +1.9 5.33 ? 16% perf-profile.children.cycles-pp.__schedule
3.34 ? 7% +15.6 18.99 ? 9% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
2.20 ? 16% -0.9 1.28 ? 16% perf-profile.self.cycles-pp.io_serial_in
0.52 ? 17% -0.2 0.28 ? 21% perf-profile.self.cycles-pp.__alloc_pages
0.26 ? 24% -0.2 0.06 ? 53% perf-profile.self.cycles-pp.delay_tsc
0.32 ? 48% -0.2 0.12 ? 78% perf-profile.self.cycles-pp.__libc_calloc
0.37 ? 44% -0.2 0.19 ? 52% perf-profile.self.cycles-pp.mutex_lock
0.30 ? 20% -0.2 0.12 ? 19% perf-profile.self.cycles-pp.__slab_free
0.44 ? 11% -0.1 0.35 ? 10% perf-profile.self.cycles-pp.__list_del_entry_valid
0.05 ? 83% +0.1 0.12 ? 19% perf-profile.self.cycles-pp.dequeue_task_fair
0.00 +0.1 0.07 ? 18% perf-profile.self.cycles-pp.__perf_read_group_add
0.09 ? 27% +0.1 0.17 ? 24% perf-profile.self.cycles-pp.update_rq_clock
0.03 ?101% +0.1 0.12 ? 31% perf-profile.self.cycles-pp.evlist_cpu_iterator__next
0.00 +0.1 0.09 ? 46% perf-profile.self.cycles-pp.__get_task_ioprio
0.06 ? 52% +0.1 0.17 ? 37% perf-profile.self.cycles-pp.security_file_permission
0.09 ? 81% +0.2 0.28 ? 23% perf-profile.self.cycles-pp.evsel__read_counter
0.21 ? 33% +0.2 0.42 ? 37% perf-profile.self.cycles-pp.apparmor_file_permission
0.53 ? 21% +0.2 0.76 ? 8% perf-profile.self.cycles-pp.__orc_find
0.08 ? 83% +0.2 0.31 ? 12% perf-profile.self.cycles-pp.evlist__id2evsel
0.19 ? 64% +15.1 15.31 ? 8% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt


***************************************************************************************************
lkp-spr-r02: 224 threads 2 sockets (Sapphire Rapids) with 256G memory
=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-spr-r02/numa02_SMT/autonuma-benchmark

commit:
ef6a22b70f ("sched/numa: apply the scan delay to every new vma")
fc137c0dda ("sched/numa: enhance vma scanning logic")

ef6a22b70f6d9044 fc137c0ddab29b591db6a091dc6
---------------- ---------------------------
%stddev %change %stddev
\ | \
191.58 ? 2% +39.3% 266.89 autonuma-benchmark.numa01.seconds
8.06 +0.2% 8.08 autonuma-benchmark.numa02.seconds
6.41 ? 12% +8.1% 6.93 autonuma-benchmark.numa02_SMT.seconds
829.64 ? 2% +36.4% 1131 autonuma-benchmark.time.elapsed_time
829.64 ? 2% +36.4% 1131 autonuma-benchmark.time.elapsed_time.max
549857 ? 4% +38.7% 762520 autonuma-benchmark.time.involuntary_context_switches
20703 -1.9% 20305 autonuma-benchmark.time.percent_of_cpu_this_job_got
2041 +2.4% 2091 autonuma-benchmark.time.system_time
169755 ? 3% +34.1% 227625 autonuma-benchmark.time.user_time
36362 ? 7% -61.8% 13887 autonuma-benchmark.time.voluntary_context_switches
1.172e+10 ? 8% +72.4% 2.021e+10 ? 2% cpuidle..time
12059145 ? 8% +72.0% 20740506 ? 2% cpuidle..usage
1469 +16.9% 1717 meminfo.Mlocked
219033 ? 7% +41.2% 309205 ? 28% meminfo.Shmem
1421566 ? 3% +38.5% 1968458 ? 10% numa-numastat.node0.local_node
1844276 ? 6% +29.3% 2384098 ? 7% numa-numastat.node0.numa_hit
887.93 ? 2% +34.0% 1189 uptime.boot
23259 ? 4% +37.1% 31889 uptime.idle
2533 ? 2% -4.2% 2426 vmstat.system.cs
232791 -1.7% 228729 vmstat.system.in
6.30 ? 10% +1.7 8.00 ? 2% mpstat.cpu.all.idle%
0.02 -0.0 0.00 mpstat.cpu.all.iowait%
1.05 -0.1 0.96 mpstat.cpu.all.irq%
0.07 +0.0 0.10 mpstat.cpu.all.soft%
1.27 ? 2% -0.3 0.97 mpstat.cpu.all.sys%
4988 ? 6% +55.2% 7744 ? 10% numa-meminfo.node0.Active
4924 ? 7% +57.3% 7744 ? 10% numa-meminfo.node0.Active(anon)
68570 ? 29% +75.8% 120551 ? 34% numa-meminfo.node0.Mapped
840.67 ? 70% +94.2% 1632 ? 7% numa-meminfo.node0.Mlocked
10013 ? 6% +37.3% 13744 ? 5% numa-meminfo.node0.Shmem
208910 ? 8% +41.7% 295980 ? 29% numa-meminfo.node1.Shmem
1231 ? 7% +57.1% 1934 ? 10% numa-vmstat.node0.nr_active_anon
17238 ? 29% +77.0% 30511 ? 34% numa-vmstat.node0.nr_mapped
210.67 ? 70% +93.0% 406.67 ? 7% numa-vmstat.node0.nr_mlock
2503 ? 6% +37.2% 3434 ? 5% numa-vmstat.node0.nr_shmem
1231 ? 7% +57.1% 1934 ? 10% numa-vmstat.node0.nr_zone_active_anon
1844152 ? 6% +29.2% 2382388 ? 7% numa-vmstat.node0.numa_hit
1421442 ? 3% +38.4% 1966749 ? 10% numa-vmstat.node0.numa_local
52210 ? 8% +42.3% 74312 ? 29% numa-vmstat.node1.nr_shmem
5516135 ? 8% +73.2% 9554162 ? 2% turbostat.C1E
2.89 ? 10% +0.8 3.67 ? 2% turbostat.C1E%
6357647 ? 8% +73.1% 11003447 ? 2% turbostat.C6
3.38 ? 11% +0.9 4.27 ? 2% turbostat.C6%
6.47 ? 10% +25.7% 8.13 ? 2% turbostat.CPU%c1
1.964e+08 ? 2% +31.3% 2.58e+08 turbostat.IRQ
666.93 -3.9% 640.93 turbostat.PkgWatt
71.58 -30.3% 49.92 turbostat.RAMWatt
30776 ? 3% +5.8% 32575 ? 3% proc-vmstat.nr_active_anon
1525450 +2.1% 1557803 proc-vmstat.nr_anon_pages
2907 +2.6% 2982 proc-vmstat.nr_anon_transparent_hugepages
760615 +3.0% 783336 ? 2% proc-vmstat.nr_file_pages
1549400 +3.4% 1602815 proc-vmstat.nr_inactive_anon
366.67 +16.6% 427.67 proc-vmstat.nr_mlock
4181 +3.4% 4322 proc-vmstat.nr_page_table_pages
54864 ? 7% +41.4% 77580 ? 28% proc-vmstat.nr_shmem
30776 ? 3% +5.8% 32575 ? 3% proc-vmstat.nr_zone_active_anon
1549399 +3.4% 1602815 proc-vmstat.nr_zone_inactive_anon
3979410 +14.1% 4539497 proc-vmstat.numa_hit
80851 ? 3% -97.6% 1967 ? 6% proc-vmstat.numa_huge_pte_updates
3261402 +17.0% 3816708 proc-vmstat.numa_local
7983190 -94.9% 405403 ? 8% proc-vmstat.numa_pages_migrated
41654608 ? 2% -96.9% 1294611 ? 4% proc-vmstat.numa_pte_updates
2.339e+08 -4.1% 2.243e+08 proc-vmstat.pgalloc_normal
4206095 +16.6% 4903734 proc-vmstat.pgfault
2.338e+08 -4.2% 2.24e+08 proc-vmstat.pgfree
7983190 -94.9% 405403 ? 8% proc-vmstat.pgmigrate_success
160314 ? 2% +22.5% 196345 proc-vmstat.pgreuse
15546 -95.0% 778.00 ? 8% proc-vmstat.thp_migration_success
6675456 ? 2% +32.3% 8834816 proc-vmstat.unevictable_pgs_scanned
1398244 +20.6% 1686461 proc-vmstat.vma_lock_abort
1691 ? 8% -100.0% 0.00 proc-vmstat.vma_lock_retry
1495234 +2.7% 1535106 proc-vmstat.vma_lock_success
18882 ? 35% +385.5% 91667 ? 24% sched_debug.cfs_rq:/.MIN_vruntime.avg
4227647 ? 35% +382.2% 20387480 ? 23% sched_debug.cfs_rq:/.MIN_vruntime.max
281894 ? 35% +383.8% 1363727 ? 24% sched_debug.cfs_rq:/.MIN_vruntime.stddev
8.84 ? 23% -27.1% 6.44 ? 7% sched_debug.cfs_rq:/.load_avg.avg
495.77 ? 85% -69.3% 152.19 ? 15% sched_debug.cfs_rq:/.load_avg.max
41.39 ? 71% -59.5% 16.77 ? 16% sched_debug.cfs_rq:/.load_avg.stddev
18882 ? 35% +385.5% 91667 ? 24% sched_debug.cfs_rq:/.max_vruntime.avg
4227648 ? 35% +382.2% 20387480 ? 23% sched_debug.cfs_rq:/.max_vruntime.max
281894 ? 35% +383.8% 1363727 ? 24% sched_debug.cfs_rq:/.max_vruntime.stddev
99585977 ? 5% +41.4% 1.408e+08 ? 2% sched_debug.cfs_rq:/.min_vruntime.avg
1.022e+08 ? 4% +41.5% 1.446e+08 ? 2% sched_debug.cfs_rq:/.min_vruntime.max
80885546 ? 2% +36.8% 1.107e+08 ? 10% sched_debug.cfs_rq:/.min_vruntime.min
2263985 ? 3% +36.3% 3086252 ? 17% sched_debug.cfs_rq:/.min_vruntime.stddev
384.30 ?113% -85.5% 55.89 ? 2% sched_debug.cfs_rq:/.removed.load_avg.max
42.99 ? 11% -31.4% 29.51 ? 5% sched_debug.cfs_rq:/.removed.runnable_avg.max
42.97 ? 11% -31.3% 29.51 ? 5% sched_debug.cfs_rq:/.removed.util_avg.max
7991223 ? 11% +32.6% 10600334 ? 16% sched_debug.cfs_rq:/.spread0.max
-13413285 +75.5% -23533995 sched_debug.cfs_rq:/.spread0.min
2251246 ? 3% +35.7% 3054867 ? 17% sched_debug.cfs_rq:/.spread0.stddev
488.32 ? 6% -98.4% 7.59 ? 94% sched_debug.cfs_rq:/.util_est_enqueued.avg
1316 ? 3% -65.2% 458.48 ? 36% sched_debug.cfs_rq:/.util_est_enqueued.max
312.92 ? 6% -85.7% 44.71 ? 56% sched_debug.cfs_rq:/.util_est_enqueued.stddev
1427145 ? 2% +30.6% 1864124 ? 2% sched_debug.cpu.avg_idle.avg
376380 ? 20% +136.3% 889390 ? 3% sched_debug.cpu.avg_idle.min
439922 ? 3% +36.5% 600395 ? 2% sched_debug.cpu.clock.avg
440560 ? 3% +36.5% 601438 ? 2% sched_debug.cpu.clock.max
439196 ? 3% +36.4% 599201 ? 2% sched_debug.cpu.clock.min
384.40 ? 8% +67.5% 643.80 sched_debug.cpu.clock.stddev
435381 ? 3% +36.5% 594324 ? 2% sched_debug.cpu.clock_task.avg
436669 ? 3% +36.5% 596141 ? 2% sched_debug.cpu.clock_task.max
407687 ? 3% +38.7% 565411 ? 2% sched_debug.cpu.clock_task.min
1972 +12.1% 2211 sched_debug.cpu.clock_task.stddev
15199 ? 2% +25.9% 19133 ? 2% sched_debug.cpu.curr->pid.avg
19055 ? 2% +21.8% 23203 ? 2% sched_debug.cpu.curr->pid.max
1263 ? 18% +30.2% 1644 ? 16% sched_debug.cpu.curr->pid.stddev
0.00 ? 7% +67.7% 0.00 sched_debug.cpu.next_balance.stddev
5438 +27.6% 6938 ? 2% sched_debug.cpu.nr_switches.avg
1684 ? 2% +19.6% 2014 ? 4% sched_debug.cpu.nr_switches.min
439184 ? 3% +36.4% 599183 ? 2% sched_debug.cpu_clk
434534 ? 3% +36.8% 594518 ? 2% sched_debug.ktime
440279 ? 3% +36.3% 600265 ? 2% sched_debug.sched_clk
47.32 ? 5% -34.1% 31.20 ? 2% perf-stat.i.MPKI
3.453e+08 -9.5% 3.125e+08 perf-stat.i.branch-instructions
1882849 -6.8% 1755021 perf-stat.i.branch-misses
29.52 -0.7 28.82 perf-stat.i.cache-miss-rate%
39865923 ? 3% -32.7% 26846308 ? 2% perf-stat.i.cache-misses
90451157 ? 3% -30.0% 63286362 perf-stat.i.cache-references
2483 ? 2% -4.2% 2377 perf-stat.i.context-switches
5.997e+11 -2.1% 5.869e+11 perf-stat.i.cpu-cycles
237.67 -8.7% 217.07 perf-stat.i.cpu-migrations
87198 ? 3% +44.5% 126003 ? 2% perf-stat.i.cycles-between-cache-misses
0.08 ? 3% +0.0 0.12 ? 10% perf-stat.i.dTLB-load-miss-rate%
275303 ? 5% +58.6% 436747 ? 11% perf-stat.i.dTLB-load-misses
4.474e+08 -7.7% 4.13e+08 perf-stat.i.dTLB-loads
0.56 -0.0 0.53 perf-stat.i.dTLB-store-miss-rate%
1032613 -6.0% 970353 perf-stat.i.dTLB-store-misses
1.959e+08 -2.9% 1.903e+08 perf-stat.i.dTLB-stores
1.675e+09 -8.4% 1.534e+09 perf-stat.i.instructions
2.68 -2.1% 2.62 perf-stat.i.metric.GHz
1498 -1.5% 1476 perf-stat.i.metric.K/sec
3.31 -12.8% 2.89 perf-stat.i.metric.M/sec
4947 -13.5% 4277 perf-stat.i.minor-faults
45.78 ? 2% -8.0 37.82 ? 7% perf-stat.i.node-load-miss-rate%
808466 -53.6% 375110 ? 3% perf-stat.i.node-load-misses
984274 ? 4% -22.8% 760195 ? 11% perf-stat.i.node-loads
4947 -13.5% 4278 perf-stat.i.page-faults
51.07 ? 3% -25.2% 38.19 perf-stat.overall.MPKI
0.55 +0.0 0.56 perf-stat.overall.branch-miss-rate%
42.92 -1.4 41.52 ? 2% perf-stat.overall.cache-miss-rate%
369.25 +8.2% 399.67 perf-stat.overall.cpi
16877 ? 5% +49.5% 25239 ? 3% perf-stat.overall.cycles-between-cache-misses
0.06 ? 4% +0.0 0.10 ? 9% perf-stat.overall.dTLB-load-miss-rate%
0.53 -0.0 0.51 perf-stat.overall.dTLB-store-miss-rate%
0.00 -7.6% 0.00 perf-stat.overall.ipc
42.61 -10.8 31.85 ? 7% perf-stat.overall.node-load-miss-rate%
3.361e+08 -10.5% 3.009e+08 perf-stat.ps.branch-instructions
1845829 -8.4% 1690701 perf-stat.ps.branch-misses
35855895 ? 4% -34.3% 23550666 ? 3% perf-stat.ps.cache-misses
83542707 ? 4% -32.1% 56718514 perf-stat.ps.cache-references
2469 ? 2% -4.1% 2368 perf-stat.ps.context-switches
6.037e+11 -1.7% 5.936e+11 perf-stat.ps.cpu-cycles
235.31 -9.1% 213.98 perf-stat.ps.cpu-migrations
262765 ? 5% +53.5% 403343 ? 9% perf-stat.ps.dTLB-load-misses
4.376e+08 -8.4% 4.008e+08 perf-stat.ps.dTLB-loads
1034201 -6.0% 972631 perf-stat.ps.dTLB-store-misses
1.941e+08 -2.8% 1.886e+08 perf-stat.ps.dTLB-stores
1.635e+09 -9.2% 1.485e+09 perf-stat.ps.instructions
4761 -13.8% 4104 perf-stat.ps.minor-faults
801748 -54.5% 364866 ? 3% perf-stat.ps.node-load-misses
1080815 ? 4% -27.3% 785292 ? 8% perf-stat.ps.node-loads
4761 -13.8% 4104 perf-stat.ps.page-faults
1.358e+12 +23.8% 1.681e+12 perf-stat.total.instructions
16.32 ? 53% -16.3 0.00 perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.cmd_sched.run_builtin.main
16.32 ? 53% -16.3 0.00 perf-profile.calltrace.cycles-pp.cmd_record.cmd_sched.run_builtin.main.__libc_start_main
16.32 ? 53% -16.3 0.00 perf-profile.calltrace.cycles-pp.cmd_sched.run_builtin.main.__libc_start_main
14.04 ? 45% -14.0 0.00 perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.cmd_sched.run_builtin
14.02 ? 45% -14.0 0.00 perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.cmd_sched
15.72 ? 49% -13.9 1.80 ? 7% perf-profile.calltrace.cycles-pp.asm_exc_page_fault
15.58 ? 49% -13.9 1.70 ? 7% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
15.56 ? 49% -13.9 1.69 ? 7% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
15.18 ? 50% -13.7 1.52 ? 8% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
17.01 ? 44% -11.7 5.27 ? 4% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
11.57 ? 65% -11.6 0.00 perf-profile.calltrace.cycles-pp.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
11.37 ? 65% -11.4 0.00 perf-profile.calltrace.cycles-pp.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
11.36 ? 65% -11.4 0.00 perf-profile.calltrace.cycles-pp.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault
11.36 ? 65% -11.4 0.00 perf-profile.calltrace.cycles-pp.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault
10.69 ? 65% -10.7 0.00 perf-profile.calltrace.cycles-pp.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page
10.69 ? 65% -10.7 0.00 perf-profile.calltrace.cycles-pp.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page
10.67 ? 66% -10.7 0.00 perf-profile.calltrace.cycles-pp.folio_copy.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages
10.52 ? 65% -10.5 0.00 perf-profile.calltrace.cycles-pp.copy_page.folio_copy.migrate_folio_extra.move_to_new_folio.migrate_pages_batch
8.81 ? 15% -3.5 5.30 perf-profile.calltrace.cycles-pp.read
8.70 ? 15% -3.5 5.24 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
8.71 ? 15% -3.5 5.25 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
8.57 ? 15% -3.4 5.18 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
8.50 ? 15% -3.4 5.12 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
5.64 ? 15% -2.7 2.91 ? 4% perf-profile.calltrace.cycles-pp.seq_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.60 ? 15% -2.7 2.90 ? 4% perf-profile.calltrace.cycles-pp.seq_read_iter.seq_read.vfs_read.ksys_read.do_syscall_64
4.62 ? 16% -2.5 2.09 ? 13% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.60 ? 16% -2.5 2.09 ? 13% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
4.27 ? 16% -2.3 1.97 ? 6% perf-profile.calltrace.cycles-pp.proc_single_show.seq_read_iter.seq_read.vfs_read.ksys_read
5.22 ? 14% -1.9 3.34 ? 5% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
5.20 ? 14% -1.9 3.33 ? 5% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.14 ? 15% -1.8 2.32 ? 12% perf-profile.calltrace.cycles-pp.open64
3.77 ? 17% -1.8 1.95 ? 6% perf-profile.calltrace.cycles-pp.do_task_stat.proc_single_show.seq_read_iter.seq_read.vfs_read
4.08 ? 16% -1.8 2.28 ? 12% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
4.07 ? 15% -1.8 2.27 ? 12% perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
4.09 ? 16% -1.8 2.28 ? 12% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.open64
4.05 ? 15% -1.8 2.25 ? 12% perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
2.48 ? 11% -1.6 0.83 ? 21% perf-profile.calltrace.cycles-pp.vfs_fstatat.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
2.37 ? 12% -1.6 0.75 ? 21% perf-profile.calltrace.cycles-pp.vfs_statx.vfs_fstatat.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.56 ? 10% -1.5 4.04 ? 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.execve
5.56 ? 10% -1.5 4.04 ? 6% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
5.56 ? 10% -1.5 4.04 ? 6% perf-profile.calltrace.cycles-pp.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
5.55 ? 10% -1.5 4.03 ? 6% perf-profile.calltrace.cycles-pp.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
5.56 ? 10% -1.5 4.04 ? 6% perf-profile.calltrace.cycles-pp.execve
2.23 ? 13% -1.5 0.73 ? 22% perf-profile.calltrace.cycles-pp.filename_lookup.vfs_statx.vfs_fstatat.__do_sys_newstat.do_syscall_64
2.21 ? 13% -1.5 0.73 ? 22% perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.vfs_statx.vfs_fstatat.__do_sys_newstat
3.92 ? 9% -1.0 2.88 ? 7% perf-profile.calltrace.cycles-pp.bprm_execve.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.59 ? 8% -0.9 0.68 ? 75% perf-profile.calltrace.cycles-pp.__poll.__cmd_record.cmd_record.run_builtin.main
1.58 ? 8% -0.9 0.68 ? 75% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll.__cmd_record.cmd_record.run_builtin
1.58 ? 8% -0.9 0.68 ? 75% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll.__cmd_record.cmd_record
2.44 ? 9% -0.9 1.54 ? 6% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.44 ? 9% -0.9 1.54 ? 6% perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.57 ? 8% -0.9 0.68 ? 75% perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll.__cmd_record
1.57 ? 8% -0.9 0.68 ? 75% perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
2.43 ? 9% -0.9 1.54 ? 6% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.70 ? 10% -0.8 1.91 ? 6% perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.exec_binprm.bprm_execve.do_execveat_common
2.71 ? 10% -0.8 1.92 ? 7% perf-profile.calltrace.cycles-pp.search_binary_handler.exec_binprm.bprm_execve.do_execveat_common.__x64_sys_execve
2.71 ? 10% -0.8 1.93 ? 7% perf-profile.calltrace.cycles-pp.exec_binprm.bprm_execve.do_execveat_common.__x64_sys_execve.do_syscall_64
1.80 ? 10% -0.7 1.06 ? 5% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
1.82 ? 10% -0.7 1.08 ? 5% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
1.81 ? 10% -0.7 1.07 ? 6% perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
1.27 ? 8% -0.7 0.54 ? 75% perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.29 ? 6% -0.6 1.67 ? 18% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork
1.31 ? 13% -0.6 0.76 ? 7% perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
0.94 ? 3% -0.5 0.43 ? 74% perf-profile.calltrace.cycles-pp.perf_poll.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
1.35 ? 18% -0.5 0.86 ? 20% perf-profile.calltrace.cycles-pp.__xstat64
1.32 ? 18% -0.5 0.84 ? 21% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__xstat64
1.32 ? 18% -0.5 0.84 ? 21% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
1.29 ? 18% -0.5 0.81 ? 22% perf-profile.calltrace.cycles-pp.cpu_stopper_thread.smpboot_thread_fn.kthread.ret_from_fork
1.31 ? 17% -0.5 0.83 ? 21% perf-profile.calltrace.cycles-pp.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
1.02 ? 15% -0.4 0.62 ? 6% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2
1.13 ? 14% -0.4 0.76 ? 5% perf-profile.calltrace.cycles-pp.alloc_bprm.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.74 ? 13% -0.4 0.37 ? 71% perf-profile.calltrace.cycles-pp.pcpu_alloc.__percpu_counter_init.mm_init.alloc_bprm.do_execveat_common
1.11 ? 12% -0.4 0.74 ? 19% perf-profile.calltrace.cycles-pp.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64
1.11 ? 12% -0.4 0.74 ? 19% perf-profile.calltrace.cycles-pp.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.07 ? 14% -0.4 0.71 ? 6% perf-profile.calltrace.cycles-pp.mm_init.alloc_bprm.do_execveat_common.__x64_sys_execve.do_syscall_64
1.01 ? 14% -0.3 0.68 ? 6% perf-profile.calltrace.cycles-pp.__percpu_counter_init.mm_init.alloc_bprm.do_execveat_common.__x64_sys_execve
1.02 ? 4% -0.3 0.69 ? 7% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.68 ? 8% -0.3 0.36 ? 70% perf-profile.calltrace.cycles-pp.__mmput.exec_mmap.begin_new_exec.load_elf_binary.search_binary_handler
0.68 ? 9% -0.3 0.36 ? 70% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exec_mmap.begin_new_exec.load_elf_binary
1.27 ? 11% -0.3 0.97 ? 9% perf-profile.calltrace.cycles-pp.begin_new_exec.load_elf_binary.search_binary_handler.exec_binprm.bprm_execve
1.10 ? 17% -0.3 0.80 ? 22% perf-profile.calltrace.cycles-pp.migration_cpu_stop.cpu_stopper_thread.smpboot_thread_fn.kthread.ret_from_fork
1.18 ? 14% -0.3 0.89 ? 13% perf-profile.calltrace.cycles-pp.__vfork
1.10 ? 10% -0.3 0.82 ? 5% perf-profile.calltrace.cycles-pp.exec_mmap.begin_new_exec.load_elf_binary.search_binary_handler.exec_binprm
1.11 ? 3% -0.3 0.84 ? 8% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
1.11 ? 3% -0.3 0.84 ? 8% perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
1.14 ? 15% -0.3 0.87 ? 14% perf-profile.calltrace.cycles-pp.kernel_clone.__x64_sys_vfork.do_syscall_64.entry_SYSCALL_64_after_hwframe.__vfork
1.15 ? 15% -0.3 0.88 ? 14% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__vfork
1.14 ? 15% -0.3 0.88 ? 14% perf-profile.calltrace.cycles-pp.__x64_sys_vfork.do_syscall_64.entry_SYSCALL_64_after_hwframe.__vfork
1.15 ? 14% -0.3 0.89 ? 13% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__vfork
1.05 ? 7% -0.2 0.86 ? 6% perf-profile.calltrace.cycles-pp.sched_exec.bprm_execve.do_execveat_common.__x64_sys_execve.do_syscall_64
0.92 ? 6% -0.1 0.78 ? 5% perf-profile.calltrace.cycles-pp.find_idlest_cpu.select_task_rq_fair.sched_exec.bprm_execve.do_execveat_common
0.92 ? 6% -0.1 0.78 ? 5% perf-profile.calltrace.cycles-pp.select_task_rq_fair.sched_exec.bprm_execve.do_execveat_common.__x64_sys_execve
0.87 ? 6% -0.1 0.74 ? 5% perf-profile.calltrace.cycles-pp.update_sg_wakeup_stats.find_idlest_group.find_idlest_cpu.select_task_rq_fair.sched_exec
0.89 ? 5% -0.1 0.76 ? 5% perf-profile.calltrace.cycles-pp.find_idlest_group.find_idlest_cpu.select_task_rq_fair.sched_exec.bprm_execve
0.17 ?141% +0.4 0.61 ? 6% perf-profile.calltrace.cycles-pp.get_perf_callchain.perf_callchain.perf_prepare_sample.perf_event_output_forward.__perf_event_overflow
0.17 ?141% +0.4 0.61 ? 6% perf-profile.calltrace.cycles-pp.perf_callchain.perf_prepare_sample.perf_event_output_forward.__perf_event_overflow.perf_tp_event
0.57 ? 4% +0.5 1.03 ? 14% perf-profile.calltrace.cycles-pp.perf_mmap_fault.__do_fault.do_read_fault.do_fault.__handle_mm_fault
0.64 ? 7% +0.5 1.12 ? 15% perf-profile.calltrace.cycles-pp.__do_fault.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.76 ? 8% +0.5 2.26 ? 7% perf-profile.calltrace.cycles-pp.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.18 ?141% +0.5 0.68 ? 6% perf-profile.calltrace.cycles-pp.perf_prepare_sample.perf_event_output_forward.__perf_event_overflow.perf_tp_event.perf_trace_sched_stat_runtime
0.22 ?141% +0.6 0.78 ? 7% perf-profile.calltrace.cycles-pp.perf_event_output_forward.__perf_event_overflow.perf_tp_event.perf_trace_sched_stat_runtime.update_curr
0.00 +0.6 0.57 ? 7% perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
0.00 +0.6 0.59 ? 7% perf-profile.calltrace.cycles-pp.perf_callchain_kernel.get_perf_callchain.perf_callchain.perf_prepare_sample.perf_event_output_forward
0.00 +0.6 0.59 ? 13% perf-profile.calltrace.cycles-pp.finish_fault.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.00 +0.6 0.60 ? 10% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter
0.00 +0.6 0.63 ? 13% perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.fault_in_readable
0.00 +0.8 0.75 ? 32% perf-profile.calltrace.cycles-pp.find_busiest_group.load_balance.rebalance_domains.__do_softirq.__irq_exit_rcu
0.00 +0.8 0.81 ? 6% perf-profile.calltrace.cycles-pp.__perf_event_overflow.perf_tp_event.perf_trace_sched_stat_runtime.update_curr.dequeue_entity
0.00 +0.8 0.82 ? 27% perf-profile.calltrace.cycles-pp.load_balance.rebalance_domains.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt
0.00 +0.8 0.82 ? 28% perf-profile.calltrace.cycles-pp.rebalance_domains.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +0.9 0.85 ? 6% perf-profile.calltrace.cycles-pp.perf_tp_event.perf_trace_sched_stat_runtime.update_curr.dequeue_entity.dequeue_task_fair
0.00 +0.9 0.89 ? 6% perf-profile.calltrace.cycles-pp.perf_trace_sched_stat_runtime.update_curr.dequeue_entity.dequeue_task_fair.__schedule
0.18 ?141% +0.9 1.08 ? 4% perf-profile.calltrace.cycles-pp.find_vma.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.fault_in_readable
0.18 ?141% +0.9 1.08 ? 4% perf-profile.calltrace.cycles-pp.mt_find.find_vma.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
1.98 ? 7% +0.9 2.88 ? 7% perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.00 +0.9 0.91 ? 27% perf-profile.calltrace.cycles-pp.evlist__id2evsel.evsel__read_counter.read_counters.process_interval.dispatch_events
0.00 +1.0 0.98 ? 18% perf-profile.calltrace.cycles-pp.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +1.0 0.98 ? 18% perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +1.0 1.01 ? 6% perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__schedule.schedule
0.00 +1.0 1.05 ? 7% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
0.00 +1.1 1.07 ? 7% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.rcu_gp_fqs_loop
0.80 ? 21% +1.1 1.88 ? 11% perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.shmem_alloc_folio.shmem_alloc_and_acct_folio
0.80 ? 21% +1.1 1.88 ? 11% perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.shmem_alloc_folio.shmem_alloc_and_acct_folio.shmem_get_folio_gfp
0.42 ? 72% +1.1 1.52 ? 12% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.shmem_alloc_folio
0.18 ?141% +1.2 1.39 ? 11% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio
5.51 ? 8% +1.3 6.78 ? 13% perf-profile.calltrace.cycles-pp.ret_from_fork
5.51 ? 8% +1.3 6.78 ? 13% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
0.58 ? 70% +1.3 1.87 ? 28% perf-profile.calltrace.cycles-pp.evsel__read_counter.read_counters.process_interval.dispatch_events.cmd_stat
0.00 +1.5 1.49 ? 7% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread
0.00 +1.5 1.50 ? 6% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.00 +1.5 1.54 ? 53% perf-profile.calltrace.cycles-pp.__pthread_disable_asynccancel.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist
0.00 +1.6 1.58 ? 4% perf-profile.calltrace.cycles-pp.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread.ret_from_fork
0.00 +1.7 1.75 ? 52% perf-profile.calltrace.cycles-pp.ring_buffer_read_head.perf_mmap__read_head.perf_mmap__push.record__mmap_read_evlist.__cmd_record
0.00 +1.8 1.75 ? 52% perf-profile.calltrace.cycles-pp.perf_mmap__read_head.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record
0.00 +1.9 1.91 ? 3% perf-profile.calltrace.cycles-pp.rcu_gp_fqs_loop.rcu_gp_kthread.kthread.ret_from_fork
0.90 ? 20% +2.0 2.94 ? 6% perf-profile.calltrace.cycles-pp.vma_alloc_folio.shmem_alloc_folio.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin
0.93 ? 18% +2.1 3.05 ? 7% perf-profile.calltrace.cycles-pp.shmem_alloc_folio.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write
1.46 ? 9% +2.2 3.61 ? 7% perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write
0.00 +2.2 2.23 ? 3% perf-profile.calltrace.cycles-pp.rcu_gp_kthread.kthread.ret_from_fork
2.63 ? 12% +2.4 4.99 ? 2% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.fault_in_readable
2.05 ? 29% +2.9 4.97 ? 3% perf-profile.calltrace.cycles-pp.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter
4.09 ? 3% +3.9 8.01 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.fault_in_readable.fault_in_iov_iter_readable
4.16 ? 3% +4.0 8.15 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write
4.77 ? 11% +4.0 8.78 ? 3% perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
4.86 ? 11% +4.1 8.99 ? 3% perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write
5.19 ? 8% +4.5 9.68 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.__generic_file_write_iter
6.87 ? 6% +7.1 13.95 ? 5% perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write
6.38 ? 8% +7.7 14.12 ? 4% perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
9.62 ? 8% +11.6 21.24 ? 6% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter
9.67 ? 8% +11.8 21.50 ? 6% perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
9.78 ? 8% +12.0 21.78 ? 6% perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write
9.55 ? 13% +13.0 22.53 ? 30% perf-profile.calltrace.cycles-pp.cmd_record.run_builtin.main.__libc_start_main
9.54 ? 13% +13.0 22.53 ? 30% perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.run_builtin.main.__libc_start_main
7.33 ? 15% +13.2 20.53 ? 32% perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin
7.87 ? 15% +13.8 21.70 ? 33% perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin.main
23.97 ? 7% +25.5 49.46 ? 3% perf-profile.calltrace.cycles-pp.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.vfs_write.ksys_write
24.04 ? 7% +25.7 49.71 ? 3% perf-profile.calltrace.cycles-pp.__generic_file_write_iter.generic_file_write_iter.vfs_write.ksys_write.do_syscall_64
24.12 ? 7% +25.9 49.98 ? 3% perf-profile.calltrace.cycles-pp.generic_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
24.34 ? 7% +26.2 50.56 ? 3% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
24.46 ? 7% +26.4 50.86 ? 3% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write.writen
24.57 ? 7% +26.5 51.08 ? 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write.writen.record__pushfn
24.58 ? 7% +26.5 51.10 ? 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write.writen.record__pushfn.perf_mmap__push
24.66 ? 7% +26.6 51.29 ? 3% perf-profile.calltrace.cycles-pp.__libc_write.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist
24.91 ? 7% +28.1 52.99 ? 4% perf-profile.calltrace.cycles-pp.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record
6.71 ? 95% +29.8 36.47 ? 11% perf-profile.calltrace.cycles-pp.__cmd_record
4.58 ?118% +31.1 35.66 ? 10% perf-profile.calltrace.cycles-pp.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record
4.67 ?117% +31.2 35.86 ? 10% perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record
4.72 ?116% +31.2 35.97 ? 10% perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record
16.32 ? 53% -16.3 0.00 perf-profile.children.cycles-pp.cmd_sched
18.49 ? 40% -12.4 6.13 ? 3% perf-profile.children.cycles-pp.__handle_mm_fault
19.38 ? 38% -12.0 7.43 ? 2% perf-profile.children.cycles-pp.handle_mm_fault
11.95 ? 64% -11.9 0.07 ?141% perf-profile.children.cycles-pp.migrate_misplaced_page
11.93 ? 64% -11.9 0.07 ?141% perf-profile.children.cycles-pp.migrate_pages
11.93 ? 64% -11.9 0.07 ?141% perf-profile.children.cycles-pp.migrate_pages_batch
11.66 ? 65% -11.6 0.06 ?141% perf-profile.children.cycles-pp.do_huge_pmd_numa_page
10.90 ? 64% -10.8 0.06 ?141% perf-profile.children.cycles-pp.move_to_new_folio
10.90 ? 64% -10.8 0.06 ?141% perf-profile.children.cycles-pp.migrate_folio_extra
11.40 ? 61% -10.8 0.57 ? 68% perf-profile.children.cycles-pp.copy_page
10.86 ? 64% -10.8 0.06 ?141% perf-profile.children.cycles-pp.folio_copy
21.02 ? 35% -10.5 10.55 perf-profile.children.cycles-pp.do_user_addr_fault
21.21 ? 35% -10.4 10.77 perf-profile.children.cycles-pp.exc_page_fault
22.22 ? 33% -10.3 11.95 perf-profile.children.cycles-pp.asm_exc_page_fault
8.84 ? 15% -3.5 5.31 perf-profile.children.cycles-pp.read
9.33 ? 14% -3.2 6.13 ? 5% perf-profile.children.cycles-pp.vfs_read
9.53 ? 13% -3.2 6.36 ? 6% perf-profile.children.cycles-pp.ksys_read
7.48 ? 14% -3.1 4.39 ? 3% perf-profile.children.cycles-pp.seq_read_iter
5.64 ? 15% -2.7 2.91 ? 4% perf-profile.children.cycles-pp.seq_read
5.39 ? 14% -2.4 2.99 ? 11% perf-profile.children.cycles-pp.do_sys_openat2
5.41 ? 14% -2.4 3.02 ? 11% perf-profile.children.cycles-pp.__x64_sys_openat
5.12 ? 15% -2.3 2.80 ? 12% perf-profile.children.cycles-pp.do_filp_open
5.11 ? 15% -2.3 2.79 ? 12% perf-profile.children.cycles-pp.path_openat
4.27 ? 16% -2.3 1.97 ? 6% perf-profile.children.cycles-pp.proc_single_show
3.78 ? 17% -1.8 1.96 ? 5% perf-profile.children.cycles-pp.do_task_stat
4.15 ? 15% -1.8 2.33 ? 12% perf-profile.children.cycles-pp.open64
2.56 ? 11% -1.7 0.86 ? 20% perf-profile.children.cycles-pp.__xstat64
2.57 ? 10% -1.6 0.93 ? 19% perf-profile.children.cycles-pp.__do_sys_newstat
2.56 ? 10% -1.6 0.92 ? 19% perf-profile.children.cycles-pp.vfs_fstatat
2.44 ? 12% -1.6 0.84 ? 21% perf-profile.children.cycles-pp.vfs_statx
2.49 ? 13% -1.6 0.94 ? 21% perf-profile.children.cycles-pp.filename_lookup
5.59 ? 10% -1.5 4.05 ? 6% perf-profile.children.cycles-pp.__x64_sys_execve
5.58 ? 10% -1.5 4.05 ? 6% perf-profile.children.cycles-pp.do_execveat_common
2.46 ? 13% -1.5 0.93 ? 21% perf-profile.children.cycles-pp.path_lookupat
5.56 ? 10% -1.5 4.04 ? 6% perf-profile.children.cycles-pp.execve
2.20 ? 14% -1.3 0.86 ? 9% perf-profile.children.cycles-pp.walk_component
2.12 ? 13% -1.2 0.94 ? 4% perf-profile.children.cycles-pp.link_path_walk
3.96 ? 9% -1.0 2.91 ? 7% perf-profile.children.cycles-pp.bprm_execve
2.17 ? 17% -1.0 1.13 ? 23% perf-profile.children.cycles-pp.lookup_fast
2.52 ? 9% -0.9 1.60 ? 6% perf-profile.children.cycles-pp.__x64_sys_exit_group
2.52 ? 9% -0.9 1.60 ? 6% perf-profile.children.cycles-pp.do_group_exit
2.51 ? 9% -0.9 1.59 ? 6% perf-profile.children.cycles-pp.do_exit
2.01 ? 2% -0.9 1.12 ? 57% perf-profile.children.cycles-pp.__poll
2.49 ? 10% -0.9 1.60 ? 5% perf-profile.children.cycles-pp.__mmput
1.99 ? 2% -0.9 1.11 ? 58% perf-profile.children.cycles-pp.do_sys_poll
2.47 ? 9% -0.9 1.59 ? 5% perf-profile.children.cycles-pp.exit_mmap
1.99 ? 2% -0.9 1.11 ? 58% perf-profile.children.cycles-pp.__x64_sys_poll
2.72 ? 10% -0.8 1.93 ? 7% perf-profile.children.cycles-pp.exec_binprm
1.92 ? 21% -0.8 1.13 ? 3% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
2.71 ? 10% -0.8 1.93 ? 7% perf-profile.children.cycles-pp.search_binary_handler
2.70 ? 10% -0.8 1.91 ? 6% perf-profile.children.cycles-pp.load_elf_binary
1.88 ? 21% -0.8 1.11 ? 3% perf-profile.children.cycles-pp.hrtimer_interrupt
1.69 ? 21% -0.7 0.95 ? 3% perf-profile.children.cycles-pp.__hrtimer_run_queues
1.82 ? 10% -0.7 1.09 ? 6% perf-profile.children.cycles-pp.exit_mm
1.61 ? 4% -0.7 0.89 ? 57% perf-profile.children.cycles-pp.do_poll
1.58 ? 10% -0.7 0.89 ? 6% perf-profile.children.cycles-pp.do_open
1.13 ? 12% -0.7 0.47 ? 21% perf-profile.children.cycles-pp.step_into
2.29 ? 6% -0.6 1.67 ? 18% perf-profile.children.cycles-pp.smpboot_thread_fn
1.16 ? 7% -0.6 0.54 ? 5% perf-profile.children.cycles-pp.filemap_map_pages
1.44 ? 21% -0.6 0.82 ? 5% perf-profile.children.cycles-pp.tick_sched_timer
1.41 ? 21% -0.6 0.80 ? 6% perf-profile.children.cycles-pp.tick_sched_handle
1.01 ? 14% -0.6 0.40 ? 23% perf-profile.children.cycles-pp.pick_link
1.39 ? 21% -0.6 0.79 ? 7% perf-profile.children.cycles-pp.update_process_times
2.36 ? 12% -0.6 1.76 ? 2% perf-profile.children.cycles-pp.exit_to_user_mode_loop
1.30 ? 10% -0.6 0.71 ? 10% perf-profile.children.cycles-pp.task_work_run
0.92 ? 8% -0.6 0.34 ? 3% perf-profile.children.cycles-pp.__fxstat64
0.91 ? 18% -0.6 0.33 ? 14% perf-profile.children.cycles-pp.try_charge_memcg
2.59 ? 11% -0.6 2.01 perf-profile.children.cycles-pp.exit_to_user_mode_prepare
1.14 ? 26% -0.6 0.57 ? 7% perf-profile.children.cycles-pp.task_tick_fair
0.82 ? 13% -0.6 0.25 ? 18% perf-profile.children.cycles-pp.__lookup_slow
0.65 ? 41% -0.6 0.08 ? 17% perf-profile.children.cycles-pp.flush_tlb_mm_range
1.34 ? 10% -0.6 0.78 ? 2% perf-profile.children.cycles-pp.unmap_vmas
1.31 ? 22% -0.6 0.75 ? 7% perf-profile.children.cycles-pp.scheduler_tick
1.21 ? 16% -0.6 0.66 ? 36% perf-profile.children.cycles-pp.open_last_lookups
1.18 ? 11% -0.5 0.65 perf-profile.children.cycles-pp.zap_pmd_range
1.16 ? 11% -0.5 0.63 perf-profile.children.cycles-pp.zap_pte_range
1.21 ? 10% -0.5 0.68 perf-profile.children.cycles-pp.unmap_page_range
0.90 ? 8% -0.5 0.39 ? 10% perf-profile.children.cycles-pp.pid_revalidate
0.56 ? 7% -0.5 0.06 ? 71% perf-profile.children.cycles-pp.newidle_balance
0.86 ? 3% -0.5 0.36 ? 11% perf-profile.children.cycles-pp.next_uptodate_page
1.16 ? 19% -0.5 0.66 ? 2% perf-profile.children.cycles-pp.kmem_cache_alloc
1.42 ? 14% -0.5 0.92 ? 9% perf-profile.children.cycles-pp.vm_mmap_pgoff
1.22 ? 12% -0.5 0.73 ? 6% perf-profile.children.cycles-pp.do_dentry_open
1.37 ? 14% -0.5 0.89 ? 9% perf-profile.children.cycles-pp.do_mmap
1.29 ? 18% -0.5 0.81 ? 22% perf-profile.children.cycles-pp.cpu_stopper_thread
1.28 ? 13% -0.5 0.81 ? 9% perf-profile.children.cycles-pp.mmap_region
0.85 ? 21% -0.4 0.40 ? 4% perf-profile.children.cycles-pp.__close_nocancel
1.39 ? 16% -0.4 0.94 ? 7% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.90 ? 17% -0.4 0.46 ? 10% perf-profile.children.cycles-pp.getdents64
0.90 ? 17% -0.4 0.46 ? 10% perf-profile.children.cycles-pp.__x64_sys_getdents64
0.90 ? 17% -0.4 0.46 ? 10% perf-profile.children.cycles-pp.iterate_dir
0.89 ? 16% -0.4 0.45 ? 9% perf-profile.children.cycles-pp.proc_pid_readdir
1.45 ? 16% -0.4 1.01 ? 10% perf-profile.children.cycles-pp.mm_init
1.03 ? 26% -0.4 0.62 ? 39% perf-profile.children.cycles-pp.__d_lookup_rcu
1.37 ? 16% -0.4 0.96 ? 9% perf-profile.children.cycles-pp.__percpu_counter_init
1.34 ? 17% -0.4 0.94 ? 9% perf-profile.children.cycles-pp.pcpu_alloc
1.05 ? 15% -0.4 0.65 ? 5% perf-profile.children.cycles-pp.charge_memcg
0.73 -0.4 0.35 ? 21% perf-profile.children.cycles-pp.pick_next_task_fair
0.64 ? 22% -0.4 0.26 ? 7% perf-profile.children.cycles-pp.ptrace_may_access
0.69 ? 4% -0.4 0.31 ? 34% perf-profile.children.cycles-pp.wait_for_lsr
1.13 ? 14% -0.4 0.76 ? 5% perf-profile.children.cycles-pp.alloc_bprm
1.11 ? 12% -0.4 0.74 ? 19% perf-profile.children.cycles-pp.proc_reg_read_iter
0.70 ? 9% -0.4 0.34 ? 10% perf-profile.children.cycles-pp.tlb_finish_mmu
1.04 ? 9% -0.4 0.69 perf-profile.children.cycles-pp.finish_task_switch
1.16 ? 20% -0.4 0.80 ? 11% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.62 ? 12% -0.4 0.27 ? 19% perf-profile.children.cycles-pp.inode_permission
0.95 ? 24% -0.3 0.60 ? 7% perf-profile.children.cycles-pp.rcu_do_batch
0.99 ? 14% -0.3 0.65 ? 13% perf-profile.children.cycles-pp.ksys_mmap_pgoff
2.00 ? 9% -0.3 1.67 ? 7% perf-profile.children.cycles-pp.select_task_rq_fair
1.13 ? 12% -0.3 0.80 ? 4% perf-profile.children.cycles-pp.__mmdrop
1.57 ? 10% -0.3 1.25 ? 5% perf-profile.children.cycles-pp.try_to_wake_up
0.91 ? 8% -0.3 0.59 ? 3% perf-profile.children.cycles-pp._raw_spin_lock
0.65 ? 18% -0.3 0.34 ? 7% perf-profile.children.cycles-pp.__fput
1.89 ? 8% -0.3 1.58 ? 6% perf-profile.children.cycles-pp.find_idlest_cpu
1.27 ? 11% -0.3 0.97 ? 9% perf-profile.children.cycles-pp.begin_new_exec
1.10 ? 17% -0.3 0.80 ? 22% perf-profile.children.cycles-pp.migration_cpu_stop
0.50 ? 18% -0.3 0.20 ? 18% perf-profile.children.cycles-pp.d_alloc_parallel
0.71 ? 25% -0.3 0.41 ? 7% perf-profile.children.cycles-pp.alloc_empty_file
0.83 ? 9% -0.3 0.54 ? 21% perf-profile.children.cycles-pp.serial8250_console_write
0.83 ? 9% -0.3 0.54 ? 21% perf-profile.children.cycles-pp.console_unlock
0.83 ? 9% -0.3 0.54 ? 21% perf-profile.children.cycles-pp.console_flush_all
1.18 ? 14% -0.3 0.89 ? 13% perf-profile.children.cycles-pp.__vfork
0.69 ? 26% -0.3 0.40 ? 7% perf-profile.children.cycles-pp.__alloc_file
0.59 ? 5% -0.3 0.30 ? 36% perf-profile.children.cycles-pp.__wait_for_common
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.asm_sysvec_irq_work
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.sysvec_irq_work
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.__sysvec_irq_work
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.irq_work_run
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.irq_work_run_list
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.irq_work_single
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp._printk
0.83 ? 9% -0.3 0.55 ? 18% perf-profile.children.cycles-pp.vprintk_emit
0.59 ? 12% -0.3 0.31 ? 16% perf-profile.children.cycles-pp.kmem_cache_free
0.74 ? 21% -0.3 0.46 ? 20% perf-profile.children.cycles-pp.vmstat_start
1.10 ? 10% -0.3 0.82 ? 5% perf-profile.children.cycles-pp.exec_mmap
0.56 ? 12% -0.3 0.28 ? 2% perf-profile.children.cycles-pp.kmem_cache_alloc_lru
1.02 ? 13% -0.3 0.75 ? 3% perf-profile.children.cycles-pp.__percpu_counter_sum
0.49 ? 5% -0.3 0.22 ? 32% perf-profile.children.cycles-pp.change_prot_numa
1.14 ? 15% -0.3 0.88 ? 14% perf-profile.children.cycles-pp.__x64_sys_vfork
0.65 ? 11% -0.3 0.39 ? 6% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
0.64 ? 19% -0.3 0.38 ? 4% perf-profile.children.cycles-pp.do_vmi_munmap
0.63 ? 19% -0.3 0.37 ? 6% perf-profile.children.cycles-pp.do_vmi_align_munmap
0.45 ? 16% -0.3 0.19 ? 17% perf-profile.children.cycles-pp.d_alloc
0.50 ? 21% -0.3 0.25 ? 13% perf-profile.children.cycles-pp.proc_task_name
0.41 ? 25% -0.2 0.16 ? 6% perf-profile.children.cycles-pp.security_ptrace_access_check
0.28 ? 17% -0.2 0.04 ? 71% perf-profile.children.cycles-pp.flush_tlb_func
0.50 ? 16% -0.2 0.26 ? 9% perf-profile.children.cycles-pp.proc_fill_cache
0.96 ? 13% -0.2 0.73 ? 14% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.27 ? 14% -0.2 0.04 ? 70% perf-profile.children.cycles-pp.lookup_open
0.78 ? 7% -0.2 0.55 perf-profile.children.cycles-pp.wp_page_copy
1.73 ? 8% -0.2 1.50 ? 6% perf-profile.children.cycles-pp.update_sg_wakeup_stats
1.76 ? 8% -0.2 1.54 ? 6% perf-profile.children.cycles-pp.find_idlest_group
0.94 ? 12% -0.2 0.72 ? 5% perf-profile.children.cycles-pp.io_serial_in
1.12 ? 11% -0.2 0.90 ? 6% perf-profile.children.cycles-pp.wake_up_new_task
0.60 ? 19% -0.2 0.39 ? 18% perf-profile.children.cycles-pp.set_task_cpu
0.52 ? 15% -0.2 0.31 perf-profile.children.cycles-pp.elf_map
0.67 ? 21% -0.2 0.46 ? 25% perf-profile.children.cycles-pp.write
0.54 ? 18% -0.2 0.33 ? 17% perf-profile.children.cycles-pp.perf_trace_sched_migrate_task
0.49 ? 4% -0.2 0.28 ? 30% perf-profile.children.cycles-pp.task_numa_work
0.65 ? 14% -0.2 0.44 ? 7% perf-profile.children.cycles-pp.do_anonymous_page
0.48 ? 13% -0.2 0.28 ? 3% perf-profile.children.cycles-pp.setlocale
0.27 ? 33% -0.2 0.07 ? 18% perf-profile.children.cycles-pp.free_unref_page
0.56 ? 14% -0.2 0.36 ? 10% perf-profile.children.cycles-pp.free_pgtables
1.45 ? 3% -0.2 1.25 ? 8% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
0.30 ? 19% -0.2 0.10 ? 16% perf-profile.children.cycles-pp.___slab_alloc
0.44 ? 11% -0.2 0.25 ? 13% perf-profile.children.cycles-pp.__do_sys_newfstat
1.05 ? 7% -0.2 0.86 ? 6% perf-profile.children.cycles-pp.sched_exec
0.58 ? 23% -0.2 0.39 ? 7% perf-profile.children.cycles-pp.fold_vm_numa_events
0.23 ? 37% -0.2 0.04 ? 73% perf-profile.children.cycles-pp.free_unref_page_prepare
0.34 ? 17% -0.2 0.15 ? 6% perf-profile.children.cycles-pp.__d_alloc
0.79 ? 5% -0.2 0.60 ? 9% perf-profile.children.cycles-pp.__cond_resched
0.30 ? 8% -0.2 0.12 ? 21% perf-profile.children.cycles-pp.may_open
0.35 ? 10% -0.2 0.17 ? 16% perf-profile.children.cycles-pp.vfs_fstat
0.38 ? 30% -0.2 0.20 ? 16% perf-profile.children.cycles-pp.wq_worker_comm
0.29 ? 19% -0.2 0.11 perf-profile.children.cycles-pp.single_release
0.36 ? 20% -0.2 0.18 ? 21% perf-profile.children.cycles-pp.next_tgid
0.44 ? 16% -0.2 0.26 ? 8% perf-profile.children.cycles-pp.__open64_nocancel
0.37 ? 32% -0.2 0.20 ? 6% perf-profile.children.cycles-pp.__slab_free
0.36 ? 8% -0.2 0.19 ? 75% perf-profile.children.cycles-pp.poll_freewait
0.37 ? 18% -0.2 0.19 ? 6% perf-profile.children.cycles-pp.__d_lookup
0.33 ? 11% -0.2 0.16 ? 13% perf-profile.children.cycles-pp.dput
0.32 ? 18% -0.2 0.15 ? 14% perf-profile.children.cycles-pp.vsnprintf
0.43 ? 11% -0.2 0.27 ? 13% perf-profile.children.cycles-pp.mod_objcg_state
0.25 ? 17% -0.2 0.09 ? 24% perf-profile.children.cycles-pp.__lock_task_sighand
0.44 ? 6% -0.2 0.29 ? 8% perf-profile.children.cycles-pp.__mmap
0.25 ? 28% -0.2 0.09 ? 13% perf-profile.children.cycles-pp.apparmor_ptrace_access_check
0.41 ? 15% -0.2 0.25 ? 26% perf-profile.children.cycles-pp.cap_vm_enough_memory
0.23 ? 12% -0.2 0.08 ? 36% perf-profile.children.cycles-pp.generic_fillattr
0.20 ? 13% -0.2 0.04 ? 76% perf-profile.children.cycles-pp.shuffle_freelist
0.51 ? 16% -0.2 0.36 ? 18% perf-profile.children.cycles-pp.perf_callchain_user
0.22 ? 13% -0.2 0.07 ? 28% perf-profile.children.cycles-pp.allocate_slab
0.52 ? 11% -0.1 0.37 ? 7% perf-profile.children.cycles-pp.wait4
0.68 ? 4% -0.1 0.54 ? 5% perf-profile.children.cycles-pp.enqueue_task_fair
0.23 ? 17% -0.1 0.08 ? 5% perf-profile.children.cycles-pp.__kmem_cache_free
0.45 ? 3% -0.1 0.31 ? 5% perf-profile.children.cycles-pp._dl_addr
0.32 ? 19% -0.1 0.17 ? 9% perf-profile.children.cycles-pp.d_hash_and_lookup
0.23 ? 12% -0.1 0.09 ? 35% perf-profile.children.cycles-pp.vfs_getattr_nosec
0.25 ? 19% -0.1 0.11 ? 73% perf-profile.children.cycles-pp.add_wait_queue
0.23 ? 14% -0.1 0.09 ? 18% perf-profile.children.cycles-pp.sync_regs
0.25 ? 24% -0.1 0.11 ? 37% perf-profile.children.cycles-pp.pid_nr_ns
0.21 ? 10% -0.1 0.07 ? 28% perf-profile.children.cycles-pp.seq_printf
0.46 ? 11% -0.1 0.32 ? 10% perf-profile.children.cycles-pp.kernel_wait4
0.45 ? 11% -0.1 0.32 ? 9% perf-profile.children.cycles-pp.do_wait
0.31 ? 10% -0.1 0.17 ? 30% perf-profile.children.cycles-pp.put_prev_task_fair
0.46 ? 8% -0.1 0.33 ? 5% perf-profile.children.cycles-pp.__kmem_cache_alloc_node
0.32 ? 5% -0.1 0.19 ? 4% perf-profile.children.cycles-pp.getname_flags
0.28 ? 16% -0.1 0.15 ? 10% perf-profile.children.cycles-pp.__vm_munmap
0.32 ? 24% -0.1 0.20 ? 4% perf-profile.children.cycles-pp.__memset
0.39 ? 16% -0.1 0.27 ? 12% perf-profile.children.cycles-pp.__task_pid_nr_ns
0.22 ? 11% -0.1 0.10 ? 4% perf-profile.children.cycles-pp.vm_area_alloc
0.26 ? 18% -0.1 0.14 ? 12% perf-profile.children.cycles-pp.proc_pid_get_link
0.44 ? 13% -0.1 0.33 ? 6% perf-profile.children.cycles-pp.__get_user_pages
0.17 ? 7% -0.1 0.05 ? 74% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.42 ? 6% -0.1 0.30 ? 5% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
0.45 ? 13% -0.1 0.33 ? 7% perf-profile.children.cycles-pp.get_user_pages_remote
0.22 ? 11% -0.1 0.11 ? 8% perf-profile.children.cycles-pp.lock_vma_under_rcu
0.26 ? 13% -0.1 0.15 ? 17% perf-profile.children.cycles-pp.vma_interval_tree_remove
0.32 ? 10% -0.1 0.21 ? 5% perf-profile.children.cycles-pp.setup_arg_pages
0.22 ? 24% -0.1 0.12 ? 4% perf-profile.children.cycles-pp.task_dump_owner
0.20 ? 23% -0.1 0.10 ? 17% perf-profile.children.cycles-pp.__kmalloc_node
0.34 ? 7% -0.1 0.23 ? 24% perf-profile.children.cycles-pp.perf_event_mmap
0.33 ? 7% -0.1 0.22 ? 24% perf-profile.children.cycles-pp.perf_event_mmap_event
0.53 ? 4% -0.1 0.43 ? 4% perf-profile.children.cycles-pp.enqueue_entity
0.27 ? 17% -0.1 0.16 ? 5% perf-profile.children.cycles-pp.load_elf_interp
0.16 ? 16% -0.1 0.06 perf-profile.children.cycles-pp.__dentry_kill
0.29 ? 16% -0.1 0.19 ? 2% perf-profile.children.cycles-pp.perf_iterate_sb
0.32 ? 12% -0.1 0.22 ? 11% perf-profile.children.cycles-pp.single_open
0.40 ? 11% -0.1 0.30 ? 16% perf-profile.children.cycles-pp.put_prev_entity
0.23 ? 15% -0.1 0.13 ? 21% perf-profile.children.cycles-pp.drm_gem_vunmap
0.23 ? 15% -0.1 0.13 ? 21% perf-profile.children.cycles-pp.drm_gem_shmem_vunmap
0.39 ? 10% -0.1 0.29 ? 8% perf-profile.children.cycles-pp.tlb_batch_pages_flush
0.28 ? 16% -0.1 0.18 ? 13% perf-profile.children.cycles-pp.__x64_sys_readlink
0.23 ? 15% -0.1 0.14 ? 20% perf-profile.children.cycles-pp.drm_gem_vunmap_unlocked
0.27 ? 9% -0.1 0.18 ? 5% perf-profile.children.cycles-pp.shift_arg_pages
0.19 ? 24% -0.1 0.10 ? 9% perf-profile.children.cycles-pp.__x64_sys_close
0.17 ? 14% -0.1 0.08 ? 10% perf-profile.children.cycles-pp.page_remove_rmap
0.27 ? 16% -0.1 0.18 ? 13% perf-profile.children.cycles-pp.do_readlinkat
0.23 ? 14% -0.1 0.14 ? 15% perf-profile.children.cycles-pp.user_path_at_empty
0.35 ? 6% -0.1 0.26 ? 18% perf-profile.children.cycles-pp.diskstats_show
0.16 ? 25% -0.1 0.07 ? 12% perf-profile.children.cycles-pp.path_init
0.13 ? 9% -0.1 0.05 ? 70% perf-profile.children.cycles-pp.getenv
0.13 ? 9% -0.1 0.04 ? 73% perf-profile.children.cycles-pp.mas_alloc_nodes
0.13 ? 9% -0.1 0.04 ? 73% perf-profile.children.cycles-pp.mas_preallocate
0.17 ? 19% -0.1 0.09 ? 31% perf-profile.children.cycles-pp.do_open_execat
0.17 ? 17% -0.1 0.08 ? 31% perf-profile.children.cycles-pp.__ptrace_may_access
0.21 ? 12% -0.1 0.12 ? 16% perf-profile.children.cycles-pp.strncpy_from_user
0.21 ? 6% -0.1 0.13 ? 13% perf-profile.children.cycles-pp.unlink_file_vma
0.14 ? 26% -0.1 0.06 ? 14% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.20 ? 8% -0.1 0.12 ? 6% perf-profile.children.cycles-pp.unmap_region
0.36 ? 11% -0.1 0.28 ? 8% perf-profile.children.cycles-pp.release_pages
0.16 ? 36% -0.1 0.08 ? 22% perf-profile.children.cycles-pp.errseq_sample
0.11 ? 22% -0.1 0.03 ? 70% perf-profile.children.cycles-pp.perf_rotate_context
0.22 ? 9% -0.1 0.14 ? 28% perf-profile.children.cycles-pp.atime_needs_update
0.25 ? 6% -0.1 0.17 ? 7% perf-profile.children.cycles-pp.proc_pid_cmdline_read
0.17 ? 14% -0.1 0.10 ? 4% perf-profile.children.cycles-pp.mas_walk
0.22 ? 8% -0.1 0.15 ? 17% perf-profile.children.cycles-pp.kmalloc_trace
0.13 ? 9% -0.1 0.06 ? 8% perf-profile.children.cycles-pp.vmstat_shepherd
0.11 ? 16% -0.1 0.04 ? 71% perf-profile.children.cycles-pp.slab_show
0.12 ? 13% -0.1 0.05 ? 84% perf-profile.children.cycles-pp.free_pcppages_bulk
0.18 ? 16% -0.1 0.11 ? 26% perf-profile.children.cycles-pp.fput
0.15 ? 23% -0.1 0.08 ? 17% perf-profile.children.cycles-pp.filp_close
0.31 ? 7% -0.1 0.25 ? 8% perf-profile.children.cycles-pp.__check_object_size
0.13 ? 17% -0.1 0.07 ? 79% perf-profile.children.cycles-pp.free_unref_page_list
0.20 ? 12% -0.1 0.13 ? 19% perf-profile.children.cycles-pp.select_task_rq
0.10 ? 21% -0.1 0.03 ? 70% perf-profile.children.cycles-pp.touch_atime
0.16 ? 10% -0.1 0.10 ? 14% perf-profile.children.cycles-pp.__call_rcu_common
0.24 ? 3% -0.1 0.18 ? 17% perf-profile.children.cycles-pp.malloc
0.14 ? 9% -0.1 0.08 ? 22% perf-profile.children.cycles-pp.__x64_sys_munmap
0.10 ? 16% -0.1 0.04 ? 76% perf-profile.children.cycles-pp.get_pid_task
0.28 ? 3% -0.1 0.22 ? 18% perf-profile.children.cycles-pp.do_cow_fault
0.11 ? 12% -0.1 0.05 ? 72% perf-profile.children.cycles-pp.thread_group_cputime_adjusted
0.11 ? 15% -0.1 0.05 ? 71% perf-profile.children.cycles-pp.thread_group_cputime
0.17 ? 7% -0.1 0.11 ? 14% perf-profile.children.cycles-pp.seq_puts
0.14 ? 18% -0.1 0.08 ? 14% perf-profile.children.cycles-pp.move_page_tables
0.16 ? 7% -0.1 0.10 ? 19% perf-profile.children.cycles-pp.drm_gem_shmem_put_pages_locked
0.16 ? 7% -0.1 0.10 ? 19% perf-profile.children.cycles-pp.drm_gem_put_pages
0.10 ? 4% -0.1 0.04 ? 71% perf-profile.children.cycles-pp.strlen
0.15 ? 26% -0.1 0.10 ? 8% perf-profile.children.cycles-pp.node_read_vmstat
0.16 ? 15% -0.1 0.10 ? 4% perf-profile.children.cycles-pp.seq_open
0.18 ? 14% -0.1 0.13 ? 7% perf-profile.children.cycles-pp.unlink_anon_vmas
0.15 ? 11% -0.1 0.10 ? 12% perf-profile.children.cycles-pp.create_elf_tables
0.13 ? 16% -0.1 0.08 ? 20% perf-profile.children.cycles-pp.find_extend_vma
0.14 ? 14% -0.1 0.09 ? 5% perf-profile.children.cycles-pp.__access_remote_vm
0.09 ? 18% -0.1 0.04 ? 76% perf-profile.children.cycles-pp.d_path
0.26 ? 6% -0.0 0.21 ? 4% perf-profile.children.cycles-pp.copy_strings
0.11 ? 14% -0.0 0.06 ? 19% perf-profile.children.cycles-pp.stop_one_cpu
0.10 ? 12% -0.0 0.06 ? 71% perf-profile.children.cycles-pp.__list_add_valid
0.08 ? 14% -0.0 0.04 ? 71% perf-profile.children.cycles-pp.do_brk_flags
0.12 ? 20% -0.0 0.07 ? 11% perf-profile.children.cycles-pp.obj_cgroup_charge
0.14 ? 26% -0.0 0.10 ? 17% perf-profile.children.cycles-pp.get_zeroed_page
0.09 ? 18% -0.0 0.05 ? 72% perf-profile.children.cycles-pp._exit
0.11 ? 4% -0.0 0.06 ? 14% perf-profile.children.cycles-pp.part_stat_read_all
0.17 ? 7% -0.0 0.12 ? 10% perf-profile.children.cycles-pp.strnlen_user
0.27 ? 6% -0.0 0.23 ? 7% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.08 ? 22% -0.0 0.04 ? 71% perf-profile.children.cycles-pp.__d_add
0.15 ? 12% -0.0 0.11 ? 19% perf-profile.children.cycles-pp.do_notify_parent
0.15 ? 9% -0.0 0.11 ? 8% perf-profile.children.cycles-pp.clear_page_erms
0.12 ? 13% -0.0 0.08 ? 11% perf-profile.children.cycles-pp.complete_signal
0.12 ? 4% -0.0 0.08 ? 10% perf-profile.children.cycles-pp.unmap_single_vma
0.09 ? 18% -0.0 0.06 ? 8% perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg
0.16 ? 10% -0.0 0.13 ? 13% perf-profile.children.cycles-pp.exit_notify
0.11 ? 8% -0.0 0.08 ? 30% perf-profile.children.cycles-pp.check_move_unevictable_pages
0.12 ? 13% -0.0 0.09 ? 9% perf-profile.children.cycles-pp.vma_complete
0.12 ? 6% -0.0 0.09 ? 10% perf-profile.children.cycles-pp.mas_split
0.13 ? 12% -0.0 0.10 ? 19% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.09 ? 13% -0.0 0.06 ? 7% perf-profile.children.cycles-pp.map_vdso
0.08 ? 14% -0.0 0.06 ? 8% perf-profile.children.cycles-pp.__install_special_mapping
0.12 ? 10% -0.0 0.09 ? 18% perf-profile.children.cycles-pp.free_swap_cache
0.09 ? 14% -0.0 0.06 ? 7% perf-profile.children.cycles-pp._copy_to_user
0.07 ? 17% -0.0 0.05 ? 8% perf-profile.children.cycles-pp.pgd_alloc
0.08 ? 5% -0.0 0.07 ? 7% perf-profile.children.cycles-pp.__queue_work
0.07 ? 6% +0.0 0.10 ? 9% perf-profile.children.cycles-pp.get_cpu_idle_time_us
0.07 ? 6% +0.0 0.10 ? 9% perf-profile.children.cycles-pp.get_idle_time
0.12 ? 11% +0.0 0.16 ? 3% perf-profile.children.cycles-pp.perf_swevent_event
0.02 ?141% +0.0 0.06 ? 8% perf-profile.children.cycles-pp.page_counter_try_charge
0.04 ? 71% +0.0 0.08 ? 10% perf-profile.children.cycles-pp.error_entry
0.07 ? 11% +0.0 0.12 ? 29% perf-profile.children.cycles-pp.__kmalloc
0.04 ? 73% +0.1 0.09 ? 15% perf-profile.children.cycles-pp.set_next_entity
0.09 ? 9% +0.1 0.15 ? 8% perf-profile.children.cycles-pp.swake_up_one
0.00 +0.1 0.06 ? 13% perf-profile.children.cycles-pp.perf_mmap__write_tail
0.00 +0.1 0.06 ? 23% perf-profile.children.cycles-pp._IO_file_fopen
0.00 +0.1 0.06 ? 7% perf-profile.children.cycles-pp.__mod_timer
0.13 ? 21% +0.1 0.20 ? 17% perf-profile.children.cycles-pp.folio_unlock
0.38 ? 10% +0.1 0.44 ? 8% perf-profile.children.cycles-pp.apparmor_file_permission
0.07 ? 23% +0.1 0.14 ? 10% perf-profile.children.cycles-pp.rcu_report_qs_rdp
0.07 ? 17% +0.1 0.14 ? 29% perf-profile.children.cycles-pp.xas_start
0.03 ? 70% +0.1 0.11 ? 19% perf-profile.children.cycles-pp.access_error
0.02 ?141% +0.1 0.09 ? 13% perf-profile.children.cycles-pp.perf_mmap__consume
0.06 ? 79% +0.1 0.14 ? 13% perf-profile.children.cycles-pp.p4d_offset
0.08 ? 14% +0.1 0.16 ? 17% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
0.11 ? 11% +0.1 0.19 ? 23% perf-profile.children.cycles-pp.current_time
0.08 ? 27% +0.1 0.17 ? 29% perf-profile.children.cycles-pp.__perf_read_group_add
0.04 ? 71% +0.1 0.12 ? 10% perf-profile.children.cycles-pp.__get_vma_policy
0.17 ? 28% +0.1 0.26 ? 17% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.05 ? 77% +0.1 0.15 ? 21% perf-profile.children.cycles-pp.shmem_is_huge
0.47 ? 12% +0.1 0.57 ? 9% perf-profile.children.cycles-pp.security_file_permission
0.30 ? 14% +0.1 0.40 ? 10% perf-profile.children.cycles-pp.__mod_node_page_state
0.05 ? 72% +0.1 0.15 ? 13% perf-profile.children.cycles-pp.down_read_trylock
0.20 ? 16% +0.1 0.30 ? 10% perf-profile.children.cycles-pp.page_add_file_rmap
0.43 ? 13% +0.1 0.54 ? 6% perf-profile.children.cycles-pp.__mod_lruvec_page_state
0.10 ? 32% +0.1 0.21 ? 8% perf-profile.children.cycles-pp.blk_cgroup_congested
0.11 ? 32% +0.1 0.22 ? 7% perf-profile.children.cycles-pp.__folio_throttle_swaprate
0.00 +0.1 0.11 ? 12% perf-profile.children.cycles-pp.folio_mark_dirty
0.03 ?141% +0.1 0.14 ? 40% perf-profile.children.cycles-pp.update_curr_rt
0.03 ?141% +0.1 0.14 ? 46% perf-profile.children.cycles-pp.task_tick_rt
0.34 ? 7% +0.1 0.46 ? 6% perf-profile.children.cycles-pp.folio_batch_move_lru
0.21 ? 21% +0.1 0.32 ? 17% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
0.31 ? 6% +0.1 0.43 ? 12% perf-profile.children.cycles-pp.folio_add_lru
0.05 ? 71% +0.1 0.17 ? 7% perf-profile.children.cycles-pp.xas_alloc
0.49 ? 13% +0.1 0.62 ? 6% perf-profile.children.cycles-pp.orc_find
0.00 +0.1 0.13 ? 13% perf-profile.children.cycles-pp.policy_node
0.00 +0.1 0.13 ? 43% perf-profile.children.cycles-pp.generic_write_check_limits
0.10 ? 27% +0.1 0.22 ? 9% perf-profile.children.cycles-pp.xas_store
0.00 +0.1 0.14 ? 13% perf-profile.children.cycles-pp._IO_fread
0.36 ? 12% +0.1 0.50 ? 11% perf-profile.children.cycles-pp.drm_fbdev_damage_blit_real
0.37 ? 11% +0.1 0.51 ? 2% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.07 ? 23% +0.1 0.21 ? 9% perf-profile.children.cycles-pp.xas_create
0.00 +0.1 0.14 ? 36% perf-profile.children.cycles-pp.generic_write_checks
0.66 ? 12% +0.1 0.81 ? 5% perf-profile.children.cycles-pp.perf_trace_sched_switch
0.07 ? 11% +0.1 0.21 ? 36% perf-profile.children.cycles-pp.file_update_time
0.75 ? 7% +0.2 0.90 ? 4% perf-profile.children.cycles-pp.__unwind_start
0.32 ? 7% +0.2 0.48 perf-profile.children.cycles-pp.__fdget_pos
0.12 ? 16% +0.2 0.28 ? 33% perf-profile.children.cycles-pp.workingset_age_nonresident
0.10 ? 45% +0.2 0.27 ? 8% perf-profile.children.cycles-pp.inode_to_bdi
0.13 ? 10% +0.2 0.31 ? 29% perf-profile.children.cycles-pp.workingset_activation
0.21 ? 37% +0.2 0.41 ? 34% perf-profile.children.cycles-pp.__evlist__disable
0.60 ? 6% +0.2 0.79 perf-profile.children.cycles-pp.__orc_find
0.00 +0.2 0.21 ? 29% perf-profile.children.cycles-pp.rcu_gp_cleanup
0.00 +0.2 0.22 ? 20% perf-profile.children.cycles-pp.__snprintf_chk
0.00 +0.2 0.23 ? 9% perf-profile.children.cycles-pp.xas_nomem
2.67 ? 3% +0.2 2.92 ? 3% perf-profile.children.cycles-pp.update_curr
0.39 ? 9% +0.2 0.64 ? 3% perf-profile.children.cycles-pp.do_set_pte
0.17 ? 17% +0.2 0.42 ? 24% perf-profile.children.cycles-pp.folio_mark_accessed
0.41 ? 12% +0.3 0.67 ? 6% perf-profile.children.cycles-pp.__count_memcg_events
0.92 ? 10% +0.3 1.18 ? 5% perf-profile.children.cycles-pp.filemap_get_entry
0.00 +0.3 0.26 ? 39% perf-profile.children.cycles-pp.force_qs_rnp
0.15 ? 19% +0.3 0.42 ? 10% perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
0.09 ? 21% +0.3 0.36 ? 8% perf-profile.children.cycles-pp.zero_user_segments
0.45 ? 15% +0.3 0.73 ? 5% perf-profile.children.cycles-pp.xas_load
0.22 ? 16% +0.3 0.51 ? 9% perf-profile.children.cycles-pp.__vm_enough_memory
1.52 ? 8% +0.3 1.82 ? 7% perf-profile.children.cycles-pp.__do_softirq
1.58 ? 15% +0.3 1.87 ? 4% perf-profile.children.cycles-pp.shmem_add_to_page_cache
0.35 +0.3 0.66 ? 12% perf-profile.children.cycles-pp.perf_mmap_to_page
2.34 ? 3% +0.3 2.67 ? 2% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
2.18 ? 7% +0.4 2.53 perf-profile.children.cycles-pp.perf_callchain_kernel
1.97 ? 9% +0.4 2.34 ? 2% perf-profile.children.cycles-pp.unwind_next_frame
0.77 ? 9% +0.4 1.14 ? 6% perf-profile.children.cycles-pp.___perf_sw_event
1.04 ? 14% +0.4 1.42 ? 10% perf-profile.children.cycles-pp.__irq_exit_rcu
0.37 ? 15% +0.4 0.75 ? 2% perf-profile.children.cycles-pp.rmqueue_bulk
0.75 ? 10% +0.4 1.15 ? 5% perf-profile.children.cycles-pp.__perf_sw_event
3.86 ? 8% +0.4 4.26 ? 3% perf-profile.children.cycles-pp.__schedule
0.44 ? 10% +0.4 0.85 ? 3% perf-profile.children.cycles-pp.finish_fault
0.85 ? 4% +0.5 1.31 ? 9% perf-profile.children.cycles-pp.perf_mmap_fault
1.11 ? 3% +0.5 1.57 ? 10% perf-profile.children.cycles-pp.__do_fault
1.18 ? 4% +0.5 1.72 ? 13% perf-profile.children.cycles-pp.native_irq_return_iret
0.32 ? 45% +0.6 0.89 ? 26% perf-profile.children.cycles-pp.rebalance_domains
0.32 ? 31% +0.6 0.91 ? 28% perf-profile.children.cycles-pp.evlist__id2evsel
1.99 ? 11% +0.7 2.66 ? 2% perf-profile.children.cycles-pp.__alloc_pages
1.30 ? 12% +0.7 2.03 ? 3% perf-profile.children.cycles-pp.get_page_from_freelist
0.71 ? 13% +0.7 1.44 ? 7% perf-profile.children.cycles-pp.mt_find
1.55 ? 8% +0.7 2.29 ? 2% perf-profile.children.cycles-pp.__folio_alloc
2.98 ? 11% +0.8 3.73 ? 3% perf-profile.children.cycles-pp.schedule
0.66 ? 13% +0.8 1.43 ? 7% perf-profile.children.cycles-pp.find_vma
0.93 ? 15% +0.8 1.75 ? 3% perf-profile.children.cycles-pp.rmqueue
1.07 ? 10% +0.9 1.95 perf-profile.children.cycles-pp.dequeue_entity
1.16 ? 10% +0.9 2.05 ? 2% perf-profile.children.cycles-pp.dequeue_task_fair
0.74 ? 24% +1.1 1.87 ? 28% perf-profile.children.cycles-pp.evsel__read_counter
5.53 ? 8% +1.3 6.79 ? 13% perf-profile.children.cycles-pp.ret_from_fork
5.51 ? 8% +1.3 6.78 ? 13% perf-profile.children.cycles-pp.kthread
0.20 ? 44% +1.4 1.62 ? 49% perf-profile.children.cycles-pp.__pthread_disable_asynccancel
0.53 ? 20% +1.4 1.96 ? 43% perf-profile.children.cycles-pp.perf_mmap__read_head
0.53 ? 20% +1.4 1.97 ? 42% perf-profile.children.cycles-pp.ring_buffer_read_head
0.17 ? 33% +1.5 1.63 ? 4% perf-profile.children.cycles-pp.schedule_timeout
1.65 ? 11% +1.6 3.28 perf-profile.children.cycles-pp.vma_alloc_folio
0.13 ? 47% +1.8 1.91 ? 3% perf-profile.children.cycles-pp.rcu_gp_fqs_loop
1.40 ? 9% +1.8 3.22 perf-profile.children.cycles-pp.shmem_alloc_folio
1.62 ? 3% +2.0 3.62 ? 7% perf-profile.children.cycles-pp.shmem_write_end
0.16 ? 38% +2.1 2.23 ? 3% perf-profile.children.cycles-pp.rcu_gp_kthread
2.34 ? 11% +2.6 4.97 ? 3% perf-profile.children.cycles-pp.shmem_alloc_and_acct_folio
5.43 ? 10% +3.8 9.22 perf-profile.children.cycles-pp.shmem_get_folio_gfp
4.95 ? 11% +4.0 9.00 ? 2% perf-profile.children.cycles-pp.shmem_write_begin
6.97 ? 6% +6.9 13.84 ? 5% perf-profile.children.cycles-pp.fault_in_readable
7.00 ? 6% +7.0 13.97 ? 5% perf-profile.children.cycles-pp.fault_in_iov_iter_readable
10.06 ? 7% +11.6 21.68 ? 6% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
9.84 ? 8% +11.7 21.52 ? 6% perf-profile.children.cycles-pp.copyin
9.94 ? 7% +11.8 21.79 ? 6% perf-profile.children.cycles-pp.copy_page_from_iter_atomic
64.07 ? 10% +12.2 76.28 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
63.99 ? 10% +12.2 76.21 perf-profile.children.cycles-pp.do_syscall_64
24.02 ? 7% +25.5 49.50 ? 3% perf-profile.children.cycles-pp.generic_perform_write
24.08 ? 7% +25.7 49.74 ? 3% perf-profile.children.cycles-pp.__generic_file_write_iter
24.15 ? 7% +25.9 50.01 ? 3% perf-profile.children.cycles-pp.generic_file_write_iter
24.90 ? 7% +26.1 51.01 ? 3% perf-profile.children.cycles-pp.vfs_write
25.04 ? 7% +26.3 51.30 ? 3% perf-profile.children.cycles-pp.ksys_write
32.57 ? 9% +26.4 59.00 ? 5% perf-profile.children.cycles-pp.__cmd_record
24.72 ? 7% +26.7 51.39 ? 3% perf-profile.children.cycles-pp.__libc_write
24.94 ? 7% +28.1 53.03 ? 4% perf-profile.children.cycles-pp.writen
25.22 ? 7% +28.7 53.97 ? 4% perf-profile.children.cycles-pp.record__pushfn
26.03 ? 6% +30.4 56.40 ? 6% perf-profile.children.cycles-pp.perf_mmap__push
26.64 ? 6% +31.0 57.67 ? 6% perf-profile.children.cycles-pp.record__mmap_read_evlist
11.30 ? 61% -10.7 0.57 ? 68% perf-profile.self.cycles-pp.copy_page
1.88 ? 16% -0.9 0.95 ? 7% perf-profile.self.cycles-pp.do_task_stat
0.80 ? 15% -0.6 0.24 ? 18% perf-profile.self.cycles-pp.try_charge_memcg
1.01 ? 26% -0.6 0.46 ? 8% perf-profile.self.cycles-pp.__d_lookup_rcu
0.85 ? 4% -0.5 0.36 ? 10% perf-profile.self.cycles-pp.next_uptodate_page
0.92 ? 9% -0.4 0.49 perf-profile.self.cycles-pp.zap_pte_range
0.54 -0.3 0.22 ? 23% perf-profile.self.cycles-pp.pid_revalidate
0.48 ? 10% -0.3 0.18 ? 31% perf-profile.self.cycles-pp.inode_permission
0.98 ? 13% -0.3 0.70 ? 4% perf-profile.self.cycles-pp.__percpu_counter_sum
0.52 ? 22% -0.2 0.28 ? 5% perf-profile.self.cycles-pp.kmem_cache_alloc
0.45 ? 14% -0.2 0.22 ? 19% perf-profile.self.cycles-pp.kmem_cache_free
1.73 ? 8% -0.2 1.50 ? 6% perf-profile.self.cycles-pp.update_sg_wakeup_stats
0.57 ? 18% -0.2 0.38 ? 5% perf-profile.self.cycles-pp.pcpu_alloc
0.40 ? 17% -0.2 0.21 ? 18% perf-profile.self.cycles-pp.vma_interval_tree_insert
0.37 ? 32% -0.2 0.19 ? 6% perf-profile.self.cycles-pp.__slab_free
0.56 ? 22% -0.2 0.38 ? 6% perf-profile.self.cycles-pp.fold_vm_numa_events
0.37 ? 8% -0.2 0.20 ? 14% perf-profile.self.cycles-pp.do_dentry_open
0.73 ? 11% -0.2 0.58 ? 3% perf-profile.self.cycles-pp._raw_spin_lock
0.40 ? 15% -0.2 0.25 ? 25% perf-profile.self.cycles-pp.cap_vm_enough_memory
0.22 ? 12% -0.1 0.08 ? 40% perf-profile.self.cycles-pp.generic_fillattr
0.30 ? 21% -0.1 0.15 ? 16% perf-profile.self.cycles-pp.__d_lookup
0.33 ? 16% -0.1 0.19 ? 2% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook
0.23 ? 14% -0.1 0.09 ? 18% perf-profile.self.cycles-pp.sync_regs
0.42 ? 2% -0.1 0.29 ? 5% perf-profile.self.cycles-pp._dl_addr
0.23 ? 14% -0.1 0.10 ? 35% perf-profile.self.cycles-pp.pid_nr_ns
0.64 ? 10% -0.1 0.52 ? 10% perf-profile.self.cycles-pp.io_serial_in
0.29 ? 33% -0.1 0.17 ? 14% perf-profile.self.cycles-pp.__alloc_file
0.42 ? 6% -0.1 0.30 ? 4% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
0.39 ? 16% -0.1 0.27 ? 12% perf-profile.self.cycles-pp.__task_pid_nr_ns
0.22 ? 23% -0.1 0.11 ? 4% perf-profile.self.cycles-pp.task_dump_owner
0.25 ? 11% -0.1 0.15 ? 17% perf-profile.self.cycles-pp.vma_interval_tree_remove
0.16 ? 16% -0.1 0.07 ? 11% perf-profile.self.cycles-pp.proc_fill_cache
0.25 ? 11% -0.1 0.16 ? 24% perf-profile.self.cycles-pp.mod_objcg_state
0.29 ? 10% -0.1 0.21 ? 11% perf-profile.self.cycles-pp.do_syscall_64
0.13 ? 30% -0.1 0.05 ? 8% perf-profile.self.cycles-pp.filemap_map_pages
0.16 ? 34% -0.1 0.08 ? 27% perf-profile.self.cycles-pp.errseq_sample
0.13 ? 24% -0.1 0.06 ? 23% perf-profile.self.cycles-pp.__lock_task_sighand
0.16 ? 25% -0.1 0.08 ? 22% perf-profile.self.cycles-pp.lookup_fast
0.15 ? 16% -0.1 0.08 ? 31% perf-profile.self.cycles-pp.__ptrace_may_access
0.17 ? 12% -0.1 0.10 ? 4% perf-profile.self.cycles-pp.mas_walk
0.47 ? 9% -0.1 0.40 ? 3% perf-profile.self.cycles-pp.__alloc_pages
0.15 ? 8% -0.1 0.09 ? 5% perf-profile.self.cycles-pp.malloc
0.13 ? 33% -0.1 0.06 ? 7% perf-profile.self.cycles-pp.wq_worker_comm
0.12 ? 21% -0.1 0.05 ? 71% perf-profile.self.cycles-pp.__call_rcu_common
0.14 ? 11% -0.1 0.08 ? 44% perf-profile.self.cycles-pp.atime_needs_update
0.17 ? 10% -0.1 0.11 ? 15% perf-profile.self.cycles-pp.io_serial_out
0.24 ? 8% -0.1 0.18 ? 14% perf-profile.self.cycles-pp.update_load_avg
0.14 ? 33% -0.1 0.08 ? 10% perf-profile.self.cycles-pp.next_tgid
0.10 ? 19% -0.1 0.04 ? 73% perf-profile.self.cycles-pp.pick_link
0.10 ? 14% -0.1 0.05 ? 72% perf-profile.self.cycles-pp.kmem_cache_alloc_lru
0.09 ? 9% -0.1 0.04 ? 73% perf-profile.self.cycles-pp.strlen
0.08 ? 5% -0.0 0.03 ? 70% perf-profile.self.cycles-pp.getenv
0.21 ? 7% -0.0 0.17 ? 14% perf-profile.self.cycles-pp.__get_user_nocheck_8
0.22 ? 2% -0.0 0.18 ? 12% perf-profile.self.cycles-pp.get_page_from_freelist
0.17 ? 7% -0.0 0.12 ? 10% perf-profile.self.cycles-pp.strnlen_user
0.13 ? 3% -0.0 0.09 ? 33% perf-profile.self.cycles-pp.fput
0.10 ? 4% -0.0 0.06 ? 14% perf-profile.self.cycles-pp.part_stat_read_all
0.08 ? 14% -0.0 0.04 ? 71% perf-profile.self.cycles-pp.__output_copy
0.15 ? 8% -0.0 0.11 ? 7% perf-profile.self.cycles-pp.clear_page_erms
0.25 -0.0 0.22 ? 3% perf-profile.self.cycles-pp.__kmem_cache_alloc_node
0.09 ? 18% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg
0.11 ? 7% -0.0 0.08 ? 10% perf-profile.self.cycles-pp.unmap_single_vma
0.12 ? 4% -0.0 0.10 ? 12% perf-profile.self.cycles-pp.fsnotify_perm
0.08 ? 6% -0.0 0.06 perf-profile.self.cycles-pp.__check_object_size
0.07 ? 14% +0.0 0.09 ? 5% perf-profile.self.cycles-pp.get_cpu_idle_time_us
0.09 ? 5% +0.0 0.12 ? 21% perf-profile.self.cycles-pp.__perf_sw_event
0.12 ? 8% +0.0 0.15 ? 3% perf-profile.self.cycles-pp.perf_swevent_event
0.20 ? 10% +0.0 0.25 ? 10% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.11 ? 8% +0.1 0.16 ? 10% perf-profile.self.cycles-pp.exc_page_fault
0.02 ?141% +0.1 0.08 ? 16% perf-profile.self.cycles-pp.error_entry
0.28 ? 13% +0.1 0.34 ? 8% perf-profile.self.cycles-pp.handle_pte_fault
0.13 ? 16% +0.1 0.19 ? 19% perf-profile.self.cycles-pp.__mod_lruvec_page_state
0.13 ? 25% +0.1 0.20 ? 17% perf-profile.self.cycles-pp.folio_unlock
0.07 ? 20% +0.1 0.14 ? 28% perf-profile.self.cycles-pp.xas_start
0.18 ? 18% +0.1 0.25 ? 15% perf-profile.self.cycles-pp.apparmor_file_permission
0.20 ? 16% +0.1 0.27 ? 23% perf-profile.self.cycles-pp.perf_exclude_event
0.07 ? 11% +0.1 0.14 ? 11% perf-profile.self.cycles-pp.folio_batch_move_lru
0.00 +0.1 0.07 ? 11% perf-profile.self.cycles-pp.do_fault
0.00 +0.1 0.07 ? 17% perf-profile.self.cycles-pp.folio_mark_dirty
0.00 +0.1 0.07 ? 35% perf-profile.self.cycles-pp.__perf_read_group_add
0.05 ? 84% +0.1 0.13 ? 16% perf-profile.self.cycles-pp.p4d_offset
0.03 ? 70% +0.1 0.11 ? 19% perf-profile.self.cycles-pp.access_error
0.08 ? 14% +0.1 0.16 ? 17% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.09 ? 35% +0.1 0.18 ? 2% perf-profile.self.cycles-pp.blk_cgroup_congested
0.10 ? 19% +0.1 0.19 ? 12% perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
0.17 ? 28% +0.1 0.26 ? 17% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.05 ? 71% +0.1 0.15 ? 14% perf-profile.self.cycles-pp.down_read_trylock
0.02 ?141% +0.1 0.12 ? 10% perf-profile.self.cycles-pp.__get_vma_policy
0.05 ? 74% +0.1 0.15 ? 19% perf-profile.self.cycles-pp.shmem_is_huge
0.29 ? 12% +0.1 0.39 ? 11% perf-profile.self.cycles-pp.__mod_node_page_state
0.14 ? 13% +0.1 0.25 ? 6% perf-profile.self.cycles-pp.handle_mm_fault
0.00 +0.1 0.10 ? 19% perf-profile.self.cycles-pp._IO_fread
0.04 ? 73% +0.1 0.15 ? 43% perf-profile.self.cycles-pp.__fsnotify_parent
0.06 ? 8% +0.1 0.17 ? 14% perf-profile.self.cycles-pp.do_set_pte
0.00 +0.1 0.12 ? 13% perf-profile.self.cycles-pp.policy_node
0.00 +0.1 0.12 ? 10% perf-profile.self.cycles-pp.fault_in_iov_iter_readable
0.00 +0.1 0.13 ? 43% perf-profile.self.cycles-pp.generic_write_check_limits
0.49 ? 12% +0.1 0.62 ? 7% perf-profile.self.cycles-pp.orc_find
0.36 ? 10% +0.1 0.49 ? 2% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.05 ? 71% +0.1 0.19 ? 12% perf-profile.self.cycles-pp.shmem_write_begin
0.50 ? 8% +0.1 0.64 ? 7% perf-profile.self.cycles-pp.perf_mmap_fault
0.11 ? 14% +0.2 0.27 ? 33% perf-profile.self.cycles-pp.workingset_age_nonresident
0.10 ? 49% +0.2 0.26 ? 9% perf-profile.self.cycles-pp.inode_to_bdi
0.11 ? 12% +0.2 0.28 ? 32% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
0.10 ? 12% +0.2 0.26 ? 17% perf-profile.self.cycles-pp.security_vm_enough_memory_mm
0.17 ? 26% +0.2 0.34 ? 18% perf-profile.self.cycles-pp.generic_perform_write
0.08 ? 16% +0.2 0.26 ? 17% perf-profile.self.cycles-pp.__vm_enough_memory
0.59 ? 7% +0.2 0.79 perf-profile.self.cycles-pp.__orc_find
0.00 +0.2 0.21 ? 22% perf-profile.self.cycles-pp.__snprintf_chk
0.34 ? 15% +0.2 0.55 ? 6% perf-profile.self.cycles-pp.__count_memcg_events
0.00 +0.2 0.21 ? 8% perf-profile.self.cycles-pp.xas_nomem
0.37 ? 14% +0.2 0.58 ? 2% perf-profile.self.cycles-pp.xas_load
0.12 ? 26% +0.2 0.34 ? 39% perf-profile.self.cycles-pp.evsel__read_counter
0.49 ? 4% +0.3 0.75 ? 2% perf-profile.self.cycles-pp.___perf_sw_event
0.34 ? 5% +0.3 0.60 ? 9% perf-profile.self.cycles-pp.do_user_addr_fault
0.34 ? 4% +0.3 0.64 ? 13% perf-profile.self.cycles-pp.perf_mmap_to_page
0.16 ? 13% +0.3 0.47 ? 5% perf-profile.self.cycles-pp.rmqueue
0.28 ? 18% +0.3 0.61 ? 5% perf-profile.self.cycles-pp.rmqueue_bulk
0.17 ? 28% +0.4 0.62 ? 13% perf-profile.self.cycles-pp.shmem_alloc_and_acct_folio
0.54 ? 12% +0.5 1.01 ? 8% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
0.65 ? 12% +0.5 1.16 ? 6% perf-profile.self.cycles-pp.__handle_mm_fault
1.18 ? 4% +0.5 1.72 ? 13% perf-profile.self.cycles-pp.native_irq_return_iret
0.32 ? 31% +0.6 0.90 ? 27% perf-profile.self.cycles-pp.evlist__id2evsel
0.13 ? 16% +0.6 0.73 ? 5% perf-profile.self.cycles-pp.vma_alloc_folio
0.59 ? 20% +0.6 1.20 ? 43% perf-profile.self.cycles-pp.record__mmap_read_evlist
0.21 ? 34% +0.6 0.86 ? 46% perf-profile.self.cycles-pp.record__pushfn
0.68 ? 12% +0.7 1.42 ? 7% perf-profile.self.cycles-pp.mt_find
0.50 ? 20% +1.4 1.89 ? 44% perf-profile.self.cycles-pp.ring_buffer_read_head
0.20 ? 41% +1.4 1.59 ? 49% perf-profile.self.cycles-pp.__pthread_disable_asynccancel
1.40 ? 3% +1.6 2.97 ? 11% perf-profile.self.cycles-pp.shmem_write_end
2.00 ? 14% +2.2 4.16 ? 19% perf-profile.self.cycles-pp.fault_in_readable
9.90 ? 7% +11.4 21.28 ? 6% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string



***************************************************************************************************
lkp-csl-2sp9: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory
=========================================================================================
compiler/cpufreq_governor/debug-setup/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/performance/no-monitor/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/numa02_SMT/autonuma-benchmark

commit:
ef6a22b70f ("sched/numa: apply the scan delay to every new vma")
fc137c0dda ("sched/numa: enhance vma scanning logic")

ef6a22b70f6d9044 fc137c0ddab29b591db6a091dc6
---------------- ---------------------------
%stddev %change %stddev
\ | \
203.37 ? 8% +48.9% 302.87 ? 3% autonuma-benchmark.numa01.seconds
13.85 -0.4% 13.80 autonuma-benchmark.numa02.seconds
12.20 -0.2% 12.18 autonuma-benchmark.numa02_SMT.seconds
921.30 ? 7% +43.2% 1318 ? 3% autonuma-benchmark.time.elapsed_time
921.30 ? 7% +43.2% 1318 ? 3% autonuma-benchmark.time.elapsed_time.max
100152 ? 7% +25.0% 125198 ? 6% autonuma-benchmark.time.involuntary_context_switches
1693712 ? 2% +5.0% 1778818 ? 2% autonuma-benchmark.time.minor_page_faults
7776 -4.7% 7409 autonuma-benchmark.time.percent_of_cpu_this_job_got
69860 ? 8% +37.4% 95961 ? 4% autonuma-benchmark.time.user_time
39149 ? 8% -56.8% 16895 ? 3% autonuma-benchmark.time.voluntary_context_switches





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests



Attachments:
(No filename) (130.50 kB)
config-6.3.0-rc4-00203-gfc137c0ddab2 (159.59 kB)
job-script (7.97 kB)
job.yaml (5.68 kB)
reproduce (263.00 B)
Download all attachments

2023-05-11 06:58:05

by Raghavendra K T

[permalink] [raw]
Subject: Re: [linus:master] [sched/numa] fc137c0dda: autonuma-benchmark.numa01.seconds 118.9% regression

On 5/10/2023 1:25 PM, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed a 118.9% regression of autonuma-benchmark.numa01.seconds on:
>
>
> commit: fc137c0ddab29b591db6a091dc6d7ce20ccb73f2 ("sched/numa: enhance vma scanning logic")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> testcase: autonuma-benchmark
> test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory
> parameters:
>
> iterations: 4x
> test: numa02_SMT
> cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+------------------------------------------------------------------------------------------------+
> | testcase: change | autonuma-benchmark: autonuma-benchmark.numa01.seconds 39.3% regression |
> | test machine | 224 threads 2 sockets (Sapphire Rapids) with 256G memory |
> | test parameters | cpufreq_governor=performance |
> | | iterations=4x |
> | | test=numa02_SMT |
> +------------------+------------------------------------------------------------------------------------------------+
> | testcase: change | autonuma-benchmark: autonuma-benchmark.numa01.seconds 48.9% regression |
> | test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | debug-setup=no-monitor |
> | | iterations=4x |
> | | test=numa02_SMT |
> +------------------+------------------------------------------------------------------------------------------------+
>
[...]

Hello,

Thanks for the detailed analysis. I have posted a RFC patch to address
this issue [1]. (that patch needs windows = 0 initialized FYI if needs
to be applied). will be posting RFC V2 soon. Will add your reported-by
to that patchset. But one thing to note is [1] will be bringing back
*some* of the system overhead of vma scanning.

Here are some observations/Clarifications on numa01 test:

- numa01 benchmark improvements I got for numascan improvement patchset
[2] were based on mmtests' numa01, lets call mmtest_numa01.
(some how this is not run in LKP ?)

- lkp_numa01 = mmtests' numa01_THREAD_ALLOC case mentioned in the
patch[1]

With numa scan enhancement patches there is a huge improvement regarding
system time overhead of vma scanning since we filter out scanning by
tasks which have not accessed VMA. This has benefited mmtest_numa01

However in case of lkp_numa01 we are observing that less PTE updates
happening because of filtering. (we can say a corner case of disjoint
set vma). This has caused regression you have reported.

backup:
----------
lkp_numa01:
3GB allocated memory that is distributed evenly to threads (24MB chunk).
24MB is then bzeroed by each thread 1000 times
mmtest_numa01:
entire 3GB bzeroed by all threads 50 times

[1].
https://lore.kernel.org/lkml/[email protected]/

[2]
https://lore.kernel.org/lkml/[email protected]/T/#t

Thanks and Regards
- Raghu