2024-04-19 05:56:19

by Yujie Liu

[permalink] [raw]
Subject: [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement

Hi Linus,

We noticed that commit 1e3ad78334a6 caused performance fluctuations in
various micro benchmarks. The perf stat metrics related with branch
instructions do have noticeable changes, which may be an expected
result of this commit. We are sending this report to provide these data
and hope it can be helpful for the awareness of overall impact or any
further investigation. Thanks.

kernel test robot noticed a 1.4% improvement of will-it-scale.per_process_ops on:

commit: 1e3ad78334a69b36e107232e337f9d693dcc9df2 ("x86/syscall: Don't force use of indirect calls for system calls")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

nr_task: 16
mode: process
test: futex4
cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.null.ops_per_sec -4.0% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=null |
| | testtime=60s |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.fpunch.ops_per_sec -1.6% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | disk=1HDD |
| | fs=ext4 |
| | nr_threads=100% |
| | test=fpunch |
| | testtime=60s |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | unixbench: unixbench.throughput -1.4% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_task=100% |
| | runtime=300s |
| | test=fsbuffer |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -1.1% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=pread1 |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -3.4% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=100% |
| | test=poll1 |
+------------------+-------------------------------------------------------------------------------------------+

Details are as below:

The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240419/[email protected]

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/16/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/futex4/will-it-scale

commit:
0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337
---------------- ---------------------------
%stddev %change %stddev
\ | \
860611 -1.4% 848885 proc-vmstat.numa_hit
753301 -1.6% 741136 proc-vmstat.numa_local
21797058 +1.4% 22102512 will-it-scale.16.processes
1362315 +1.4% 1381406 will-it-scale.per_process_ops
21797058 +1.4% 22102512 will-it-scale.workload
0.04 ? 7% -7.4% 0.04 perf-stat.i.MPKI
1.98e+09 +19.2% 2.36e+09 perf-stat.i.branch-instructions
1.47 -1.2 0.30 perf-stat.i.branch-miss-rate%
30820475 -70.4% 9118612 perf-stat.i.branch-misses
3.45 -4.4% 3.30 perf-stat.i.cpi
1.504e+10 +5.1% 1.58e+10 perf-stat.i.instructions
0.29 +4.5% 0.31 perf-stat.i.ipc
0.05 ? 2% -4.2% 0.04 perf-stat.overall.MPKI
1.56 -1.2 0.39 perf-stat.overall.branch-miss-rate%
3.43 -4.3% 3.28 perf-stat.overall.cpi
0.29 +4.5% 0.30 perf-stat.overall.ipc
208138 +3.4% 215312 perf-stat.overall.path-length
1.973e+09 +19.2% 2.353e+09 perf-stat.ps.branch-instructions
30729762 -70.4% 9109071 perf-stat.ps.branch-misses
1.499e+10 +5.1% 1.575e+10 perf-stat.ps.instructions
4.537e+12 +4.9% 4.759e+12 perf-stat.total.instructions
12.23 -0.6 11.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
10.09 -0.6 9.51 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
22.31 -0.4 21.88 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
9.25 +0.2 9.43 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
8.79 +0.2 9.02 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
7.13 +0.2 7.36 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
8.37 +0.3 8.63 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
12.38 -0.6 11.78 perf-profile.children.cycles-pp.do_syscall_64
10.12 -0.5 9.57 perf-profile.children.cycles-pp.__x64_sys_futex
22.63 -0.4 22.20 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.48 ? 2% -0.0 0.46 perf-profile.children.cycles-pp.get_futex_key
0.00 +0.2 0.18 ? 2% perf-profile.children.cycles-pp.x64_sys_call
9.11 +0.2 9.29 perf-profile.children.cycles-pp.entry_SYSCALL_64
8.88 +0.2 9.11 perf-profile.children.cycles-pp.do_futex
7.13 +0.2 7.36 perf-profile.children.cycles-pp.__futex_wait
8.43 +0.3 8.70 perf-profile.children.cycles-pp.futex_wait
1.20 -0.7 0.47 perf-profile.self.cycles-pp.__x64_sys_futex
1.46 -0.2 1.27 perf-profile.self.cycles-pp.do_syscall_64
0.51 -0.1 0.44 perf-profile.self.cycles-pp.do_futex
0.38 ? 5% -0.1 0.32 ? 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.48 ? 2% -0.0 0.45 perf-profile.self.cycles-pp.get_futex_key
0.00 +0.1 0.15 ? 2% perf-profile.self.cycles-pp.x64_sys_call
7.97 +0.1 8.12 perf-profile.self.cycles-pp.entry_SYSCALL_64
10.43 +0.2 10.60 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.72 ? 3% +0.2 0.94 ? 3% perf-profile.self.cycles-pp.__futex_wait


***************************************************************************************************
lkp-icl-2sp8: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/1HDD/btrfs/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/utime/stress-ng/60s

commit:
0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337
---------------- ---------------------------
%stddev %change %stddev
\ | \
136026 ? 3% +20.6% 164016 ? 11% meminfo.DirectMap4k
5.516e+10 +1.5% 5.598e+10 perf-stat.i.branch-instructions
5.427e+10 +1.5% 5.508e+10 perf-stat.ps.branch-instructions
137060 ? 23% +35.5% 185722 ? 7% numa-numastat.node0.local_node
50345 ? 26% -56.2% 22060 ? 77% numa-numastat.node0.other_node
289383 ? 9% -17.6% 238445 ? 6% numa-numastat.node1.local_node
15965 ? 85% +177.3% 44264 ? 38% numa-numastat.node1.other_node
136562 ? 23% +35.6% 185165 ? 7% numa-vmstat.node0.numa_local
50345 ? 26% -56.2% 22060 ? 77% numa-vmstat.node0.numa_other
288523 ? 9% -17.7% 237526 ? 6% numa-vmstat.node1.numa_local
15965 ? 85% +177.3% 44264 ? 38% numa-vmstat.node1.numa_other
1.71 -0.5 1.18 perf-profile.calltrace.cycles-pp.mnt_want_write.vfs_utimes.do_utimes.__x64_sys_utimensat.do_syscall_64
43.01 -0.3 42.68 perf-profile.calltrace.cycles-pp.user_path_at_empty.do_utimes.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe
23.61 -0.3 23.34 perf-profile.calltrace.cycles-pp.do_utimes.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe.utimensat
26.52 -0.2 26.27 perf-profile.calltrace.cycles-pp.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe.utimensat
16.22 -0.2 16.00 perf-profile.calltrace.cycles-pp.do_utimes.__x64_sys_utime.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
13.89 -0.2 13.68 perf-profile.calltrace.cycles-pp.user_path_at_empty.do_utimes.__x64_sys_utime.do_syscall_64.entry_SYSCALL_64_after_hwframe
39.07 -0.2 38.87 perf-profile.calltrace.cycles-pp.do_utimes.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe
16.75 -0.2 16.56 perf-profile.calltrace.cycles-pp.__x64_sys_utime.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
15.77 -0.2 15.58 perf-profile.calltrace.cycles-pp.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utimensat.do_syscall_64
10.55 -0.2 10.37 perf-profile.calltrace.cycles-pp.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utime.do_syscall_64
13.78 -0.2 13.60 perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utimensat
9.48 -0.2 9.31 perf-profile.calltrace.cycles-pp.strncpy_from_user.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utime
29.46 -0.1 29.32 perf-profile.calltrace.cycles-pp.utimensat
25.18 -0.1 25.05 perf-profile.calltrace.cycles-pp.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utimensat.do_syscall_64
21.74 -0.1 21.62 perf-profile.calltrace.cycles-pp.strncpy_from_user.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utimensat
27.48 -0.1 27.35 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.utimensat
43.89 -0.1 43.77 perf-profile.calltrace.cycles-pp.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe
17.24 -0.1 17.13 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
27.21 -0.1 27.11 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.utimensat
17.10 -0.1 17.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
18.02 -0.1 17.93 perf-profile.calltrace.cycles-pp.syscall
3.82 -0.1 3.76 perf-profile.calltrace.cycles-pp.__check_object_size.strncpy_from_user.getname_flags.user_path_at_empty.do_utimes
0.57 -0.0 0.54 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.61 -0.0 1.58 perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.strncpy_from_user.getname_flags.user_path_at_empty
2.91 -0.0 2.88 perf-profile.calltrace.cycles-pp.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utime.do_syscall_64
2.43 -0.0 2.40 perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utime
45.81 +0.1 45.96 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
45.27 +0.2 45.45 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
79.22 -0.7 78.54 perf-profile.children.cycles-pp.do_utimes
57.10 -0.5 56.56 perf-profile.children.cycles-pp.user_path_at_empty
70.66 -0.4 70.29 perf-profile.children.cycles-pp.__x64_sys_utimensat
36.81 -0.3 36.49 perf-profile.children.cycles-pp.getname_flags
31.75 -0.3 31.45 perf-profile.children.cycles-pp.strncpy_from_user
20.12 -0.2 19.91 perf-profile.children.cycles-pp.filename_lookup
17.70 -0.2 17.50 perf-profile.children.cycles-pp.path_lookupat
16.79 -0.2 16.60 perf-profile.children.cycles-pp.__x64_sys_utime
29.54 -0.1 29.40 perf-profile.children.cycles-pp.utimensat
18.34 -0.1 18.25 perf-profile.children.cycles-pp.syscall
19.31 -0.1 19.22 perf-profile.children.cycles-pp.vfs_utimes
4.47 -0.1 4.40 perf-profile.children.cycles-pp.__check_object_size
1.32 -0.1 1.26 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
3.38 -0.1 3.34 perf-profile.children.cycles-pp.walk_component
2.56 -0.0 2.52 perf-profile.children.cycles-pp.lookup_fast
2.08 -0.0 2.04 perf-profile.children.cycles-pp.__d_lookup_rcu
2.33 -0.0 2.30 perf-profile.children.cycles-pp.check_heap_object
2.44 -0.0 2.41 perf-profile.children.cycles-pp.complete_walk
1.07 -0.0 1.05 perf-profile.children.cycles-pp.make_vfsuid
1.30 -0.0 1.28 perf-profile.children.cycles-pp.path_put
0.84 +0.0 0.88 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.00 +0.6 0.63 perf-profile.children.cycles-pp.x64_sys_call
27.25 -0.2 27.02 perf-profile.self.cycles-pp.strncpy_from_user
1.30 -0.1 1.22 perf-profile.self.cycles-pp.do_syscall_64
0.24 -0.0 0.23 perf-profile.self.cycles-pp.may_setattr
0.12 +0.0 0.15 ? 3% perf-profile.self.cycles-pp.__x64_sys_utime
0.84 +0.0 0.88 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.92 +0.1 1.04 perf-profile.self.cycles-pp.__x64_sys_utimensat
0.00 +0.5 0.55 perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/readahead/stress-ng/60s

commit:
0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337
---------------- ---------------------------
%stddev %change %stddev
\ | \
5.631e+10 +2.8% 5.787e+10 perf-stat.i.branch-instructions
5.54e+10 +2.8% 5.695e+10 perf-stat.ps.branch-instructions
55177 ? 10% +36.4% 75281 ? 12% sched_debug.cfs_rq:/.avg_vruntime.stddev
55177 ? 10% +36.4% 75281 ? 12% sched_debug.cfs_rq:/.min_vruntime.stddev
46.20 -0.5 45.74 perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
35.83 -0.4 35.38 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
20.24 -0.3 19.90 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.__x64_sys_pread64
20.87 -0.3 20.54 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.__x64_sys_pread64.do_syscall_64
1.66 -0.1 1.58 perf-profile.calltrace.cycles-pp.__fdget.ksys_readahead.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead
0.66 ? 3% -0.1 0.60 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead.stress_run
0.63 ? 4% -0.0 0.58 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead
4.29 -0.0 4.25 perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.20 -0.0 2.16 perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.__x64_sys_pread64.do_syscall_64
1.88 -0.0 1.85 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.__x64_sys_pread64
4.33 ? 3% +0.3 4.68 ? 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.readahead
3.66 ? 3% +0.4 4.05 ? 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead
46.41 -0.5 45.94 perf-profile.children.cycles-pp.vfs_read
48.17 -0.5 47.71 perf-profile.children.cycles-pp.__x64_sys_pread64
36.13 -0.5 35.68 perf-profile.children.cycles-pp.filemap_read
20.30 -0.3 19.96 perf-profile.children.cycles-pp._copy_to_iter
20.97 -0.3 20.64 perf-profile.children.cycles-pp.copy_page_to_iter
55.86 -0.3 55.60 perf-profile.children.cycles-pp.__libc_pread
24.71 -0.2 24.48 perf-profile.children.cycles-pp.stress_readahead
2.62 -0.2 2.46 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
4.54 -0.1 4.45 perf-profile.children.cycles-pp.ksys_readahead
5.33 -0.0 5.28 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
4.40 -0.0 4.36 perf-profile.children.cycles-pp.__fsnotify_parent
2.28 -0.0 2.26 perf-profile.children.cycles-pp.touch_atime
2.06 -0.0 2.04 perf-profile.children.cycles-pp.atime_needs_update
0.08 ? 8% -0.0 0.05 ? 8% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.78 +0.0 0.81 perf-profile.children.cycles-pp.posix_fadvise
59.97 +0.3 60.27 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
58.28 +0.3 58.60 perf-profile.children.cycles-pp.do_syscall_64
18.84 +0.5 19.32 perf-profile.children.cycles-pp.readahead
0.00 +1.2 1.22 perf-profile.children.cycles-pp.x64_sys_call
20.09 -0.3 19.76 perf-profile.self.cycles-pp._copy_to_iter
24.32 -0.2 24.08 perf-profile.self.cycles-pp.stress_readahead
2.65 -0.2 2.47 perf-profile.self.cycles-pp.do_syscall_64
4.84 -0.0 4.80 perf-profile.self.cycles-pp.filemap_read
5.16 -0.0 5.11 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
4.27 -0.0 4.22 perf-profile.self.cycles-pp.__fsnotify_parent
0.08 ? 6% -0.0 0.05 ? 7% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
1.82 -0.0 1.80 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.70 +0.0 0.72 perf-profile.self.cycles-pp.__x64_sys_pread64
0.00 +1.1 1.06 perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/null/stress-ng/60s

commit:
0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337
---------------- ---------------------------
%stddev %change %stddev
\ | \
19402 ? 14% +63.7% 31762 ? 28% sched_debug.cpu.nr_switches.max
3272 ? 10% +40.4% 4595 ? 21% sched_debug.cpu.nr_switches.stddev
3241 +10.1% 3569 ? 9% vmstat.system.cs
162368 -0.9% 160961 vmstat.system.in
6303220 -3.7% 6068707 proc-vmstat.numa_hit
6236896 -3.8% 6002419 proc-vmstat.numa_local
6341375 -3.7% 6107478 proc-vmstat.pgalloc_normal
6171078 -3.7% 5941105 proc-vmstat.pgfault
6144519 -3.8% 5913179 proc-vmstat.pgfree
19272 -3.3% 18627 stress-ng.null.MB_per_sec_/dev/null_write_rate
2.902e+09 -4.0% 2.787e+09 stress-ng.null.ops
48365768 -4.0% 46449880 stress-ng.null.ops_per_sec
5809136 -3.9% 5580207 stress-ng.time.minor_page_faults
2394 +1.6% 2431 stress-ng.time.system_time
1324 -2.7% 1289 stress-ng.time.user_time
3.529e+10 +18.8% 4.19e+10 perf-stat.i.branch-instructions
0.24 ? 3% -0.1 0.19 ? 3% perf-stat.i.branch-miss-rate%
85202098 ? 4% -9.4% 77223454 ? 3% perf-stat.i.branch-misses
3168 ? 2% +11.4% 3529 ? 10% perf-stat.i.context-switches
1.03 -2.7% 1.00 perf-stat.i.cpi
1.897e+11 +2.7% 1.949e+11 perf-stat.i.instructions
0.97 +2.7% 1.00 perf-stat.i.ipc
3.14 -3.8% 3.03 perf-stat.i.metric.K/sec
100663 -3.8% 96871 perf-stat.i.minor-faults
100663 -3.8% 96871 perf-stat.i.page-faults
0.24 ? 3% -0.1 0.18 ? 3% perf-stat.overall.branch-miss-rate%
1.03 -2.7% 1.00 perf-stat.overall.cpi
0.97 +2.7% 1.00 perf-stat.overall.ipc
3.471e+10 +18.7% 4.121e+10 perf-stat.ps.branch-instructions
83783190 ? 3% -9.8% 75603241 ? 3% perf-stat.ps.branch-misses
3114 ? 2% +11.0% 3457 ? 10% perf-stat.ps.context-switches
1.866e+11 +2.7% 1.916e+11 perf-stat.ps.instructions
98965 -3.8% 95242 perf-stat.ps.minor-faults
98966 -3.8% 95242 perf-stat.ps.page-faults
1.139e+13 +2.6% 1.169e+13 perf-stat.total.instructions
4.88 ? 2% -0.3 4.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.llseek
4.94 ? 2% -0.2 4.74 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
3.29 ? 2% -0.2 3.12 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.ioctl
3.26 -0.1 3.13 perf-profile.calltrace.cycles-pp.setfl.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.30 -0.1 3.17 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.fallocate64
2.48 -0.1 2.36 perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
2.34 ? 2% -0.1 2.21 perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.12 -0.1 2.01 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
0.84 ? 2% -0.1 0.75 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.86 ? 2% -0.1 0.76 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
0.89 ? 3% -0.1 0.80 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek.stress_run
1.63 ? 2% -0.1 1.55 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
0.86 ? 3% -0.1 0.79 ? 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
2.40 ? 2% -0.1 2.33 perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
1.26 -0.1 1.20 perf-profile.calltrace.cycles-pp.__put_user_4.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.58 ? 3% -0.1 0.52 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
1.45 -0.1 1.40 perf-profile.calltrace.cycles-pp._raw_spin_lock.setfl.do_fcntl.__x64_sys_fcntl.do_syscall_64
0.55 +0.0 0.58 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
0.86 ? 3% +0.1 0.91 ? 2% perf-profile.calltrace.cycles-pp.__fdget_raw.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
3.11 +0.1 3.19 perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
1.50 ? 2% +0.1 1.60 ? 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fdatasync
0.70 +0.1 0.83 perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
1.15 ? 2% +0.1 1.29 ? 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync
1.41 ? 2% +0.1 1.55 ? 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fdatasync.stress_run
1.18 ? 2% +0.2 1.35 ? 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync.stress_run
3.90 ? 2% +0.2 4.08 ? 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_run
3.95 ? 2% +0.2 4.14 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
1.36 ? 2% +0.2 1.55 ? 2% perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
4.08 ? 2% +0.2 4.31 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
4.66 ? 3% +0.3 4.91 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
4.06 ? 3% +0.3 4.36 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
4.58 +0.3 4.91 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
5.07 +0.3 5.40 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fallocate64
4.20 ? 3% +0.3 4.54 ? 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek.stress_run
6.71 ? 2% +0.4 7.07 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
0.17 ?141% +0.4 0.58 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_fdatasync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync.stress_run
0.00 +0.5 0.55 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_fdatasync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync
8.22 -0.6 7.61 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
16.53 -0.6 15.94 perf-profile.children.cycles-pp.entry_SYSCALL_64
16.36 -0.6 15.80 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
5.04 -0.2 4.83 perf-profile.children.cycles-pp.do_fcntl
5.14 ? 2% -0.2 4.95 perf-profile.children.cycles-pp.vfs_write
9.50 -0.2 9.31 perf-profile.children.cycles-pp.__x64_sys_fcntl
3.54 -0.1 3.40 perf-profile.children.cycles-pp.setfl
2.62 -0.1 2.48 perf-profile.children.cycles-pp.do_vfs_ioctl
2.56 -0.1 2.47 perf-profile.children.cycles-pp.stress_null
1.89 -0.1 1.80 perf-profile.children.cycles-pp.amd_clear_divider
1.74 -0.1 1.65 perf-profile.children.cycles-pp.__libc_fcntl64
1.38 -0.1 1.30 perf-profile.children.cycles-pp.__put_user_4
1.54 -0.1 1.49 perf-profile.children.cycles-pp._raw_spin_lock
0.44 ? 4% -0.1 0.39 perf-profile.children.cycles-pp.__munmap
0.42 ? 4% -0.1 0.37 perf-profile.children.cycles-pp.__vm_munmap
0.42 ? 4% -0.1 0.37 perf-profile.children.cycles-pp.__x64_sys_munmap
0.40 ? 4% -0.0 0.36 perf-profile.children.cycles-pp.do_vmi_align_munmap
2.46 -0.0 2.41 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.42 ? 4% -0.0 0.38 perf-profile.children.cycles-pp.do_vmi_munmap
0.32 ? 4% -0.0 0.28 perf-profile.children.cycles-pp.unmap_region
0.24 ? 3% -0.0 0.22 ? 2% perf-profile.children.cycles-pp.asm_exc_page_fault
0.31 ? 3% -0.0 0.29 perf-profile.children.cycles-pp.__mmap
0.22 ? 3% -0.0 0.19 perf-profile.children.cycles-pp.do_user_addr_fault
0.55 -0.0 0.52 perf-profile.children.cycles-pp.fcntl64@plt
0.75 -0.0 0.72 perf-profile.children.cycles-pp.security_file_fcntl
0.29 ? 3% -0.0 0.27 perf-profile.children.cycles-pp.vm_mmap_pgoff
0.28 ? 3% -0.0 0.25 perf-profile.children.cycles-pp.do_mmap
0.22 ? 2% -0.0 0.20 ? 2% perf-profile.children.cycles-pp.exc_page_fault
0.20 ? 3% -0.0 0.18 ? 2% perf-profile.children.cycles-pp.mmap_region
0.56 -0.0 0.54 perf-profile.children.cycles-pp.null_lseek
0.53 -0.0 0.51 perf-profile.children.cycles-pp.security_file_ioctl
0.18 ? 3% -0.0 0.16 ? 2% perf-profile.children.cycles-pp.handle_mm_fault
0.15 ? 7% -0.0 0.13 ? 2% perf-profile.children.cycles-pp.tlb_finish_mmu
0.16 ? 3% -0.0 0.15 ? 3% perf-profile.children.cycles-pp.__handle_mm_fault
0.12 ? 6% -0.0 0.10 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
0.07 -0.0 0.06 perf-profile.children.cycles-pp.__anon_vma_prepare
3.28 +0.1 3.35 perf-profile.children.cycles-pp.__x64_sys_fallocate
7.43 +0.1 7.51 perf-profile.children.cycles-pp.fdatasync
4.15 +0.1 4.23 perf-profile.children.cycles-pp.syscall_return_via_sysret
1.07 +0.1 1.22 perf-profile.children.cycles-pp.__x64_sys_fdatasync
2.93 +0.4 3.35 ? 2% perf-profile.children.cycles-pp.__fdget
52.11 +1.6 53.71 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
47.02 +1.9 48.87 perf-profile.children.cycles-pp.do_syscall_64
0.00 +3.4 3.40 perf-profile.children.cycles-pp.x64_sys_call
8.38 -0.7 7.68 perf-profile.self.cycles-pp.do_syscall_64
15.83 -0.6 15.27 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
5.47 -0.3 5.20 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
5.17 -0.2 4.98 perf-profile.self.cycles-pp.entry_SYSCALL_64
4.64 -0.2 4.49 perf-profile.self.cycles-pp.llseek
4.24 -0.1 4.10 perf-profile.self.cycles-pp.ioctl
4.36 -0.1 4.22 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.94 -0.1 1.84 perf-profile.self.cycles-pp.stress_null
2.02 -0.1 1.93 perf-profile.self.cycles-pp.fdatasync
2.18 -0.1 2.10 perf-profile.self.cycles-pp.fallocate64
2.01 -0.1 1.94 perf-profile.self.cycles-pp.setfl
1.33 -0.1 1.26 perf-profile.self.cycles-pp.__put_user_4
1.54 -0.1 1.47 perf-profile.self.cycles-pp.do_fcntl
1.19 -0.1 1.14 ? 2% perf-profile.self.cycles-pp.do_vfs_ioctl
1.34 -0.1 1.29 perf-profile.self.cycles-pp._raw_spin_lock
2.30 -0.0 2.26 perf-profile.self.cycles-pp.__x64_sys_fcntl
1.97 -0.0 1.93 perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.97 -0.0 0.94 perf-profile.self.cycles-pp.amd_clear_divider
0.56 -0.0 0.54 perf-profile.self.cycles-pp.security_file_fcntl
0.39 -0.0 0.37 perf-profile.self.cycles-pp.fcntl64@plt
0.30 -0.0 0.29 perf-profile.self.cycles-pp.rw_verify_area
0.36 -0.0 0.35 perf-profile.self.cycles-pp.security_file_ioctl
0.44 +0.0 0.48 ? 2% perf-profile.self.cycles-pp.__x64_sys_fallocate
0.34 +0.0 0.37 ? 6% perf-profile.self.cycles-pp.__x64_sys_fdatasync
4.14 +0.1 4.23 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.57 +0.1 1.66 ? 2% perf-profile.self.cycles-pp.__fdget_raw
2.61 +0.4 3.06 ? 2% perf-profile.self.cycles-pp.__fdget
0.00 +2.9 2.89 perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/monte-carlo/stress-ng/60s

commit:
0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.74 -0.1 1.62 perf-stat.overall.branch-miss-rate%
1.411e+13 +1.2% 1.427e+13 perf-stat.total.instructions
2838242 -1.6% 2793803 stress-ng.monte-carlo.samples/sec,_e_using_arc4
7122323 -1.2% 7036665 stress-ng.monte-carlo.samples/sec,_exp_using_arc4
3972723 -1.5% 3911813 stress-ng.monte-carlo.samples/sec,_pi_using_arc4
1.016e+08 -1.3% 1.004e+08 stress-ng.monte-carlo.samples/sec,_pi_using_lcg
6407021 -1.4% 6319313 stress-ng.monte-carlo.samples/sec,_sin_using_arc4
7374513 -1.3% 7277983 stress-ng.monte-carlo.samples/sec,_sqrt_using_arc4
3962914 -1.5% 3904274 stress-ng.monte-carlo.samples/sec,_squircle_using_arc4
1108 +1.8% 1128 stress-ng.time.system_time
3.02 -0.3 2.69 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
3.40 ? 2% -0.2 3.16 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_getpid.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
2.93 ? 3% -0.2 2.71 ? 3% perf-profile.calltrace.cycles-pp.__task_pid_nr_ns.__x64_sys_getpid.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid
17.72 -0.2 17.52 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__getpid.stress_mc_arc4_rand
0.94 -0.0 0.91 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid
3.13 -0.0 3.10 perf-profile.calltrace.cycles-pp.stress_mc_mwc64_rand
2.07 -0.0 2.04 perf-profile.calltrace.cycles-pp.stress_mwc64.stress_mc_mwc64_rand
0.56 -0.0 0.53 perf-profile.calltrace.cycles-pp.stress_monte_carlo_sqrt.stress_mc_arc4_rand
43.17 +0.5 43.65 perf-profile.calltrace.cycles-pp.stress_mc_arc4_rand
34.06 +0.5 34.58 perf-profile.calltrace.cycles-pp.__getpid.stress_mc_arc4_rand
13.54 +0.8 14.33 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
10.40 +1.0 11.40 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
0.00 +1.4 1.38 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
3.94 -0.4 3.58 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
3.85 ? 2% -0.2 3.61 ? 2% perf-profile.children.cycles-pp.__x64_sys_getpid
3.15 ? 2% -0.2 2.93 ? 3% perf-profile.children.cycles-pp.__task_pid_nr_ns
7.88 -0.1 7.76 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
4.44 -0.1 4.37 perf-profile.children.cycles-pp.stress_mc_xorshift_rand
1.18 -0.1 1.13 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
2.10 -0.0 2.06 perf-profile.children.cycles-pp.stress_monte_carlo_pi
2.38 -0.0 2.35 perf-profile.children.cycles-pp.stress_monte_carlo_sqrt
2.32 -0.0 2.29 perf-profile.children.cycles-pp.stress_mwc64
35.10 +0.6 35.66 perf-profile.children.cycles-pp.__getpid
27.59 +0.7 28.32 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
25.10 +0.7 25.84 perf-profile.children.cycles-pp.do_syscall_64
0.00 +1.6 1.61 perf-profile.children.cycles-pp.x64_sys_call
3.90 -0.2 3.68 perf-profile.self.cycles-pp.do_syscall_64
2.90 ? 2% -0.2 2.69 ? 3% perf-profile.self.cycles-pp.__task_pid_nr_ns
7.64 -0.1 7.52 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
2.08 -0.1 2.00 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.95 -0.0 0.90 perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
2.76 -0.0 2.72 perf-profile.self.cycles-pp.stress_mc_xorshift_rand
1.64 -0.0 1.60 perf-profile.self.cycles-pp.stress_monte_carlo_pi
2.05 -0.0 2.02 perf-profile.self.cycles-pp.stress_mwc64
0.00 +1.4 1.37 perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp8: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fpunch/stress-ng/60s

commit:
0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.408e+10 +4.9% 4.623e+10 perf-stat.i.branch-instructions
0.21 -0.0 0.19 ? 3% perf-stat.overall.branch-miss-rate%
4.336e+10 +4.9% 4.547e+10 perf-stat.ps.branch-instructions
1.054e+08 -1.6% 1.037e+08 stress-ng.fpunch.ops
1756286 -1.6% 1727644 stress-ng.fpunch.ops_per_sec
879217 -2.0% 861604 stress-ng.time.voluntary_context_switches
38.90 -0.6 38.29 perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
31.84 -0.5 31.30 perf-profile.calltrace.cycles-pp.generic_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
40.48 -0.4 40.04 perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
19.84 -0.4 19.48 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
22.29 -0.4 21.94 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
16.32 -0.3 16.01 perf-profile.calltrace.cycles-pp.generic_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
26.13 -0.3 25.87 perf-profile.calltrace.cycles-pp.write
23.37 -0.3 23.11 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
47.74 -0.2 47.50 perf-profile.calltrace.cycles-pp.__libc_pwrite
0.52 -0.2 0.34 ? 70% perf-profile.calltrace.cycles-pp.__mutex_unlock_slowpath.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
2.06 -0.1 1.98 perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
1.33 -0.1 1.28 perf-profile.calltrace.cycles-pp.rwsem_wake.up_write.generic_file_write_iter.vfs_write.__x64_sys_pwrite64
0.64 ? 2% -0.1 0.59 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
2.62 -0.0 2.58 perf-profile.calltrace.cycles-pp.simple_write_end.generic_perform_write.generic_file_write_iter.vfs_write.__x64_sys_pwrite64
1.02 -0.0 0.99 perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.vfs_write.ksys_write.do_syscall_64
0.66 -0.0 0.64 perf-profile.calltrace.cycles-pp.rwsem_wake.up_write.generic_file_write_iter.vfs_write.ksys_write
1.48 -0.0 1.46 perf-profile.calltrace.cycles-pp.simple_write_end.generic_perform_write.generic_file_write_iter.vfs_write.ksys_write
0.66 -0.0 0.64 perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
1.06 +0.0 1.08 perf-profile.calltrace.cycles-pp.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
0.83 ? 3% +0.1 0.91 perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
1.60 +0.1 1.71 perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
0.71 ? 2% +0.1 0.82 perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
3.00 +0.2 3.18 perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
5.63 +0.2 5.85 perf-profile.calltrace.cycles-pp.syscall
2.37 +0.2 2.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
2.65 +0.2 2.89 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
10.09 +0.3 10.44 perf-profile.calltrace.cycles-pp.fallocate64
4.37 +0.4 4.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
4.84 +0.4 5.21 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fallocate64
59.21 -1.0 58.23 perf-profile.children.cycles-pp.vfs_write
48.48 -0.9 47.62 perf-profile.children.cycles-pp.generic_file_write_iter
40.60 -0.4 40.17 perf-profile.children.cycles-pp.__x64_sys_pwrite64
22.42 -0.4 22.06 perf-profile.children.cycles-pp.ksys_write
26.21 -0.3 25.96 perf-profile.children.cycles-pp.write
47.88 -0.2 47.65 perf-profile.children.cycles-pp.__libc_pwrite
3.21 -0.1 3.10 perf-profile.children.cycles-pp.up_write
2.62 -0.1 2.52 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
2.07 -0.1 2.00 perf-profile.children.cycles-pp.rwsem_wake
4.34 -0.1 4.27 perf-profile.children.cycles-pp.simple_write_end
1.28 -0.0 1.24 perf-profile.children.cycles-pp.wake_up_q
0.69 -0.0 0.66 perf-profile.children.cycles-pp.wake_q_add
1.06 -0.0 1.03 ? 2% perf-profile.children.cycles-pp.__mutex_unlock_slowpath
0.84 ? 2% -0.0 0.81 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.76 -0.0 0.73 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.65 -0.0 0.62 perf-profile.children.cycles-pp.rwsem_mark_wake
0.81 -0.0 0.79 perf-profile.children.cycles-pp.try_to_wake_up
5.70 +0.2 5.92 perf-profile.children.cycles-pp.syscall
2.03 ? 2% +0.3 2.29 perf-profile.children.cycles-pp.__fdget
4.83 +0.3 5.10 perf-profile.children.cycles-pp.__x64_sys_fallocate
10.18 +0.3 10.52 perf-profile.children.cycles-pp.fallocate64
0.00 +1.1 1.12 perf-profile.children.cycles-pp.x64_sys_call
2.68 -0.1 2.55 perf-profile.self.cycles-pp.do_syscall_64
4.24 -0.1 4.13 perf-profile.self.cycles-pp.fault_in_readable
2.04 -0.0 1.99 perf-profile.self.cycles-pp.simple_write_end
4.80 -0.0 4.76 perf-profile.self.cycles-pp.vfs_write
1.67 -0.0 1.64 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.66 ? 2% -0.0 0.63 perf-profile.self.cycles-pp.wake_q_add
0.72 -0.0 0.70 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.68 ? 3% -0.0 0.66 ? 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.50 -0.0 0.48 perf-profile.self.cycles-pp.wake_up_q
0.59 +0.1 0.66 perf-profile.self.cycles-pp.__x64_sys_fallocate
0.96 +0.1 1.05 perf-profile.self.cycles-pp.__fdget_pos
0.57 +0.1 0.68 perf-profile.self.cycles-pp.__x64_sys_pwrite64
1.87 ? 2% +0.3 2.14 perf-profile.self.cycles-pp.__fdget
0.00 +1.0 0.96 perf-profile.self.cycles-pp.x64_sys_call





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



2024-04-19 07:33:55

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement

On Fri, Apr 19, 2024 at 01:49:26PM +0800, kernel test robot wrote:
> Hi Linus,
>
> We noticed that commit 1e3ad78334a6 caused performance fluctuations in
> various micro benchmarks. The perf stat metrics related with branch
> instructions do have noticeable changes, which may be an expected
> result of this commit. We are sending this report to provide these data
> and hope it can be helpful for the awareness of overall impact or any
> further investigation. Thanks.
>
> kernel test robot noticed a 1.4% improvement of will-it-scale.per_process_ops on:
>
> commit: 1e3ad78334a69b36e107232e337f9d693dcc9df2 ("x86/syscall: Don't force use of indirect calls for system calls")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

Thanks, these are significant regressions.

Since this is on Skylake (with IBRS enabled, presumably) I'd expect that
these regressions are fixed by my "Only harden syscalls when needed"
patch. I'm planning on posting a new version of that tomorrow, but v3
[*] should be good enough to fix it. Could you run these tests on the
same Skylake system with my patch added?

Also it would be helpful to see the same tests on Cascade/Ice Lake, or
some other system for which the 'spectre_v2' sysfs vulnerabilities file
shows "BHI: SW loop". On such a system it shouldn't matter whether my
patch is added as it won't disable Linus' syscall change. But it would
be very helpful to see the performance impact of that combination.

[*] https://lkml.kernel.org/lkml/eda0ec65f4612cc66875aaf76e738643f41fbc01.1713296762.git.jpoimboe@kernel.org

--
Josh

2024-04-22 07:48:34

by Yujie Liu

[permalink] [raw]
Subject: Re: [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement

Hi Josh,

On Fri, Apr 19, 2024 at 12:33:46AM -0700, Josh Poimboeuf wrote:
> On Fri, Apr 19, 2024 at 01:49:26PM +0800, kernel test robot wrote:
> > Hi Linus,
> >
> > We noticed that commit 1e3ad78334a6 caused performance fluctuations in
> > various micro benchmarks. The perf stat metrics related with branch
> > instructions do have noticeable changes, which may be an expected
> > result of this commit. We are sending this report to provide these data
> > and hope it can be helpful for the awareness of overall impact or any
> > further investigation. Thanks.
> >
> > kernel test robot noticed a 1.4% improvement of will-it-scale.per_process_ops on:
> >
> > commit: 1e3ad78334a69b36e107232e337f9d693dcc9df2 ("x86/syscall: Don't force use of indirect calls for system calls")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> Thanks, these are significant regressions.

First we need to clarify that by running this specific will-it-scale
futex4 benchmark on a Skylake machine, we observed a +1.4% performance
improvement, not a regression.

> Since this is on Skylake (with IBRS enabled, presumably) I'd expect that
> these regressions are fixed by my "Only harden syscalls when needed"
> patch. I'm planning on posting a new version of that tomorrow, but v3
> [*] should be good enough to fix it. Could you run these tests on the
> same Skylake system with my patch added?

The v3 patch [*] cannot be applied on commit 1e3ad78334a6. Seems the
code base has changed a lot, so we are not able to directly compare
1e3ad78334a6 and 1e3ad78334a6+v3_patch.

The patch is good to apply on v6.9-rc4, so we tested v6.9-rc4 and
v6.9-rc4+v3_patch. Here are the test results for your reference:

Skylake
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor:
lkp-skl-fpga01/will-it-scale/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3/gcc-13/16/process/futex4/performance

commit:
0cd01ac5dcb1 ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")
v6.9-rc4
v6.9-rc4+v3_patch

0cd01ac5dcb1 1e3ad78334a6 v6.9-rc4 v6.9-rc4+v3_patch
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
1362315 +1.4% 1381406 +1.5% 1382652 +0.5% 1369778 will-it-scale.per_process_ops
21797058 +1.4% 22102512 +1.5% 22122442 +0.5% 21916453 will-it-scale.workload
0.04 ? 7% -7.4% 0.04 -6.1% 0.04 ? 2% -4.0% 0.04 perf-stat.i.MPKI
1.98e+09 +19.2% 2.36e+09 +19.2% 2.359e+09 +1.7% 2.014e+09 perf-stat.i.branch-instructions
1.47 -1.2 0.30 -1.2 0.30 ? 3% -0.0 1.45 perf-stat.i.branch-miss-rate%
30820475 -70.4% 9118612 -71.0% 8945551 +0.5% 30985854 perf-stat.i.branch-misses
7767463 -1.2% 7676829 -1.0% 7686158 -1.3% 7664542 perf-stat.i.cache-references
3.45 -4.4% 3.30 -4.4% 3.30 -0.4% 3.43 perf-stat.i.cpi
1.504e+10 +5.1% 1.58e+10 +5.2% 1.582e+10 +1.2% 1.522e+10 perf-stat.i.instructions
0.29 +4.5% 0.31 +4.6% 0.31 +0.4% 0.29 perf-stat.i.ipc
1.01 ?100% -0.6% 1.00 ?100% +104.1% 2.06 +0.3% 1.01 ?100% perf-stat.i.metric.K/sec
0.05 ? 2% -4.2% 0.04 -3.9% 0.04 ? 2% +0.4% 0.05 perf-stat.overall.MPKI
1.56 -1.2 0.39 -1.2 0.38 -0.0 1.54 perf-stat.overall.branch-miss-rate%
3.43 -4.3% 3.28 -4.4% 3.28 -0.5% 3.41 perf-stat.overall.cpi
0.29 +4.5% 0.30 +4.6% 0.30 +0.5% 0.29 perf-stat.overall.ipc
208138 +3.4% 215312 +3.5% 215474 +0.5% 209279 perf-stat.overall.path-length
1.973e+09 +19.2% 2.353e+09 +19.1% 2.351e+09 +1.8% 2.008e+09 perf-stat.ps.branch-instructions
30729762 -70.4% 9109071 -71.0% 8918595 +0.6% 30911752 perf-stat.ps.branch-misses
7745419 -1.1% 7663567 -1.1% 7663740 -1.3% 7647834 perf-stat.ps.cache-references
1.499e+10 +5.1% 1.575e+10 +5.2% 1.577e+10 +1.2% 1.517e+10 perf-stat.ps.instructions
4.537e+12 +4.9% 4.759e+12 +5.1% 4.767e+12 +1.1% 4.587e+12 perf-stat.total.instructions
12.23 -0.6 11.60 -0.6 11.64 -0.0 12.21 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
10.09 -0.6 9.51 -0.5 9.56 -0.1 10.01 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
22.31 -0.4 21.88 -0.4 21.94 +0.0 22.36 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
19.15 +0.2 19.30 +0.2 19.38 -0.1 19.04 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
9.25 +0.2 9.43 +0.0 9.25 -0.0 9.23 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
8.79 +0.2 9.02 +0.3 9.07 -0.1 8.72 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
7.13 +0.2 7.36 +0.3 7.41 -0.1 7.07 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
8.37 +0.3 8.63 +0.3 8.68 -0.1 8.28 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
12.38 -0.6 11.78 -0.5 11.84 -0.0 12.38 perf-profile.children.cycles-pp.do_syscall_64
10.12 -0.5 9.57 -0.5 9.63 -0.1 10.04 perf-profile.children.cycles-pp.__x64_sys_futex
22.63 -0.4 22.20 -0.4 22.24 +0.0 22.65 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.48 ? 2% -0.0 0.46 -0.0 0.47 ? 2% -0.0 0.46 perf-profile.children.cycles-pp.get_futex_key
19.34 +0.1 19.49 +0.2 19.57 -0.1 19.24 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.00 +0.2 0.18 ? 2% +0.2 0.18 ? 3% +0.0 0.00 perf-profile.children.cycles-pp.x64_sys_call
9.11 +0.2 9.29 +0.0 9.12 +0.0 9.12 perf-profile.children.cycles-pp.entry_SYSCALL_64
8.88 +0.2 9.11 +0.3 9.16 -0.1 8.81 perf-profile.children.cycles-pp.do_futex
7.13 +0.2 7.36 +0.3 7.41 -0.1 7.07 perf-profile.children.cycles-pp.__futex_wait
8.43 +0.3 8.70 +0.3 8.75 -0.1 8.34 perf-profile.children.cycles-pp.futex_wait
1.20 -0.7 0.47 -0.7 0.46 ? 3% -0.0 1.20 ? 2% perf-profile.self.cycles-pp.__x64_sys_futex
1.46 -0.2 1.27 -0.2 1.26 ? 2% +0.0 1.48 ? 2% perf-profile.self.cycles-pp.do_syscall_64
0.51 -0.1 0.44 -0.1 0.45 ? 2% +0.0 0.52 perf-profile.self.cycles-pp.do_futex
0.38 ? 5% -0.1 0.32 ? 4% -0.1 0.32 ? 5% +0.0 0.39 ? 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.48 ? 2% -0.0 0.45 -0.0 0.45 ? 2% -0.0 0.46 ? 2% perf-profile.self.cycles-pp.get_futex_key
1.21 +0.0 1.24 ? 2% +0.0 1.23 ? 3% -0.0 1.18 perf-profile.self.cycles-pp.futex_wait
0.09 ? 14% +0.0 0.12 ? 8% +0.0 0.13 ? 6% +0.0 0.12 ? 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.00 +0.1 0.15 ? 2% +0.2 0.15 ? 3% +0.0 0.00 perf-profile.self.cycles-pp.x64_sys_call
7.97 +0.1 8.12 -0.0 7.95 -0.0 7.96 perf-profile.self.cycles-pp.entry_SYSCALL_64
19.28 +0.2 19.44 +0.2 19.53 -0.1 19.21 perf-profile.self.cycles-pp.syscall_return_via_sysret
10.43 +0.2 10.60 +0.2 10.59 +0.0 10.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.72 ? 3% +0.2 0.94 ? 3% +0.2 0.93 ? 4% +0.0 0.74 perf-profile.self.cycles-pp.__futex_wait

> Also it would be helpful to see the same tests on Cascade/Ice Lake, or
> some other system for which the 'spectre_v2' sysfs vulnerabilities file
> shows "BHI: SW loop". On such a system it shouldn't matter whether my
> patch is added as it won't disable Linus' syscall change. But it would
> be very helpful to see the performance impact of that combination.

The test results on Cascade/Ice Lake are as follows:

Intel Xeon Platinum 8260L (Cascade Lake)
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor:
lkp-csl-2sp3/will-it-scale/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3/gcc-13/16/process/futex4/performance

commit:
0cd01ac5dcb1 ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")
v6.9-rc4
v6.9-rc4+v3_patch

0cd01ac5dcb1 1e3ad78334a6 v6.9-rc4 v6.9-rc4+v3_patch
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
3237910 -0.3% 3229309 -10.1% 2911031 -11.0% 2882769 will-it-scale.per_process_ops
51806565 -0.3% 51668961 -10.1% 46576504 -11.0% 46124311 will-it-scale.workload
0.02 ? 7% -6.4% 0.02 ? 3% -9.5% 0.02 ? 2% -2.4% 0.02 ? 12% perf-stat.i.MPKI
4.649e+09 +17.4% 5.459e+09 +76.1% 8.186e+09 +75.4% 8.156e+09 perf-stat.i.branch-instructions
0.72 -0.6 0.15 ? 4% -0.6 0.12 -0.6 0.12 ? 2% perf-stat.i.branch-miss-rate%
34188248 -74.0% 8872232 ? 3% -69.9% 10285664 -70.0% 10244122 perf-stat.i.branch-misses
1.70 -4.2% 1.63 -8.0% 1.56 -8.3% 1.56 perf-stat.i.cpi
3.326e+10 +3.6% 3.444e+10 +9.1% 3.628e+10 +8.2% 3.599e+10 perf-stat.i.instructions
0.59 +4.3% 0.61 +8.7% 0.64 +9.0% 0.64 perf-stat.i.ipc
0.18 ? 16% -11.5% 0.16 ? 22% -33.9% 0.12 ? 46% -58.6% 0.08 ? 49% perf-stat.i.major-faults
0.02 ? 7% -6.3% 0.02 ? 4% -11.0% 0.02 ? 3% -2.3% 0.02 ? 13% perf-stat.overall.MPKI
0.74 -0.6 0.16 ? 3% -0.6 0.13 -0.6 0.13 perf-stat.overall.branch-miss-rate%
1.70 -4.1% 1.63 -8.0% 1.56 -8.3% 1.56 perf-stat.overall.cpi
0.59 +4.3% 0.61 +8.7% 0.64 +9.0% 0.64 perf-stat.overall.ipc
193210 +3.9% 200708 +21.4% 234594 +21.5% 234812 perf-stat.overall.path-length
4.633e+09 +17.4% 5.441e+09 +76.1% 8.159e+09 +75.4% 8.129e+09 perf-stat.ps.branch-instructions
34084869 -74.0% 8860998 ? 2% -69.9% 10274305 -70.0% 10220106 perf-stat.ps.branch-misses
3.315e+10 +3.6% 3.433e+10 +9.1% 3.616e+10 +8.2% 3.587e+10 perf-stat.ps.instructions
0.18 ? 16% -11.5% 0.16 ? 22% -33.8% 0.12 ? 46% -58.6% 0.08 ? 49% perf-stat.ps.major-faults
1.001e+13 +3.6% 1.037e+13 +9.2% 1.093e+13 +8.2% 1.083e+13 perf-stat.total.instructions
18.55 -0.3 18.23 -1.1 17.45 -1.1 17.46 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.82 -0.1 1.74 -0.2 1.60 -0.2 1.57 perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
3.58 -0.1 3.51 -0.5 3.11 -0.5 3.11 perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait.do_futex
17.39 -0.1 17.32 -1.1 16.32 -1.1 16.30 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
0.68 ? 2% -0.0 0.66 -0.1 0.60 ? 2% -0.1 0.60 ? 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
2.73 -0.0 2.72 -0.3 2.40 -0.3 2.40 perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait
0.60 ? 2% +0.0 0.60 ? 2% -0.0 0.57 -0.0 0.59 ? 2% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
0.00 +0.0 0.00 +6.3 6.26 +6.2 6.22 perf-profile.calltrace.cycles-pp.clear_bhb_loop.syscall
0.61 ? 2% +0.0 0.61 ? 2% -0.0 0.58 -0.0 0.60 ? 2% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
72.70 +0.0 72.72 +0.1 72.78 +0.1 72.80 perf-profile.calltrace.cycles-pp.syscall
1.78 +0.0 1.80 -0.2 1.59 -0.2 1.58 perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
16.67 +0.0 16.71 -1.1 15.61 -1.1 15.59 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
20.50 +0.0 20.55 -0.5 20.04 -0.5 20.01 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
21.76 +0.0 21.81 -0.5 21.22 -0.5 21.24 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
2.07 +0.1 2.13 -0.2 1.90 -0.2 1.91 perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
0.76 +0.1 0.84 ? 3% -0.0 0.74 -0.0 0.74 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
5.09 +0.1 5.17 -0.5 4.60 -0.5 4.62 perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
7.85 +0.1 7.94 -0.7 7.10 -0.8 7.07 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
0.91 ? 2% +0.1 1.04 +0.0 0.92 +0.0 0.92 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
39.86 +0.2 40.02 -4.4 35.46 -4.4 35.48 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
14.13 +0.2 14.35 -0.9 13.20 -1.0 13.18 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
12.44 +0.3 12.70 -1.1 11.33 -1.1 11.34 perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
18.62 -0.3 18.30 -1.1 17.57 -1.0 17.58 perf-profile.children.cycles-pp.__x64_sys_futex
17.59 -0.1 17.46 -1.0 16.54 -1.1 16.52 perf-profile.children.cycles-pp.do_futex
1.82 -0.1 1.74 -0.2 1.60 -0.2 1.57 perf-profile.children.cycles-pp.futex_q_unlock
3.19 -0.1 3.13 -0.4 2.77 -0.4 2.77 perf-profile.children.cycles-pp.__get_user_nocheck_4
0.68 ? 2% -0.0 0.66 -0.1 0.60 ? 2% -0.1 0.60 ? 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
3.58 -0.0 3.57 -0.4 3.16 -0.4 3.16 perf-profile.children.cycles-pp.futex_get_value_locked
0.80 -0.0 0.79 ? 4% -0.0 0.76 ? 2% -0.1 0.75 ? 3% perf-profile.children.cycles-pp.hrtimer_interrupt
0.81 -0.0 0.80 ? 4% -0.0 0.77 ? 2% -0.0 0.76 ? 3% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.46 ? 2% -0.0 0.45 ? 5% -0.0 0.43 ? 2% -0.0 0.42 ? 4% perf-profile.children.cycles-pp.tick_nohz_handler
0.35 -0.0 0.34 -0.0 0.30 ? 3% -0.0 0.30 ? 2% perf-profile.children.cycles-pp.testcase
0.66 -0.0 0.65 ? 4% -0.0 0.62 -0.0 0.62 ? 2% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.38 ? 2% -0.0 0.38 ? 5% -0.0 0.36 -0.0 0.35 ? 5% perf-profile.children.cycles-pp.update_process_times
0.14 ? 5% -0.0 0.14 ? 5% -0.0 0.12 ? 4% -0.0 0.12 ? 5% perf-profile.children.cycles-pp.amd_clear_divider
0.00 +0.0 0.00 +6.3 6.32 +6.3 6.29 perf-profile.children.cycles-pp.clear_bhb_loop
0.28 ? 6% +0.0 0.28 ? 4% -0.0 0.25 ? 3% -0.0 0.24 ? 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.90 +0.0 0.91 ? 3% -0.1 0.80 -0.1 0.81 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
1.88 +0.0 1.90 ? 2% -0.2 1.69 -0.2 1.68 perf-profile.children.cycles-pp._raw_spin_lock
0.17 ? 4% +0.0 0.20 ? 2% -0.1 0.12 ? 3% -0.1 0.12 ? 7% perf-profile.children.cycles-pp.futex_setup_timer
20.64 +0.0 20.69 -0.5 20.18 -0.4 20.21 perf-profile.children.cycles-pp.do_syscall_64
21.90 +0.0 21.94 -0.5 21.41 -0.5 21.43 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
2.07 +0.1 2.13 -0.2 1.90 -0.2 1.91 perf-profile.children.cycles-pp.futex_hash
5.13 +0.1 5.20 -0.5 4.64 -0.5 4.64 perf-profile.children.cycles-pp.entry_SYSCALL_64
16.84 +0.1 16.91 -1.1 15.73 -1.1 15.71 perf-profile.children.cycles-pp.futex_wait
5.30 +0.1 5.37 -0.5 4.79 -0.5 4.80 perf-profile.children.cycles-pp.futex_q_lock
0.91 ? 2% +0.1 1.05 +0.0 0.92 +0.0 0.92 perf-profile.children.cycles-pp.get_futex_key
42.79 +0.2 42.98 -4.6 38.15 -4.6 38.18 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
14.13 +0.2 14.36 -0.9 13.20 -1.0 13.18 perf-profile.children.cycles-pp.__futex_wait
12.58 +0.3 12.83 -1.1 11.44 -1.1 11.44 perf-profile.children.cycles-pp.futex_wait_setup
0.00 +0.4 0.41 ? 2% +0.6 0.56 ? 2% +0.6 0.58 ? 3% perf-profile.children.cycles-pp.x64_sys_call
4.04 -0.3 3.77 -0.7 3.36 -0.7 3.33 perf-profile.self.cycles-pp.syscall
1.03 ? 2% -0.3 0.76 ? 2% -0.0 1.00 -0.0 1.02 perf-profile.self.cycles-pp.__x64_sys_futex
0.88 -0.3 0.62 +0.0 0.91 +0.0 0.91 perf-profile.self.cycles-pp.do_futex
2.50 -0.1 2.42 -0.2 2.30 -0.2 2.31 perf-profile.self.cycles-pp.futex_wait
1.74 -0.1 1.68 -0.2 1.55 -0.2 1.52 perf-profile.self.cycles-pp.futex_q_unlock
3.18 -0.1 3.12 -0.4 2.76 -0.4 2.76 perf-profile.self.cycles-pp.__get_user_nocheck_4
0.54 -0.1 0.48 ? 3% -0.1 0.43 -0.1 0.44 ? 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.48 -0.0 1.45 ? 2% +0.2 1.69 ? 2% +0.2 1.67 perf-profile.self.cycles-pp.__futex_wait
0.68 ? 2% -0.0 0.66 -0.1 0.60 ? 2% -0.1 0.60 ? 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.35 -0.0 0.34 -0.0 0.30 ? 3% -0.0 0.30 ? 2% perf-profile.self.cycles-pp.testcase
0.00 +0.0 0.00 +6.3 6.26 +6.2 6.22 perf-profile.self.cycles-pp.clear_bhb_loop
1.33 +0.0 1.33 -0.1 1.23 -0.1 1.22 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.44 +0.0 1.44 -0.1 1.30 -0.1 1.30 perf-profile.self.cycles-pp.futex_q_lock
1.03 +0.0 1.05 ? 2% +0.2 1.22 +0.3 1.28 perf-profile.self.cycles-pp.do_syscall_64
1.80 +0.0 1.84 ? 2% -0.2 1.62 -0.2 1.62 perf-profile.self.cycles-pp._raw_spin_lock
2.42 +0.0 2.46 -0.2 2.19 -0.2 2.20 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.38 ? 6% +0.1 0.44 +0.0 0.38 -0.0 0.38 ? 3% perf-profile.self.cycles-pp.futex_get_value_locked
2.00 +0.1 2.06 -0.2 1.83 -0.2 1.84 perf-profile.self.cycles-pp.futex_hash
0.21 ? 6% +0.1 0.28 ? 4% +0.0 0.25 ? 3% +0.0 0.24 ? 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
1.11 ? 3% +0.1 1.22 ? 3% -0.0 1.08 ? 2% -0.0 1.10 perf-profile.self.cycles-pp.futex_wait_setup
0.90 ? 2% +0.1 1.04 +0.0 0.92 +0.0 0.92 ? 2% perf-profile.self.cycles-pp.get_futex_key
42.61 +0.2 42.81 -4.6 38.00 -4.6 38.02 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.00 +0.4 0.41 ? 2% +0.6 0.55 ? 2% +0.5 0.52 ? 3% perf-profile.self.cycles-pp.x64_sys_call


Intel Xeon Gold 6346 (Ice Lake)
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor:
lkp-icl-2sp9/will-it-scale/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3/gcc-13/16/process/futex4/performance

commit: 0cd01ac5dcb1 ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file") 1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls") v6.9-rc4 v6.9-rc4+v3_patch

0cd01ac5dcb1 1e3ad78334a6 v6.9-rc4 v6.9-rc4+v3_patch
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
7907214 -1.8% 7763496 -15.4% 6686457 -15.5% 6678350 will-it-scale.per_process_ops
1.265e+08 -1.8% 1.242e+08 -15.4% 1.07e+08 -15.5% 1.069e+08 will-it-scale.workload
1.112e+10 +16.0% 1.29e+10 +67.3% 1.86e+10 +68.0% 1.868e+10 perf-stat.i.branch-instructions
0.06 ? 2% -0.0 0.06 ? 2% -0.0 0.05 -0.0 0.05 perf-stat.i.branch-miss-rate%
6858604 ? 2% +0.6% 6900573 ? 2% +8.4% 7434422 +7.7% 7388238 perf-stat.i.branch-misses
0.72 -2.0% 0.71 -2.7% 0.70 -2.7% 0.70 perf-stat.i.cpi
8.004e+10 +2.1% 8.17e+10 +2.8% 8.231e+10 +2.8% 8.232e+10 perf-stat.i.instructions
1.38 +2.1% 1.41 +2.8% 1.42 +2.8% 1.42 perf-stat.i.ipc
0.06 ? 2% -0.0 0.05 ? 2% -0.0 0.04 -0.0 0.04 perf-stat.overall.branch-miss-rate%
0.72 -2.0% 0.71 -2.8% 0.70 -2.8% 0.70 perf-stat.overall.cpi
1.38 +2.1% 1.41 +2.8% 1.42 +2.8% 1.42 perf-stat.overall.ipc
190470 +3.9% 197929 +21.7% 231786 +21.8% 231973 perf-stat.overall.path-length
1.108e+10 +16.0% 1.286e+10 +67.3% 1.854e+10 +68.0% 1.862e+10 perf-stat.ps.branch-instructions
6893534 ? 2% +0.5% 6924998 ? 2% +8.3% 7462919 +7.5% 7410265 perf-stat.ps.branch-misses
7.978e+10 +2.1% 8.143e+10 +2.8% 8.204e+10 +2.8% 8.205e+10 perf-stat.ps.instructions
2.41e+13 +2.0% 2.459e+13 +2.9% 2.48e+13 +2.9% 2.479e+13 perf-stat.total.instructions
48.06 -2.8 45.31 -9.9 38.20 ? 3% -9.1 38.94 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
42.93 -2.5 40.41 -8.8 34.12 ? 4% -8.1 34.84 perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
56.45 -2.4 54.10 -12.0 44.44 -11.4 45.05 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
58.78 -2.3 56.48 -12.5 46.31 -11.9 46.86 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
61.14 -2.3 58.86 -12.9 48.20 -12.5 48.67 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
71.08 -1.4 69.71 -12.1 58.96 -12.1 58.95 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
68.07 -1.2 66.88 -11.8 56.28 -11.8 56.26 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
17.13 -1.1 16.06 -3.8 13.38 ? 7% -2.9 14.20 ? 5% perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
100.03 -0.8 99.22 -1.0 99.01 -1.4 98.60 perf-profile.calltrace.cycles-pp.syscall
15.22 -0.8 14.42 -2.7 12.53 ? 5% -2.7 12.52 perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait.do_futex
12.02 -0.6 11.37 -2.1 9.89 ? 6% -2.1 9.92 perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait
3.12 ? 9% -0.5 2.61 -0.9 2.22 ? 10% -1.0 2.08 ? 7% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
7.38 ? 2% -0.4 6.99 -1.9 5.44 ? 5% -1.6 5.76 ? 5% perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
5.12 -0.3 4.78 -0.9 4.20 ? 10% -0.6 4.47 ? 5% perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
4.99 ? 2% -0.3 4.66 -1.0 4.00 ? 5% -1.2 3.79 ? 6% perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
3.02 ? 3% -0.2 2.80 -0.9 2.17 -0.8 2.25 ? 3% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
1.58 ? 3% -0.1 1.51 -0.3 1.29 ? 10% -0.4 1.22 ? 7% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
0.94 ? 3% -0.1 0.87 -0.2 0.75 ? 9% -0.2 0.70 ? 8% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
0.69 ? 3% -0.0 0.65 -0.2 0.47 ? 46% -0.3 0.36 ? 71% perf-profile.calltrace.cycles-pp.amd_clear_divider.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
0.68 ? 3% -0.0 0.66 -0.3 0.38 ? 71% -0.2 0.45 ? 45% perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
0.00 +0.0 0.00 +15.2 15.21 ? 2% +14.8 14.83 perf-profile.calltrace.cycles-pp.clear_bhb_loop.syscall
1.04 ? 2% +0.1 1.13 ? 2% -0.1 0.96 ? 3% -0.0 1.00 ? 5% perf-profile.calltrace.cycles-pp.testcase
1.57 +0.1 1.70 -0.1 1.48 ? 9% -0.0 1.54 ? 5% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
0.00 +1.3 1.29 +1.6 1.62 ? 8% +1.3 1.34 ? 4% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
16.18 +1.6 17.80 -0.6 15.63 ? 7% +0.0 16.22 ? 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
48.52 -2.8 45.75 -10.0 38.57 ? 2% -9.2 39.30 perf-profile.children.cycles-pp.__futex_wait
44.20 -2.6 41.60 -9.1 35.12 ? 3% -8.4 35.84 perf-profile.children.cycles-pp.futex_wait_setup
57.11 -2.4 54.75 -12.1 44.98 -11.5 45.58 perf-profile.children.cycles-pp.futex_wait
59.22 -2.3 56.91 -12.7 46.54 -12.1 47.10 perf-profile.children.cycles-pp.do_futex
61.79 -2.3 59.51 -13.0 48.74 -12.6 49.19 perf-profile.children.cycles-pp.__x64_sys_futex
69.05 -1.5 67.59 -12.1 56.90 -12.2 56.85 perf-profile.children.cycles-pp.do_syscall_64
71.36 -1.4 70.00 -11.9 59.44 -11.9 59.43 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
17.82 -1.1 16.70 -3.9 13.94 ? 7% -3.0 14.79 ? 5% perf-profile.children.cycles-pp.futex_q_lock
14.54 -0.8 13.76 -2.6 11.96 ? 5% -2.6 11.98 perf-profile.children.cycles-pp.futex_get_value_locked
13.16 -0.7 12.46 -2.3 10.83 ? 5% -2.3 10.84 perf-profile.children.cycles-pp.__get_user_nocheck_4
3.96 ? 3% -0.5 3.46 -1.0 2.95 ? 10% -1.2 2.78 ? 7% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
7.61 ? 2% -0.4 7.21 -2.0 5.62 ? 4% -1.6 5.96 ? 5% perf-profile.children.cycles-pp.futex_hash
5.35 -0.4 5.00 -1.0 4.40 ? 10% -0.7 4.67 ? 5% perf-profile.children.cycles-pp._raw_spin_lock
5.22 ? 2% -0.3 4.88 -1.0 4.19 ? 5% -1.3 3.97 ? 6% perf-profile.children.cycles-pp.futex_q_unlock
3.26 ? 3% -0.2 3.02 -0.9 2.35 ? 2% -0.8 2.44 ? 3% perf-profile.children.cycles-pp.get_futex_key
98.50 -0.1 98.40 +0.1 98.60 +0.1 98.55 perf-profile.children.cycles-pp.syscall
1.81 ? 3% -0.1 1.73 -0.3 1.47 ? 10% -0.4 1.40 ? 7% perf-profile.children.cycles-pp.syscall_return_via_sysret
1.17 ? 3% -0.1 1.09 -0.2 0.94 ? 9% -0.3 0.87 ? 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.93 ? 3% -0.1 0.86 ? 2% -0.2 0.73 ? 10% -0.2 0.70 ? 8% perf-profile.children.cycles-pp.amd_clear_divider
0.16 ? 2% -0.0 0.15 -0.0 0.15 ? 5% -0.0 0.15 ? 4% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.16 ? 3% -0.0 0.14 ? 3% -0.0 0.14 ? 3% -0.0 0.14 ? 5% perf-profile.children.cycles-pp.hrtimer_interrupt
9.15 ? 2% -0.0 9.14 -1.5 7.65 ? 7% -1.6 7.54 ? 3% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.65 ? 4% -0.0 0.64 -0.1 0.54 ? 7% -0.1 0.54 ? 2% perf-profile.children.cycles-pp.futex_setup_timer
0.00 +0.0 0.00 +15.4 15.40 ? 2% +15.0 15.01 perf-profile.children.cycles-pp.clear_bhb_loop
0.62 ? 2% +0.0 0.66 ? 2% -0.1 0.56 ? 5% -0.0 0.58 ? 5% perf-profile.children.cycles-pp.syscall@plt
1.45 +0.1 1.57 -0.1 1.33 ? 2% -0.1 1.39 ? 5% perf-profile.children.cycles-pp.testcase
1.59 +0.1 1.72 -0.1 1.49 ? 9% -0.0 1.56 ? 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
9.45 +0.9 10.37 -0.3 9.12 ? 7% +0.0 9.45 ? 5% perf-profile.children.cycles-pp.entry_SYSCALL_64
0.00 +1.5 1.51 +1.8 1.81 ? 8% +1.5 1.51 ? 4% perf-profile.children.cycles-pp.x64_sys_call
12.90 -0.7 12.23 -2.3 10.63 ? 5% -2.3 10.64 perf-profile.self.cycles-pp.__get_user_nocheck_4
7.38 ? 2% -0.4 6.99 -1.9 5.43 ? 5% -1.6 5.76 ? 5% perf-profile.self.cycles-pp.futex_hash
5.09 -0.4 4.70 -1.0 4.11 ? 8% -0.7 4.35 ? 6% perf-profile.self.cycles-pp.futex_q_lock
5.11 -0.3 4.78 -0.9 4.21 ? 10% -0.6 4.47 ? 5% perf-profile.self.cycles-pp._raw_spin_lock
4.86 ? 2% -0.3 4.54 -0.9 3.92 ? 5% -1.1 3.71 ? 7% perf-profile.self.cycles-pp.futex_q_unlock
3.68 -0.2 3.45 -0.1 3.62 ? 9% -0.1 3.56 ? 7% perf-profile.self.cycles-pp.do_syscall_64
3.02 ? 3% -0.2 2.80 -0.9 2.16 ? 2% -0.8 2.25 ? 3% perf-profile.self.cycles-pp.get_futex_key
4.33 ? 3% -0.2 4.14 -0.9 3.45 ? 6% -0.9 3.45 ? 2% perf-profile.self.cycles-pp.__futex_wait
4.17 -0.2 3.99 ? 3% -0.9 3.32 -0.9 3.32 perf-profile.self.cycles-pp.futex_wait_setup
2.09 ? 3% -0.2 1.93 -0.4 1.64 ? 9% -0.5 1.56 ? 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.39 ? 2% -0.1 1.29 -0.3 1.13 ? 2% -0.3 1.12 ? 2% perf-profile.self.cycles-pp.futex_get_value_locked
1.81 ? 3% -0.1 1.73 -0.3 1.47 ? 10% -0.4 1.40 ? 7% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.94 ? 3% -0.1 0.88 -0.2 0.75 ? 10% -0.2 0.70 ? 8% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.46 ? 2% -0.0 0.43 ? 2% -0.1 0.37 ? 11% -0.1 0.34 ? 6% perf-profile.self.cycles-pp.amd_clear_divider
0.43 ? 4% +0.0 0.43 ? 2% -0.1 0.36 ? 7% -0.1 0.36 ? 2% perf-profile.self.cycles-pp.futex_setup_timer
0.00 +0.0 0.00 +15.2 15.22 ? 2% +14.8 14.81 perf-profile.self.cycles-pp.clear_bhb_loop
8.92 ? 2% +0.0 8.92 -1.4 7.47 ? 7% -1.6 7.36 ? 2% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.20 +0.0 0.22 ? 3% -0.0 0.18 ? 8% -0.0 0.20 ? 7% perf-profile.self.cycles-pp.syscall@plt
2.76 +0.0 2.81 -0.4 2.38 ? 10% -0.5 2.27 ? 7% perf-profile.self.cycles-pp.__x64_sys_futex
1.90 ? 3% +0.0 1.95 -0.7 1.23 ? 11% -0.7 1.22 ? 8% perf-profile.self.cycles-pp.do_futex
2.50 ? 2% +0.1 2.61 +0.1 2.56 ? 8% +0.1 2.60 ? 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.24 +0.1 1.35 ? 2% -0.1 1.14 ? 3% -0.0 1.19 ? 5% perf-profile.self.cycles-pp.testcase
1.59 +0.1 1.72 -0.1 1.49 ? 9% -0.0 1.56 ? 5% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
2.72 +0.2 2.94 ? 2% -0.1 2.61 ? 9% -0.0 2.69 ? 6% perf-profile.self.cycles-pp.entry_SYSCALL_64
8.15 ? 3% +0.4 8.56 -2.0 6.18 ? 10% -2.1 6.04 ? 3% perf-profile.self.cycles-pp.futex_wait
11.95 +1.0 12.92 -1.0 10.97 ? 4% -0.6 11.35 ? 4% perf-profile.self.cycles-pp.syscall
0.00 +1.3 1.30 +1.6 1.62 ? 8% +1.3 1.33 ? 4% perf-profile.self.cycles-pp.x64_sys_call


BTW, we did observe some regressions by running other benchmarks on
commit 1e3ad78334a6, but these regressions are on Ice Lake, not Skylake.
Please kindly contact us if you are interested in looking into them.

stress-ng.null.ops_per_sec -4.0% regression on Intel Xeon Gold 6346 (Ice Lake)
unixbench.fsbuffer.throughput -1.4% regression on Intel Xeon Gold 6346 (Ice Lake)

Thanks,
Yujie

>
> [*] https://lkml.kernel.org/lkml/eda0ec65f4612cc66875aaf76e738643f41fbc01.1713296762.git.jpoimboe@kernel.org
>
> --
> Josh
>