2022-07-21 14:01:59

by kernel test robot

[permalink] [raw]
Subject: [x86/bugs] 6ad0ad2bf8: will-it-scale.per_process_ops -33.5% regression



Greeting,

FYI, we noticed a -33.5% regression of will-it-scale.per_process_ops due to commit:


commit: 6ad0ad2bf8a67e27d1f9d006a1dabb0e1c360cc3 ("x86/bugs: Report Intel retbleed vulnerability")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 104 threads 2 sockets Skylake with 192G memory
with following parameters:

nr_task: 16
mode: process
test: futex3
cpufreq_governor: performance
ucode: 0x2006c0a

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -31.4% regression |
| test machine | 104 threads 2 sockets Skylake with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=futex4 |
| | ucode=0x2006c0a |
+------------------+----------------------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/futex3/will-it-scale/0x2006c0a

commit:
166115c08a ("x86/bugs: Split spectre_v2_select_mitigation() and spectre_v2_user_select_mitigation()")
6ad0ad2bf8 ("x86/bugs: Report Intel retbleed vulnerability")

166115c08a9b0b84 6ad0ad2bf8a67e27d1f9d006a1d
---------------- ---------------------------
%stddev %change %stddev
\ | \
30835845 -33.5% 20516991 will-it-scale.16.processes
1927239 -33.5% 1282311 will-it-scale.per_process_ops
30835845 -33.5% 20516991 will-it-scale.workload
8.59 ? 17% -1.1 7.44 ? 18% mpstat.cpu.all.usr%
0.09 ? 4% -32.1% 0.06 turbostat.IPC
226.11 ? 7% -4.9% 215.05 ? 6% turbostat.PkgWatt
1.875e+09 ? 16% -37.4% 1.174e+09 ? 15% perf-stat.i.branch-instructions
58612335 ? 18% -32.1% 39806456 ? 14% perf-stat.i.branch-misses
3.39 ? 6% +40.4% 4.76 ? 11% perf-stat.i.cpi
26700443 ? 17% -36.2% 17025890 ? 15% perf-stat.i.dTLB-load-misses
3.194e+09 ? 16% -34.1% 2.105e+09 ? 15% perf-stat.i.dTLB-loads
2.298e+09 ? 16% -36.9% 1.451e+09 ? 16% perf-stat.i.dTLB-stores
26226991 ? 17% -36.4% 16676481 ? 16% perf-stat.i.iTLB-load-misses
1.26e+10 ? 16% -33.6% 8.37e+09 ? 15% perf-stat.i.instructions
0.30 ? 4% -27.9% 0.22 ? 7% perf-stat.i.ipc
70.83 ? 16% -35.8% 45.47 ? 15% perf-stat.i.metric.M/sec
3.22 +43.0% 4.60 ? 2% perf-stat.overall.cpi
0.31 -30.0% 0.22 ? 2% perf-stat.overall.ipc
145784 +6.1% 154711 perf-stat.overall.path-length
1.873e+09 ? 15% -37.5% 1.171e+09 ? 15% perf-stat.ps.branch-instructions
58546994 ? 18% -32.2% 39721505 ? 14% perf-stat.ps.branch-misses
26668535 ? 17% -36.3% 16989617 ? 15% perf-stat.ps.dTLB-load-misses
3.19e+09 ? 16% -34.2% 2.1e+09 ? 15% perf-stat.ps.dTLB-loads
2.295e+09 ? 16% -36.9% 1.448e+09 ? 16% perf-stat.ps.dTLB-stores
26197064 ? 17% -36.5% 16641071 ? 16% perf-stat.ps.iTLB-load-misses
1.258e+10 ? 16% -33.6% 8.352e+09 ? 15% perf-stat.ps.instructions
4.495e+12 -29.4% 3.174e+12 perf-stat.total.instructions
34.16 ? 12% -11.6 22.54 ? 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
26.91 ? 12% -10.2 16.68 ? 10% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
14.09 ? 12% -5.1 8.96 ? 10% perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
5.76 ? 12% -2.0 3.79 ? 10% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
4.26 ? 12% -1.6 2.66 ? 10% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
3.74 ? 12% -1.4 2.35 ? 10% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.38 ? 13% -1.2 2.15 ? 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
1.75 ? 12% -0.6 1.11 ? 11% perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
1.04 ? 14% -0.4 0.65 ? 11% perf-profile.calltrace.cycles-pp.testcase
12.45 ? 11% +6.2 18.66 ? 10% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
34.69 ? 12% -11.9 22.82 ? 10% perf-profile.children.cycles-pp.do_syscall_64
27.06 ? 12% -10.3 16.76 ? 10% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
13.51 ? 12% -4.9 8.57 ? 10% perf-profile.children.cycles-pp.__entry_text_start
5.81 ? 12% -2.0 3.82 ? 10% perf-profile.children.cycles-pp.__x64_sys_futex
4.30 ? 12% -1.6 2.69 ? 10% perf-profile.children.cycles-pp.do_futex
3.86 ? 12% -1.4 2.42 ? 11% perf-profile.children.cycles-pp.futex_wake
2.20 ? 12% -0.8 1.38 ? 9% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
2.19 ? 13% -0.8 1.40 ? 9% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
1.75 ? 12% -0.6 1.11 ? 11% perf-profile.children.cycles-pp.futex_hash
1.04 ? 14% -0.4 0.65 ? 11% perf-profile.children.cycles-pp.testcase
0.75 ? 13% -0.3 0.46 ? 12% perf-profile.children.cycles-pp.get_futex_key
0.32 ? 13% -0.2 0.12 ? 9% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.39 ? 11% +0.2 0.55 ? 13% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
12.74 ? 12% +6.1 18.84 ? 10% perf-profile.children.cycles-pp.syscall_return_via_sysret
25.92 ? 12% -9.8 16.16 ? 10% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
11.73 ? 13% -4.3 7.41 ? 10% perf-profile.self.cycles-pp.__entry_text_start
3.49 ? 12% -1.3 2.23 ? 9% perf-profile.self.cycles-pp.syscall
1.93 ? 12% -0.7 1.22 ? 10% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.66 ? 12% -0.6 1.06 ? 11% perf-profile.self.cycles-pp.futex_hash
1.43 ? 11% -0.5 0.89 ? 10% perf-profile.self.cycles-pp.futex_wake
0.98 ? 13% -0.4 0.63 ? 10% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.72 ? 15% -0.3 0.45 ? 10% perf-profile.self.cycles-pp.testcase
0.69 ? 14% -0.3 0.43 ? 12% perf-profile.self.cycles-pp.get_futex_key
0.43 ? 14% -0.2 0.27 ? 9% perf-profile.self.cycles-pp.do_futex
0.28 ? 13% -0.2 0.12 ? 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.28 ? 11% +0.1 0.39 ? 13% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
12.68 ? 12% +6.1 18.81 ? 10% perf-profile.self.cycles-pp.syscall_return_via_sysret
1.62 ? 12% +9.3 10.96 ? 10% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe


***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/futex4/will-it-scale/0x2006c0a

commit:
166115c08a ("x86/bugs: Split spectre_v2_select_mitigation() and spectre_v2_user_select_mitigation()")
6ad0ad2bf8 ("x86/bugs: Report Intel retbleed vulnerability")

166115c08a9b0b84 6ad0ad2bf8a67e27d1f9d006a1d
---------------- ---------------------------
%stddev %change %stddev
\ | \
90733512 -31.4% 62258677 will-it-scale.52.processes
1744874 -31.4% 1197281 will-it-scale.per_process_ops
90733512 -31.4% 62258677 will-it-scale.workload
655664 ? 10% -22.6% 507216 ? 5% meminfo.DirectMap4k
0.04 ? 9% +0.0 0.05 ? 6% mpstat.cpu.all.soft%
3.85 +14.0% 4.39 ? 4% sched_debug.cpu.clock.stddev
0.11 ? 3% -35.4% 0.07 turbostat.IPC
96.34 ? 2% -96.3 0.00 turbostat.PKG_%
5838 ? 40% -48.7% 2993 ? 13% turbostat.POLL
371.79 -4.2% 356.19 turbostat.PkgWatt
8.115e+09 -33.8% 5.372e+09 perf-stat.i.branch-instructions
2.31 +0.1 2.41 perf-stat.i.branch-miss-rate%
1.865e+08 -30.5% 1.296e+08 perf-stat.i.branch-misses
11.17 ? 3% +1.2 12.38 ? 3% perf-stat.i.cache-miss-rate%
580470 ? 8% +15.5% 670300 ? 6% perf-stat.i.cache-misses
2.69 +42.6% 3.84 perf-stat.i.cpi
310120 ? 8% -17.0% 257300 ? 8% perf-stat.i.cycles-between-cache-misses
0.62 -0.0 0.62 perf-stat.i.dTLB-load-miss-rate%
90617994 -31.3% 62233060 perf-stat.i.dTLB-load-misses
1.443e+10 -30.5% 1.003e+10 perf-stat.i.dTLB-loads
70201 -5.3% 66506 perf-stat.i.dTLB-store-misses
1.106e+10 -32.1% 7.505e+09 perf-stat.i.dTLB-stores
91097784 -31.2% 62707380 perf-stat.i.iTLB-load-misses
5.418e+10 -29.8% 3.801e+10 perf-stat.i.instructions
596.50 +1.9% 608.12 perf-stat.i.instructions-per-iTLB-miss
0.37 -29.9% 0.26 perf-stat.i.ipc
323.07 -31.8% 220.28 perf-stat.i.metric.M/sec
120820 ? 5% +14.4% 138242 ? 4% perf-stat.i.node-load-misses
22505 ? 5% +13.4% 25514 ? 3% perf-stat.i.node-store-misses
8944 ? 6% +20.5% 10779 ? 9% perf-stat.i.node-stores
0.09 ? 6% +50.0% 0.14 ? 4% perf-stat.overall.MPKI
2.30 +0.1 2.41 perf-stat.overall.branch-miss-rate%
11.51 ? 2% +1.1 12.62 ? 2% perf-stat.overall.cache-miss-rate%
2.69 +42.8% 3.84 perf-stat.overall.cpi
249231 ? 7% -13.4% 215874 ? 6% perf-stat.overall.cycles-between-cache-misses
0.62 -0.0 0.62 perf-stat.overall.dTLB-load-miss-rate%
0.00 +0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
594.79 +1.9% 606.25 perf-stat.overall.instructions-per-iTLB-miss
0.37 -30.0% 0.26 perf-stat.overall.ipc
180499 +2.2% 184469 perf-stat.overall.path-length
8.088e+09 -33.8% 5.355e+09 perf-stat.ps.branch-instructions
1.86e+08 -30.5% 1.292e+08 perf-stat.ps.branch-misses
586425 ? 8% +15.4% 676887 ? 6% perf-stat.ps.cache-misses
90306393 -31.3% 62021744 perf-stat.ps.dTLB-load-misses
1.438e+10 -30.5% 1e+10 perf-stat.ps.dTLB-loads
70004 -5.2% 66332 perf-stat.ps.dTLB-store-misses
1.102e+10 -32.1% 7.48e+09 perf-stat.ps.dTLB-stores
90782034 -31.2% 62496607 perf-stat.ps.iTLB-load-misses
5.4e+10 -29.8% 3.789e+10 perf-stat.ps.instructions
121778 ? 5% +14.9% 139936 ? 4% perf-stat.ps.node-load-misses
22472 ? 6% +13.4% 25477 ? 3% perf-stat.ps.node-store-misses
9020 ? 6% +20.4% 10864 ? 9% perf-stat.ps.node-stores
1.638e+13 -29.9% 1.148e+13 perf-stat.total.instructions
40.11 -11.3 28.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
25.35 -8.1 17.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
13.72 -4.3 9.41 perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
13.44 -3.8 9.65 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
12.01 -3.6 8.44 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
11.57 -3.4 8.17 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.06 -2.9 7.12 perf-profile.calltrace.cycles-pp.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
41.56 -1.7 39.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
4.32 -1.4 2.95 perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
3.38 -1.1 2.31 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
3.03 -0.9 2.14 perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
2.58 ? 2% -0.8 1.78 perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex
1.53 -0.5 1.06 perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.futex_wait.do_futex
1.23 ? 2% -0.4 0.84 perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.futex_wait.do_futex
1.29 -0.3 0.98 perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
0.76 ? 2% -0.2 0.54 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
0.00 +0.6 0.57 perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
12.10 +7.8 19.90 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
40.40 -11.3 29.11 perf-profile.children.cycles-pp.do_syscall_64
25.50 -8.1 17.37 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
13.14 -4.1 9.02 perf-profile.children.cycles-pp.__entry_text_start
13.53 -3.8 9.71 perf-profile.children.cycles-pp.__x64_sys_futex
12.09 -3.6 8.50 perf-profile.children.cycles-pp.do_futex
11.62 -3.4 8.21 perf-profile.children.cycles-pp.futex_wait
10.28 -3.0 7.25 perf-profile.children.cycles-pp.futex_wait_setup
41.78 -1.4 40.35 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
4.46 -1.4 3.05 perf-profile.children.cycles-pp.futex_q_lock
3.07 -0.9 2.18 perf-profile.children.cycles-pp.futex_get_value_locked
2.85 -0.9 1.98 perf-profile.children.cycles-pp.__get_user_nocheck_4
2.22 -0.7 1.52 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
2.14 -0.7 1.46 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.53 -0.5 1.06 perf-profile.children.cycles-pp.futex_hash
1.28 ? 2% -0.4 0.88 perf-profile.children.cycles-pp._raw_spin_lock
1.29 -0.3 0.98 perf-profile.children.cycles-pp.futex_q_unlock
0.80 ? 2% -0.2 0.57 perf-profile.children.cycles-pp.get_futex_key
0.25 -0.1 0.17 ? 3% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.23 ? 2% -0.1 0.16 ? 4% perf-profile.children.cycles-pp.testcase
0.15 ? 2% -0.0 0.10 ? 4% perf-profile.children.cycles-pp.futex_setup_timer
0.09 ? 5% -0.0 0.06 ? 9% perf-profile.children.cycles-pp.syscall@plt
0.10 ? 5% -0.0 0.07 ? 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.38 ? 3% +0.2 0.62 perf-profile.children.cycles-pp.syscall_enter_from_user_mode
12.37 +7.7 20.09 perf-profile.children.cycles-pp.syscall_return_via_sysret
25.21 -8.0 17.17 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
11.39 -3.6 7.82 perf-profile.self.cycles-pp.__entry_text_start
4.00 -1.2 2.75 perf-profile.self.cycles-pp.syscall
2.79 ? 2% -0.8 1.94 perf-profile.self.cycles-pp.__get_user_nocheck_4
1.86 -0.6 1.28 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.56 -0.5 1.07 perf-profile.self.cycles-pp.futex_q_lock
1.47 -0.4 1.02 perf-profile.self.cycles-pp.futex_hash
1.23 ? 2% -0.4 0.85 perf-profile.self.cycles-pp._raw_spin_lock
1.20 ? 2% -0.4 0.84 perf-profile.self.cycles-pp.futex_wait
1.07 -0.3 0.72 ? 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.23 -0.3 0.94 ? 2% perf-profile.self.cycles-pp.futex_q_unlock
0.88 -0.3 0.60 perf-profile.self.cycles-pp.futex_wait_setup
1.43 -0.2 1.20 perf-profile.self.cycles-pp.__x64_sys_futex
0.79 ? 3% -0.2 0.57 perf-profile.self.cycles-pp.get_futex_key
0.45 ? 2% -0.2 0.30 ? 2% perf-profile.self.cycles-pp.do_futex
0.23 ? 2% -0.1 0.16 ? 5% perf-profile.self.cycles-pp.testcase
0.19 ? 2% -0.1 0.13 ? 4% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.26 ? 7% -0.0 0.22 ? 5% perf-profile.self.cycles-pp.futex_get_value_locked
0.10 -0.0 0.07 ? 5% perf-profile.self.cycles-pp.futex_setup_timer
0.28 ? 3% +0.2 0.43 ? 2% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.88 +0.2 1.04 perf-profile.self.cycles-pp.do_syscall_64
12.31 +7.8 20.06 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.47 +10.1 11.58 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (20.70 kB)
config-5.19.0-rc4-00026-g6ad0ad2bf8a6 (166.51 kB)
job-script (7.75 kB)
job.yaml (5.28 kB)
reproduce (356.00 B)
Download all attachments