2021-05-13 10:53:28

by kernel test robot

[permalink] [raw]
Subject: [x86/entry] fe950f6020: will-it-scale.per_thread_ops 5.2% improvement



Greeting,

FYI, we noticed a 5.2% improvement of will-it-scale.per_thread_ops due to commit:


commit: fe950f6020338c8ac668ef823bb692d36b7542a2 ("x86/entry: Enable random_kstack_offset support")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:

nr_task: 50%
mode: thread
test: lseek1
cpufreq_governor: performance
ucode: 0x5003006

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 5.5% improvement |
| test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=50% |
| | test=lseek2 |
| | ucode=0x5003006 |
+------------------+-------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 7.8% improvement |
| test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=lseek2 |
| | ucode=0x5003006 |
+------------------+-------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 5.5% improvement |
| test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=16 |
| | test=lseek2 |
| | ucode=0x5003006 |
+------------------+-------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -1.7% regression |
| test machine | 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=dup1 |
| | ucode=0x5003006 |
+------------------+-------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 7.9% improvement |
| test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=16 |
| | test=lseek2 |
| | ucode=0x5003006 |
+------------------+-------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/lseek1/will-it-scale/0x5003006

commit:
39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
fe950f6020 ("x86/entry: Enable random_kstack_offset support")

39218ff4c625dbf2 fe950f6020338c8ac668ef823bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
3.039e+08 +5.2% 3.197e+08 will-it-scale.44.threads
6907369 +5.2% 7265144 will-it-scale.per_thread_ops
3.039e+08 +5.2% 3.197e+08 will-it-scale.workload
5861 ? 28% +59.2% 9333 ? 32% proc-vmstat.numa_hint_faults_local
38898 ? 96% -70.1% 11622 ?190% numa-meminfo.node0.Active
38725 ? 96% -70.4% 11481 ?192% numa-meminfo.node0.Active(anon)
9681 ? 96% -70.4% 2865 ?192% numa-vmstat.node0.nr_active_anon
9681 ? 96% -70.4% 2865 ?192% numa-vmstat.node0.nr_zone_active_anon
35732 ? 11% -34.2% 23499 ? 18% softirqs.CPU57.SCHED
32110 ? 10% -40.0% 19252 ? 43% softirqs.CPU84.SCHED
223.17 ? 14% -27.0% 162.83 ? 18% interrupts.CPU13.RES:Rescheduling_interrupts
7858 -34.7% 5132 ? 37% interrupts.CPU15.NMI:Non-maskable_interrupts
7858 -34.7% 5132 ? 37% interrupts.CPU15.PMI:Performance_monitoring_interrupts
0.01 ? 24% -100.0% 0.00 perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.02 ? 46% -45.0% 0.01 ? 15% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
0.03 ? 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
459.17 ? 12% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.57 ?115% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ? 7% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.57 ?115% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
3.728e+10 ? 9% +13.0% 4.213e+10 ? 2% perf-stat.i.dTLB-loads
2.474e+10 ? 9% +13.8% 2.816e+10 ? 2% perf-stat.i.dTLB-stores
1827406 -4.0% 1754013 perf-stat.i.iTLB-loads
1.244e+11 ? 9% +13.1% 1.407e+11 ? 2% perf-stat.i.instructions
1.02 ? 7% +11.6% 1.14 perf-stat.i.ipc
1026 ? 9% +13.1% 1160 ? 2% perf-stat.i.metric.M/sec
0.93 -6.5% 0.87 perf-stat.overall.cpi
0.00 ? 2% -0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
398.77 +1.6% 404.98 perf-stat.overall.instructions-per-iTLB-miss
1.08 +7.0% 1.15 perf-stat.overall.ipc
86.31 -2.1 84.17 perf-stat.overall.node-load-miss-rate%
131961 +1.7% 134154 perf-stat.overall.path-length
3.718e+10 ? 9% +13.0% 4.2e+10 ? 2% perf-stat.ps.dTLB-loads
2.467e+10 ? 9% +13.8% 2.807e+10 ? 2% perf-stat.ps.dTLB-stores
1821416 -4.0% 1748171 perf-stat.ps.iTLB-loads
1.241e+11 ? 9% +13.0% 1.402e+11 ? 2% perf-stat.ps.instructions
4.011e+13 +6.9% 4.288e+13 perf-stat.total.instructions
36.32 -4.9 31.44 ? 10% perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
46.21 -4.6 41.65 ? 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_lseek64
21.18 -4.2 16.96 ? 10% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
11.48 -3.9 7.59 ? 9% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.88 -0.8 6.11 ? 9% perf-profile.calltrace.cycles-pp.__fget_files.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64
2.44 -0.4 2.02 ? 10% perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +1.5 1.46 ? 9% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +2.8 2.79 ? 10% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
36.85 -4.7 32.16 ? 10% perf-profile.children.cycles-pp.ksys_lseek
46.46 -4.4 42.09 ? 10% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
21.59 -4.2 17.41 ? 10% perf-profile.children.cycles-pp.__fdget_pos
11.85 -3.9 7.94 ? 9% perf-profile.children.cycles-pp.__fget_light
7.17 -0.7 6.46 ? 9% perf-profile.children.cycles-pp.__fget_files
2.44 -0.4 2.02 ? 10% perf-profile.children.cycles-pp.shmem_file_llseek
0.28 ? 3% +0.1 0.36 ? 12% perf-profile.children.cycles-pp.rcu_read_unlock_strict
0.30 ? 4% +0.7 0.99 ?148% perf-profile.children.cycles-pp.update_process_times
4.60 -3.2 1.39 ? 10% perf-profile.self.cycles-pp.__fget_light
6.80 -0.7 6.06 ? 9% perf-profile.self.cycles-pp.__fget_files
2.25 -0.5 1.78 ? 10% perf-profile.self.cycles-pp.shmem_file_llseek
1.26 -0.2 1.06 ? 10% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.62 ? 4% -0.1 0.55 ? 10% perf-profile.self.cycles-pp.testcase
0.22 ? 2% -0.0 0.19 ? 9% perf-profile.self.cycles-pp.__x64_sys_lseek
0.14 ? 4% +0.0 0.18 ? 12% perf-profile.self.cycles-pp.rcu_read_unlock_strict
0.73 +0.6 1.30 ? 10% perf-profile.self.cycles-pp.do_syscall_64



will-it-scale.44.threads

3.5e+08 +-----------------------------------------------------------------+
|O O OO OOO O OO OO OOO OO OO OOO OO OO OOO OO OO OO OOO OO OO O|
3e+08 |+.++.++.+++.++.++.++.+++.++.++.+++.++.++.+++.++.++.++.+++.++.++ |
| |
2.5e+08 |-+ |
| |
2e+08 |-+ |
| |
1.5e+08 |-+ |
| |
1e+08 |-+ |
| |
5e+07 |-+ |
| |
0 +-----------------------------------------------------------------+


will-it-scale.per_thread_ops

8e+06 +-------------------------------------------------------------------+
|O O OO OO O OO OO OO OO OO OO OOO OO OO OO OO OO OO OO OO OO OO O|
7e+06 |+.++.++.++.++.++.++.++.++.++.++.+++.++.++.++.++.++.++.++.++.++.++ |
6e+06 |-+ |
| |
5e+06 |-+ |
| |
4e+06 |-+ |
| |
3e+06 |-+ |
2e+06 |-+ |
| |
1e+06 |-+ |
| |
0 +-------------------------------------------------------------------+


will-it-scale.workload

3.5e+08 +-----------------------------------------------------------------+
|O O OO OOO O OO OO OOO OO OO OOO OO OO OOO OO OO OO OOO OO OO O|
3e+08 |+.++.++.+++.++.++.++.+++.++.++.+++.++.++.+++.++.++.++.+++.++.++ |
| |
2.5e+08 |-+ |
| |
2e+08 |-+ |
| |
1.5e+08 |-+ |
| |
1e+08 |-+ |
| |
5e+07 |-+ |
| |
0 +-----------------------------------------------------------------+




0.045 +-------------------------------------------------------------------+
| |
| + |
0.04 |-+ : |
| : |
| :: |
0.035 |-+ :: |
| : : + |
0.03 |-+ + : : :: |
| :: + : : :: + .+ +. |
| + :: :+ + + : : : + + +.+ +.+ + ++.|
0.025 |-+ :+: : + .+ + :.+ + + : +.: + +. + |
|+ +.+ + + +.+ :.++ + :.++ ++ + ++ |
| + + + |
0.02 +-------------------------------------------------------------------+




0.013 +-------------------------------------------------------------------+
| : : |
0.012 |-+ + :: : + |
0.011 |-+ :: :: : : :: |
|.+ + : .+ +. +. +. + .+ : : : : :+.+ |
0.01 |-++.+ : ++ +.+.+ ++ : + ++ : + ++ + +.+ : + : : : |
| :: +: + : : : + : : : : : : |
0.009 |-+ + + + : : + :: : : ::|
| : : :: : : ::|
0.008 |-+ : : : : : + |
0.007 |-+ + : + + : |
| : : : |
0.006 |-+ + : |
| : |
0.005 +-------------------------------------------------------------------+




600 +---------------------------------------------------------------------+
| + +. + |
| + :+ : + + + |
550 |-+ +.+ : +.+ .+.+ +. +.+ +.+ + : .|
| +.+ + :: + +.+ : ++ : : +. : : + |
| : + : + : : : : + : + : : |
500 |-+: : : : : : : : : +: : : |
| : : : : : + : + : + + : |
450 |-+: : : + : : : : : : |
|: : :: : : : : : : |
|:: : :: : : : : |
400 |:: : :: :: ++.+ |
|:: + :: : |
| : : : |
350 +---------------------------------------------------------------------+




0.045 +-------------------------------------------------------------------+
| |
| + |
0.04 |-+ : |
| : |
| :: |
0.035 |-+ :: |
| : : + |
0.03 |-+ + : : :: |
| :: + : : :: + .+ +. |
| + :: :+ + + : : : + + +.+ +.+ + ++.|
0.025 |-+ :+: : + .+ + :.+ + + : +.: + +. + |
|+ +.+ + + +.+ :.++ + :.++ ++ + ++ |
| + + + |
0.02 +-------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample

***************************************************************************************************
lkp-csl-2sp9: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/lseek2/will-it-scale/0x5003006

commit:
39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
fe950f6020 ("x86/entry: Enable random_kstack_offset support")

39218ff4c625dbf2 fe950f6020338c8ac668ef823bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
3.043e+08 +5.5% 3.212e+08 will-it-scale.44.threads
6915852 +5.5% 7299206 will-it-scale.per_thread_ops
3.043e+08 +5.5% 3.212e+08 will-it-scale.workload
45.50 ?178% -99.3% 0.33 ?223% interrupts.109:PCI-MSI.31981642-edge.i40e-eth0-TxRx-73
1318 ? 17% -27.1% 961.00 ? 23% interrupts.CPU17.CAL:Function_call_interrupts
4441 ? 55% +56.4% 6947 ? 14% interrupts.CPU27.NMI:Non-maskable_interrupts
4441 ? 55% +56.4% 6947 ? 14% interrupts.CPU27.PMI:Performance_monitoring_interrupts
969.67 ? 21% +44.8% 1404 ? 15% interrupts.CPU52.CAL:Function_call_interrupts
1048 ? 17% +28.6% 1348 ? 9% interrupts.CPU70.CAL:Function_call_interrupts
45.00 ?179% -99.6% 0.17 ?223% interrupts.CPU73.109:PCI-MSI.31981642-edge.i40e-eth0-TxRx-73
4684 ? 47% +52.1% 7122 ? 14% interrupts.CPU86.NMI:Non-maskable_interrupts
4684 ? 47% +52.1% 7122 ? 14% interrupts.CPU86.PMI:Performance_monitoring_interrupts
0.02 ? 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
466.83 ? 6% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.12 ? 35% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
440.02 ?222% -99.7% 1.54 ? 14% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
440.03 ?222% -99.6% 1.55 ? 14% perf-sched.wait_time.avg.ms.do_syslog.part.0.kmsg_read.vfs_read
0.02 ? 7% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
1318 ?222% -99.8% 3.08 ? 14% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
1318 ?222% -99.8% 3.10 ? 14% perf-sched.wait_time.max.ms.do_syslog.part.0.kmsg_read.vfs_read
0.12 ? 35% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
13627 ? 8% +30.1% 17730 ? 20% softirqs.CPU2.RCU
13987 ? 8% +24.5% 17410 ? 15% softirqs.CPU20.RCU
11774 ? 11% +32.8% 15633 ? 22% softirqs.CPU3.RCU
12955 ? 8% +34.2% 17380 ? 13% softirqs.CPU30.RCU
13872 ? 3% +23.5% 17126 ? 14% softirqs.CPU31.RCU
14367 ? 6% +13.4% 16292 ? 9% softirqs.CPU36.RCU
12724 ? 10% +31.0% 16667 ? 5% softirqs.CPU37.RCU
12093 ? 8% +20.4% 14559 ? 10% softirqs.CPU63.RCU
12010 ? 9% +22.6% 14726 ? 9% softirqs.CPU67.RCU
21091 ? 32% +50.0% 31645 ? 20% softirqs.CPU74.SCHED
13024 ? 8% +28.1% 16685 ? 14% softirqs.CPU9.RCU
35.56 ? 2% -6.2 29.33 ? 9% perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
45.14 -6.2 38.95 ? 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_lseek64
20.70 ? 2% -4.7 15.96 ? 9% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
11.24 ? 2% -4.4 6.87 ? 9% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.74 ? 2% -1.1 5.61 ? 9% perf-profile.calltrace.cycles-pp.__fget_files.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64
2.41 ? 3% -0.7 1.71 ? 9% perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +1.4 1.36 ? 8% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +2.6 2.59 ? 9% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
36.05 ? 2% -6.1 29.96 ? 9% perf-profile.children.cycles-pp.ksys_lseek
45.41 -6.0 39.36 ? 9% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
21.10 ? 2% -4.7 16.36 ? 9% perf-profile.children.cycles-pp.__fdget_pos
11.59 ? 2% -4.4 7.20 ? 9% perf-profile.children.cycles-pp.__fget_light
7.05 ? 2% -1.1 5.95 ? 9% perf-profile.children.cycles-pp.__fget_files
2.41 ? 3% -0.7 1.71 ? 9% perf-profile.children.cycles-pp.shmem_file_llseek
0.34 ? 4% -0.0 0.30 ? 8% perf-profile.children.cycles-pp.update_process_times
4.45 ? 2% -3.3 1.18 ? 10% perf-profile.self.cycles-pp.__fget_light
6.67 ? 2% -1.1 5.57 ? 9% perf-profile.self.cycles-pp.__fget_files
2.21 ? 3% -0.7 1.51 ? 8% perf-profile.self.cycles-pp.shmem_file_llseek
1.26 -0.2 1.02 ? 8% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.71 +0.5 1.24 ? 9% perf-profile.self.cycles-pp.do_syscall_64
3.05e+10 +5.8% 3.227e+10 perf-stat.i.branch-instructions
3.079e+08 +4.5% 3.218e+08 perf-stat.i.branch-misses
3.998e+10 +6.1% 4.244e+10 perf-stat.i.dTLB-loads
44220 +6.5% 47109 ? 3% perf-stat.i.dTLB-store-misses
2.649e+10 +7.2% 2.839e+10 perf-stat.i.dTLB-stores
2.976e+08 +3.9% 3.093e+08 perf-stat.i.iTLB-load-misses
1824718 -4.2% 1748236 perf-stat.i.iTLB-loads
1.334e+11 +6.4% 1.419e+11 perf-stat.i.instructions
448.74 +2.4% 459.53 perf-stat.i.instructions-per-iTLB-miss
1.08 +6.7% 1.15 perf-stat.i.ipc
1101 +6.3% 1171 perf-stat.i.metric.M/sec
1.01 -0.0 1.00 perf-stat.overall.branch-miss-rate%
0.93 -6.7% 0.86 perf-stat.overall.cpi
448.14 +2.4% 458.89 perf-stat.overall.instructions-per-iTLB-miss
1.08 +7.2% 1.16 perf-stat.overall.ipc
131961 +1.7% 134156 perf-stat.overall.path-length
3.04e+10 +5.8% 3.216e+10 perf-stat.ps.branch-instructions
3.069e+08 +4.5% 3.208e+08 perf-stat.ps.branch-misses
3.985e+10 +6.1% 4.23e+10 perf-stat.ps.dTLB-loads
44075 +6.6% 46980 ? 3% perf-stat.ps.dTLB-store-misses
2.64e+10 +7.2% 2.829e+10 perf-stat.ps.dTLB-stores
2.967e+08 +3.9% 3.083e+08 perf-stat.ps.iTLB-load-misses
1818554 -4.2% 1742364 perf-stat.ps.iTLB-loads
1.329e+11 +6.4% 1.415e+11 perf-stat.ps.instructions
4.016e+13 +7.3% 4.309e+13 perf-stat.total.instructions



***************************************************************************************************
lkp-csl-2sp9: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/lseek2/will-it-scale/0x5003006

commit:
39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
fe950f6020 ("x86/entry: Enable random_kstack_offset support")

39218ff4c625dbf2 fe950f6020338c8ac668ef823bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.093e+08 +7.8% 4.41e+08 will-it-scale.44.processes
9301344 +7.8% 10022385 will-it-scale.per_process_ops
4.093e+08 +7.8% 4.41e+08 will-it-scale.workload
126.29 ? 32% +77.8% 224.57 ? 22% interrupts.CPU0.RES:Rescheduling_interrupts
15.17 +1.5 16.68 mpstat.cpu.all.usr%
262.82 +2.5% 269.31 turbostat.PkgWatt
15.00 +6.7% 16.00 vmstat.cpu.us
15940 ? 11% +36.9% 21829 ? 23% softirqs.CPU20.RCU
14633 ? 13% +36.8% 20013 ? 17% softirqs.CPU39.RCU
17267 ? 13% +27.1% 21954 ? 19% softirqs.CPU43.RCU
0.01 ? 16% -100.0% 0.00 perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ? 10% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
529.73 ? 10% -21.5% 415.67 ? 19% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
567.00 ? 5% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.57 ?179% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ? 10% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
529.73 ? 10% -21.5% 415.67 ? 19% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
0.57 ?179% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
12.09 -6.8 5.29 ? 12% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
11.05 -6.8 4.29 ? 12% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
21.28 -6.4 14.86 ? 12% perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
35.29 -5.3 30.00 ? 12% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.llseek
1.41 ? 2% -0.3 1.09 ? 14% perf-profile.calltrace.cycles-pp.testcase
0.00 +1.1 1.09 ? 13% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
0.00 +3.1 3.10 ? 11% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
0.00 +5.2 5.19 ? 11% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
12.69 -6.9 5.83 ? 12% perf-profile.children.cycles-pp.__fdget_pos
11.06 -6.8 4.29 ? 12% perf-profile.children.cycles-pp.__fget_light
21.71 -6.5 15.25 ? 12% perf-profile.children.cycles-pp.ksys_lseek
35.62 -5.1 30.57 ? 12% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
1.54 ? 2% -0.4 1.18 ? 13% perf-profile.children.cycles-pp.testcase
1.36 -0.3 1.09 ? 13% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.70 ? 2% -0.2 0.48 ? 13% perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
2.49 ? 2% +0.8 3.30 ? 11% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
10.62 -6.7 3.93 ? 12% perf-profile.self.cycles-pp.__fget_light
1.23 ? 2% -0.3 0.89 ? 13% perf-profile.self.cycles-pp.testcase
1.35 -0.3 1.08 ? 12% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
1.72 -0.3 1.45 ? 12% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.45 ? 4% -0.1 0.37 ? 13% perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
1.79 ? 2% +0.9 2.69 ? 10% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
1.29 +1.0 2.32 ? 12% perf-profile.self.cycles-pp.do_syscall_64
2.753e+10 +9.2% 3.007e+10 perf-stat.i.branch-instructions
1.50 -0.0 1.48 perf-stat.i.branch-miss-rate%
4.131e+08 +7.6% 4.446e+08 perf-stat.i.branch-misses
0.94 -9.1% 0.86 perf-stat.i.cpi
3.984e+10 +9.8% 4.376e+10 perf-stat.i.dTLB-loads
61463 +9.2% 67090 perf-stat.i.dTLB-store-misses
2.621e+10 +11.1% 2.912e+10 perf-stat.i.dTLB-stores
3.961e+08 +7.4% 4.255e+08 perf-stat.i.iTLB-load-misses
1836428 -3.3% 1776535 perf-stat.i.iTLB-loads
1.311e+11 +10.0% 1.442e+11 perf-stat.i.instructions
333.07 +3.4% 344.51 perf-stat.i.instructions-per-iTLB-miss
1.06 +10.0% 1.17 perf-stat.i.ipc
1063 +10.0% 1169 perf-stat.i.metric.M/sec
1.50 -0.0 1.48 perf-stat.overall.branch-miss-rate%
0.94 -9.1% 0.86 perf-stat.overall.cpi
331.01 +2.4% 338.86 perf-stat.overall.instructions-per-iTLB-miss
1.06 +10.0% 1.17 perf-stat.overall.ipc
96478 +2.1% 98513 perf-stat.overall.path-length
2.744e+10 +9.2% 2.997e+10 perf-stat.ps.branch-instructions
4.117e+08 +7.6% 4.432e+08 perf-stat.ps.branch-misses
3.971e+10 +9.8% 4.361e+10 perf-stat.ps.dTLB-loads
61264 +9.2% 66871 perf-stat.ps.dTLB-store-misses
2.613e+10 +11.1% 2.902e+10 perf-stat.ps.dTLB-stores
3.947e+08 +7.4% 4.241e+08 perf-stat.ps.iTLB-load-misses
1830250 -3.3% 1770517 perf-stat.ps.iTLB-loads
1.307e+11 +10.0% 1.437e+11 perf-stat.ps.instructions
3.948e+13 +10.0% 4.344e+13 perf-stat.total.instructions



***************************************************************************************************
lkp-csl-2sp9: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/lseek2/will-it-scale/0x5003006

commit:
39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
fe950f6020 ("x86/entry: Enable random_kstack_offset support")

39218ff4c625dbf2 fe950f6020338c8ac668ef823bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.107e+08 +5.5% 1.168e+08 will-it-scale.16.threads
6920338 +5.5% 7300711 will-it-scale.per_thread_ops
1.107e+08 +5.5% 1.168e+08 will-it-scale.workload
2.421e+08 ?158% -99.1% 2058964 ? 5% cpuidle.C1.time
166.58 +1.3% 168.78 turbostat.PkgWatt
1510 ? 10% +12.7% 1703 ? 4% slabinfo.khugepaged_mm_slot.active_objs
1510 ? 10% +12.7% 1703 ? 4% slabinfo.khugepaged_mm_slot.num_objs
41752 +14.6% 47840 ? 12% softirqs.CPU23.SCHED
23632 ? 9% +23.9% 29282 ? 8% softirqs.CPU47.SCHED
24724 ? 17% +26.0% 31150 ? 7% softirqs.CPU52.SCHED
19277 ? 17% -26.8% 14106 ? 21% softirqs.CPU8.SCHED
10.53 ? 10% -3.6 6.97 ? 12% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.4 1.39 ? 10% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +2.6 2.63 ? 10% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
10.87 ? 10% -3.6 7.32 ? 12% perf-profile.children.cycles-pp.__fget_light
0.17 ? 17% -0.1 0.11 ? 18% perf-profile.children.cycles-pp.clockevents_program_event
0.10 ? 12% -0.1 0.05 ? 73% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
4.17 ? 10% -3.0 1.19 ? 11% perf-profile.self.cycles-pp.__fget_light
0.68 ? 11% +0.6 1.23 ? 9% perf-profile.self.cycles-pp.do_syscall_64
0.02 ? 54% -84.2% 0.00 ? 85% perf-sched.sch_delay.avg.ms.schedule_timeout.wait_for_completion.__flush_work.lru_add_drain_all
0.02 ?132% -100.0% 0.00 perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ? 43% -61.2% 0.01 ? 40% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
0.03 ? 64% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
111.83 ? 32% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.87 ?209% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ? 65% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.86 ?211% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
169.67 ? 27% +105.3% 348.33 ? 22% interrupts.CPU3.TLB:TLB_shootdowns
146.67 ? 22% -51.2% 71.50 ? 26% interrupts.CPU37.NMI:Non-maskable_interrupts
146.67 ? 22% -51.2% 71.50 ? 26% interrupts.CPU37.PMI:Performance_monitoring_interrupts
130.17 ? 29% -44.4% 72.33 ? 30% interrupts.CPU38.NMI:Non-maskable_interrupts
130.17 ? 29% -44.4% 72.33 ? 30% interrupts.CPU38.PMI:Performance_monitoring_interrupts
155.17 ? 30% +114.4% 332.67 ? 31% interrupts.CPU4.TLB:TLB_shootdowns
42.67 ? 92% -83.6% 7.00 ? 82% interrupts.CPU48.RES:Rescheduling_interrupts
53.17 ? 95% -88.7% 6.00 ? 36% interrupts.CPU52.RES:Rescheduling_interrupts
80.50 ?210% +1302.5% 1129 ?162% interrupts.CPU62.RES:Rescheduling_interrupts
138.00 ? 22% -47.9% 71.83 ? 53% interrupts.CPU70.NMI:Non-maskable_interrupts
138.00 ? 22% -47.9% 71.83 ? 53% interrupts.CPU70.PMI:Performance_monitoring_interrupts
117.83 ? 24% -45.8% 63.83 ? 30% interrupts.CPU80.NMI:Non-maskable_interrupts
117.83 ? 24% -45.8% 63.83 ? 30% interrupts.CPU80.PMI:Performance_monitoring_interrupts
120.00 ? 26% -46.2% 64.50 ? 28% interrupts.CPU81.NMI:Non-maskable_interrupts
120.00 ? 26% -46.2% 64.50 ? 28% interrupts.CPU81.PMI:Performance_monitoring_interrupts
1.12e+10 +6.6% 1.194e+10 perf-stat.i.branch-instructions
1.08 ? 2% -0.1 1.02 ? 2% perf-stat.i.branch-miss-rate%
0.96 -8.0% 0.89 perf-stat.i.cpi
1.467e+10 +6.8% 1.567e+10 perf-stat.i.dTLB-loads
9.704e+09 +8.0% 1.048e+10 perf-stat.i.dTLB-stores
1.085e+08 +5.2% 1.141e+08 perf-stat.i.iTLB-load-misses
4.903e+10 +7.2% 5.257e+10 perf-stat.i.instructions
453.23 +1.8% 461.58 perf-stat.i.instructions-per-iTLB-miss
1.04 +8.3% 1.13 perf-stat.i.ipc
404.23 +7.1% 432.74 perf-stat.i.metric.M/sec
1.07 ? 2% -0.1 1.02 ? 2% perf-stat.overall.branch-miss-rate%
0.96 -7.6% 0.89 perf-stat.overall.cpi
451.97 +1.9% 460.75 perf-stat.overall.instructions-per-iTLB-miss
1.04 +8.2% 1.13 perf-stat.overall.ipc
133400 +1.6% 135521 perf-stat.overall.path-length
1.116e+10 +6.6% 1.19e+10 perf-stat.ps.branch-instructions
1.462e+10 +6.8% 1.561e+10 perf-stat.ps.dTLB-loads
9.672e+09 +8.0% 1.044e+10 perf-stat.ps.dTLB-stores
1.081e+08 +5.2% 1.137e+08 perf-stat.ps.iTLB-load-misses
4.886e+10 +7.2% 5.239e+10 perf-stat.ps.instructions
1.477e+13 +7.2% 1.583e+13 perf-stat.total.instructions



***************************************************************************************************
lkp-csl-2ap2: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/dup1/will-it-scale/0x5003006

commit:
39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
fe950f6020 ("x86/entry: Enable random_kstack_offset support")

39218ff4c625dbf2 fe950f6020338c8ac668ef823bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.653e+08 -1.7% 4.575e+08 will-it-scale.96.processes
4846984 -1.7% 4765791 will-it-scale.per_process_ops
4.653e+08 -1.7% 4.575e+08 will-it-scale.workload
0.84 ? 2% +0.2 1.04 ? 7% mpstat.cpu.all.irq%
50.00 -2.0% 49.00 vmstat.cpu.id
125.33 ? 57% -69.8% 37.83 ? 83% interrupts.CPU164.RES:Rescheduling_interrupts
7763 ? 11% -33.1% 5197 ? 48% interrupts.CPU177.NMI:Non-maskable_interrupts
7763 ? 11% -33.1% 5197 ? 48% interrupts.CPU177.PMI:Performance_monitoring_interrupts
7750 ? 18% -58.8% 3194 ? 37% interrupts.CPU92.NMI:Non-maskable_interrupts
7750 ? 18% -58.8% 3194 ? 37% interrupts.CPU92.PMI:Performance_monitoring_interrupts
14208 ? 14% +18.5% 16830 ? 8% softirqs.CPU15.RCU
25356 ? 36% +46.5% 37157 ? 7% softirqs.CPU164.SCHED
11031 ? 14% +31.3% 14481 ? 14% softirqs.CPU169.RCU
12602 ? 8% +17.5% 14804 ? 6% softirqs.CPU183.RCU
28486 ? 25% -44.9% 15685 ? 48% softirqs.CPU24.SCHED
15237 ? 12% +23.6% 18837 ? 7% softirqs.CPU68.RCU
0.01 ? 9% -16.7% 0.01 ? 11% perf-sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
0.01 ? 19% +37.3% 0.01 ? 16% perf-sched.sch_delay.avg.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
0.01 ? 23% -100.0% 0.00 perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.01 ? 19% +37.3% 0.01 ? 16% perf-sched.sch_delay.max.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
2.10 ?139% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
1188 ? 8% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2621 ?141% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2.10 ?139% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2621 ?141% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ? 2% +7.3% 0.03 ? 5% perf-stat.i.MPKI
5.915e+08 -1.3% 5.835e+08 perf-stat.i.branch-misses
12.12 ? 2% +0.9 13.00 ? 4% perf-stat.i.cache-miss-rate%
1200067 ? 3% +13.7% 1364854 ? 4% perf-stat.i.cache-misses
9856054 +6.4% 10489701 perf-stat.i.cache-references
281116 ? 2% -12.7% 245501 ? 4% perf-stat.i.cycles-between-cache-misses
4298048 -4.2% 4118045 perf-stat.i.iTLB-loads
74.85 +3.4% 77.39 perf-stat.i.metric.K/sec
0.03 +5.5% 0.03 perf-stat.overall.MPKI
241364 ? 3% -12.1% 212085 ? 4% perf-stat.overall.cycles-between-cache-misses
233746 +1.8% 238022 perf-stat.overall.path-length
5.897e+08 -1.4% 5.817e+08 perf-stat.ps.branch-misses
1222801 ? 3% +13.7% 1390521 ? 4% perf-stat.ps.cache-misses
10040040 +5.7% 10608298 perf-stat.ps.cache-references
4283511 -4.2% 4104019 perf-stat.ps.iTLB-loads
0.93 ? 9% +0.2 1.14 ? 4% perf-profile.calltrace.cycles-pp.fd_install.__x64_sys_dup.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup
0.00 +0.9 0.94 ? 5% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup
0.00 +1.0 0.98 ? 4% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
0.00 +1.8 1.75 ? 4% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup
0.00 +1.8 1.82 ? 4% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
11.57 ? 8% +2.4 13.96 ? 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
0.14 ? 9% +0.0 0.15 ? 3% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
0.12 ? 9% +0.0 0.14 ? 5% perf-profile.children.cycles-pp.perf_prepare_sample
0.14 ? 10% +0.0 0.17 ? 3% perf-profile.children.cycles-pp.update_curr
0.14 ? 9% +0.0 0.16 ? 6% perf-profile.children.cycles-pp.perf_tp_event
0.38 ? 10% +0.1 0.45 ? 3% perf-profile.children.cycles-pp.close@plt
0.31 ? 11% +0.1 0.45 ? 40% perf-profile.children.cycles-pp.update_process_times
0.31 ? 12% +0.1 0.46 ? 41% perf-profile.children.cycles-pp.tick_sched_handle
0.32 ? 10% +0.2 0.49 ? 50% perf-profile.children.cycles-pp.tick_sched_timer
0.35 ? 10% +0.2 0.53 ? 49% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.58 ? 11% +1.6 2.13 ?154% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
28.26 ? 8% +5.5 33.75 ? 4% perf-profile.children.cycles-pp.do_syscall_64
0.92 ? 9% +0.2 1.09 ? 4% perf-profile.self.cycles-pp.fd_install
0.89 ? 9% +0.7 1.62 ? 4% perf-profile.self.cycles-pp.do_syscall_64



***************************************************************************************************
lkp-csl-2sp9: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/lseek2/will-it-scale/0x5003006

commit:
39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
fe950f6020 ("x86/entry: Enable random_kstack_offset support")

39218ff4c625dbf2 fe950f6020338c8ac668ef823bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.489e+08 +7.9% 1.606e+08 will-it-scale.16.processes
9303247 +7.9% 10038079 will-it-scale.per_process_ops
1.489e+08 +7.9% 1.606e+08 will-it-scale.workload
164.84 +1.4% 167.21 turbostat.PkgWatt
1.163e+10 ? 63% -99.6% 41700541 ?179% cpuidle.C6.time
14620637 ? 59% -99.6% 65713 ?156% cpuidle.C6.usage
40.08 ? 49% +107.9% 83.33 ? 14% sched_debug.cfs_rq:/.removed.runnable_avg.max
5.85 ? 54% +91.4% 11.19 ? 20% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
40.08 ? 49% +107.9% 83.33 ? 14% sched_debug.cfs_rq:/.removed.util_avg.max
5.85 ? 54% +83.6% 10.74 ? 20% sched_debug.cfs_rq:/.removed.util_avg.stddev
99973 ? 53% +97.4% 197369 ? 33% interrupts.CAL:Function_call_interrupts
488.17 ? 6% +24.7% 608.67 ? 15% interrupts.CPU22.CAL:Function_call_interrupts
14.00 ? 87% +432.1% 74.50 ? 32% interrupts.CPU22.RES:Rescheduling_interrupts
218.17 ?107% -72.5% 60.00 ? 23% interrupts.CPU35.NMI:Non-maskable_interrupts
218.17 ?107% -72.5% 60.00 ? 23% interrupts.CPU35.PMI:Performance_monitoring_interrupts
123.17 ? 42% -43.0% 70.17 ? 30% interrupts.CPU37.NMI:Non-maskable_interrupts
123.17 ? 42% -43.0% 70.17 ? 30% interrupts.CPU37.PMI:Performance_monitoring_interrupts
2.67 ? 91% +756.2% 22.83 ?161% interrupts.CPU38.TLB:TLB_shootdowns
3.50 ? 69% +1423.8% 53.33 ?184% interrupts.CPU67.TLB:TLB_shootdowns
34.00 ? 14% +163.7% 89.67 ?111% interrupts.CPU8.RES:Rescheduling_interrupts
113.50 ? 38% -39.5% 68.67 ? 8% interrupts.CPU85.NMI:Non-maskable_interrupts
113.50 ? 38% -39.5% 68.67 ? 8% interrupts.CPU85.PMI:Performance_monitoring_interrupts
330.50 ? 54% +131.4% 764.67 ? 30% interrupts.TLB:TLB_shootdowns
0.01 ? 5% -100.0% 0.00 perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ?128% -70.0% 0.01 ? 18% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
0.03 ? 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
530.40 ? 14% +59.3% 845.16 ? 13% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
186.83 ? 10% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.33 ?159% -89.4% 0.04 ? 44% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.15 ? 61% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ?128% -70.0% 0.01 ? 18% perf-sched.wait_and_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
0.03 ? 7% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2.55 ? 5% +15.4% 2.95 ? 12% perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork
530.39 ? 14% +59.3% 845.15 ? 13% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
0.32 ?157% -87.1% 0.04 ? 5% perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.15 ? 61% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
10.84 ? 11% -6.1 4.69 ? 10% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
9.92 ? 11% -6.1 3.81 ? 10% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
19.15 ? 11% -6.0 13.17 ? 10% perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
1.28 ? 15% -0.3 0.96 ? 9% perf-profile.calltrace.cycles-pp.testcase
0.77 ? 11% -0.2 0.61 ? 11% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
0.88 ? 9% -0.1 0.74 ? 11% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
0.00 +1.0 0.95 ? 9% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
0.00 +2.7 2.72 ? 9% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
0.00 +4.6 4.58 ? 9% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
11.38 ? 11% -6.2 5.17 ? 10% perf-profile.children.cycles-pp.__fdget_pos
9.93 ? 11% -6.1 3.81 ? 10% perf-profile.children.cycles-pp.__fget_light
19.53 ? 11% -6.0 13.53 ? 10% perf-profile.children.cycles-pp.ksys_lseek
1.39 ? 14% -0.4 1.04 ? 9% perf-profile.children.cycles-pp.testcase
1.22 ? 13% -0.3 0.95 ? 9% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.63 ? 12% -0.2 0.42 ? 11% perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
2.20 ? 11% +0.7 2.89 ? 10% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
9.54 ? 11% -6.0 3.49 ? 10% perf-profile.self.cycles-pp.__fget_light
1.12 ? 14% -0.3 0.78 ? 9% perf-profile.self.cycles-pp.testcase
1.22 ? 13% -0.3 0.94 ? 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
1.53 ? 11% -0.3 1.27 ? 10% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.40 ? 10% -0.1 0.32 ? 9% perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
1.58 ? 12% +0.8 2.36 ? 9% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
1.16 ? 12% +0.9 2.04 ? 10% perf-profile.self.cycles-pp.do_syscall_64
0.53 ? 50% -72.9% 0.14 ? 3% perf-stat.i.MPKI
1.013e+10 +9.2% 1.107e+10 perf-stat.i.branch-instructions
1.55 -0.1 1.49 perf-stat.i.branch-miss-rate%
1.57e+08 +4.6% 1.643e+08 perf-stat.i.branch-misses
1167685 ? 23% -35.0% 758467 ? 3% perf-stat.i.cache-misses
0.97 -10.3% 0.87 perf-stat.i.cpi
46007 ? 30% +46.5% 67405 ? 3% perf-stat.i.cycles-between-cache-misses
0.01 ? 67% -0.0 0.00 ? 6% perf-stat.i.dTLB-load-miss-rate%
1.463e+10 +10.0% 1.609e+10 perf-stat.i.dTLB-loads
0.00 ? 52% -0.0 0.00 ? 2% perf-stat.i.dTLB-store-miss-rate%
9.612e+09 +11.1% 1.068e+10 perf-stat.i.dTLB-stores
1.445e+08 +7.8% 1.558e+08 perf-stat.i.iTLB-load-misses
4.825e+10 +10.0% 5.307e+10 perf-stat.i.instructions
338.28 +2.2% 345.61 perf-stat.i.instructions-per-iTLB-miss
1.03 +11.5% 1.15 perf-stat.i.ipc
1.18 ? 3% -13.8% 1.01 ? 8% perf-stat.i.metric.K/sec
390.93 +10.0% 430.08 perf-stat.i.metric.M/sec
0.53 ? 51% -73.6% 0.14 ? 3% perf-stat.overall.MPKI
1.55 -0.1 1.48 perf-stat.overall.branch-miss-rate%
0.97 -10.3% 0.87 perf-stat.overall.cpi
43056 ? 27% +42.1% 61177 ? 3% perf-stat.overall.cycles-between-cache-misses
0.01 ? 68% -0.0 0.00 ? 6% perf-stat.overall.dTLB-load-miss-rate%
0.00 ? 54% -0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
333.83 +2.0% 340.60 perf-stat.overall.instructions-per-iTLB-miss
1.03 +11.5% 1.15 perf-stat.overall.ipc
97642 +1.8% 99442 perf-stat.overall.path-length
1.01e+10 +9.2% 1.103e+10 perf-stat.ps.branch-instructions
1.565e+08 +4.6% 1.637e+08 perf-stat.ps.branch-misses
1163211 ? 23% -35.1% 755495 ? 3% perf-stat.ps.cache-misses
1.458e+10 +10.0% 1.604e+10 perf-stat.ps.dTLB-loads
9.58e+09 +11.1% 1.064e+10 perf-stat.ps.dTLB-stores
1.441e+08 +7.8% 1.553e+08 perf-stat.ps.iTLB-load-misses
4.809e+10 +10.0% 5.289e+10 perf-stat.ps.instructions
1.453e+13 +9.9% 1.597e+13 perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (63.00 kB)
config-5.12.0-rc6-00004-gfe950f602033 (175.63 kB)
job-script (7.99 kB)
job.yaml (5.50 kB)
reproduce (347.00 B)
Download all attachments