2019-03-13 05:43:35

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [x86, retpolines] ce02ef06fc: will-it-scale.per_thread_ops 3.1% improvement

Greeting,

FYI, we noticed a 3.1% improvement of will-it-scale.per_thread_ops due to commit:


commit: ce02ef06fcf7a399a6276adb83f37373d10cbbe1 ("x86, retpolines: Raise limit for generating indirect calls from switch-case")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
with following parameters:

nr_task: 100%
mode: thread
test: futex3
cpufreq_governor: performance
ucode: 0xb00002e

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 4.3% improvement |
| test machine | 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=futex3 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 2.5% improvement |
| test machine | 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=futex1 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 5.8% improvement |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=futex3 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 2.6% improvement |
| test machine | 160 threads Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz with 256G memory |
| test parameters | cpufreq_governor=performance |
| | test=futex1 |
| | ucode=0xb00002e |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 2.5% improvement |
| test machine | 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=futex2 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 3.1% improvement |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=50% |
| | test=futex4 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 5.4% improvement |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=16 |
| | test=futex3 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 3.0% improvement |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=16 |
| | test=futex4 |
+------------------+---------------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-8/performance/x86_64-rhel-7.6/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3b/futex3/will-it-scale/0xb00002e

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
2879155 +3.1% 2967432 will-it-scale.per_thread_ops
17102 -4.3% 16361 ± 4% will-it-scale.time.system_time
9373 +7.9% 10115 ± 6% will-it-scale.time.user_time
2.534e+08 +3.1% 2.611e+08 will-it-scale.workload
342984 ± 3% -17.5% 283080 ± 9% meminfo.DirectMap4k
731997 ± 3% +8.0% 790813 ± 3% numa-vmstat.node1.numa_hit
542485 ± 4% +9.1% 591663 ± 6% numa-vmstat.node1.numa_local
71365 ± 26% -61.7% 27347 ±103% turbostat.C3
93437 ± 8% +89.5% 177023 ± 62% turbostat.C6
64.00 -5.5% 60.50 ± 4% vmstat.cpu.sy
34.00 +10.3% 37.50 ± 6% vmstat.cpu.us
1102 -4.2% 1055 vmstat.system.cs
73.25 ± 35% +498.0% 438.00 ± 42% interrupts.CPU1.RES:Rescheduling_interrupts
14.00 ± 56% +937.5% 145.25 ± 96% interrupts.CPU44.RES:Rescheduling_interrupts
34.50 ± 36% +5674.6% 1992 ±152% interrupts.CPU71.RES:Rescheduling_interrupts
39.50 ± 27% +1055.1% 456.25 ± 86% interrupts.CPU81.RES:Rescheduling_interrupts
30470003 ± 26% -65.9% 10405023 ±132% cpuidle.C3.time
71763 ± 26% -61.2% 27879 ±100% cpuidle.C3.usage
101566 ± 7% +84.3% 187224 ± 59% cpuidle.C6.usage
3666 ± 9% +261.9% 13269 ±105% cpuidle.POLL.time
2069 ± 11% +60.3% 3317 ± 21% cpuidle.POLL.usage
20.55 -2.5 18.10 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
26.37 -1.9 24.46 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
34.04 -1.4 32.68 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
32.05 -0.9 31.17 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.66 +0.2 2.84 perf-profile.calltrace.cycles-pp.get_futex_key_refs.get_futex_key.futex_wake.do_futex.__x64_sys_futex
6.21 +0.5 6.68 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
13.85 +0.7 14.59 perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.81 +0.9 31.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
28.95 +1.0 29.94 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
25620 ± 3% +178.3% 71307 ±105% sched_debug.cfs_rq:/.load.max
28.67 ± 12% +36.0% 38.98 ± 7% sched_debug.cfs_rq:/.load_avg.stddev
1.90 ± 70% +127.6% 4.32 ± 36% sched_debug.cfs_rq:/.removed.load_avg.avg
15.09 ± 60% +73.2% 26.14 ± 19% sched_debug.cfs_rq:/.removed.load_avg.stddev
87.67 ± 70% +127.5% 199.49 ± 36% sched_debug.cfs_rq:/.removed.runnable_sum.avg
696.69 ± 60% +73.1% 1205 ± 19% sched_debug.cfs_rq:/.removed.runnable_sum.stddev
8.26 ± 6% -41.2% 4.86 ± 25% sched_debug.cpu.clock.stddev
8.26 ± 6% -41.2% 4.86 ± 25% sched_debug.cpu.clock_task.stddev
25549 ± 3% +13.0% 28872 ± 7% sched_debug.cpu.load.max
2572 ± 2% -7.0% 2391 ± 2% sched_debug.cpu.nr_switches.stddev
0.04 ± 8% +186.1% 0.11 ±105% perf-stat.i.MPKI
1.728e+10 +4.5% 1.806e+10 perf-stat.i.branch-instructions
2.95 -1.5 1.47 perf-stat.i.branch-miss-rate%
5.093e+08 -48.1% 2.641e+08 perf-stat.i.branch-misses
4.46 ± 4% -0.7 3.74 ± 6% perf-stat.i.cache-miss-rate%
1071 -4.3% 1025 perf-stat.i.context-switches
2.04 -3.1% 1.98 perf-stat.i.cpi
0.00 ± 3% +0.0 0.00 ± 91% perf-stat.i.dTLB-load-miss-rate%
122225 ± 3% +64.5% 201098 ± 4% perf-stat.i.dTLB-load-misses
3.044e+10 +1.3% 3.083e+10 perf-stat.i.dTLB-loads
1.195e+11 +3.4% 1.236e+11 perf-stat.i.instructions
0.49 +3.4% 0.51 perf-stat.i.ipc
2.95 -1.5 1.46 perf-stat.overall.branch-miss-rate%
4.36 ± 3% -0.7 3.65 ± 6% perf-stat.overall.cache-miss-rate%
2.04 -3.4% 1.97 perf-stat.overall.cpi
0.00 ± 3% +0.0 0.00 ± 4% perf-stat.overall.dTLB-load-miss-rate%
0.49 +3.5% 0.51 perf-stat.overall.ipc
1.722e+10 +4.5% 1.8e+10 perf-stat.ps.branch-instructions
5.076e+08 -48.1% 2.632e+08 perf-stat.ps.branch-misses
1068 -4.3% 1022 perf-stat.ps.context-switches
121934 ± 3% +64.5% 200584 ± 4% perf-stat.ps.dTLB-load-misses
3.034e+10 +1.3% 3.072e+10 perf-stat.ps.dTLB-loads
1.191e+11 +3.4% 1.232e+11 perf-stat.ps.instructions
3.597e+13 +3.6% 3.726e+13 perf-stat.total.instructions
27179 ± 3% -10.4% 24364 ± 4% softirqs.CPU0.RCU
26245 ± 3% -7.7% 24218 ± 2% softirqs.CPU1.RCU
99697 ± 4% +14.1% 113712 ± 12% softirqs.CPU22.TIMER
95660 ± 3% +9.2% 104482 ± 6% softirqs.CPU23.TIMER
95210 ± 2% +11.2% 105917 ± 7% softirqs.CPU24.TIMER
95588 ± 3% +14.8% 109761 ± 15% softirqs.CPU26.TIMER
95590 ± 3% +8.7% 103931 ± 6% softirqs.CPU27.TIMER
95292 ± 3% +9.8% 104606 ± 7% softirqs.CPU28.TIMER
29507 ± 4% -10.1% 26537 ± 4% softirqs.CPU30.RCU
95494 ± 2% +13.6% 108483 ± 14% softirqs.CPU30.TIMER
95431 ± 3% +7.3% 102396 ± 5% softirqs.CPU31.TIMER
95509 ± 2% +9.2% 104339 ± 7% softirqs.CPU32.TIMER
31809 ± 12% -18.3% 25999 ± 7% softirqs.CPU34.RCU
30307 ± 7% -10.2% 27211 ± 5% softirqs.CPU36.RCU
30101 ± 2% -10.1% 27065 ± 4% softirqs.CPU37.RCU
32484 ± 13% -16.7% 27065 ± 5% softirqs.CPU38.RCU
28942 ± 4% -22.8% 22330 ± 11% softirqs.CPU43.RCU
29357 ± 11% -16.5% 24505 ± 3% softirqs.CPU53.RCU
27787 ± 2% -12.8% 24225 ± 4% softirqs.CPU55.RCU
26839 ± 6% +25.9% 33785 ± 11% softirqs.CPU56.RCU
27853 ± 5% -9.1% 25319 ± 6% softirqs.CPU59.RCU
23808 ± 6% -11.5% 21068 ± 6% softirqs.CPU61.RCU
23696 ± 3% -11.1% 21065 ± 11% softirqs.CPU63.RCU
94741 ± 2% +14.8% 108720 ± 13% softirqs.CPU66.TIMER
95004 ± 3% +8.4% 102943 ± 6% softirqs.CPU67.TIMER
94691 ± 3% +12.2% 106209 ± 7% softirqs.CPU68.TIMER
95364 ± 3% +15.0% 109629 ± 16% softirqs.CPU70.TIMER
95723 ± 4% +8.6% 103936 ± 7% softirqs.CPU71.TIMER
95124 ± 3% +10.6% 105166 ± 7% softirqs.CPU72.TIMER
95167 ± 2% +14.1% 108618 ± 14% softirqs.CPU74.TIMER
95123 ± 3% +8.1% 102817 ± 5% softirqs.CPU75.TIMER
25308 ± 20% -17.3% 20941 ± 4% softirqs.CPU85.RCU
23201 ± 4% -10.0% 20876 softirqs.CPU87.RCU



will-it-scale.workload

2.66e+08 +-+--O-O----------------------------------------O----------------+
2.64e+08 O-OO OO O O OO O O O |
| O OO O O O O OO O OO O OO O |
2.62e+08 +-+ OO O OO O
2.6e+08 +-+ O O |
| |
2.58e+08 +-+ |
2.56e+08 +-+ |
2.54e+08 +-+ +. |
| ++ : +.+. |
2.52e+08 +-+ : : : + |
2.5e+08 +-+ : : +.+.++. .+ |
|.++.+.+. +.+.+.++.+.++.+.+ +.+ +. .+ .+ |
2.48e+08 +-+ ++.+.+ + + |
2.46e+08 +-+--------------------------------------------------------------+


will-it-scale.time.user_time

11500 +-+-----------------------------------------------------------------+
| O O |
| |
11000 +-+ |
| |
| |
10500 +-+ |
| |
10000 +-+ |
| O O O O |
O O O O O OO O O OO O O O OO O O O OO O O O O O OO O O OO O
9500 +-+ |
|.+.++. .+.+.++. .+.+.++. .+.+.++.+.+.+.++.+.+.++.+.+.+.++ |
| +.+.+.++ + + |
9000 +-+-----------------------------------------------------------------+


will-it-scale.time.system_time

17500 +-+-----------------------------------------------------------------+
| .+.+.++. |
|.+.++.+ +.+.++.+.+.+.++.+.+.+.++.+.+.+.++.+.+.++.+.+.+.++ |
17000 +-+ |
| O OO O O OO O O O OO O O O OO O O O O O OO O O OO O
O O O O O O O O |
16500 +-+ |
| |
16000 +-+ |
| |
| |
15500 +-+ |
| |
| O O |
15000 +-+-----------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample

***************************************************************************************************
lkp-skl-2sp5: 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/process/50%/debian-x86_64-2018-04-03.cgz/lkp-skl-2sp5/futex3/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
%stddev %change %stddev
\ | \
2735813 +4.3% 2852635 will-it-scale.per_process_ops
1.532e+08 +4.3% 1.597e+08 will-it-scale.workload
0.00 ± 5% -0.0 0.00 ±146% mpstat.cpu.-1.iowait%
3.125e+09 ±167% +221.3% 1.004e+10 ± 61% cpuidle.C6.time
3890771 ±168% +200.8% 11703776 ± 56% cpuidle.C6.usage
3870516 ±168% +201.8% 11681235 ± 56% turbostat.C6
9.20 ±167% +20.4 29.59 ± 61% turbostat.C6%
16963 -11.4% 15029 ± 6% numa-meminfo.node0.Mapped
39861 ± 48% -31.9% 27147 ± 49% numa-meminfo.node0.Shmem
7374 ± 4% +12.0% 8258 ± 8% numa-meminfo.node1.KernelStack
578.25 ± 6% -11.2% 513.50 sched_debug.cfs_rq:/.util_est_enqueued.max
5.54 ± 3% -17.4% 4.58 ± 2% sched_debug.cpu.clock.stddev
5.54 ± 3% -17.4% 4.58 ± 2% sched_debug.cpu.clock_task.stddev
22.21 ± 16% -50.5% 11.00 ± 14% sched_debug.cpu.nr_uninterruptible.max
5.85 ± 15% -32.3% 3.96 ± 10% sched_debug.cpu.nr_uninterruptible.stddev
4327 -11.7% 3821 ± 7% numa-vmstat.node0.nr_mapped
9956 ± 48% -31.9% 6782 ± 49% numa-vmstat.node0.nr_shmem
13.00 ±173% +830.8% 121.00 ± 61% numa-vmstat.node1.nr_active_file
11.00 ±173% +2050.0% 236.50 ± 94% numa-vmstat.node1.nr_dirtied
7375 ± 4% +12.0% 8260 ± 8% numa-vmstat.node1.nr_kernel_stack
9.75 ±173% +2300.0% 234.00 ± 95% numa-vmstat.node1.nr_written
13.00 ±173% +830.8% 121.00 ± 61% numa-vmstat.node1.nr_zone_active_file
44.34 ± 10% -8.2 36.17 ± 11% perf-profile.calltrace.cycles-pp.secondary_startup_64
43.59 ± 10% -8.1 35.52 ± 11% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
43.59 ± 10% -8.1 35.52 ± 11% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
43.59 ± 10% -8.1 35.52 ± 11% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
43.48 ± 10% -8.1 35.42 ± 11% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
43.30 ± 10% -8.0 35.30 ± 11% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
1.44 ± 7% +0.2 1.69 ± 6% perf-profile.calltrace.cycles-pp.hash_futex.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
1.64 ± 7% +0.4 2.07 ± 7% perf-profile.calltrace.cycles-pp.get_futex_key_refs.get_futex_key.futex_wake.do_futex.__x64_sys_futex
3.12 ± 8% +0.7 3.86 ± 6% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
6.78 ± 8% +1.0 7.80 ± 6% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
16.75 ± 7% +3.3 20.04 ± 6% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
19.34 ± 8% +3.8 23.12 ± 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
6256 -38.7% 3832 ± 35% interrupts.CPU102.NMI:Non-maskable_interrupts
6256 -38.7% 3832 ± 35% interrupts.CPU102.PMI:Performance_monitoring_interrupts
101.25 ± 89% -85.2% 15.00 ±118% interrupts.CPU11.RES:Rescheduling_interrupts
13.00 ± 93% +41876.9% 5457 ±161% interrupts.CPU20.RES:Rescheduling_interrupts
590.75 ±111% -90.6% 55.50 ±110% interrupts.CPU29.RES:Rescheduling_interrupts
4676 ± 22% -42.3% 2697 ± 29% interrupts.CPU36.NMI:Non-maskable_interrupts
4676 ± 22% -42.3% 2697 ± 29% interrupts.CPU36.PMI:Performance_monitoring_interrupts
1338 ±166% -99.6% 5.25 ± 97% interrupts.CPU37.RES:Rescheduling_interrupts
4690 ± 22% -41.8% 2730 ± 30% interrupts.CPU39.NMI:Non-maskable_interrupts
4690 ± 22% -41.8% 2730 ± 30% interrupts.CPU39.PMI:Performance_monitoring_interrupts
4696 ± 22% -42.3% 2711 ± 29% interrupts.CPU47.NMI:Non-maskable_interrupts
4696 ± 22% -42.3% 2711 ± 29% interrupts.CPU47.PMI:Performance_monitoring_interrupts
4702 ± 22% -42.7% 2693 ± 28% interrupts.CPU49.NMI:Non-maskable_interrupts
4702 ± 22% -42.7% 2693 ± 28% interrupts.CPU49.PMI:Performance_monitoring_interrupts
1.058e+10 +5.6% 1.118e+10 perf-stat.i.branch-instructions
2.95 -1.5 1.48 perf-stat.i.branch-miss-rate%
3.123e+08 -46.9% 1.657e+08 perf-stat.i.branch-misses
4839609 +8.3% 5241207 perf-stat.i.cache-references
1.70 -4.4% 1.62 perf-stat.i.cpi
1.394e+08 ± 7% +16.6% 1.626e+08 ± 3% perf-stat.i.iTLB-load-misses
7.293e+10 +4.6% 7.629e+10 perf-stat.i.instructions
525.91 ± 6% -10.6% 469.98 ± 3% perf-stat.i.instructions-per-iTLB-miss
0.59 +4.7% 0.62 perf-stat.i.ipc
0.07 +3.6% 0.07 perf-stat.overall.MPKI
2.95 -1.5 1.48 perf-stat.overall.branch-miss-rate%
1.70 -4.4% 1.62 perf-stat.overall.cpi
525.68 ± 6% -10.6% 469.79 ± 3% perf-stat.overall.instructions-per-iTLB-miss
0.59 +4.7% 0.62 perf-stat.overall.ipc
1.054e+10 +5.6% 1.114e+10 perf-stat.ps.branch-instructions
3.112e+08 -46.9% 1.652e+08 perf-stat.ps.branch-misses
4829390 +8.3% 5231132 perf-stat.ps.cache-references
1.39e+08 ± 7% +16.6% 1.62e+08 ± 3% perf-stat.ps.iTLB-load-misses
7.268e+10 +4.6% 7.603e+10 perf-stat.ps.instructions
2.194e+13 +4.5% 2.293e+13 perf-stat.total.instructions
87108 ± 4% +7.0% 93168 ± 4% softirqs.CPU100.TIMER
87250 ± 4% +7.0% 93347 ± 4% softirqs.CPU101.TIMER
87532 ± 4% +7.1% 93724 ± 3% softirqs.CPU103.TIMER
87157 ± 4% +7.4% 93568 ± 3% softirqs.CPU104.TIMER
87006 ± 3% +8.0% 93944 ± 3% softirqs.CPU105.TIMER
87194 ± 4% +8.0% 94175 ± 3% softirqs.CPU106.TIMER
87091 ± 4% +7.2% 93323 ± 4% softirqs.CPU107.TIMER
87273 ± 4% +6.9% 93328 ± 3% softirqs.CPU109.TIMER
87091 ± 4% +15.1% 100221 ± 14% softirqs.CPU110.TIMER
87820 ± 4% +6.9% 93881 ± 4% softirqs.CPU56.TIMER
87539 ± 2% +5.8% 92644 ± 4% softirqs.CPU57.TIMER
87788 ± 4% +7.2% 94067 ± 5% softirqs.CPU60.TIMER
87845 ± 4% +6.7% 93725 ± 5% softirqs.CPU61.TIMER
87755 ± 4% +23.3% 108227 ± 26% softirqs.CPU62.TIMER
87710 ± 3% +7.0% 93811 ± 5% softirqs.CPU63.TIMER
87524 ± 4% +10.8% 97017 ± 4% softirqs.CPU65.TIMER
88028 ± 4% +6.8% 94003 ± 5% softirqs.CPU66.TIMER
87694 ± 4% +7.0% 93871 ± 5% softirqs.CPU67.TIMER
87750 ± 4% +7.0% 93888 ± 4% softirqs.CPU68.TIMER
8299 ±108% -64.7% 2925 ± 2% softirqs.CPU69.SCHED
87688 ± 4% +6.8% 93630 ± 5% softirqs.CPU71.TIMER
87272 ± 4% +11.2% 97026 ± 5% softirqs.CPU72.TIMER
87384 ± 4% +10.8% 96860 ± 4% softirqs.CPU75.TIMER
87742 ± 4% +6.7% 93629 ± 5% softirqs.CPU76.TIMER
87454 ± 4% +6.8% 93425 ± 5% softirqs.CPU77.TIMER
87407 ± 4% +7.7% 94142 ± 5% softirqs.CPU78.TIMER
88299 ± 4% +6.2% 93751 ± 5% softirqs.CPU79.TIMER
11503 ±126% -75.0% 2870 ± 2% softirqs.CPU80.SCHED
88007 ± 4% +5.9% 93173 ± 5% softirqs.CPU81.TIMER
87710 ± 4% +6.5% 93386 ± 5% softirqs.CPU82.TIMER
87681 ± 3% +6.7% 93587 ± 5% softirqs.CPU83.TIMER
87235 ± 4% +10.1% 96031 ± 3% softirqs.CPU85.TIMER
87644 ± 4% +7.9% 94599 ± 2% softirqs.CPU87.TIMER
88170 ± 4% +8.2% 95367 ± 2% softirqs.CPU88.TIMER
87392 ± 4% +7.5% 93982 ± 3% softirqs.CPU89.TIMER
87811 ± 4% +7.1% 94080 ± 3% softirqs.CPU90.TIMER
87043 ± 4% +10.4% 96088 ± 4% softirqs.CPU91.TIMER
86998 ± 4% +7.4% 93467 ± 4% softirqs.CPU92.TIMER
87124 ± 4% +7.3% 93481 ± 4% softirqs.CPU93.TIMER
87072 ± 4% +7.3% 93388 ± 3% softirqs.CPU94.TIMER
87099 ± 4% +6.8% 93051 ± 4% softirqs.CPU95.TIMER
86845 ± 4% +6.6% 92604 ± 5% softirqs.CPU96.TIMER
87200 ± 4% +7.3% 93598 ± 3% softirqs.CPU97.TIMER
87953 ± 4% +6.0% 93269 ± 4% softirqs.CPU98.TIMER
87211 ± 4% +6.9% 93271 ± 4% softirqs.CPU99.TIMER



***************************************************************************************************
lkp-skl-2sp5: 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/process/50%/debian-x86_64-2018-04-03.cgz/lkp-skl-2sp5/futex1/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
%stddev %change %stddev
\ | \
2117652 +2.5% 2170754 will-it-scale.per_process_ops
1.186e+08 +2.5% 1.216e+08 will-it-scale.workload
472944 ± 3% -8.8% 431109 ± 2% meminfo.Committed_AS
5506 ±169% +228.9% 18112 ± 34% numa-numastat.node0.other_node
70481 -1.7% 69272 proc-vmstat.nr_active_anon
70481 -1.7% 69272 proc-vmstat.nr_zone_active_anon
13912 ± 8% -10.8% 12404 ± 5% proc-vmstat.pgactivate
109265 ± 20% +35.4% 147962 ± 14% numa-meminfo.node0.AnonPages
45308 ± 31% -43.2% 25756 ± 47% numa-meminfo.node0.Shmem
95438 ± 13% -33.0% 63947 ± 35% numa-meminfo.node1.AnonHugePages
134950 ± 16% -29.2% 95525 ± 22% numa-meminfo.node1.AnonPages
27320 ± 20% +35.4% 36992 ± 14% numa-vmstat.node0.nr_anon_pages
11332 ± 32% -43.2% 6435 ± 47% numa-vmstat.node0.nr_shmem
33731 ± 16% -29.2% 23886 ± 22% numa-vmstat.node1.nr_anon_pages
207313 ± 4% -6.1% 194680 ± 3% numa-vmstat.node1.numa_other
714.00 ± 4% -17.6% 588.00 ± 5% slabinfo.mnt_cache.active_objs
714.00 ± 4% -17.6% 588.00 ± 5% slabinfo.mnt_cache.num_objs
7378 ± 2% -9.8% 6652 slabinfo.shmem_inode_cache.active_objs
7378 ± 2% -9.8% 6652 slabinfo.shmem_inode_cache.num_objs
4614486 ± 43% -49.5% 2332387 ± 10% sched_debug.cpu.avg_idle.max
398823 ± 42% -49.3% 202390 ± 12% sched_debug.cpu.avg_idle.stddev
5.52 ± 3% -17.9% 4.53 ± 3% sched_debug.cpu.clock.stddev
5.52 ± 3% -17.9% 4.53 ± 3% sched_debug.cpu.clock_task.stddev
0.00 ± 43% -32.2% 0.00 ± 2% sched_debug.cpu.next_balance.stddev
88789 ± 4% +14.2% 101399 ± 14% softirqs.CPU100.TIMER
21774 ± 10% +24.5% 27109 ± 13% softirqs.CPU18.RCU
99823 ± 2% +11.4% 111178 ± 10% softirqs.CPU18.TIMER
24198 ± 11% +17.6% 28451 ± 14% softirqs.CPU19.RCU
21333 ± 11% +30.0% 27729 ± 12% softirqs.CPU26.RCU
98344 ± 2% +13.5% 111624 ± 10% softirqs.CPU26.TIMER
102128 ± 4% +9.4% 111696 ± 11% softirqs.CPU34.TIMER
98109 +13.0% 110860 ± 11% softirqs.CPU37.TIMER
2890 ± 4% +303.7% 11667 ±126% softirqs.CPU59.SCHED
88353 ± 3% +8.8% 96087 ± 10% softirqs.CPU59.TIMER
16819 ± 8% +20.5% 20274 ± 11% softirqs.CPU6.RCU
29308 ± 52% +44.4% 42332 ± 4% softirqs.CPU6.SCHED
88560 ± 3% +27.9% 113276 ± 27% softirqs.CPU83.TIMER
88853 ± 4% +13.7% 101031 ± 15% softirqs.CPU94.TIMER
88666 ± 4% +4.9% 92996 ± 5% softirqs.CPU96.TIMER
1.52e+10 +3.4% 1.571e+10 perf-stat.i.branch-instructions
1.61 -0.8 0.81 perf-stat.i.branch-miss-rate%
2.427e+08 -47.4% 1.277e+08 perf-stat.i.branch-misses
494521 ± 3% +7.5% 531576 ± 2% perf-stat.i.cache-misses
4861656 ± 4% +6.1% 5157364 ± 3% perf-stat.i.cache-references
1.20 -3.1% 1.16 perf-stat.i.cpi
27.28 ± 3% -5.6% 25.76 ± 3% perf-stat.i.cpu-migrations
271004 ± 3% -7.6% 250275 ± 2% perf-stat.i.cycles-between-cache-misses
0.00 ±130% -0.0 0.00 ± 7% perf-stat.i.dTLB-load-miss-rate%
69139 ± 8% -39.5% 41823 ± 2% perf-stat.i.dTLB-load-misses
2.427e+10 +4.6% 2.539e+10 ± 3% perf-stat.i.dTLB-loads
1.037e+11 +2.8% 1.066e+11 perf-stat.i.instructions
0.84 +2.8% 0.86 perf-stat.i.ipc
95449 ± 2% +10.2% 105213 ± 4% perf-stat.i.node-load-misses
1.60 -0.8 0.81 perf-stat.overall.branch-miss-rate%
1.19 -2.6% 1.16 perf-stat.overall.cpi
249918 ± 3% -6.9% 232651 ± 2% perf-stat.overall.cycles-between-cache-misses
0.00 ± 9% -0.0 0.00 ± 4% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 2% -0.0 0.00 ± 3% perf-stat.overall.dTLB-store-miss-rate%
0.84 +2.7% 0.86 perf-stat.overall.ipc
83.36 +2.3 85.61 perf-stat.overall.node-load-miss-rate%
1.515e+10 +3.4% 1.566e+10 perf-stat.ps.branch-instructions
2.419e+08 -47.4% 1.273e+08 perf-stat.ps.branch-misses
493395 ± 3% +7.5% 530214 ± 2% perf-stat.ps.cache-misses
4852831 ± 4% +6.0% 5145509 ± 3% perf-stat.ps.cache-references
27.21 ± 3% -5.6% 25.69 ± 3% perf-stat.ps.cpu-migrations
69088 ± 8% -39.5% 41817 ± 2% perf-stat.ps.dTLB-load-misses
2.419e+10 +4.6% 2.53e+10 ± 3% perf-stat.ps.dTLB-loads
1.033e+11 +2.8% 1.062e+11 perf-stat.ps.instructions
95170 ± 2% +10.2% 104903 ± 4% perf-stat.ps.node-load-misses
3.122e+13 +2.6% 3.204e+13 perf-stat.total.instructions
179.50 ± 9% +33.4% 239.50 ± 6% interrupts.71:PCI-MSI.12589060-edge.eth3-TxRx-3
6526 ±165% -99.9% 7.00 ± 74% interrupts.CPU27.RES:Rescheduling_interrupts
2870 +25.2% 3594 ± 24% interrupts.CPU32.NMI:Non-maskable_interrupts
2870 +25.2% 3594 ± 24% interrupts.CPU32.PMI:Performance_monitoring_interrupts
2872 +25.6% 3607 ± 24% interrupts.CPU34.NMI:Non-maskable_interrupts
2872 +25.6% 3607 ± 24% interrupts.CPU34.PMI:Performance_monitoring_interrupts
2930 ± 3% +46.6% 4295 ± 32% interrupts.CPU35.NMI:Non-maskable_interrupts
2930 ± 3% +46.6% 4295 ± 32% interrupts.CPU35.PMI:Performance_monitoring_interrupts
2877 +24.9% 3593 ± 24% interrupts.CPU36.NMI:Non-maskable_interrupts
2877 +24.9% 3593 ± 24% interrupts.CPU36.PMI:Performance_monitoring_interrupts
2871 +64.6% 4727 ± 32% interrupts.CPU38.NMI:Non-maskable_interrupts
2871 +64.6% 4727 ± 32% interrupts.CPU38.PMI:Performance_monitoring_interrupts
2871 +52.1% 4367 ± 30% interrupts.CPU40.NMI:Non-maskable_interrupts
2871 +52.1% 4367 ± 30% interrupts.CPU40.PMI:Performance_monitoring_interrupts
2872 +23.9% 3560 ± 24% interrupts.CPU41.NMI:Non-maskable_interrupts
2872 +23.9% 3560 ± 24% interrupts.CPU41.PMI:Performance_monitoring_interrupts
12.25 ± 94% +5675.5% 707.50 ±110% interrupts.CPU42.RES:Rescheduling_interrupts
2875 +37.4% 3951 ± 34% interrupts.CPU44.NMI:Non-maskable_interrupts
2875 +37.4% 3951 ± 34% interrupts.CPU44.PMI:Performance_monitoring_interrupts
6196 -49.8% 3112 interrupts.CPU66.NMI:Non-maskable_interrupts
6196 -49.8% 3112 interrupts.CPU66.PMI:Performance_monitoring_interrupts
6223 -49.9% 3118 interrupts.CPU67.NMI:Non-maskable_interrupts
6223 -49.9% 3118 interrupts.CPU67.PMI:Performance_monitoring_interrupts
179.50 ± 9% +33.4% 239.50 ± 6% interrupts.CPU68.71:PCI-MSI.12589060-edge.eth3-TxRx-3
6225 -50.1% 3108 interrupts.CPU68.NMI:Non-maskable_interrupts
6225 -50.1% 3108 interrupts.CPU68.PMI:Performance_monitoring_interrupts
6198 -50.0% 3098 interrupts.CPU69.NMI:Non-maskable_interrupts
6198 -50.0% 3098 interrupts.CPU69.PMI:Performance_monitoring_interrupts
6193 -49.6% 3118 interrupts.CPU70.NMI:Non-maskable_interrupts
6193 -49.6% 3118 interrupts.CPU70.PMI:Performance_monitoring_interrupts
6234 -56.5% 2713 ± 25% interrupts.CPU71.NMI:Non-maskable_interrupts
6234 -56.5% 2713 ± 25% interrupts.CPU71.PMI:Performance_monitoring_interrupts
6195 -49.4% 3137 interrupts.CPU72.NMI:Non-maskable_interrupts
6195 -49.4% 3137 interrupts.CPU72.PMI:Performance_monitoring_interrupts
6239 -49.9% 3123 interrupts.CPU73.NMI:Non-maskable_interrupts
6239 -49.9% 3123 interrupts.CPU73.PMI:Performance_monitoring_interrupts
6255 -37.4% 3913 ± 34% interrupts.CPU75.NMI:Non-maskable_interrupts
6255 -37.4% 3913 ± 34% interrupts.CPU75.PMI:Performance_monitoring_interrupts
6215 -37.3% 3894 ± 34% interrupts.CPU76.NMI:Non-maskable_interrupts
6215 -37.3% 3894 ± 34% interrupts.CPU76.PMI:Performance_monitoring_interrupts
6198 -37.5% 3877 ± 34% interrupts.CPU77.NMI:Non-maskable_interrupts
6198 -37.5% 3877 ± 34% interrupts.CPU77.PMI:Performance_monitoring_interrupts
6225 -43.4% 3526 ± 46% interrupts.CPU78.NMI:Non-maskable_interrupts
6225 -43.4% 3526 ± 46% interrupts.CPU78.PMI:Performance_monitoring_interrupts
6236 -37.7% 3887 ± 34% interrupts.CPU79.NMI:Non-maskable_interrupts
6236 -37.7% 3887 ± 34% interrupts.CPU79.PMI:Performance_monitoring_interrupts
6239 -51.5% 3028 ± 3% interrupts.CPU83.NMI:Non-maskable_interrupts
6239 -51.5% 3028 ± 3% interrupts.CPU83.PMI:Performance_monitoring_interrupts



***************************************************************************************************
lkp-bdw-ep3d: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/process/50%/debian-x86_64-2018-04-03-no-ucode.cgz/lkp-bdw-ep3d/futex3/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
%stddev %change %stddev
\ | \
3606506 +5.8% 3817348 will-it-scale.per_process_ops
1.587e+08 +5.8% 1.68e+08 will-it-scale.workload
777050 ± 7% +3986.1% 31751409 ±168% cpuidle.C1.time
1630 -100.0% 0.00 meminfo.Mlocked
16.00 +10.9% 17.75 ± 2% vmstat.cpu.us
4032 ± 3% -9.6% 3647 ± 6% slabinfo.kmalloc-rcl-64.active_objs
4032 ± 3% -9.6% 3647 ± 6% slabinfo.kmalloc-rcl-64.num_objs
219.50 ± 12% -100.0% 0.00 numa-vmstat.node0.nr_mlock
3159 ± 20% -17.3% 2611 ± 11% numa-vmstat.node1.nr_mapped
187.00 ± 16% -100.0% 0.00 numa-vmstat.node1.nr_mlock
-1244 +479.3% -7208 sched_debug.cfs_rq:/.spread0.min
3.52 ± 5% -20.8% 2.79 sched_debug.cpu.clock.stddev
3.52 ± 5% -20.8% 2.79 sched_debug.cpu.clock_task.stddev
54.50 ± 59% -57.4% 23.21 ± 14% sched_debug.cpu.cpu_load[1].max
42.67 ± 10% +15.3% 49.21 ± 8% sched_debug.cpu.cpu_load[4].max
21.00 ± 12% -39.5% 12.71 ± 27% sched_debug.cpu.nr_uninterruptible.max
4.03 ± 8% -22.1% 3.14 ± 10% sched_debug.cpu.nr_uninterruptible.stddev
70467 -3.5% 67987 proc-vmstat.nr_active_anon
61221 -0.9% 60664 proc-vmstat.nr_anon_pages
4462 -1.2% 4407 proc-vmstat.nr_inactive_anon
6942 -2.6% 6765 proc-vmstat.nr_mapped
407.25 -100.0% 0.00 proc-vmstat.nr_mlock
70467 -3.5% 67987 proc-vmstat.nr_zone_active_anon
4462 -1.2% 4407 proc-vmstat.nr_zone_inactive_anon
697001 -1.0% 690247 proc-vmstat.numa_hit
679834 -1.0% 673107 proc-vmstat.numa_local
22560 ± 20% +30.4% 29420 ± 7% softirqs.CPU26.RCU
22786 ± 15% +34.9% 30730 ± 9% softirqs.CPU27.RCU
21637 ± 5% +11.2% 24052 ± 3% softirqs.CPU47.RCU
23104 ± 8% +16.0% 26810 ± 14% softirqs.CPU5.RCU
19255 ± 22% +27.1% 24481 ± 4% softirqs.CPU51.RCU
21604 ± 5% +11.9% 24173 ± 3% softirqs.CPU52.RCU
22073 ± 5% +9.4% 24150 ± 2% softirqs.CPU55.RCU
19672 ± 13% +18.7% 23358 ± 2% softirqs.CPU56.RCU
18526 ± 13% +16.4% 21570 ± 2% softirqs.CPU61.RCU
17650 ± 17% +22.7% 21662 ± 2% softirqs.CPU80.RCU
19156 ± 5% +12.1% 21468 ± 5% softirqs.CPU82.RCU
17920 ± 17% +22.6% 21966 ± 3% softirqs.CPU83.RCU
19130 ± 4% +12.2% 21459 ± 7% softirqs.CPU86.RCU
1545 ± 86% -89.5% 163.00 ± 87% interrupts.CPU1.RES:Rescheduling_interrupts
4908 ± 16% -33.1% 3282 ± 47% interrupts.CPU11.NMI:Non-maskable_interrupts
4908 ± 16% -33.1% 3282 ± 47% interrupts.CPU11.PMI:Performance_monitoring_interrupts
5416 ± 29% -39.4% 3284 ± 47% interrupts.CPU12.NMI:Non-maskable_interrupts
5416 ± 29% -39.4% 3284 ± 47% interrupts.CPU12.PMI:Performance_monitoring_interrupts
4951 ± 15% -33.7% 3283 ± 47% interrupts.CPU13.NMI:Non-maskable_interrupts
4951 ± 15% -33.7% 3283 ± 47% interrupts.CPU13.PMI:Performance_monitoring_interrupts
4929 ± 16% -26.1% 3641 ± 33% interrupts.CPU16.NMI:Non-maskable_interrupts
4929 ± 16% -26.1% 3641 ± 33% interrupts.CPU16.PMI:Performance_monitoring_interrupts
61.00 ± 76% +531.6% 385.25 ± 89% interrupts.CPU2.RES:Rescheduling_interrupts
41.00 ± 64% +1479.3% 647.50 ± 89% interrupts.CPU23.RES:Rescheduling_interrupts
30.75 ± 72% +155.3% 78.50 ± 31% interrupts.CPU25.RES:Rescheduling_interrupts
71.25 ±128% +1377.2% 1052 ± 95% interrupts.CPU26.RES:Rescheduling_interrupts
5.75 ± 60% +2652.2% 158.25 ±124% interrupts.CPU32.RES:Rescheduling_interrupts
13.00 ± 55% +640.4% 96.25 ± 42% interrupts.CPU36.RES:Rescheduling_interrupts
790.25 ± 57% -75.5% 194.00 ±136% interrupts.CPU5.RES:Rescheduling_interrupts
6924 ± 24% -32.5% 4672 ± 40% interrupts.CPU52.NMI:Non-maskable_interrupts
6924 ± 24% -32.5% 4672 ± 40% interrupts.CPU52.PMI:Performance_monitoring_interrupts
7881 -37.2% 4953 ± 34% interrupts.CPU57.NMI:Non-maskable_interrupts
7881 -37.2% 4953 ± 34% interrupts.CPU57.PMI:Performance_monitoring_interrupts
7912 -37.5% 4942 ± 34% interrupts.CPU58.NMI:Non-maskable_interrupts
7912 -37.5% 4942 ± 34% interrupts.CPU58.PMI:Performance_monitoring_interrupts
7902 -37.5% 4940 ± 34% interrupts.CPU60.NMI:Non-maskable_interrupts
7902 -37.5% 4940 ± 34% interrupts.CPU60.PMI:Performance_monitoring_interrupts
7884 -37.4% 4939 ± 34% interrupts.CPU62.NMI:Non-maskable_interrupts
7884 -37.4% 4939 ± 34% interrupts.CPU62.PMI:Performance_monitoring_interrupts
4932 ± 16% -33.3% 3290 ± 46% interrupts.CPU9.NMI:Non-maskable_interrupts
4932 ± 16% -33.3% 3290 ± 46% interrupts.CPU9.PMI:Performance_monitoring_interrupts
0.08 -6.9% 0.07 perf-stat.i.MPKI
1.09e+10 +7.4% 1.17e+10 perf-stat.i.branch-instructions
2.96 -1.5 1.48 perf-stat.i.branch-miss-rate%
3.218e+08 -46.2% 1.731e+08 perf-stat.i.branch-misses
1.64 -5.9% 1.54 perf-stat.i.cpi
0.00 ± 15% -0.0 0.00 ± 7% perf-stat.i.dTLB-load-miss-rate%
271193 ± 15% -57.7% 114745 ± 5% perf-stat.i.dTLB-load-misses
1.916e+10 +3.9% 1.991e+10 perf-stat.i.dTLB-loads
0.00 ± 27% -0.0 0.00 ± 15% perf-stat.i.dTLB-store-miss-rate%
245430 ± 25% -88.7% 27715 ± 9% perf-stat.i.dTLB-store-misses
1.451e+10 +3.5% 1.501e+10 perf-stat.i.dTLB-stores
7.52e+10 +6.3% 7.995e+10 perf-stat.i.instructions
0.61 +6.3% 0.65 perf-stat.i.ipc
0.07 -6.3% 0.06 perf-stat.overall.MPKI
2.95 -1.5 1.48 perf-stat.overall.branch-miss-rate%
1.64 -5.9% 1.54 perf-stat.overall.cpi
727812 ± 4% +9.0% 793642 ± 3% perf-stat.overall.cycles-between-cache-misses
0.00 ± 15% -0.0 0.00 ± 5% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 25% -0.0 0.00 ± 9% perf-stat.overall.dTLB-store-miss-rate%
0.61 +6.3% 0.65 perf-stat.overall.ipc
1.086e+10 +7.4% 1.166e+10 perf-stat.ps.branch-instructions
3.207e+08 -46.2% 1.725e+08 perf-stat.ps.branch-misses
270777 ± 15% -57.7% 114522 ± 5% perf-stat.ps.dTLB-load-misses
1.91e+10 +3.9% 1.984e+10 perf-stat.ps.dTLB-loads
245104 ± 25% -88.7% 27670 ± 9% perf-stat.ps.dTLB-store-misses
1.446e+10 +3.5% 1.496e+10 perf-stat.ps.dTLB-stores
7.495e+10 +6.3% 7.968e+10 perf-stat.ps.instructions
2.263e+13 +6.3% 2.406e+13 perf-stat.total.instructions



***************************************************************************************************
lkp-bdw-ex2: 160 threads Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/ucode:
gcc-8/performance/x86_64-rhel-7.6/debian-x86_64-2018-04-03.cgz/lkp-bdw-ex2/futex1/will-it-scale/0xb00002e

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:2 50% 1:3 dmesg.WARNING:at#for_ip_interrupt_entry/0x
1:2 -50% :3 stderr.mount.nfs:Connection_timed_out
:2 50% 1:3 kmsg.Not_tainted
:2 50% 1:3 kmsg.echo#>/proc/sys/kernel/hung_task_timeout_secs~disables_this_message
%stddev %change %stddev
\ | \
2496467 +2.6% 2560919 will-it-scale.per_process_ops
515041 -3.1% 499149 ± 2% will-it-scale.per_thread_ops
0.70 +6.4% 0.75 ± 2% will-it-scale.scalability
118.88 -4.3% 113.80 ± 4% will-it-scale.time.user_time
1.153e+09 +3.5% 1.194e+09 will-it-scale.workload
28.99 -0.8% 28.77 boot-time.dhcp
7866 ± 5% +1874.1% 155290 ± 57% numa-numastat.node0.local_node
26516 +532.8% 167786 ± 49% numa-numastat.node0.numa_hit
11983872 -14.0% 10309632 ± 5% meminfo.DirectMap2M
470472 ± 5% -15.6% 397085 ± 8% meminfo.DirectMap4k
816.50 ± 41% -47.4% 429.33 ± 36% meminfo.Mlocked
95.50 ± 4% -8.2% 87.67 ± 6% proc-vmstat.nr_inactive_file
203.50 ± 40% -47.3% 107.33 ± 37% proc-vmstat.nr_mlock
17078 ± 5% +9.8% 18746 ± 3% proc-vmstat.nr_shmem
95.50 ± 4% -8.2% 87.67 ± 6% proc-vmstat.nr_zone_inactive_file
11669 ± 10% +22.2% 14262 proc-vmstat.pgactivate
1847 ± 5% -9.3% 1676 ± 4% slabinfo.avtab_node.active_objs
1847 ± 5% -9.3% 1676 ± 4% slabinfo.avtab_node.num_objs
2192 +8.4% 2377 ± 4% slabinfo.pool_workqueue.num_objs
578.00 ± 52% -60.8% 226.67 ± 14% slabinfo.tw_sock_TCP.active_objs
578.00 ± 52% -60.8% 226.67 ± 14% slabinfo.tw_sock_TCP.num_objs
9566399 ± 51% -82.9% 1634223 ± 18% cpuidle.C1.time
204936 ± 53% -74.1% 53035 ± 8% cpuidle.C1.usage
2.985e+09 ± 14% -97.8% 66052113 ± 80% cpuidle.C1E.time
35865693 ± 13% -99.3% 265794 ± 66% cpuidle.C1E.usage
1.324e+10 ± 63% +148.3% 3.288e+10 ± 7% cpuidle.C3.time
29716500 ± 21% +151.6% 74755358 ± 3% cpuidle.C3.usage
3.532e+10 ± 21% -64.9% 1.241e+10 ± 98% cpuidle.C6.time
41301763 ± 25% -51.6% 20007429 ±104% cpuidle.C6.usage
1230402 ± 10% -49.7% 618946 ± 66% sched_debug.cfs_rq:/.min_vruntime.min
0.75 +72.9% 1.30 ± 26% sched_debug.cfs_rq:/.removed.load_avg.avg
10.35 +33.2% 13.80 ± 14% sched_debug.cfs_rq:/.removed.load_avg.stddev
34.63 ± 2% +71.6% 59.44 ± 26% sched_debug.cfs_rq:/.removed.runnable_sum.avg
478.65 ± 2% +32.2% 633.00 ± 14% sched_debug.cfs_rq:/.removed.runnable_sum.stddev
0.38 ± 2% +59.1% 0.60 ± 19% sched_debug.cfs_rq:/.removed.util_avg.avg
5.23 ± 2% +25.3% 6.55 ± 6% sched_debug.cfs_rq:/.removed.util_avg.stddev
173210 -57.9% 72904 ± 56% sched_debug.cpu.avg_idle.min
62774 ± 15% -35.9% 40218 ± 34% sched_debug.cpu.nr_switches.max
5673 ± 3% -32.6% 3823 ± 23% sched_debug.cpu.nr_switches.stddev
-12.00 -13.9% -10.33 sched_debug.cpu.nr_uninterruptible.min
202540 ± 54% -74.7% 51287 ± 8% turbostat.C1
0.01 ± 33% -0.0 0.00 turbostat.C1%
35860488 ± 13% -99.3% 261162 ± 68% turbostat.C1E
4.01 ± 15% -3.9 0.09 ± 62% turbostat.C1E%
29716034 ± 21% +151.6% 74753457 ± 3% turbostat.C3
17.71 ± 63% +32.3 49.97 ± 20% turbostat.C3%
41237740 ± 25% -51.7% 19937579 ±104% turbostat.C6
47.43 ± 21% -31.4 16.06 ± 87% turbostat.C6%
33.64 ± 4% +15.7% 38.94 ± 2% turbostat.CPU%c1
24.96 ± 28% -74.9% 6.26 ±111% turbostat.CPU%c6
25.88 -21.5% 20.30 ± 9% turbostat.Pkg%pc2
4673 ± 53% -62.2% 1767 ±117% numa-vmstat.node0.nr_inactive_anon
2835 ± 22% -20.9% 2242 ± 37% numa-vmstat.node0.nr_mapped
4673 ± 53% -62.2% 1767 ±117% numa-vmstat.node0.nr_zone_inactive_anon
31071 ± 7% -36.0% 19898 ± 65% numa-vmstat.node1.nr_active_anon
30838 ± 6% -47.3% 16251 ± 70% numa-vmstat.node1.nr_anon_pages
2712 ± 72% +187.0% 7784 ± 29% numa-vmstat.node1.nr_shmem
6316 ± 13% +31.2% 8288 ± 5% numa-vmstat.node1.nr_slab_reclaimable
31071 ± 7% -36.0% 19898 ± 65% numa-vmstat.node1.nr_zone_active_anon
18107 ± 22% -60.8% 7090 ±136% numa-vmstat.node2.nr_anon_pages
6591 ± 4% -10.9% 5873 ± 7% numa-vmstat.node2.nr_kernel_stack
65.00 ± 61% -66.7% 21.67 ± 39% numa-vmstat.node2.nr_mlock
567.00 ± 34% -57.5% 241.00 ± 76% numa-vmstat.node2.nr_page_table_pages
14392 ± 94% +143.6% 35066 ± 13% numa-vmstat.node3.nr_active_anon
9457 ± 95% +271.6% 35143 ± 13% numa-vmstat.node3.nr_anon_pages
198.50 ± 48% +89.6% 376.33 ± 6% numa-vmstat.node3.nr_page_table_pages
14392 ± 94% +143.6% 35066 ± 13% numa-vmstat.node3.nr_zone_active_anon
3.84 ± 15% -1.6 2.24 ± 64% perf-stat.i.branch-miss-rate%
1.547e+08 ± 2% -41.0% 91259839 ± 5% perf-stat.i.branch-misses
14.20 +6.7% 15.15 perf-stat.i.cpi
236467 +7.5% 254135 ± 4% perf-stat.i.cycles-between-cache-misses
638431 +52.6% 974297 ± 26% perf-stat.i.dTLB-store-misses
79.49 +6.8 86.31 perf-stat.i.iTLB-load-miss-rate%
1.265e+08 -29.3% 89456361 ± 11% perf-stat.i.iTLB-load-misses
738.29 ± 2% +18.9% 878.12 ± 8% perf-stat.i.instructions-per-iTLB-miss
1.81 ± 2% -0.9 0.93 ± 9% perf-stat.overall.branch-miss-rate%
2.62 -3.7% 2.52 ± 2% perf-stat.overall.cpi
96.82 -1.3 95.51 perf-stat.overall.iTLB-load-miss-rate%
464.17 ± 2% +63.7% 759.84 ± 5% perf-stat.overall.instructions-per-iTLB-miss
1.522e+08 ± 2% -41.0% 89728767 ± 4% perf-stat.ps.branch-misses
634910 +52.1% 965837 ± 26% perf-stat.ps.dTLB-store-misses
1.241e+08 ± 2% -29.3% 87765598 ± 11% perf-stat.ps.iTLB-load-misses
2.229e+13 +3.5% 2.307e+13 perf-stat.total.instructions
18696 ± 53% -61.0% 7295 ±115% numa-meminfo.node0.Inactive
18694 ± 53% -62.2% 7067 ±117% numa-meminfo.node0.Inactive(anon)
11339 ± 22% -23.8% 8636 ± 36% numa-meminfo.node0.Mapped
124395 ± 7% -36.0% 79620 ± 65% numa-meminfo.node1.Active
124287 ± 7% -35.9% 79620 ± 65% numa-meminfo.node1.Active(anon)
107555 ± 3% -49.5% 54262 ± 77% numa-meminfo.node1.AnonHugePages
123351 ± 6% -47.3% 64996 ± 70% numa-meminfo.node1.AnonPages
25266 ± 13% +31.2% 33157 ± 5% numa-meminfo.node1.KReclaimable
25266 ± 13% +31.2% 33157 ± 5% numa-meminfo.node1.SReclaimable
10856 ± 72% +187.1% 31171 ± 29% numa-meminfo.node1.Shmem
86531 +11.9% 96794 numa-meminfo.node1.Slab
56683 ± 38% -59.7% 22853 ±141% numa-meminfo.node2.AnonHugePages
72407 ± 22% -60.8% 28350 ±136% numa-meminfo.node2.AnonPages
6571 ± 3% -10.7% 5866 ± 7% numa-meminfo.node2.KernelStack
2259 ± 34% -57.8% 954.00 ± 76% numa-meminfo.node2.PageTables
57693 ± 94% +143.1% 140270 ± 13% numa-meminfo.node3.Active
57693 ± 94% +143.1% 140270 ± 13% numa-meminfo.node3.Active(anon)
25926 ±100% +354.0% 117714 ± 19% numa-meminfo.node3.AnonHugePages
37807 ± 95% +271.8% 140574 ± 13% numa-meminfo.node3.AnonPages
792.00 ± 48% +90.4% 1508 ± 6% numa-meminfo.node3.PageTables
30.68 ± 5% -6.2 24.43 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
29.36 ± 4% -6.0 23.31 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
28.09 ± 5% -5.9 22.18 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
24.84 ± 5% -5.1 19.71 ± 2% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.09 ± 5% -3.6 18.49 ± 2% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
17.96 ± 5% -2.9 15.11 ± 2% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
16.11 ± 4% -2.4 13.70 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
13.50 ± 5% -1.9 11.59 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
12.85 ± 5% -1.9 10.96 ± 2% perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
11.27 ± 5% -1.6 9.63 ± 2% perf-profile.calltrace.cycles-pp.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
2.13 ± 18% -0.6 1.56 ± 13% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle.cpu_startup_entry
2.30 ± 17% -0.5 1.75 ± 12% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
1.31 ± 13% -0.4 0.94 ± 12% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle
1.78 ± 5% -0.3 1.48 ± 2% perf-profile.calltrace.cycles-pp.get_futex_key_refs.get_futex_key.futex_wake.do_futex.__x64_sys_futex
0.85 ± 9% -0.2 0.61 ± 10% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
1.25 ± 6% -0.2 1.03 ± 2% perf-profile.calltrace.cycles-pp.drop_futex_key_refs.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
1.29 ± 2% -0.2 1.11 ± 2% perf-profile.calltrace.cycles-pp.hash_futex.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
0.83 ± 2% -0.1 0.69 ± 6% perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
36.46 ± 8% +11.1 47.52 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
36.46 ± 8% +11.1 47.52 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
36.45 ± 8% +11.1 47.52 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
36.85 ± 8% +11.2 48.01 ± 2% perf-profile.calltrace.cycles-pp.secondary_startup_64
35.20 ± 8% +11.3 46.47 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
32.78 ± 8% +11.9 44.64 ± 2% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
31888 ± 12% -21.0% 25192 ± 24% softirqs.CPU0.SCHED
34289 ± 14% -24.0% 26070 ± 29% softirqs.CPU1.RCU
34567 ± 10% -30.5% 24026 ± 33% softirqs.CPU1.SCHED
32531 ± 2% -20.8% 25760 ± 20% softirqs.CPU108.RCU
25490 ± 8% -22.4% 19783 ± 22% softirqs.CPU114.RCU
32019 -29.3% 22624 ± 30% softirqs.CPU118.RCU
30997 ± 8% -28.9% 22027 ± 31% softirqs.CPU119.RCU
31873 ± 6% -20.9% 25222 ± 20% softirqs.CPU132.RCU
33363 ± 6% -13.0% 29035 ± 14% softirqs.CPU137.SCHED
33894 ± 12% -15.8% 28533 ± 17% softirqs.CPU139.SCHED
32677 ± 7% -18.0% 26807 ± 21% softirqs.CPU140.SCHED
33447 ± 8% -14.2% 28692 ± 15% softirqs.CPU141.SCHED
36590 ± 4% -21.2% 28824 ± 15% softirqs.CPU142.SCHED
37643 -23.3% 28882 ± 15% softirqs.CPU143.SCHED
164778 ± 21% -20.9% 130261 ± 19% softirqs.CPU143.TIMER
35481 ± 6% -24.8% 26691 ± 26% softirqs.CPU146.RCU
32886 -32.2% 22305 ± 20% softirqs.CPU148.RCU
35817 ± 2% -26.8% 26208 ± 15% softirqs.CPU153.RCU
39524 ± 4% -32.0% 26867 ± 38% softirqs.CPU157.RCU
32255 ± 21% -22.9% 24866 ± 23% softirqs.CPU161.RCU
27970 ± 6% -25.8% 20754 ± 15% softirqs.CPU162.RCU
34679 ± 10% -38.6% 21287 ± 15% softirqs.CPU163.RCU
36891 ± 10% -29.2% 26124 ± 36% softirqs.CPU165.RCU
34348 -31.8% 23438 ± 19% softirqs.CPU168.RCU
135565 ± 5% -3.5% 130798 ± 6% softirqs.CPU168.TIMER
33609 ± 7% -38.0% 20840 ± 25% softirqs.CPU169.RCU
138004 ± 7% -7.0% 128364 ± 3% softirqs.CPU169.TIMER
31693 ± 3% -26.4% 23322 ± 18% softirqs.CPU170.RCU
137904 ± 3% -7.2% 127942 ± 2% softirqs.CPU170.TIMER
31190 ± 12% -37.9% 19366 ± 5% softirqs.CPU171.RCU
135343 ± 5% -9.4% 122587 ± 2% softirqs.CPU171.TIMER
32093 ± 6% -32.0% 21809 ± 14% softirqs.CPU172.RCU
135088 ± 6% -5.7% 127421 ± 3% softirqs.CPU172.TIMER
33396 -39.0% 20369 ± 22% softirqs.CPU173.RCU
33154 -36.0% 21227 ± 27% softirqs.CPU174.RCU
134831 ± 6% -5.2% 127815 ± 4% softirqs.CPU174.TIMER
32295 -45.1% 17733 ± 7% softirqs.CPU175.RCU
135104 ± 6% -5.8% 127215 ± 3% softirqs.CPU175.TIMER
38022 -22.4% 29512 ± 17% softirqs.CPU176.RCU
142550 ± 5% -7.7% 131547 ± 7% softirqs.CPU176.TIMER
144473 ± 4% -6.3% 135320 ± 6% softirqs.CPU177.TIMER
36694 ± 2% -25.4% 27366 ± 9% softirqs.CPU178.RCU
142513 ± 5% -7.4% 131975 ± 6% softirqs.CPU178.TIMER
35080 ± 16% -21.1% 27684 ± 6% softirqs.CPU179.RCU
142200 ± 5% -8.4% 130241 ± 5% softirqs.CPU179.TIMER
25564 ± 9% -16.8% 21275 ± 22% softirqs.CPU18.RCU
35852 ± 4% -25.2% 26821 ± 26% softirqs.CPU180.RCU
142217 ± 5% -7.0% 132249 ± 6% softirqs.CPU180.TIMER
38047 ± 2% -22.9% 29335 ± 17% softirqs.CPU181.RCU
141419 ± 6% -7.0% 131498 ± 6% softirqs.CPU181.TIMER
40784 -23.1% 31365 ± 17% softirqs.CPU182.RCU
142421 ± 6% -6.7% 132915 ± 7% softirqs.CPU182.TIMER
142474 ± 6% -5.8% 134151 ± 8% softirqs.CPU183.TIMER
40752 ± 3% -18.6% 33186 ± 20% softirqs.CPU186.SCHED
142550 ± 5% -6.8% 132791 ± 7% softirqs.CPU186.TIMER
141886 ± 6% -5.7% 133755 ± 8% softirqs.CPU187.TIMER
141386 ± 5% -6.0% 132865 ± 7% softirqs.CPU188.TIMER
141389 ± 6% -4.9% 134434 ± 8% softirqs.CPU190.TIMER
142156 ± 5% -5.7% 134100 ± 7% softirqs.CPU191.TIMER
32108 ± 4% -27.3% 23333 ± 28% softirqs.CPU23.RCU
37312 ± 13% -31.6% 25514 ± 29% softirqs.CPU23.SCHED
35687 ± 9% -22.9% 27503 ± 25% softirqs.CPU24.SCHED
167185 ± 18% -21.1% 131935 ± 20% softirqs.CPU27.TIMER
30756 ± 12% -24.3% 23280 ± 33% softirqs.CPU3.SCHED
31470 ± 4% -23.0% 24238 ± 14% softirqs.CPU35.RCU
35117 -22.3% 27296 ± 20% softirqs.CPU40.SCHED
33658 ± 4% -14.8% 28669 ± 15% softirqs.CPU41.SCHED
43560 ± 24% -30.9% 30087 ± 33% softirqs.CPU43.SCHED
39178 ± 15% -21.7% 30690 ± 24% softirqs.CPU48.SCHED
32934 -22.4% 25567 ± 15% softirqs.CPU52.RCU
38551 ± 5% -27.3% 28039 ± 29% softirqs.CPU56.RCU
36428 ± 2% -26.5% 26778 ± 17% softirqs.CPU57.RCU
34695 ± 3% -26.2% 25597 ± 21% softirqs.CPU6.RCU
36344 ± 10% -20.6% 28844 ± 27% softirqs.CPU60.RCU
40092 ± 2% -24.5% 30260 ± 25% softirqs.CPU63.RCU
32098 ± 20% -17.2% 26573 ± 21% softirqs.CPU65.RCU
32184 ± 14% -25.2% 24060 ± 13% softirqs.CPU67.RCU
35961 ± 12% -22.4% 27912 ± 30% softirqs.CPU69.RCU
29635 ± 24% -22.4% 22992 ± 32% softirqs.CPU7.SCHED
41084 ± 7% -19.4% 33099 ± 23% softirqs.CPU70.SCHED
35943 ± 2% -32.8% 24163 ± 14% softirqs.CPU72.RCU
33269 -34.7% 21719 ± 27% softirqs.CPU73.RCU
35198 -32.6% 23740 ± 15% softirqs.CPU74.RCU
143966 ± 3% -8.9% 131173 softirqs.CPU74.TIMER
33372 ± 5% -36.8% 21101 ± 4% softirqs.CPU75.RCU
32488 ± 8% -28.8% 23137 ± 10% softirqs.CPU76.RCU
144842 ± 2% -8.6% 132441 ± 2% softirqs.CPU76.TIMER
34795 -38.4% 21426 ± 15% softirqs.CPU77.RCU
144610 ± 2% -9.8% 130385 ± 2% softirqs.CPU77.TIMER
34656 ± 4% -32.8% 23299 ± 16% softirqs.CPU78.RCU
32988 ± 3% -32.8% 22171 ± 14% softirqs.CPU79.RCU
38438 ± 2% -22.8% 29682 ± 18% softirqs.CPU80.RCU
148881 ± 4% -8.1% 136791 ± 5% softirqs.CPU80.TIMER
147100 ± 4% -7.8% 135590 ± 7% softirqs.CPU81.TIMER
38189 -26.5% 28076 ± 12% softirqs.CPU82.RCU
35589 ± 8% -23.4% 27269 ± 11% softirqs.CPU83.RCU
149704 ± 3% -9.3% 135755 ± 3% softirqs.CPU83.TIMER
33771 ± 7% -16.2% 28300 ± 15% softirqs.CPU84.RCU
150278 ± 3% -9.1% 136582 ± 4% softirqs.CPU84.TIMER
39465 -27.4% 28644 ± 16% softirqs.CPU85.RCU
149894 ± 3% -8.7% 136794 ± 4% softirqs.CPU85.TIMER
38696 -19.9% 31006 ± 16% softirqs.CPU86.RCU
149493 ± 3% -7.3% 138570 ± 5% softirqs.CPU86.TIMER
150904 ± 3% -6.8% 140701 ± 7% softirqs.CPU87.TIMER
150105 ± 3% -8.1% 137895 ± 5% softirqs.CPU90.TIMER
149503 ± 3% -6.8% 139333 ± 7% softirqs.CPU91.TIMER
43187 -23.7% 32947 ± 23% softirqs.CPU92.SCHED
148634 ± 4% -6.5% 138999 ± 6% softirqs.CPU92.TIMER
148999 ± 4% -6.6% 139212 ± 6% softirqs.CPU93.TIMER
148447 ± 4% -5.6% 140189 ± 6% softirqs.CPU94.TIMER
149868 ± 3% -7.4% 138782 ± 6% softirqs.CPU95.TIMER
1198 ± 82% -84.7% 183.33 ± 14% interrupts.115:IR-PCI-MSI.1574917-edge.eth1-TxRx-5
220.50 ± 3% -5.8% 207.67 interrupts.116:IR-PCI-MSI.1574918-edge.eth1-TxRx-6
208.50 ± 8% -16.5% 174.00 ± 16% interrupts.118:IR-PCI-MSI.1574920-edge.eth1-TxRx-8
340.50 ± 43% -47.9% 177.33 ± 16% interrupts.119:IR-PCI-MSI.1574921-edge.eth1-TxRx-9
276.50 ± 29% -30.7% 191.67 ± 12% interrupts.124:IR-PCI-MSI.1574922-edge.eth1-TxRx-10
327.00 ± 4% -42.7% 187.33 ± 21% interrupts.126:IR-PCI-MSI.1574924-edge.eth1-TxRx-12
273.00 ± 30% -36.4% 173.67 ± 15% interrupts.139:IR-PCI-MSI.1574937-edge.eth1-TxRx-25
212.00 ± 10% -18.2% 173.33 ± 16% interrupts.142:IR-PCI-MSI.1574940-edge.eth1-TxRx-28
377.50 -46.0% 204.00 ± 70% interrupts.179:IR-PCI-MSI.512000-edge.ahci[0000:00:1f.2]
1634 ± 12% +208.9% 5049 ± 40% interrupts.CPU0.NMI:Non-maskable_interrupts
1634 ± 12% +208.9% 5049 ± 40% interrupts.CPU0.PMI:Performance_monitoring_interrupts
1648 ± 12% +222.6% 5316 ± 7% interrupts.CPU1.NMI:Non-maskable_interrupts
1648 ± 12% +222.6% 5316 ± 7% interrupts.CPU1.PMI:Performance_monitoring_interrupts
16162 ± 29% -91.5% 1378 ± 50% interrupts.CPU1.RES:Rescheduling_interrupts
276.50 ± 29% -30.7% 191.67 ± 12% interrupts.CPU10.124:IR-PCI-MSI.1574922-edge.eth1-TxRx-10
3295 ± 12% +60.6% 5294 ± 7% interrupts.CPU10.NMI:Non-maskable_interrupts
3295 ± 12% +60.6% 5294 ± 7% interrupts.CPU10.PMI:Performance_monitoring_interrupts
3282 ± 12% +61.4% 5298 ± 7% interrupts.CPU11.NMI:Non-maskable_interrupts
3282 ± 12% +61.4% 5298 ± 7% interrupts.CPU11.PMI:Performance_monitoring_interrupts
327.00 ± 4% -42.7% 187.33 ± 21% interrupts.CPU12.126:IR-PCI-MSI.1574924-edge.eth1-TxRx-12
3306 ± 13% +61.0% 5323 ± 6% interrupts.CPU12.NMI:Non-maskable_interrupts
3306 ± 13% +61.0% 5323 ± 6% interrupts.CPU12.PMI:Performance_monitoring_interrupts
4826 ± 4% -11.8% 4254 ± 2% interrupts.CPU120.CAL:Function_call_interrupts
3309 ± 12% +60.6% 5315 ± 6% interrupts.CPU13.NMI:Non-maskable_interrupts
3309 ± 12% +60.6% 5315 ± 6% interrupts.CPU13.PMI:Performance_monitoring_interrupts
263.50 ± 36% +54.6% 407.33 ± 2% interrupts.CPU138.NMI:Non-maskable_interrupts
263.50 ± 36% +54.6% 407.33 ± 2% interrupts.CPU138.PMI:Performance_monitoring_interrupts
3292 ± 12% +61.2% 5307 ± 6% interrupts.CPU14.NMI:Non-maskable_interrupts
3292 ± 12% +61.2% 5307 ± 6% interrupts.CPU14.PMI:Performance_monitoring_interrupts
334.50 ± 22% +46.9% 491.33 ± 4% interrupts.CPU144.NMI:Non-maskable_interrupts
334.50 ± 22% +46.9% 491.33 ± 4% interrupts.CPU144.PMI:Performance_monitoring_interrupts
329.00 ± 25% +47.8% 486.33 ± 2% interrupts.CPU145.NMI:Non-maskable_interrupts
329.00 ± 25% +47.8% 486.33 ± 2% interrupts.CPU145.PMI:Performance_monitoring_interrupts
341.50 ± 25% +46.0% 498.67 ± 3% interrupts.CPU146.NMI:Non-maskable_interrupts
341.50 ± 25% +46.0% 498.67 ± 3% interrupts.CPU146.PMI:Performance_monitoring_interrupts
99.00 ± 74% -78.5% 21.33 ± 37% interrupts.CPU146.RES:Rescheduling_interrupts
331.00 ± 20% +53.9% 509.33 interrupts.CPU147.NMI:Non-maskable_interrupts
331.00 ± 20% +53.9% 509.33 interrupts.CPU147.PMI:Performance_monitoring_interrupts
354.50 ± 31% +50.3% 532.67 ± 6% interrupts.CPU148.NMI:Non-maskable_interrupts
354.50 ± 31% +50.3% 532.67 ± 6% interrupts.CPU148.PMI:Performance_monitoring_interrupts
332.00 ± 23% +66.4% 552.33 ± 12% interrupts.CPU149.NMI:Non-maskable_interrupts
332.00 ± 23% +66.4% 552.33 ± 12% interrupts.CPU149.PMI:Performance_monitoring_interrupts
3312 ± 13% +60.9% 5329 ± 6% interrupts.CPU15.NMI:Non-maskable_interrupts
3312 ± 13% +60.9% 5329 ± 6% interrupts.CPU15.PMI:Performance_monitoring_interrupts
339.50 ± 23% +56.1% 530.00 ± 4% interrupts.CPU151.NMI:Non-maskable_interrupts
339.50 ± 23% +56.1% 530.00 ± 4% interrupts.CPU151.PMI:Performance_monitoring_interrupts
349.50 ± 24% +42.0% 496.33 ± 2% interrupts.CPU152.NMI:Non-maskable_interrupts
349.50 ± 24% +42.0% 496.33 ± 2% interrupts.CPU152.PMI:Performance_monitoring_interrupts
325.50 ± 21% +53.7% 500.33 ± 3% interrupts.CPU153.NMI:Non-maskable_interrupts
325.50 ± 21% +53.7% 500.33 ± 3% interrupts.CPU153.PMI:Performance_monitoring_interrupts
340.50 ± 26% +57.6% 536.67 ± 3% interrupts.CPU154.NMI:Non-maskable_interrupts
340.50 ± 26% +57.6% 536.67 ± 3% interrupts.CPU154.PMI:Performance_monitoring_interrupts
324.00 ± 22% +56.9% 508.33 ± 3% interrupts.CPU155.NMI:Non-maskable_interrupts
324.00 ± 22% +56.9% 508.33 ± 3% interrupts.CPU155.PMI:Performance_monitoring_interrupts
322.50 ± 25% +68.7% 544.00 ± 6% interrupts.CPU158.NMI:Non-maskable_interrupts
322.50 ± 25% +68.7% 544.00 ± 6% interrupts.CPU158.PMI:Performance_monitoring_interrupts
3309 ± 13% +59.8% 5286 ± 7% interrupts.CPU16.NMI:Non-maskable_interrupts
3309 ± 13% +59.8% 5286 ± 7% interrupts.CPU16.PMI:Performance_monitoring_interrupts
377.00 ± 99% -100.0% 0.00 interrupts.CPU160.TLB:TLB_shootdowns
478.00 +12.6% 538.00 ± 5% interrupts.CPU163.NMI:Non-maskable_interrupts
478.00 +12.6% 538.00 ± 5% interrupts.CPU163.PMI:Performance_monitoring_interrupts
217.50 ± 99% -100.0% 0.00 interrupts.CPU163.TLB:TLB_shootdowns
437.50 ± 9% +28.5% 562.00 ± 2% interrupts.CPU165.NMI:Non-maskable_interrupts
437.50 ± 9% +28.5% 562.00 ± 2% interrupts.CPU165.PMI:Performance_monitoring_interrupts
365.50 ± 75% -87.4% 46.00 ± 88% interrupts.CPU166.RES:Rescheduling_interrupts
3313 ± 13% +59.9% 5299 ± 6% interrupts.CPU17.NMI:Non-maskable_interrupts
3313 ± 13% +59.9% 5299 ± 6% interrupts.CPU17.PMI:Performance_monitoring_interrupts
333.00 ± 29% +48.5% 494.67 ± 2% interrupts.CPU170.NMI:Non-maskable_interrupts
333.00 ± 29% +48.5% 494.67 ± 2% interrupts.CPU170.PMI:Performance_monitoring_interrupts
333.50 ± 28% +49.0% 497.00 ± 5% interrupts.CPU171.NMI:Non-maskable_interrupts
333.50 ± 28% +49.0% 497.00 ± 5% interrupts.CPU171.PMI:Performance_monitoring_interrupts
339.50 ± 24% +47.0% 499.00 ± 4% interrupts.CPU173.NMI:Non-maskable_interrupts
339.50 ± 24% +47.0% 499.00 ± 4% interrupts.CPU173.PMI:Performance_monitoring_interrupts
16.50 ± 27% +1370.7% 242.67 ±125% interrupts.CPU175.RES:Rescheduling_interrupts
357.50 ± 26% +43.8% 514.00 ± 5% interrupts.CPU177.NMI:Non-maskable_interrupts
357.50 ± 26% +43.8% 514.00 ± 5% interrupts.CPU177.PMI:Performance_monitoring_interrupts
16.00 ± 50% +343.8% 71.00 ± 83% interrupts.CPU177.RES:Rescheduling_interrupts
365.00 ± 27% +43.6% 524.00 ± 6% interrupts.CPU178.NMI:Non-maskable_interrupts
365.00 ± 27% +43.6% 524.00 ± 6% interrupts.CPU178.PMI:Performance_monitoring_interrupts
352.50 ± 21% +47.7% 520.67 ± 2% interrupts.CPU179.NMI:Non-maskable_interrupts
352.50 ± 21% +47.7% 520.67 ± 2% interrupts.CPU179.PMI:Performance_monitoring_interrupts
3286 ± 12% +61.5% 5307 ± 6% interrupts.CPU18.NMI:Non-maskable_interrupts
3286 ± 12% +61.5% 5307 ± 6% interrupts.CPU18.PMI:Performance_monitoring_interrupts
349.00 ± 27% +45.6% 508.00 ± 3% interrupts.CPU180.NMI:Non-maskable_interrupts
349.00 ± 27% +45.6% 508.00 ± 3% interrupts.CPU180.PMI:Performance_monitoring_interrupts
331.00 ± 25% +48.3% 491.00 ± 6% interrupts.CPU181.NMI:Non-maskable_interrupts
331.00 ± 25% +48.3% 491.00 ± 6% interrupts.CPU181.PMI:Performance_monitoring_interrupts
350.00 ± 28% +54.5% 540.67 ± 5% interrupts.CPU183.NMI:Non-maskable_interrupts
350.00 ± 28% +54.5% 540.67 ± 5% interrupts.CPU183.PMI:Performance_monitoring_interrupts
326.50 ± 27% +64.2% 536.00 ± 7% interrupts.CPU184.NMI:Non-maskable_interrupts
326.50 ± 27% +64.2% 536.00 ± 7% interrupts.CPU184.PMI:Performance_monitoring_interrupts
349.50 ± 33% +58.1% 552.67 ± 6% interrupts.CPU185.NMI:Non-maskable_interrupts
349.50 ± 33% +58.1% 552.67 ± 6% interrupts.CPU185.PMI:Performance_monitoring_interrupts
456.00 ± 4% +13.3% 516.67 ± 4% interrupts.CPU191.NMI:Non-maskable_interrupts
456.00 ± 4% +13.3% 516.67 ± 4% interrupts.CPU191.PMI:Performance_monitoring_interrupts
1635 ± 12% +225.0% 5313 ± 7% interrupts.CPU2.NMI:Non-maskable_interrupts
1635 ± 12% +225.0% 5313 ± 7% interrupts.CPU2.PMI:Performance_monitoring_interrupts
3335 ± 12% +59.8% 5330 ± 6% interrupts.CPU20.NMI:Non-maskable_interrupts
3335 ± 12% +59.8% 5330 ± 6% interrupts.CPU20.PMI:Performance_monitoring_interrupts
2601 ± 44% +104.3% 5314 ± 6% interrupts.CPU21.NMI:Non-maskable_interrupts
2601 ± 44% +104.3% 5314 ± 6% interrupts.CPU21.PMI:Performance_monitoring_interrupts
2598 ± 44% +104.7% 5318 ± 6% interrupts.CPU22.NMI:Non-maskable_interrupts
2598 ± 44% +104.7% 5318 ± 6% interrupts.CPU22.PMI:Performance_monitoring_interrupts
26782 ± 46% -99.9% 25.33 ± 44% interrupts.CPU23.RES:Rescheduling_interrupts
9012 ± 99% -99.7% 31.00 ± 53% interrupts.CPU24.RES:Rescheduling_interrupts
273.00 ± 30% -36.4% 173.67 ± 15% interrupts.CPU25.139:IR-PCI-MSI.1574937-edge.eth1-TxRx-25
19.00 ± 31% +15038.6% 2876 ±139% interrupts.CPU25.RES:Rescheduling_interrupts
212.00 ± 10% -18.2% 173.33 ± 16% interrupts.CPU28.142:IR-PCI-MSI.1574940-edge.eth1-TxRx-28
1636 ± 12% +224.6% 5312 ± 6% interrupts.CPU3.NMI:Non-maskable_interrupts
1636 ± 12% +224.6% 5312 ± 6% interrupts.CPU3.PMI:Performance_monitoring_interrupts
3283 ± 12% +62.9% 5350 ± 6% interrupts.CPU33.NMI:Non-maskable_interrupts
3283 ± 12% +62.9% 5350 ± 6% interrupts.CPU33.PMI:Performance_monitoring_interrupts
3259 ± 11% +64.1% 5348 ± 6% interrupts.CPU34.NMI:Non-maskable_interrupts
3259 ± 11% +64.1% 5348 ± 6% interrupts.CPU34.PMI:Performance_monitoring_interrupts
3293 ± 12% +62.1% 5337 ± 6% interrupts.CPU35.NMI:Non-maskable_interrupts
3293 ± 12% +62.1% 5337 ± 6% interrupts.CPU35.PMI:Performance_monitoring_interrupts
3286 ± 12% +61.6% 5310 ± 6% interrupts.CPU36.NMI:Non-maskable_interrupts
3286 ± 12% +61.6% 5310 ± 6% interrupts.CPU36.PMI:Performance_monitoring_interrupts
3261 ± 11% +63.3% 5325 ± 6% interrupts.CPU37.NMI:Non-maskable_interrupts
3261 ± 11% +63.3% 5325 ± 6% interrupts.CPU37.PMI:Performance_monitoring_interrupts
3279 ± 12% +61.9% 5309 ± 6% interrupts.CPU38.NMI:Non-maskable_interrupts
3279 ± 12% +61.9% 5309 ± 6% interrupts.CPU38.PMI:Performance_monitoring_interrupts
3341 ± 13% +59.5% 5331 ± 6% interrupts.CPU39.NMI:Non-maskable_interrupts
3341 ± 13% +59.5% 5331 ± 6% interrupts.CPU39.PMI:Performance_monitoring_interrupts
1622 ± 11% +228.4% 5327 ± 6% interrupts.CPU4.NMI:Non-maskable_interrupts
1622 ± 11% +228.4% 5327 ± 6% interrupts.CPU4.PMI:Performance_monitoring_interrupts
365.00 ± 3% +17.0% 427.00 ± 7% interrupts.CPU41.NMI:Non-maskable_interrupts
365.00 ± 3% +17.0% 427.00 ± 7% interrupts.CPU41.PMI:Performance_monitoring_interrupts
11.00 ± 9% +375.8% 52.33 ± 90% interrupts.CPU41.RES:Rescheduling_interrupts
355.00 ± 5% +24.0% 440.33 interrupts.CPU42.NMI:Non-maskable_interrupts
355.00 ± 5% +24.0% 440.33 interrupts.CPU42.PMI:Performance_monitoring_interrupts
24378 ± 99% -75.3% 6030 ±140% interrupts.CPU43.RES:Rescheduling_interrupts
435.00 ± 10% +22.8% 534.33 ± 6% interrupts.CPU48.NMI:Non-maskable_interrupts
435.00 ± 10% +22.8% 534.33 ± 6% interrupts.CPU48.PMI:Performance_monitoring_interrupts
18756 ± 99% -99.9% 27.33 ± 33% interrupts.CPU48.RES:Rescheduling_interrupts
1198 ± 82% -84.7% 183.33 ± 14% interrupts.CPU5.115:IR-PCI-MSI.1574917-edge.eth1-TxRx-5
1619 ± 11% +226.6% 5289 ± 7% interrupts.CPU5.NMI:Non-maskable_interrupts
1619 ± 11% +226.6% 5289 ± 7% interrupts.CPU5.PMI:Performance_monitoring_interrupts
70.00 ± 77% -81.4% 13.00 ± 16% interrupts.CPU58.RES:Rescheduling_interrupts
60.50 ± 65% -70.2% 18.00 ± 31% interrupts.CPU59.RES:Rescheduling_interrupts
220.50 ± 3% -5.8% 207.67 interrupts.CPU6.116:IR-PCI-MSI.1574918-edge.eth1-TxRx-6
2583 ± 44% +105.6% 5309 ± 6% interrupts.CPU6.NMI:Non-maskable_interrupts
2583 ± 44% +105.6% 5309 ± 6% interrupts.CPU6.PMI:Performance_monitoring_interrupts
295.00 ± 92% -95.7% 12.67 ± 3% interrupts.CPU60.RES:Rescheduling_interrupts
334.50 ± 30% +63.1% 545.67 ± 3% interrupts.CPU62.NMI:Non-maskable_interrupts
334.50 ± 30% +63.1% 545.67 ± 3% interrupts.CPU62.PMI:Performance_monitoring_interrupts
343.00 ± 29% +55.3% 532.67 ± 3% interrupts.CPU63.NMI:Non-maskable_interrupts
343.00 ± 29% +55.3% 532.67 ± 3% interrupts.CPU63.PMI:Performance_monitoring_interrupts
357.50 ± 30% +58.2% 565.67 ± 2% interrupts.CPU64.NMI:Non-maskable_interrupts
357.50 ± 30% +58.2% 565.67 ± 2% interrupts.CPU64.PMI:Performance_monitoring_interrupts
433.00 ± 6% +16.7% 505.33 interrupts.CPU68.NMI:Non-maskable_interrupts
433.00 ± 6% +16.7% 505.33 interrupts.CPU68.PMI:Performance_monitoring_interrupts
350.50 ± 29% +52.0% 532.67 interrupts.CPU69.NMI:Non-maskable_interrupts
350.50 ± 29% +52.0% 532.67 interrupts.CPU69.PMI:Performance_monitoring_interrupts
2579 ± 44% +106.5% 5326 ± 6% interrupts.CPU7.NMI:Non-maskable_interrupts
2579 ± 44% +106.5% 5326 ± 6% interrupts.CPU7.PMI:Performance_monitoring_interrupts
351.50 ± 24% +60.1% 562.67 ± 4% interrupts.CPU70.NMI:Non-maskable_interrupts
351.50 ± 24% +60.1% 562.67 ± 4% interrupts.CPU70.PMI:Performance_monitoring_interrupts
8765 ± 98% -95.8% 366.67 ±137% interrupts.CPU70.RES:Rescheduling_interrupts
12.00 ± 25% +83638.9% 10048 ±140% interrupts.CPU76.RES:Rescheduling_interrupts
208.50 ± 8% -16.5% 174.00 ± 16% interrupts.CPU8.118:IR-PCI-MSI.1574920-edge.eth1-TxRx-8
2600 ± 44% +103.9% 5302 ± 7% interrupts.CPU8.NMI:Non-maskable_interrupts
2600 ± 44% +103.9% 5302 ± 7% interrupts.CPU8.PMI:Performance_monitoring_interrupts
18.00 +1.6e+05% 28221 ± 73% interrupts.CPU8.RES:Rescheduling_interrupts
11.50 ± 39% +1253.6% 155.67 ± 55% interrupts.CPU85.RES:Rescheduling_interrupts
5548 ± 99% -96.9% 170.67 ±129% interrupts.CPU87.RES:Rescheduling_interrupts
8.50 ± 17% +2182.4% 194.00 ±123% interrupts.CPU88.RES:Rescheduling_interrupts
340.50 ± 43% -47.9% 177.33 ± 16% interrupts.CPU9.119:IR-PCI-MSI.1574921-edge.eth1-TxRx-9
3298 ± 12% +62.7% 5366 ± 5% interrupts.CPU9.NMI:Non-maskable_interrupts
3298 ± 12% +62.7% 5366 ± 5% interrupts.CPU9.PMI:Performance_monitoring_interrupts
7.50 ± 46% +84597.8% 6352 ±140% interrupts.CPU90.RES:Rescheduling_interrupts
16.00 ± 68% +22747.9% 3655 ±122% interrupts.CPU91.RES:Rescheduling_interrupts
433127 ± 10% +26.8% 549204 ± 2% interrupts.NMI:Non-maskable_interrupts
433127 ± 10% +26.8% 549204 ± 2% interrupts.PMI:Performance_monitoring_interrupts



***************************************************************************************************
lkp-skl-2sp5: 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/process/50%/debian-x86_64-2018-04-03.cgz/lkp-skl-2sp5/futex2/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 kmsg.usb#-#:can't_read_configurations,error
1:4 -25% :4 kmsg.usb#-#:unable_to_read_config_index#descriptor/all
%stddev %change %stddev
\ | \
1780260 +2.5% 1824313 will-it-scale.per_process_ops
99694583 +2.5% 1.022e+08 will-it-scale.workload
0.00 ± 6% -0.0 0.00 ±167% mpstat.cpu.-1.iowait%
5.446e+09 ± 91% -92.5% 4.098e+08 ± 49% cpuidle.C6.time
7644348 ± 90% -92.7% 558721 ± 52% cpuidle.C6.usage
7621943 ± 91% -92.9% 541456 ± 54% turbostat.C6
16.06 ± 91% -14.9 1.19 ± 50% turbostat.C6%
0.74 ± 10% +0.1 0.86 ± 8% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel
0.74 ± 10% +0.1 0.88 ± 8% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
0.74 ± 10% +0.1 0.88 ± 7% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
0.74 ± 10% +0.1 0.88 ± 7% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64
0.74 ± 10% +0.1 0.88 ± 7% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64
69598 +3.0% 71654 proc-vmstat.nr_active_anon
4658 +1.0% 4705 proc-vmstat.nr_inactive_anon
13814 ± 9% +13.2% 15637 proc-vmstat.nr_shmem
69598 +3.0% 71654 proc-vmstat.nr_zone_active_anon
4658 +1.0% 4705 proc-vmstat.nr_zone_inactive_anon
2144 ±130% -96.4% 76.25 ±144% proc-vmstat.numa_hint_faults
1909 ±149% -98.7% 24.75 ±164% proc-vmstat.numa_hint_faults_local
700177 +1.1% 707887 proc-vmstat.numa_hit
678374 +1.1% 686063 proc-vmstat.numa_local
27069 ± 14% -13.8% 23325 ± 15% softirqs.CPU16.RCU
27517 ± 10% -20.8% 21794 ± 14% softirqs.CPU33.RCU
25029 ± 6% -14.8% 21319 ± 10% softirqs.CPU34.RCU
26362 ± 10% -20.8% 20887 softirqs.CPU35.RCU
26078 ± 12% -16.0% 21918 ± 16% softirqs.CPU37.RCU
27103 ± 9% -23.8% 20659 ± 23% softirqs.CPU38.RCU
27840 ± 13% -23.7% 21236 ± 24% softirqs.CPU39.RCU
27102 ± 12% -18.1% 22205 ± 15% softirqs.CPU40.RCU
27838 ± 13% -18.2% 22779 ± 17% softirqs.CPU41.RCU
27996 ± 14% -18.4% 22836 ± 16% softirqs.CPU47.RCU
12.71 ± 8% +15.9% 14.73 ± 3% sched_debug.cfs_rq:/.load_avg.avg
30.68 ± 12% +46.2% 44.85 ± 22% sched_debug.cfs_rq:/.load_avg.stddev
1.11 ±109% +206.5% 3.40 ± 19% sched_debug.cfs_rq:/.removed.load_avg.avg
9.42 ±102% +205.7% 28.80 ± 38% sched_debug.cfs_rq:/.removed.load_avg.stddev
51.21 ±109% +206.5% 156.94 ± 19% sched_debug.cfs_rq:/.removed.runnable_sum.avg
435.14 ±101% +205.3% 1328 ± 38% sched_debug.cfs_rq:/.removed.runnable_sum.stddev
0.40 ±120% +314.7% 1.65 ± 25% sched_debug.cfs_rq:/.removed.util_avg.avg
32.33 ±107% +304.4% 130.75 ± 57% sched_debug.cfs_rq:/.removed.util_avg.max
3.35 ±110% +322.6% 14.17 ± 43% sched_debug.cfs_rq:/.removed.util_avg.stddev
5.66 ± 2% -20.5% 4.50 ± 3% sched_debug.cpu.clock.stddev
5.66 ± 2% -20.5% 4.50 ± 3% sched_debug.cpu.clock_task.stddev
1.421e+10 +2.3% 1.454e+10 perf-stat.i.branch-instructions
1.44 -0.7 0.74 perf-stat.i.branch-miss-rate%
2.051e+08 -47.4% 1.079e+08 perf-stat.i.branch-misses
479525 ± 5% +12.6% 539837 ± 5% perf-stat.i.cache-misses
4870102 ± 2% +5.6% 5143022 perf-stat.i.cache-references
1.27 -2.3% 1.24 perf-stat.i.cpi
278227 ± 5% -12.1% 244626 ± 5% perf-stat.i.cycles-between-cache-misses
1.563e+10 +2.5% 1.601e+10 perf-stat.i.dTLB-stores
1.078e+08 ± 9% -12.9% 93908148 ± 10% perf-stat.i.iTLB-load-misses
4631458 ± 14% -21.8% 3623934 ± 10% perf-stat.i.iTLB-loads
9.725e+10 +2.3% 9.947e+10 perf-stat.i.instructions
0.79 +2.4% 0.81 perf-stat.i.ipc
0.05 ± 2% +3.2% 0.05 perf-stat.overall.MPKI
1.44 -0.7 0.74 perf-stat.overall.branch-miss-rate%
1.27 -2.3% 1.24 perf-stat.overall.cpi
258391 ± 5% -11.3% 229282 ± 5% perf-stat.overall.cycles-between-cache-misses
0.79 +2.4% 0.81 perf-stat.overall.ipc
1.416e+10 +2.3% 1.449e+10 perf-stat.ps.branch-instructions
2.044e+08 -47.4% 1.075e+08 perf-stat.ps.branch-misses
478404 ± 5% +12.5% 538400 ± 5% perf-stat.ps.cache-misses
4859880 ± 2% +5.6% 5131025 perf-stat.ps.cache-references
1.558e+10 +2.5% 1.596e+10 perf-stat.ps.dTLB-stores
1.074e+08 ± 9% -12.9% 93588790 ± 10% perf-stat.ps.iTLB-load-misses
4615690 ± 14% -21.8% 3611652 ± 10% perf-stat.ps.iTLB-loads
9.691e+10 +2.3% 9.913e+10 perf-stat.ps.instructions
2.923e+13 +2.4% 2.992e+13 perf-stat.total.instructions
342.50 ± 13% +32.8% 455.00 ± 10% interrupts.68:PCI-MSI.12589057-edge.eth3-TxRx-0
3327 ± 15% +45.8% 4850 ± 9% interrupts.CPU0.NMI:Non-maskable_interrupts
3327 ± 15% +45.8% 4850 ± 9% interrupts.CPU0.PMI:Performance_monitoring_interrupts
3303 ± 14% +44.9% 4786 ± 9% interrupts.CPU1.NMI:Non-maskable_interrupts
3303 ± 14% +44.9% 4786 ± 9% interrupts.CPU1.PMI:Performance_monitoring_interrupts
3308 ± 14% +44.0% 4763 ± 10% interrupts.CPU10.NMI:Non-maskable_interrupts
3308 ± 14% +44.0% 4763 ± 10% interrupts.CPU10.PMI:Performance_monitoring_interrupts
3303 ± 14% +43.7% 4747 ± 10% interrupts.CPU11.NMI:Non-maskable_interrupts
3303 ± 14% +43.7% 4747 ± 10% interrupts.CPU11.PMI:Performance_monitoring_interrupts
3328 ± 15% +43.4% 4773 ± 10% interrupts.CPU12.NMI:Non-maskable_interrupts
3328 ± 15% +43.4% 4773 ± 10% interrupts.CPU12.PMI:Performance_monitoring_interrupts
19.50 ± 59% +2679.5% 542.00 ±139% interrupts.CPU15.RES:Rescheduling_interrupts
2013 ± 26% +109.1% 4208 ± 28% interrupts.CPU18.NMI:Non-maskable_interrupts
2013 ± 26% +109.1% 4208 ± 28% interrupts.CPU18.PMI:Performance_monitoring_interrupts
2005 ± 27% +108.9% 4189 ± 29% interrupts.CPU19.NMI:Non-maskable_interrupts
2005 ± 27% +108.9% 4189 ± 29% interrupts.CPU19.PMI:Performance_monitoring_interrupts
3343 ± 16% +42.6% 4769 ± 10% interrupts.CPU2.NMI:Non-maskable_interrupts
3343 ± 16% +42.6% 4769 ± 10% interrupts.CPU2.PMI:Performance_monitoring_interrupts
2187 ± 52% +90.6% 4170 ± 28% interrupts.CPU21.NMI:Non-maskable_interrupts
2187 ± 52% +90.6% 4170 ± 28% interrupts.CPU21.PMI:Performance_monitoring_interrupts
9.00 ± 27% +1941.7% 183.75 ±129% interrupts.CPU21.RES:Rescheduling_interrupts
2179 ± 52% +64.9% 3595 ± 37% interrupts.CPU22.NMI:Non-maskable_interrupts
2179 ± 52% +64.9% 3595 ± 37% interrupts.CPU22.PMI:Performance_monitoring_interrupts
2180 ± 52% +63.8% 3570 ± 37% interrupts.CPU23.NMI:Non-maskable_interrupts
2180 ± 52% +63.8% 3570 ± 37% interrupts.CPU23.PMI:Performance_monitoring_interrupts
2189 ± 52% +63.2% 3572 ± 37% interrupts.CPU24.NMI:Non-maskable_interrupts
2189 ± 52% +63.2% 3572 ± 37% interrupts.CPU24.PMI:Performance_monitoring_interrupts
2171 ± 52% +63.7% 3554 ± 38% interrupts.CPU25.NMI:Non-maskable_interrupts
2171 ± 52% +63.7% 3554 ± 38% interrupts.CPU25.PMI:Performance_monitoring_interrupts
2169 ± 52% +71.3% 3715 ± 43% interrupts.CPU27.NMI:Non-maskable_interrupts
2169 ± 52% +71.3% 3715 ± 43% interrupts.CPU27.PMI:Performance_monitoring_interrupts
2165 ± 52% +115.3% 4662 ± 32% interrupts.CPU28.NMI:Non-maskable_interrupts
2165 ± 52% +115.3% 4662 ± 32% interrupts.CPU28.PMI:Performance_monitoring_interrupts
2628 ± 41% +69.4% 4453 ± 27% interrupts.CPU29.NMI:Non-maskable_interrupts
2628 ± 41% +69.4% 4453 ± 27% interrupts.CPU29.PMI:Performance_monitoring_interrupts
2157 ± 50% +94.6% 4197 ± 28% interrupts.CPU30.NMI:Non-maskable_interrupts
2157 ± 50% +94.6% 4197 ± 28% interrupts.CPU30.PMI:Performance_monitoring_interrupts
2190 ± 52% +88.0% 4118 ± 29% interrupts.CPU31.NMI:Non-maskable_interrupts
2190 ± 52% +88.0% 4118 ± 29% interrupts.CPU31.PMI:Performance_monitoring_interrupts
2891 ± 29% +63.3% 4723 ± 10% interrupts.CPU33.NMI:Non-maskable_interrupts
2891 ± 29% +63.3% 4723 ± 10% interrupts.CPU33.PMI:Performance_monitoring_interrupts
2859 ± 27% +66.1% 4751 ± 10% interrupts.CPU34.NMI:Non-maskable_interrupts
2859 ± 27% +66.1% 4751 ± 10% interrupts.CPU34.PMI:Performance_monitoring_interrupts
2885 ± 28% +63.5% 4717 ± 10% interrupts.CPU35.NMI:Non-maskable_interrupts
2885 ± 28% +63.5% 4717 ± 10% interrupts.CPU35.PMI:Performance_monitoring_interrupts
3303 ± 14% +57.0% 5186 ± 14% interrupts.CPU36.NMI:Non-maskable_interrupts
3303 ± 14% +57.0% 5186 ± 14% interrupts.CPU36.PMI:Performance_monitoring_interrupts
3306 ± 14% +43.4% 4740 ± 10% interrupts.CPU37.NMI:Non-maskable_interrupts
3306 ± 14% +43.4% 4740 ± 10% interrupts.CPU37.PMI:Performance_monitoring_interrupts
3296 ± 14% +57.0% 5174 ± 14% interrupts.CPU38.NMI:Non-maskable_interrupts
3296 ± 14% +57.0% 5174 ± 14% interrupts.CPU38.PMI:Performance_monitoring_interrupts
3319 ± 14% +54.7% 5136 ± 15% interrupts.CPU39.NMI:Non-maskable_interrupts
3319 ± 14% +54.7% 5136 ± 15% interrupts.CPU39.PMI:Performance_monitoring_interrupts
3295 ± 14% +42.7% 4701 ± 10% interrupts.CPU4.NMI:Non-maskable_interrupts
3295 ± 14% +42.7% 4701 ± 10% interrupts.CPU4.PMI:Performance_monitoring_interrupts
3277 ± 14% +44.4% 4731 ± 10% interrupts.CPU40.NMI:Non-maskable_interrupts
3277 ± 14% +44.4% 4731 ± 10% interrupts.CPU40.PMI:Performance_monitoring_interrupts
9.25 ± 64% +6537.8% 614.00 ±112% interrupts.CPU48.RES:Rescheduling_interrupts
4174 ± 6% +22.3% 5106 ± 7% interrupts.CPU5.NMI:Non-maskable_interrupts
4174 ± 6% +22.3% 5106 ± 7% interrupts.CPU5.PMI:Performance_monitoring_interrupts
2620 ± 46% +80.3% 4725 ± 11% interrupts.CPU50.NMI:Non-maskable_interrupts
2620 ± 46% +80.3% 4725 ± 11% interrupts.CPU50.PMI:Performance_monitoring_interrupts
2612 ± 45% +82.4% 4763 ± 10% interrupts.CPU51.NMI:Non-maskable_interrupts
2612 ± 45% +82.4% 4763 ± 10% interrupts.CPU51.PMI:Performance_monitoring_interrupts
3355 ± 12% +51.8% 5094 ± 15% interrupts.CPU52.NMI:Non-maskable_interrupts
3355 ± 12% +51.8% 5094 ± 15% interrupts.CPU52.PMI:Performance_monitoring_interrupts
3381 ± 14% +64.2% 5553 ± 15% interrupts.CPU53.NMI:Non-maskable_interrupts
3381 ± 14% +64.2% 5553 ± 15% interrupts.CPU53.PMI:Performance_monitoring_interrupts
3350 ± 12% +52.2% 5099 ± 15% interrupts.CPU54.NMI:Non-maskable_interrupts
3350 ± 12% +52.2% 5099 ± 15% interrupts.CPU54.PMI:Performance_monitoring_interrupts
2949 ± 33% +59.4% 4701 ± 11% interrupts.CPU55.NMI:Non-maskable_interrupts
2949 ± 33% +59.4% 4701 ± 11% interrupts.CPU55.PMI:Performance_monitoring_interrupts
342.50 ± 13% +32.8% 455.00 ± 10% interrupts.CPU65.68:PCI-MSI.12589057-edge.eth3-TxRx-0
3325 ± 15% +42.4% 4735 ± 10% interrupts.CPU7.NMI:Non-maskable_interrupts
3325 ± 15% +42.4% 4735 ± 10% interrupts.CPU7.PMI:Performance_monitoring_interrupts
3333 ± 14% +42.2% 4739 ± 10% interrupts.CPU8.NMI:Non-maskable_interrupts
3333 ± 14% +42.2% 4739 ± 10% interrupts.CPU8.PMI:Performance_monitoring_interrupts
3333 ± 14% +43.3% 4777 ± 10% interrupts.CPU9.NMI:Non-maskable_interrupts
3333 ± 14% +43.3% 4777 ± 10% interrupts.CPU9.PMI:Performance_monitoring_interrupts
468484 ± 15% +21.6% 569837 interrupts.NMI:Non-maskable_interrupts
468484 ± 15% +21.6% 569837 interrupts.PMI:Performance_monitoring_interrupts



***************************************************************************************************
lkp-bdw-ep3d: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/thread/50%/debian-x86_64-2018-04-03-no-ucode.cgz/lkp-bdw-ep3d/futex4/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
2901967 +3.1% 2990951 will-it-scale.per_thread_ops
9330 -1.8% 9159 will-it-scale.time.system_time
3914 +4.4% 4084 will-it-scale.time.user_time
1.277e+08 +3.1% 1.316e+08 will-it-scale.workload
86447 +67.7% 145013 ± 58% cpuidle.POLL.time
79592 ± 48% +5261.0% 4266935 ±168% turbostat.C1E
14.00 +7.1% 15.00 vmstat.cpu.us
10229 +17.3% 11997 ± 15% sched_debug.cfs_rq:/.load.avg
10210 +17.3% 11981 ± 15% sched_debug.cfs_rq:/.runnable_weight.avg
10427 +100.4% 20894 ± 51% sched_debug.cfs_rq:/.runnable_weight.stddev
3.65 -21.3% 2.87 ± 2% sched_debug.cpu.clock.stddev
3.65 -21.3% 2.87 ± 2% sched_debug.cpu.clock_task.stddev
539.75 ± 51% -58.6% 223.25 ± 33% interrupts.36:PCI-MSI.3145733-edge.eth0-TxRx-4
539.75 ± 51% -58.6% 223.25 ± 33% interrupts.CPU15.36:PCI-MSI.3145733-edge.eth0-TxRx-4
58.00 ±163% -97.4% 1.50 ± 57% interrupts.CPU24.RES:Rescheduling_interrupts
33.75 ± 63% +2124.4% 750.75 ± 48% interrupts.CPU47.RES:Rescheduling_interrupts
5374 ± 33% -34.0% 3546 ± 32% interrupts.CPU72.NMI:Non-maskable_interrupts
5374 ± 33% -34.0% 3546 ± 32% interrupts.CPU72.PMI:Performance_monitoring_interrupts
5411 ± 32% -34.6% 3540 ± 32% interrupts.CPU79.NMI:Non-maskable_interrupts
5411 ± 32% -34.6% 3540 ± 32% interrupts.CPU79.PMI:Performance_monitoring_interrupts
11276 ± 6% +694.2% 89560 ±148% interrupts.RES:Rescheduling_interrupts
20950 +11.1% 23285 ± 6% softirqs.CPU11.RCU
19588 ± 2% +19.7% 23450 ± 5% softirqs.CPU15.RCU
18353 ± 14% +24.2% 22793 ± 6% softirqs.CPU16.RCU
20000 +11.7% 22349 ± 6% softirqs.CPU18.RCU
19975 +14.1% 22794 ± 5% softirqs.CPU19.RCU
20267 ± 2% +12.6% 22814 ± 3% softirqs.CPU20.RCU
20115 +15.1% 23144 ± 9% softirqs.CPU23.RCU
17876 ± 15% +20.3% 21509 ± 6% softirqs.CPU25.RCU
19571 +16.3% 22768 ± 4% softirqs.CPU26.RCU
19793 +16.2% 22992 ± 6% softirqs.CPU27.RCU
20419 +9.5% 22360 ± 7% softirqs.CPU3.RCU
128667 ± 27% -29.1% 91177 ± 3% softirqs.CPU33.TIMER
24150 ± 20% +28.9% 31140 ± 14% softirqs.CPU68.RCU
19588 ± 8% +16.8% 22885 ± 6% softirqs.CPU7.RCU
39375 ± 8% -17.1% 32661 ± 26% softirqs.CPU70.SCHED
24754 ± 18% +26.9% 31423 ± 14% softirqs.CPU72.RCU
39326 ± 9% -9.6% 35566 ± 13% softirqs.CPU78.SCHED
0.08 ± 2% -4.0% 0.07 ± 2% perf-stat.i.MPKI
1.058e+10 +3.0% 1.09e+10 perf-stat.i.branch-instructions
2.46 -1.2 1.25 perf-stat.i.branch-miss-rate%
2.599e+08 -47.6% 1.361e+08 perf-stat.i.branch-misses
1.68 -3.0% 1.63 perf-stat.i.cpi
1.89e+10 +2.6% 1.939e+10 ± 2% perf-stat.i.dTLB-loads
0.00 ± 29% -0.0 0.00 ± 5% perf-stat.i.dTLB-store-miss-rate%
103680 ± 34% -72.3% 28748 ± 12% perf-stat.i.dTLB-store-misses
1.678e+10 +2.0% 1.711e+10 perf-stat.i.dTLB-stores
7.327e+10 +3.0% 7.55e+10 perf-stat.i.instructions
0.59 +3.1% 0.61 perf-stat.i.ipc
68719 ± 2% +9.5% 75261 ± 6% perf-stat.i.node-load-misses
0.07 ± 2% -3.9% 0.07 ± 2% perf-stat.overall.MPKI
2.46 -1.2 1.25 perf-stat.overall.branch-miss-rate%
1.68 -3.0% 1.63 perf-stat.overall.cpi
0.00 ± 34% -0.0 0.00 ± 12% perf-stat.overall.dTLB-store-miss-rate%
0.59 +3.1% 0.61 perf-stat.overall.ipc
1.055e+10 +3.0% 1.086e+10 perf-stat.ps.branch-instructions
2.59e+08 -47.6% 1.357e+08 perf-stat.ps.branch-misses
1.883e+10 +2.6% 1.932e+10 ± 2% perf-stat.ps.dTLB-loads
103440 ± 34% -72.2% 28709 ± 12% perf-stat.ps.dTLB-store-misses
1.673e+10 +2.0% 1.705e+10 perf-stat.ps.dTLB-stores
7.302e+10 +3.0% 7.525e+10 perf-stat.ps.instructions
68505 ± 2% +9.5% 75031 ± 6% perf-stat.ps.node-load-misses
2.205e+13 +3.0% 2.272e+13 perf-stat.total.instructions



***************************************************************************************************
lkp-bdw-ep3d: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/thread/16/debian-x86_64-2018-04-03-no-ucode.cgz/lkp-bdw-ep3d/futex3/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
3600203 +5.4% 3793291 will-it-scale.per_thread_ops
3151 -3.0% 3058 will-it-scale.time.system_time
1664 +5.6% 1757 will-it-scale.time.user_time
57603256 +5.4% 60692675 will-it-scale.workload
1501 ±165% +378.2% 7180 ± 64% numa-numastat.node1.other_node
10.65 ± 7% -1.4 9.29 ± 7% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
29694546 ± 20% -64.2% 10632710 ± 96% cpuidle.C3.usage
8.85e+09 ± 54% +83.0% 1.62e+10 ± 22% cpuidle.C6.time
9019 ± 3% -6.7% 8412 ± 3% proc-vmstat.nr_shmem
5981 ± 7% -15.6% 5046 ± 9% proc-vmstat.pgactivate
1333 ± 2% -11.8% 1176 ± 7% slabinfo.kmalloc-rcl-96.active_objs
1333 ± 2% -11.8% 1176 ± 7% slabinfo.kmalloc-rcl-96.num_objs
6.25 ±138% +9.1e+05% 56693 ± 94% interrupts.CPU27.RES:Rescheduling_interrupts
705.25 ± 8% -23.1% 542.00 ± 24% interrupts.CPU31.NMI:Non-maskable_interrupts
705.25 ± 8% -23.1% 542.00 ± 24% interrupts.CPU31.PMI:Performance_monitoring_interrupts
29694116 ± 20% -64.2% 10631962 ± 96% turbostat.C3
33.23 ± 54% +27.6 60.82 ± 22% turbostat.C6%
42.30 ± 5% -12.9% 36.82 ± 7% turbostat.CPU%c1
13.87 ± 89% +148.9% 34.52 ± 30% turbostat.CPU%c6
130.83 -2.6% 127.37 turbostat.PkgWatt
130462 ± 18% -31.6% 89297 ± 24% numa-meminfo.node0.AnonHugePages
173402 ± 19% -24.7% 130542 ± 15% numa-meminfo.node0.AnonPages
13748 ± 45% +91.1% 26267 ± 27% numa-meminfo.node0.Shmem
68045 ± 51% +61.9% 110144 ± 19% numa-meminfo.node1.AnonPages
5645 ±117% -88.0% 678.50 ± 55% numa-meminfo.node1.Inactive
22332 ± 33% -66.9% 7389 ± 84% numa-meminfo.node1.Shmem
3.24 ± 6% -24.3% 2.45 ± 4% sched_debug.cpu.clock.stddev
3.24 ± 6% -24.3% 2.45 ± 4% sched_debug.cpu.clock_task.stddev
19.63 +12.4% 22.06 ± 10% sched_debug.cpu.cpu_load[1].stddev
19.88 +8.7% 21.61 ± 5% sched_debug.cpu.cpu_load[2].stddev
2798 ± 33% +52.9% 4278 ± 17% sched_debug.cpu.nr_load_updates.stddev
343.42 ± 3% -10.5% 307.33 ± 4% sched_debug.cpu.nr_switches.min
43351 ± 19% -24.7% 32635 ± 15% numa-vmstat.node0.nr_anon_pages
3436 ± 45% +91.0% 6564 ± 26% numa-vmstat.node0.nr_shmem
651210 ± 5% +15.3% 750851 ± 7% numa-vmstat.node0.numa_hit
635494 ± 5% +16.6% 740858 ± 7% numa-vmstat.node0.numa_local
15715 ± 15% -36.4% 9992 ± 46% numa-vmstat.node0.numa_other
17011 ± 51% +61.9% 27534 ± 19% numa-vmstat.node1.nr_anon_pages
3002 ± 13% -15.3% 2541 ± 10% numa-vmstat.node1.nr_mapped
5575 ± 33% -66.9% 1846 ± 84% numa-vmstat.node1.nr_shmem
17950 ± 9% +40.4% 25202 ± 19% softirqs.CPU17.RCU
41369 ± 5% +69.8% 70245 ± 38% softirqs.CPU27.SCHED
26213 ± 15% -25.1% 19636 ± 8% softirqs.CPU30.RCU
147213 ± 16% -24.0% 111950 ± 25% softirqs.CPU31.TIMER
29368 ± 11% -23.7% 22409 ± 8% softirqs.CPU33.RCU
40180 ± 7% -31.5% 27524 ± 31% softirqs.CPU48.SCHED
36027 ± 18% -21.8% 28180 ± 23% softirqs.CPU59.SCHED
17332 ± 6% +38.8% 24057 ± 18% softirqs.CPU61.RCU
23112 ± 17% -19.8% 18526 ± 2% softirqs.CPU74.RCU
146744 ± 16% -24.4% 110909 ± 24% softirqs.CPU75.TIMER
25217 ± 11% -16.6% 21027 ± 7% softirqs.CPU77.RCU
4.052e+09 +7.8% 4.368e+09 perf-stat.i.branch-instructions
3.09 ± 2% -1.4 1.72 ± 3% perf-stat.i.branch-miss-rate%
1.253e+08 ± 2% -40.0% 75127073 ± 2% perf-stat.i.branch-misses
1.81 -6.9% 1.69 perf-stat.i.cpi
0.64 ± 10% +24.4% 0.79 ± 12% perf-stat.i.cpu-migrations
2399825 ± 9% -18.4% 1957160 ± 15% perf-stat.i.dTLB-load-misses
0.01 ± 15% -0.0 0.01 ± 13% perf-stat.i.dTLB-store-miss-rate%
816574 ± 16% -42.9% 466292 ± 15% perf-stat.i.dTLB-store-misses
5.561e+09 +2.6% 5.708e+09 perf-stat.i.dTLB-stores
82665670 ± 7% +48.0% 1.223e+08 ± 23% perf-stat.i.iTLB-load-misses
2.772e+10 +6.8% 2.961e+10 perf-stat.i.instructions
0.55 +7.4% 0.59 perf-stat.i.ipc
40.59 ± 20% +13.6 54.15 ± 5% perf-stat.i.node-store-miss-rate%
3.09 ± 2% -1.4 1.72 ± 3% perf-stat.overall.branch-miss-rate%
1.81 -6.9% 1.69 perf-stat.overall.cpi
0.01 ± 15% -0.0 0.01 ± 13% perf-stat.overall.dTLB-store-miss-rate%
336.83 ± 6% -24.0% 255.95 ± 23% perf-stat.overall.instructions-per-iTLB-miss
0.55 +7.4% 0.59 perf-stat.overall.ipc
41.10 ± 18% +12.7 53.75 ± 4% perf-stat.overall.node-store-miss-rate%
144710 +1.4% 146728 perf-stat.overall.path-length
4.038e+09 +7.8% 4.354e+09 perf-stat.ps.branch-instructions
1.249e+08 ± 2% -40.0% 74877676 ± 2% perf-stat.ps.branch-misses
0.64 ± 10% +24.3% 0.80 ± 12% perf-stat.ps.cpu-migrations
2391882 ± 9% -18.4% 1950697 ± 15% perf-stat.ps.dTLB-load-misses
813904 ± 16% -42.9% 464755 ± 15% perf-stat.ps.dTLB-store-misses
5.542e+09 +2.6% 5.689e+09 perf-stat.ps.dTLB-stores
82386090 ± 7% +48.0% 1.219e+08 ± 23% perf-stat.ps.iTLB-load-misses
2.763e+10 +6.8% 2.951e+10 perf-stat.ps.instructions
8.336e+12 +6.8% 8.905e+12 perf-stat.total.instructions



***************************************************************************************************
lkp-bdw-ep3d: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-8/performance/x86_64-rhel-7.6/thread/16/debian-x86_64-2018-04-03-no-ucode.cgz/lkp-bdw-ep3d/futex4/will-it-scale

commit:
e6d7bc0bdf ("x86/build: Use the single-argument OUTPUT_FORMAT() linker script command")
ce02ef06fc ("x86, retpolines: Raise limit for generating indirect calls from switch-case")

e6d7bc0bdf4155e8 ce02ef06fcf7a399a6276adb83f
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
2901594 +3.0% 2989036 will-it-scale.per_thread_ops
3403 -2.0% 3333 will-it-scale.time.system_time
1413 +4.9% 1482 will-it-scale.time.user_time
46425522 +3.0% 47824586 will-it-scale.workload
5680 ± 4% +11.8% 6349 ± 8% numa-meminfo.node1.KernelStack
71724 ± 5% +9568.2% 6934407 ±102% turbostat.C1E
1037 -5.4% 981.00 ± 2% vmstat.system.cs
1093 ± 8% +21.7% 1330 ± 6% slabinfo.Acpi-Parse.active_objs
1093 ± 8% +21.7% 1330 ± 6% slabinfo.Acpi-Parse.num_objs
787410 ± 9% -13.4% 682189 ± 8% numa-vmstat.node0.numa_hit
773168 ± 9% -12.7% 674989 ± 8% numa-vmstat.node0.numa_local
5679 ± 4% +11.8% 6349 ± 8% numa-vmstat.node1.nr_kernel_stack
521.67 ± 3% +43.0% 745.92 ± 30% sched_debug.cfs_rq:/.util_est_enqueued.max
69097 ± 42% +170.3% 186784 ± 31% sched_debug.cpu.avg_idle.min
2683 ± 18% +70.7% 4580 ± 17% sched_debug.cpu.nr_load_updates.stddev
10622 ± 2% +17.4% 12471 ± 9% softirqs.CPU15.RCU
25703 ± 27% -35.0% 16703 ± 5% softirqs.CPU24.RCU
36148 ± 18% -31.8% 24645 ± 39% softirqs.CPU40.RCU
41400 ± 5% -30.7% 28681 ± 12% softirqs.CPU45.SCHED
39116 ± 3% -10.1% 35171 ± 7% softirqs.CPU47.RCU
42614 -48.3% 22023 ± 48% softirqs.CPU48.SCHED
39308 ± 4% -13.1% 34165 ± 9% softirqs.CPU52.RCU
40085 ± 4% -8.6% 36638 ± 5% softirqs.CPU55.RCU
35150 ± 16% -53.0% 16504 ± 39% softirqs.CPU79.RCU
52.73 ± 15% -11.1 41.65 ± 13% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
52.73 ± 15% -11.1 41.65 ± 13% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
52.72 ± 15% -11.1 41.63 ± 13% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
1.38 ± 19% +0.4 1.77 ± 11% perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
1.07 ± 18% +0.5 1.54 ± 10% perf-profile.calltrace.cycles-pp.get_futex_key_refs.get_futex_key.futex_wait_setup.futex_wait.do_futex
3.06 ± 18% +1.0 4.03 ± 9% perf-profile.calltrace.cycles-pp.get_futex_value_locked.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
1.83 ± 17% +1.1 2.90 ± 11% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
10.46 ± 19% +3.2 13.64 ± 8% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
11.64 ± 18% +3.6 15.28 ± 10% perf-profile.calltrace.cycles-pp.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
12.36 ± 18% +3.9 16.29 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
13.77 ± 18% +4.1 17.91 ± 10% perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
283155 -1.8% 278155 interrupts.CAL:Function_call_interrupts
686.25 ± 6% -22.3% 533.50 ± 24% interrupts.CPU26.NMI:Non-maskable_interrupts
686.25 ± 6% -22.3% 533.50 ± 24% interrupts.CPU26.PMI:Performance_monitoring_interrupts
651.25 ± 5% -27.1% 474.75 ± 33% interrupts.CPU27.NMI:Non-maskable_interrupts
651.25 ± 5% -27.1% 474.75 ± 33% interrupts.CPU27.PMI:Performance_monitoring_interrupts
30826 ±171% -100.0% 7.00 ± 67% interrupts.CPU27.RES:Rescheduling_interrupts
646.75 ± 7% -27.1% 471.50 ± 32% interrupts.CPU30.NMI:Non-maskable_interrupts
646.75 ± 7% -27.1% 471.50 ± 32% interrupts.CPU30.PMI:Performance_monitoring_interrupts
690.25 ± 9% -42.6% 396.50 ± 31% interrupts.CPU31.NMI:Non-maskable_interrupts
690.25 ± 9% -42.6% 396.50 ± 31% interrupts.CPU31.PMI:Performance_monitoring_interrupts
672.50 ± 8% -40.9% 397.75 ± 31% interrupts.CPU32.NMI:Non-maskable_interrupts
672.50 ± 8% -40.9% 397.75 ± 31% interrupts.CPU32.PMI:Performance_monitoring_interrupts
4895 ± 34% +61.0% 7883 interrupts.CPU5.NMI:Non-maskable_interrupts
4895 ± 34% +61.0% 7883 interrupts.CPU5.PMI:Performance_monitoring_interrupts
1.50 ±100% +3433.3% 53.00 ±154% interrupts.CPU67.RES:Rescheduling_interrupts
684.00 ± 6% -28.5% 488.75 ± 35% interrupts.CPU70.NMI:Non-maskable_interrupts
684.00 ± 6% -28.5% 488.75 ± 35% interrupts.CPU70.PMI:Performance_monitoring_interrupts
672.75 ± 5% -30.2% 469.50 ± 31% interrupts.CPU75.NMI:Non-maskable_interrupts
672.75 ± 5% -30.2% 469.50 ± 31% interrupts.CPU75.PMI:Performance_monitoring_interrupts
717.00 ± 8% -32.2% 486.00 ± 32% interrupts.CPU81.NMI:Non-maskable_interrupts
717.00 ± 8% -32.2% 486.00 ± 32% interrupts.CPU81.PMI:Performance_monitoring_interrupts
3.981e+09 +2.4% 4.078e+09 perf-stat.i.branch-instructions
2.71 -1.2 1.50 ± 2% perf-stat.i.branch-miss-rate%
1.076e+08 -43.4% 60927361 ± 2% perf-stat.i.branch-misses
0.26 ± 3% +0.1 0.36 ± 18% perf-stat.i.cache-miss-rate%
149100 +15.8% 172637 ± 9% perf-stat.i.cache-misses
1004 -5.5% 948.79 ± 3% perf-stat.i.context-switches
1.86 -4.3% 1.78 perf-stat.i.cpi
381642 -15.9% 321003 ± 9% perf-stat.i.cycles-between-cache-misses
2.733e+10 +2.4% 2.8e+10 perf-stat.i.instructions
442.76 ± 4% +19.7% 529.79 ± 18% perf-stat.i.instructions-per-iTLB-miss
0.54 +4.5% 0.56 perf-stat.i.ipc
61815 ± 9% +29.4% 80005 ± 9% perf-stat.i.node-load-misses
20179 ± 19% +49.9% 30255 ± 17% perf-stat.i.node-stores
2.70 -1.2 1.49 ± 2% perf-stat.overall.branch-miss-rate%
0.26 ± 3% +0.1 0.37 ± 19% perf-stat.overall.cache-miss-rate%
1.86 -4.3% 1.78 perf-stat.overall.cpi
341445 -14.6% 291501 ± 9% perf-stat.overall.cycles-between-cache-misses
438.67 ± 3% +20.4% 528.04 ± 18% perf-stat.overall.instructions-per-iTLB-miss
0.54 +4.5% 0.56 perf-stat.overall.ipc
3.968e+09 +2.4% 4.064e+09 perf-stat.ps.branch-instructions
1.073e+08 -43.4% 60725347 ± 2% perf-stat.ps.branch-misses
148665 +15.8% 172139 ± 9% perf-stat.ps.cache-misses
1000 -5.5% 945.65 ± 3% perf-stat.ps.context-switches
2.724e+10 +2.4% 2.79e+10 perf-stat.ps.instructions
61624 ± 9% +29.4% 79753 ± 9% perf-stat.ps.node-load-misses
20119 ± 19% +49.9% 30162 ± 17% perf-stat.ps.node-stores
8.225e+12 +2.4% 8.419e+12 perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


Attachments:
(No filename) (121.01 kB)
config-5.0.0-rc1-00004-gce02ef06 (195.18 kB)
job-script (7.32 kB)
job.yaml (4.91 kB)
reproduce (319.00 B)
Download all attachments

2019-03-14 08:42:24

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [LKP] [x86, retpolines] ce02ef06fc: will-it-scale.per_thread_ops 3.1% improvement

On 03/13/2019 06:27 AM, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed a 3.1% improvement of will-it-scale.per_thread_ops due to commit:
>
>
> commit: ce02ef06fcf7a399a6276adb83f37373d10cbbe1 ("x86, retpolines: Raise limit for generating indirect calls from switch-case")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> in testcase: will-it-scale
> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
> with following parameters:
>
> nr_task: 100%
> mode: thread
> test: futex3
> cpufreq_governor: performance
> ucode: 0xb00002e
>
> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
> test-url: https://github.com/antonblanchard/will-it-scale
>
> In addition to that, the commit also has significant impact on the following tests:

Any thoughts on whether the above one-liner gcc work-around should be backported
to stable as well given these gains?

Thanks,
Daniel

> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops 4.3% improvement |
> | test machine | 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=process |
> | | nr_task=50% |
> | | test=futex3 |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops 2.5% improvement |
> | test machine | 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=process |
> | | nr_task=50% |
> | | test=futex1 |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops 5.8% improvement |
> | test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=process |
> | | nr_task=50% |
> | | test=futex3 |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops 2.6% improvement |
> | test machine | 160 threads Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz with 256G memory |
> | test parameters | cpufreq_governor=performance |
> | | test=futex1 |
> | | ucode=0xb00002e |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops 2.5% improvement |
> | test machine | 112 threads Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=process |
> | | nr_task=50% |
> | | test=futex2 |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_thread_ops 3.1% improvement |
> | test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=thread |
> | | nr_task=50% |
> | | test=futex4 |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_thread_ops 5.4% improvement |
> | test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=thread |
> | | nr_task=16 |
> | | test=futex3 |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_thread_ops 3.0% improvement |
> | test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
> | test parameters | cpufreq_governor=performance |
> | | mode=thread |
> | | nr_task=16 |
> | | test=futex4 |
> +------------------+---------------------------------------------------------------------------+

2019-04-29 12:54:36

by Greg KH

[permalink] [raw]
Subject: Re: [LKP] [x86, retpolines] ce02ef06fc: will-it-scale.per_thread_ops 3.1% improvement

On Thu, Mar 14, 2019 at 09:39:35AM +0100, Daniel Borkmann wrote:
> On 03/13/2019 06:27 AM, kernel test robot wrote:
> > Greeting,
> >
> > FYI, we noticed a 3.1% improvement of will-it-scale.per_thread_ops due to commit:
> >
> >
> > commit: ce02ef06fcf7a399a6276adb83f37373d10cbbe1 ("x86, retpolines: Raise limit for generating indirect calls from switch-case")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > in testcase: will-it-scale
> > on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
> > with following parameters:
> >
> > nr_task: 100%
> > mode: thread
> > test: futex3
> > cpufreq_governor: performance
> > ucode: 0xb00002e
> >
> > test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
> > test-url: https://github.com/antonblanchard/will-it-scale
> >
> > In addition to that, the commit also has significant impact on the following tests:
>
> Any thoughts on whether the above one-liner gcc work-around should be backported
> to stable as well given these gains?

I have now done so, thanks.

greg k-h