Greeting,
FYI, we noticed a -11.9% regression of aim9.exec_test.ops_per_sec due to commit:
commit: a97056a6fab541e1661fed9ced0f793bda34b717 ("cpuidle: poll_state: Add time limit to poll_idle()")
https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git poll-idle
in testcase: aim9
on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
with following parameters:
testtime: 5s
test: all
cpufreq_governor: performance
test-description: Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsw-ep4/all/aim9/5s
commit:
v4.16-rc5
a97056a6fa ("cpuidle: poll_state: Add time limit to poll_idle()")
v4.16-rc5 a97056a6fab541e1661fed9ced
---------------- --------------------------
%stddev %change %stddev
\ | \
2010 -11.9% 1770 aim9.exec_test.ops_per_sec
619320 ? 2% +2.6% 635500 aim9.creat-clo.ops_per_sec
1335376 +1.3% 1353216 aim9.disk_wrt.ops_per_sec
4976 -6.9% 4633 aim9.fork_test.ops_per_sec
4.555e+08 +1.4% 4.62e+08 aim9.fun_cal1.ops_per_sec
2.498e+08 ? 5% +6.8% 2.668e+08 aim9.fun_cal15.ops_per_sec
48323640 ? 2% +2.0% 49295400 aim9.jmp_test.ops_per_sec
413494 +2.7% 424689 aim9.link_test.ops_per_sec
509182 +1.2% 515440 aim9.page_test.ops_per_sec
396.68 -12.2% 348.15 ? 9% aim9.shell_rtns_1.ops_per_sec
701520 ? 2% +2.6% 720100 aim9.signal_test.ops_per_sec
619952 +1.6% 630083 aim9.sync_disk_rw.ops_per_sec
9496 ? 9% -29.1% 6730 ? 13% aim9.time.involuntary_context_switches
6666695 -5.1% 6328538 ? 3% aim9.time.minor_page_faults
117287 -8.4% 107404 ? 3% aim9.time.voluntary_context_switches
5573 ? 62% +108.4% 11614 ? 34% numa-numastat.node0.other_node
2172 -5.4% 2054 vmstat.system.cs
49345778 ? 4% +307.8% 2.012e+08 ? 74% cpuidle.C3.time
4.561e+08 ? 9% -100.0% 123156 ? 62% cpuidle.POLL.time
106130 ?183% -97.4% 2761 ?124% numa-meminfo.node1.Inactive
106067 ?183% -97.5% 2612 ?131% numa-meminfo.node1.Inactive(anon)
132349 ?175% -95.0% 6681 ? 44% numa-meminfo.node1.Shmem
2573 ? 17% +25.6% 3232 ? 13% numa-vmstat.node0.nr_mapped
5961 ? 52% +98.4% 11828 ? 33% numa-vmstat.node0.numa_other
26515 ?183% -97.5% 651.75 ?132% numa-vmstat.node1.nr_inactive_anon
33087 ?175% -95.0% 1665 ? 44% numa-vmstat.node1.nr_shmem
26515 ?183% -97.5% 651.75 ?132% numa-vmstat.node1.nr_zone_inactive_anon
1349 ? 62% +119.0% 2955 ? 11% slabinfo.dmaengine-unmap-16.active_objs
1368 ? 62% +116.1% 2955 ? 11% slabinfo.dmaengine-unmap-16.num_objs
731.60 ? 8% +21.1% 886.25 ? 4% slabinfo.ip6_dst_cache.active_objs
731.60 ? 8% +21.1% 886.25 ? 4% slabinfo.ip6_dst_cache.num_objs
3184 ? 6% -13.4% 2758 ? 3% slabinfo.mm_struct.active_objs
3184 ? 6% -13.0% 2771 ? 3% slabinfo.mm_struct.num_objs
166.80 ? 3% -36.2% 106.50 turbostat.Avg_MHz
5.94 ? 2% -1.1 4.84 ? 5% turbostat.Busy%
2812 -21.2% 2216 ? 4% turbostat.Bzy_MHz
27.11 -10.2% 24.34 turbostat.CPU%c1
19.80 ? 9% +81.7% 35.98 turbostat.Pkg%pc2
0.04 ? 90% +250.0% 0.14 ? 29% turbostat.Pkg%pc6
117.44 ? 3% -15.9% 98.79 turbostat.PkgWatt
8.64 ? 5% -11.1% 7.68 ? 3% turbostat.RAMWatt
2.794e+11 ? 10% -23.1% 2.15e+11 ? 4% perf-stat.branch-instructions
1.46 ? 5% +0.4 1.87 ? 6% perf-stat.branch-miss-rate%
0.93 ? 5% -0.1 0.84 ? 2% perf-stat.cache-miss-rate%
90185384 ? 2% -6.9% 83996045 ? 2% perf-stat.cache-misses
653470 -5.7% 616530 perf-stat.context-switches
2.28 ? 10% -18.9% 1.85 ? 4% perf-stat.cpi
3.505e+12 ? 4% -35.2% 2.273e+12 ? 4% perf-stat.cpu-cycles
31896 -6.3% 29892 ? 2% perf-stat.cpu-migrations
0.12 ? 18% +0.0 0.17 ? 17% perf-stat.dTLB-load-miss-rate%
4.953e+11 ? 12% -29.4% 3.497e+11 ? 10% perf-stat.dTLB-loads
0.44 ? 9% +22.3% 0.54 ? 4% perf-stat.ipc
7355724 -4.6% 7019933 ? 3% perf-stat.minor-faults
7355724 -4.6% 7019932 ? 3% perf-stat.page-faults
2713 ? 6% +20.4% 3268 ? 5% sched_debug.cfs_rq:/.exec_clock.avg
73594 ? 6% +26.1% 92766 ? 10% sched_debug.cfs_rq:/.exec_clock.max
12795 ? 5% +21.7% 15577 ? 6% sched_debug.cfs_rq:/.exec_clock.stddev
18587 ? 10% +16.2% 21602 ? 7% sched_debug.cfs_rq:/.min_vruntime.avg
177486 ? 7% +21.7% 215979 ? 2% sched_debug.cfs_rq:/.min_vruntime.max
28721 ? 5% +18.6% 34065 ? 4% sched_debug.cfs_rq:/.min_vruntime.stddev
1.97 ? 33% -57.6% 0.83 sched_debug.cfs_rq:/.nr_spread_over.max
0.25 ? 62% -100.0% 0.00 sched_debug.cfs_rq:/.nr_spread_over.stddev
4.66 ? 35% -65.4% 1.61 ? 87% sched_debug.cfs_rq:/.removed.util_avg.avg
25.10 ? 25% -57.0% 10.79 ? 83% sched_debug.cfs_rq:/.removed.util_avg.stddev
166666 ? 8% +22.4% 203936 ? 4% sched_debug.cfs_rq:/.spread0.max
28721 ? 5% +18.6% 34066 ? 4% sched_debug.cfs_rq:/.spread0.stddev
649.62 ? 6% -9.4% 588.34 ? 4% sched_debug.cpu.curr->pid.avg
4637 ? 3% -9.0% 4220 ? 2% sched_debug.cpu.curr->pid.stddev
1954 ? 8% +13.5% 2218 ? 10% sched_debug.cpu.sched_goidle.stddev
518.80 ? 24% -35.4% 335.12 ? 36% sched_debug.cpu.ttwu_count.min
aim9.exec_test.ops_per_sec
2500 +-+------------------------------------------------------------------+
| |
| |
2000 +-++..+..+...+..+..+..+..+..+..+...+..+..+..+..+..+..+..+...+..+..+..|
O O O O O O O O O O O O O O O O O O O O O O
| |
1500 +-+ |
| |
1000 +-+ |
| |
| |
500 +-+ |
| |
| |
0 +-+----------O-------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong