2018-03-30 01:27:36

by kernel test robot

[permalink] [raw]
Subject: [lkp-robot] [sched] f4b457d904: vm-scalability.throughput -8.2% regression


Greeting,

FYI, we noticed a -8.2% regression of vm-scalability.throughput due to commit:


commit: f4b457d90488952f0f685941a419e3f22ed56df4 ("sched: idle: Select idle state before stopping the tick")
https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git idle-tick

in testcase: vm-scalability
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:

runtime: 300s
test: small-allocs-mt
cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/300s/lkp-hsw-ep5/small-allocs-mt/vm-scalability

commit:
616cc2d73a ("cpuidle: Return nohz hint from cpuidle_select()")
f4b457d904 ("sched: idle: Select idle state before stopping the tick")

616cc2d73a1f78f8 f4b457d90488952f0f685941a4
---------------- --------------------------
%stddev %change %stddev
\ | \
9678 -8.2% 8888 vm-scalability.median
542003 -8.2% 497740 vm-scalability.throughput
8924 -14.4% 7642 vm-scalability.time.involuntary_context_switches
36200385 -8.2% 33245872 vm-scalability.time.minor_page_faults
431.00 +5.7% 455.75 vm-scalability.time.percent_of_cpu_this_job_got
871.96 +9.4% 954.17 vm-scalability.time.system_time
440.87 -2.0% 432.12 vm-scalability.time.user_time
38355611 -8.2% 35207200 vm-scalability.time.voluntary_context_switches
1.629e+08 -8.2% 1.496e+08 vm-scalability.workload
1824 ? 10% +7.9% 1969 ? 9% proc-vmstat.pgactivate
251096 -8.0% 230901 vmstat.system.cs
0.00 ? 19% +0.0 0.01 ? 7% mpstat.cpu.soft%
2.10 -0.3 1.84 mpstat.cpu.usr%
536133 ? 2% +18.1% 633055 ? 2% softirqs.RCU
528501 +37.0% 724182 softirqs.SCHED
1080212 +66.7% 1800314 softirqs.TIMER
16475040 -70.0% 4936102 cpuidle.C1.time
1107989 -89.4% 116924 ? 3% cpuidle.C1.usage
9.244e+08 ? 3% -91.2% 81145933 ? 9% cpuidle.C1E.time
8079537 ? 2% -94.9% 414952 ? 6% cpuidle.C1E.usage
8.713e+09 -90.1% 8.668e+08 cpuidle.C3.time
25043408 -88.1% 2974532 cpuidle.C3.usage
5.406e+09 ? 2% +159.2% 1.401e+10 cpuidle.C6.time
18924649 +144.9% 46347159 cpuidle.C6.usage
62626212 ? 5% -96.2% 2350100 ? 15% cpuidle.POLL.time
136735 -98.5% 2067 ? 5% cpuidle.POLL.usage
159.25 +3.8% 165.25 turbostat.Avg_MHz
1106579 -89.5% 116023 ? 4% turbostat.C1
0.10 ? 5% -0.1 0.03 turbostat.C1%
8078055 ? 2% -94.9% 413304 ? 6% turbostat.C1E
5.40 ? 3% -4.9 0.47 ? 9% turbostat.C1E%
25043071 -88.1% 2973755 turbostat.C3
50.91 -45.8 5.07 turbostat.C3%
18923465 +144.9% 46346009 turbostat.C6
31.58 ? 2% +50.4 81.99 turbostat.C6%
16.76 -84.4% 2.61 turbostat.CPU%c3
7.45 ? 5% +206.3% 22.84 turbostat.CPU%c6
68.41 -2.9% 66.40 turbostat.PkgWatt
263.87 ? 11% +58.4% 417.90 ? 7% sched_debug.cfs_rq:/.exec_clock.stddev
79744 ? 2% +11.5% 88901 sched_debug.cfs_rq:/.min_vruntime.avg
86046 ? 2% +11.0% 95473 sched_debug.cfs_rq:/.min_vruntime.max
2472 ? 9% +29.4% 3198 ? 10% sched_debug.cfs_rq:/.min_vruntime.stddev
-2401 -382.2% 6777 ? 50% sched_debug.cfs_rq:/.spread0.avg
3899 ? 47% +242.7% 13362 ? 29% sched_debug.cfs_rq:/.spread0.max
2473 ? 9% +29.4% 3201 ? 10% sched_debug.cfs_rq:/.spread0.stddev
1048 ? 14% +56.8% 1645 ? 16% sched_debug.cpu.nr_load_updates.stddev
8420 ? 11% +49.9% 12620 ? 13% sched_debug.cpu.nr_switches.stddev
-12.79 +31.3% -16.79 sched_debug.cpu.nr_uninterruptible.min
6.83 ? 11% +24.8% 8.52 ? 4% sched_debug.cpu.nr_uninterruptible.stddev
8033 ? 11% +47.8% 11870 ? 14% sched_debug.cpu.sched_count.stddev
4015 ? 11% +47.4% 5920 ? 14% sched_debug.cpu.sched_goidle.stddev
354735 +11.4% 395332 sched_debug.cpu.ttwu_count.max
329194 -10.7% 294033 sched_debug.cpu.ttwu_count.min
5653 ? 15% +216.7% 17905 ? 5% sched_debug.cpu.ttwu_count.stddev
3.254e+11 -8.6% 2.974e+11 perf-stat.branch-instructions
1.96 +1.6 3.52 perf-stat.branch-miss-rate%
6.392e+09 +63.6% 1.046e+10 perf-stat.branch-misses
4.73 -0.8 3.92 perf-stat.cache-miss-rate%
1.687e+09 -9.3% 1.53e+09 perf-stat.cache-misses
3.564e+10 +9.6% 3.905e+10 perf-stat.cache-references
77069002 -8.2% 70758635 perf-stat.context-switches
2.33 +15.7% 2.70 perf-stat.cpi
2.818e+12 +5.8% 2.98e+12 perf-stat.cpu-cycles
275787 ? 2% +6.8% 294526 perf-stat.cpu-migrations
0.96 +0.1 1.04 ? 3% perf-stat.dTLB-load-miss-rate%
3.217e+11 -6.1% 3.02e+11 perf-stat.dTLB-loads
2.957e+08 ? 2% +4.0% 3.077e+08 perf-stat.dTLB-store-misses
1.792e+11 +6.1% 1.901e+11 perf-stat.dTLB-stores
57.82 +5.1 62.95 perf-stat.iTLB-load-miss-rate%
6.899e+08 +10.1% 7.595e+08 perf-stat.iTLB-load-misses
5.034e+08 ? 2% -11.2% 4.471e+08 perf-stat.iTLB-loads
1.21e+12 -8.6% 1.105e+12 perf-stat.instructions
1753 -17.0% 1455 perf-stat.instructions-per-iTLB-miss
0.43 -13.6% 0.37 perf-stat.ipc
36958457 -8.0% 34003884 perf-stat.minor-faults
1.057e+09 -9.9% 9.532e+08 perf-stat.node-load-misses
3.475e+08 ? 2% -10.5% 3.11e+08 perf-stat.node-store-misses
2.242e+08 -10.6% 2.005e+08 perf-stat.node-stores
36958459 -8.0% 34003902 perf-stat.page-faults


vm-scalability.throughput

600000 +-+----------------------------------------------------------------+
| +.++.+. +.+.++.+.+ ++.+ +.+.++.+ +.++ +.+.++.+.+ .|
500000 +-+O O OO O OO O OO O OO O OO O O OO O O O O : : : + |
| : : : : : O : : : : |
| : : : : : : : : : |
400000 +-+ : : : : : : : : : |
|: : : : : : : : : : |
300000 +-+ : : : : : : : : : |
|: : : : : : : : : : |
200000 +-+ : : : : : : : : : |
|: : : : : : : : : : |
| :: : : : : :: : : |
100000 +-+: : : : : :: : : |
| :: : : : : :: : : |
0 O-O----------------------------------------------------------------+


vm-scalability.median

10000 +-+-----------------------------------------------------------------+
9000 +-+O O.O O +.O+.+.+ O +.+.+O O+ O.+O ++.+ +.+ + +.|
| : O O O O OO : O OO O O O :O O O : : : |
8000 +-+ : : : : : : : : : |
7000 +-+ : : : : : : : : : |
|: : : : : : : : : : |
6000 +-+ : : : : : : : : : |
5000 +-+ : : : : : : : : : |
4000 +-+ : : : : : : : : : |
|: : : : : : : : : : |
3000 +-+ : : : : : : : : : |
2000 +-+: : : : : : : :: |
| :: : : : : : : :: |
1000 +-+: : : : : : : :: |
0 O-O-----------------------------------------------------------------+


vm-scalability.workload

1.8e+08 +-+---------------------------------------------------------------+
| +.++.+. +.+.++.++ +.+.+ ++.+.++ +.+.+ +.+.++.+.+ .|
1.6e+08 +-+O O OO O OO O OO OO O OO O OO O OO OO O : : : + |
1.4e+08 +-+ : : : : : O : : : : |
| : : : : : : : : : |
1.2e+08 +-+ : : : : : : : : : |
1e+08 +-+ : : : : : : : : : |
|: : : : : : : : : : |
8e+07 +-+ : : : : : : : : : |
6e+07 +-+ : : : : : : : : : |
|: : : : : : : : : : |
4e+07 +-+: : : : : : : : : |
2e+07 +-+: : : : : : : : : |
| :: : : : : : : : : |
0 O-O---------------------------------------------------------------+





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong


Attachments:
(No filename) (12.51 kB)
config-4.16.0-rc5-00005-gf4b457d (168.66 kB)
job-script (7.12 kB)
job.yaml (4.75 kB)
reproduce (817.00 B)
Download all attachments