2018-04-18 01:12:00

by kernel test robot

[permalink] [raw]
Subject: [lkp-robot] [mm/cma] a57a290bd3: vm-scalability.throughput -15.5% regression


Greeting,

FYI, we noticed a -15.5% regression of vm-scalability.throughput due to commit:


commit: a57a290bd38f64bde9b8f797600aee3925109061 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: vm-scalability
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:

runtime: 300s
test: lru-file-readonce
cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/300s/lkp-bdw-ep2/lru-file-readonce/vm-scalability

commit:
d92b1ec27c ("mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request")
a57a290bd3 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

d92b1ec27cae2c99 a57a290bd38f64bde9b8f79760
---------------- --------------------------
%stddev %change %stddev
\ | \
308089 -15.7% 259785 vm-scalability.median
0.70 ? 56% +696.1% 5.56 ? 4% vm-scalability.stddev
27142495 -15.5% 22927797 vm-scalability.throughput
186.93 +23.9% 231.56 vm-scalability.time.elapsed_time
186.93 +23.9% 231.56 vm-scalability.time.elapsed_time.max
298206 +5.4% 314171 vm-scalability.time.involuntary_context_switches
4022 ? 3% -6.1% 3778 vm-scalability.time.maximum_resident_set_size
7329 -4.5% 7003 vm-scalability.time.percent_of_cpu_this_job_got
13376 +18.9% 15903 vm-scalability.time.system_time
324.95 -3.2% 314.71 vm-scalability.time.user_time
1345 +1.3% 1363 vm-scalability.time.voluntary_context_switches
12905097 -21.5% 10131265 ? 3% vmstat.memory.free
16.12 ? 3% +2.7 18.81 mpstat.cpu.idle%
1.98 -0.4 1.56 mpstat.cpu.usr%
7092821 ? 2% -22.0% 5529796 ? 12% numa-meminfo.node0.MemFree
6466966 ? 2% -21.4% 5082116 ? 11% numa-meminfo.node1.MemFree
467.06 -3.7% 449.96 pmeter.Average_Active_Power
58116 -12.3% 50956 pmeter.performance_per_watt
9546 ? 8% +50.7% 14381 ? 27% softirqs.NET_RX
157856 ? 2% +31.5% 207560 ? 3% softirqs.SCHED
5556773 +24.1% 6893563 softirqs.TIMER
118309 ? 3% +19.4% 141262 meminfo.Active
117637 ? 3% +19.6% 140648 meminfo.Active(anon)
31240 ? 2% -71.6% 8879 ? 25% meminfo.CmaFree
13541103 -22.3% 10519830 ? 3% meminfo.MemFree
97674 ? 4% +25.1% 122193 meminfo.Shmem
19884 ? 14% +39.9% 27813 ? 12% cpuidle.C1.usage
4626065 ? 4% +55.5% 7193168 ? 6% cpuidle.C3.time
16850 ? 6% +45.7% 24551 ? 3% cpuidle.C3.usage
2.556e+09 ? 4% +46.8% 3.752e+09 cpuidle.C6.time
2612869 ? 4% +47.0% 3841894 cpuidle.C6.usage
286.50 ? 18% +53.9% 441.00 ? 8% cpuidle.POLL.usage
12919583 ? 21% +584.0% 88376315 numa-numastat.node0.numa_foreign
13098152 ? 21% +80.1% 23585358 ? 15% numa-numastat.node0.numa_miss
13110934 ? 21% +80.0% 23593876 ? 15% numa-numastat.node0.other_node
5.278e+08 -17.0% 4.381e+08 numa-numastat.node1.local_node
13098152 ? 21% +80.1% 23585358 ? 15% numa-numastat.node1.numa_foreign
5.278e+08 -17.0% 4.381e+08 numa-numastat.node1.numa_hit
12919583 ? 21% +584.0% 88376315 numa-numastat.node1.numa_miss
12923938 ? 21% +583.9% 88384876 numa-numastat.node1.other_node
1779873 ? 4% -21.4% 1398188 ? 11% numa-vmstat.node0.nr_free_pages
6386636 ? 12% +691.1% 50521587 numa-vmstat.node0.numa_foreign
7869 ? 2% -71.2% 2268 ? 25% numa-vmstat.node1.nr_free_cma
1613920 -20.6% 1281336 ? 11% numa-vmstat.node1.nr_free_pages
305.75 ? 7% -15.3% 259.00 ? 3% numa-vmstat.node1.nr_isolated_file
3.11e+08 -17.7% 2.56e+08 numa-vmstat.node1.numa_hit
3.108e+08 -17.7% 2.559e+08 numa-vmstat.node1.numa_local
6388941 ? 12% +691.0% 50534071 numa-vmstat.node1.numa_miss
6564139 ? 12% +672.6% 50713295 numa-vmstat.node1.numa_other
2361 -3.4% 2281 turbostat.Avg_MHz
18023 ? 13% +43.9% 25927 ? 14% turbostat.C1
16372 ? 6% +45.6% 23844 ? 5% turbostat.C3
2610248 ? 4% +47.1% 3838846 turbostat.C6
15.38 ? 2% +2.9 18.30 turbostat.C6%
4.11 ? 6% +79.8% 7.38 turbostat.CPU%c1
18087951 ? 2% +22.2% 22101137 turbostat.IRQ
5.18 ? 2% -19.9% 4.15 turbostat.Pkg%pc2
235.95 -3.4% 227.98 turbostat.PkgWatt
24.09 -1.4% 23.76 turbostat.RAMWatt
2839703 ? 7% -31.7% 1939146 ? 13% sched_debug.cfs_rq:/.min_vruntime.min
397053 ? 4% +25.8% 499507 ? 3% sched_debug.cfs_rq:/.min_vruntime.stddev
6.28 ? 2% -16.4% 5.25 ? 7% sched_debug.cfs_rq:/.nr_spread_over.avg
192.50 ? 15% -26.1% 142.25 ? 19% sched_debug.cfs_rq:/.nr_spread_over.max
27.64 ? 13% -24.3% 20.92 ? 15% sched_debug.cfs_rq:/.nr_spread_over.stddev
-37835 +121.3% -83735 sched_debug.cfs_rq:/.spread0.avg
-2599772 +34.8% -3503649 sched_debug.cfs_rq:/.spread0.min
396913 ? 4% +25.8% 499460 ? 3% sched_debug.cfs_rq:/.spread0.stddev
21.88 ? 27% +114.6% 46.94 ? 6% sched_debug.cfs_rq:/.util_est_enqueued.avg
647.08 ? 19% +25.4% 811.67 ? 5% sched_debug.cfs_rq:/.util_est_enqueued.max
103.42 ? 23% +52.1% 157.30 sched_debug.cfs_rq:/.util_est_enqueued.stddev
126173 ? 4% -15.6% 106542 ? 10% sched_debug.cpu.nr_switches.max
1975 ? 3% -23.1% 1519 sched_debug.cpu.nr_switches.min
122036 ? 4% -18.0% 100120 ? 10% sched_debug.cpu.sched_count.max
1720 ? 2% -29.7% 1210 sched_debug.cpu.sched_count.min
60707 ? 5% -17.4% 50149 ? 10% sched_debug.cpu.ttwu_count.max
830.08 ? 3% -41.9% 482.17 ? 2% sched_debug.cpu.ttwu_count.min
60276 ? 5% -17.7% 49636 ? 11% sched_debug.cpu.ttwu_local.max
785.33 ? 3% -42.8% 449.00 ? 2% sched_debug.cpu.ttwu_local.min
2.823e+12 +15.8% 3.268e+12 perf-stat.branch-instructions
0.64 -0.1 0.56 perf-stat.branch-miss-rate%
7.68 +0.5 8.17 perf-stat.cache-miss-rate%
1.475e+10 +6.7% 1.573e+10 perf-stat.cache-misses
1689793 ? 3% +21.3% 2049799 ? 3% perf-stat.context-switches
2.97 +5.0% 3.12 perf-stat.cpi
3.878e+13 +19.4% 4.631e+13 perf-stat.cpu-cycles
11366 +25.0% 14205 perf-stat.cpu-migrations
3.505e+12 +13.0% 3.959e+12 perf-stat.dTLB-loads
1.306e+13 +13.7% 1.485e+13 perf-stat.instructions
4324 ? 6% +17.0% 5062 ? 6% perf-stat.instructions-per-iTLB-miss
0.34 -4.8% 0.32 perf-stat.ipc
529440 +18.3% 626390 perf-stat.minor-faults
46.54 ? 3% +15.3 61.85 perf-stat.node-load-miss-rate%
8.175e+08 ? 6% +68.9% 1.381e+09 perf-stat.node-load-misses
9.372e+08 -9.1% 8.517e+08 perf-stat.node-loads
13.14 ? 4% +10.2 23.31 perf-stat.node-store-miss-rate%
8.107e+08 ? 5% +84.7% 1.497e+09 perf-stat.node-store-misses
5.359e+09 -8.1% 4.925e+09 perf-stat.node-stores
529444 +18.3% 626391 perf-stat.page-faults
3040 +13.7% 3458 perf-stat.path-length
126389 -13.8% 108944 proc-vmstat.allocstall_movable
1015 ? 4% +29.0% 1309 ? 5% proc-vmstat.allocstall_normal
1672 ?106% +360.9% 7708 ? 49% proc-vmstat.compact_migrate_scanned
2597 ? 4% -31.8% 1771 ? 6% proc-vmstat.kswapd_low_wmark_hit_quickly
29424 ? 3% +19.9% 35279 proc-vmstat.nr_active_anon
7830 ? 3% -71.7% 2216 ? 26% proc-vmstat.nr_free_cma
3340670 ? 2% -21.6% 2618440 proc-vmstat.nr_free_pages
610.75 ? 2% -10.1% 549.00 proc-vmstat.nr_isolated_file
24391 ? 4% +25.6% 30642 proc-vmstat.nr_shmem
29425 ? 3% +19.9% 35285 proc-vmstat.nr_zone_active_anon
26017735 ? 21% +330.3% 1.12e+08 ? 2% proc-vmstat.numa_foreign
1247 ? 13% +26.3% 1575 ? 4% proc-vmstat.numa_hint_faults
26017735 ? 21% +330.3% 1.12e+08 ? 2% proc-vmstat.numa_miss
26034889 ? 21% +330.1% 1.12e+08 ? 2% proc-vmstat.numa_other
2602 ? 4% -31.5% 1782 ? 7% proc-vmstat.pageoutrun
544184 +18.3% 643659 proc-vmstat.pgfault
9.827e+08 -13.3% 8.519e+08 proc-vmstat.pgscan_direct
59757667 ? 12% +218.9% 1.906e+08 ? 2% proc-vmstat.pgscan_kswapd
9.827e+08 -13.3% 8.518e+08 proc-vmstat.pgsteal_direct
59757604 ? 12% +218.9% 1.906e+08 ? 2% proc-vmstat.pgsteal_kswapd



vm-scalability.throughput

2.8e+07 +-+---------------------------------------------------------------+
| +.+. + ++.++.+. +. + + ++. .+ +.+ .+.+ .+ |
2.7e+07 +-+ + .+.+ + + + + +. +.+ +.|
| + + : |
| +.+.+ |
2.6e+07 +-+ |
| |
2.5e+07 +-+ |
| |
2.4e+07 +-+ |
| |
O O O |
2.3e+07 +-OO O O O O O O O O |
| O O OO O O O O |
2.2e+07 +-+---------------------------------------------------------------+


vm-scalability.median

320000 +-+----------------------------------------------------------------+
|.++.+ .+.++.+.++. .+ +.+.++.+ +.+.++ |
310000 +-+ + .+.++ + :+ + : + .++. +. .++ |
| ++ + + + +.+ : + +|
300000 +-+ +.+.+ |
| |
290000 +-+ |
| |
280000 +-+ |
| |
270000 +-+ |
| O |
260000 O-OO O O O |
| O OO O OO O OO OO O O O |
250000 +-+----------------------------------------------------------------+




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong


Attachments:
(No filename) (13.83 kB)
config-4.16.0-11359-ga57a290 (166.74 kB)
job-script (7.42 kB)
job.yaml (5.05 kB)
reproduce (8.72 kB)
Download all attachments