2021-03-18 15:09:39

by kernel test robot

[permalink] [raw]
Subject: [hugetlb] 4eae4efa2c: vm-scalability.throughput 1.1% improvement



Greeting,

FYI, we noticed a 1.1% improvement of vm-scalability.throughput due to commit:


commit: 4eae4efa2c299f85b7ebfbeeda56c19c5eba2768 ("hugetlb: do early cow when page pinned on src mm")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: vm-scalability
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

runtime: 300s
size: 8T
test: anon-cow-seq-hugetlb
cpufreq_governor: performance
ucode: 0x5003006

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/8T/lkp-csl-2sp6/anon-cow-seq-hugetlb/vm-scalability/0x5003006

commit:
ca6eb14d64 ("mm: use is_cow_mapping() across tree where proper")
4eae4efa2c ("hugetlb: do early cow when page pinned on src mm")

ca6eb14d6453bea8 4eae4efa2c299f85b7ebfbeeda5
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.93 ? 9% +0.1 1.06 ? 5% vm-scalability.stddev%
36639545 +1.1% 37036215 vm-scalability.throughput
90908 ? 4% -4.8% 86547 vm-scalability.time.involuntary_context_switches
12834 -1.2% 12676 vm-scalability.time.system_time
7026 ? 36% +119.0% 15385 ? 62% softirqs.NET_RX
68.53 ? 18% -31.0% 47.25 ? 14% sched_debug.cpu.nr_uninterruptible.max
-28.19 +23.9% -34.94 sched_debug.cpu.nr_uninterruptible.min
3928701 -1.0% 3888371 proc-vmstat.htlb_buddy_alloc_success
2.013e+09 -1.0% 1.992e+09 proc-vmstat.pgalloc_normal
2.011e+09 -0.9% 1.992e+09 proc-vmstat.pgfree
3069 ? 18% -12.2% 2694 ? 4% interrupts.CPU32.CAL:Function_call_interrupts
264.17 ? 41% -33.6% 175.33 ? 8% interrupts.CPU74.RES:Rescheduling_interrupts
264.50 ? 22% -33.1% 177.00 ? 18% interrupts.CPU77.RES:Rescheduling_interrupts
246.00 ? 28% -33.6% 163.33 ? 16% interrupts.CPU82.RES:Rescheduling_interrupts
219.67 ? 20% -24.4% 166.17 ? 14% interrupts.CPU90.RES:Rescheduling_interrupts
1.937e+11 -1.0% 1.917e+11 perf-stat.i.cpu-cycles
1.045e+08 -2.2% 1.022e+08 perf-stat.i.node-load-misses
0.92 ? 17% +0.6 1.54 ? 10% perf-stat.i.node-store-miss-rate%
409065 ? 16% +88.8% 772315 ? 11% perf-stat.i.node-store-misses
0.69 ? 18% +0.6 1.30 ? 9% perf-stat.overall.node-store-miss-rate%
1.045e+08 -2.2% 1.023e+08 perf-stat.ps.node-load-misses
408265 ? 17% +87.9% 767022 ? 11% perf-stat.ps.node-store-misses
0.46 ? 39% -56.4% 0.20 ? 60% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
2578 ? 3% -19.8% 2067 ? 10% perf-sched.wait_and_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2582 ? 3% -20.0% 2067 ? 10% perf-sched.wait_and_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read
2412 ? 3% -19.9% 1931 ? 11% perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
2568 ? 3% -19.9% 2057 ? 10% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2362 ? 4% -23.6% 1804 ? 12% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
2667 ? 3% -18.0% 2187 ? 9% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
2.80 ?186% -95.0% 0.14 ? 79% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
2578 ? 3% -19.8% 2067 ? 10% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2580 ? 3% -19.9% 2067 ? 10% perf-sched.wait_time.max.ms.do_syslog.part.0.kmsg_read.vfs_read
2412 ? 3% -19.9% 1931 ? 11% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
2568 ? 3% -19.9% 2057 ? 10% perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2362 ? 4% -23.6% 1804 ? 12% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
2667 ? 3% -18.0% 2187 ? 9% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
176.12 ?209% -96.6% 5.98 ? 64% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra



vm-scalability.throughput

3.72e+07 +----------------------------------------------------------------+
| |
3.71e+07 |-+ O O |
| O O O O O |
3.7e+07 |-+O O O O OO |
| O O O |
3.69e+07 |-+ O O OO O O O |
| |
3.68e+07 |-+ + |
| +. :: |
3.67e+07 |-+ : + + : :.+ |
|+. .+ :+ + + + .+ +. +.++ + +.++. |
3.66e+07 |-+++.++ + : : :.+ + + ++.++.+ ++.++.+ +|
| : +.+ : + + |
3.65e+07 +----------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (7.63 kB)
config-5.12.0-rc2-00348-g4eae4efa2c29 (175.56 kB)
job-script (8.14 kB)
job.yaml (5.58 kB)
reproduce (6.67 kB)
Download all attachments