Greeting,
FYI, we noticed a 34.5% improvement of vm-scalability.throughput due to commit:
commit: cb67f4282bf9693658dbda934a441ddbbb1446df ("mm,thp,rmap: simplify compound page mapcount handling")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: vm-scalability
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
with following parameters:
runtime: 300s
size: 128G
test: truncate-seq
cpufreq_governor: performance
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/128G/lkp-csl-2sp3/truncate-seq/vm-scalability
commit:
dad6a5eb55 ("mm,hugetlb: use folio fields in second tail page")
cb67f4282b ("mm,thp,rmap: simplify compound page mapcount handling")
dad6a5eb55564845 cb67f4282bf9693658dbda934a4
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.352e+08 +34.5% 3.164e+08 vm-scalability.median
2.352e+08 +34.5% 3.164e+08 vm-scalability.throughput
1132841 ? 37% +90.2% 2154205 ? 22% proc-vmstat.compact_free_scanned
2.40 ?107% +5.3 7.74 ? 47% perf-profile.children.cycles-pp.do_filp_open
2.40 ?107% +5.3 7.74 ? 47% perf-profile.children.cycles-pp.path_openat
105.57 +1.5% 107.20 perf-stat.i.cpu-migrations
31.13 ? 4% +4.8 35.91 ? 4% perf-stat.i.iTLB-load-miss-rate%
821423 ? 5% +17.4% 963953 ? 2% perf-stat.i.iTLB-load-misses
1724 ? 5% -15.5% 1456 ? 4% perf-stat.i.instructions-per-iTLB-miss
727.93 -11.3% 645.86 ? 18% perf-stat.i.metric.K/sec
572194 ? 3% -29.0% 406083 ? 4% perf-stat.i.node-load-misses
603100 ? 3% -29.9% 422840 ? 3% perf-stat.i.node-loads
31.13 ? 4% +4.8 35.88 ? 4% perf-stat.overall.iTLB-load-miss-rate%
1769 ? 5% -15.7% 1490 ? 4% perf-stat.overall.instructions-per-iTLB-miss
104.86 +1.5% 106.46 perf-stat.ps.cpu-migrations
815914 ? 5% +17.3% 957385 ? 2% perf-stat.ps.iTLB-load-misses
568201 ? 3% -29.0% 403248 ? 4% perf-stat.ps.node-load-misses
598933 ? 3% -29.9% 420034 ? 3% perf-stat.ps.node-loads
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests