2023-06-07 07:00:35

by kernel test robot

[permalink] [raw]
Subject: [linus:master] [x86] 20f3337d35: stress-ng.lockofd.ops_per_sec 3.4% improvement



Hello,

kernel test robot noticed a 3.4% improvement of stress-ng.lockofd.ops_per_sec on:


commit: 20f3337d350c4e1b4ac66d731fd4e98565bf6cc0 ("x86: don't use REP_GOOD or ERMS for small memory clearing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: os
test: lockofd
cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
os/gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp7/lockofd/stress-ng/60s

commit:
68674f94ff ("x86: don't use REP_GOOD or ERMS for small memory copies")
20f3337d35 ("x86: don't use REP_GOOD or ERMS for small memory clearing")

68674f94ffc9dddc 20f3337d350c4e1b4ac66d731fd
---------------- ---------------------------
%stddev %change %stddev
\ | \
39.25 +2.2% 40.10 iostat.cpu.user
0.05 -0.0 0.03 ? 16% mpstat.cpu.all.soft%
1.212e+08 +3.4% 1.253e+08 stress-ng.lockofd.ops
2019682 +3.4% 2087883 stress-ng.lockofd.ops_per_sec
9.311e+08 +6.8% 9.944e+08 perf-stat.i.branch-instructions
0.67 -7.1% 0.63 perf-stat.i.cpi
1.172e+09 +2.5% 1.202e+09 perf-stat.i.dTLB-loads
0.00 ? 3% -0.0 0.00 ? 2% perf-stat.i.dTLB-store-miss-rate%
8.309e+08 +15.2% 9.575e+08 perf-stat.i.dTLB-stores
4.579e+09 +7.8% 4.934e+09 perf-stat.i.instructions
1.49 +7.8% 1.60 perf-stat.i.ipc
1.16 +62.7% 1.89 ? 2% perf-stat.i.metric.G/sec
1771 -28.8% 1261 ? 4% perf-stat.i.metric.M/sec
19.22 ? 2% -1.8 17.39 perf-profile.calltrace.cycles-pp.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
53.24 -1.8 51.44 perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
60.31 -1.5 58.83 perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.48 ? 4% -0.6 5.87 ? 4% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
1.03 ? 7% -0.2 0.88 ? 10% perf-profile.calltrace.cycles-pp.flock64_to_posix_lock.fcntl_getlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
5.76 ? 3% +0.4 6.12 ? 2% perf-profile.calltrace.cycles-pp.stress_mwc64modn
0.00 +1.5 1.51 ? 4% perf-profile.calltrace.cycles-pp.memset_orig.kmem_cache_alloc.fcntl_setlk.do_fcntl.__x64_sys_fcntl
0.00 +1.6 1.60 ? 7% perf-profile.calltrace.cycles-pp.memset_orig.kmem_cache_alloc.fcntl_getlk.do_fcntl.__x64_sys_fcntl
53.94 -1.7 52.24 perf-profile.children.cycles-pp.do_fcntl
19.65 -1.6 18.07 perf-profile.children.cycles-pp.fcntl_setlk
13.82 -1.0 12.83 ? 3% perf-profile.children.cycles-pp.kmem_cache_alloc
5.96 ? 3% +0.4 6.37 ? 3% perf-profile.children.cycles-pp.stress_mwc64modn
5.88 ? 2% +0.5 6.39 ? 3% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.00 +3.3 3.29 ? 5% perf-profile.children.cycles-pp.memset_orig
0.84 ? 9% +0.2 0.99 ? 9% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
5.60 ? 2% +0.4 6.04 ? 3% perf-profile.self.cycles-pp.stress_mwc64modn
5.58 ? 2% +0.5 6.09 ? 2% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.00 +3.1 3.11 ? 5% perf-profile.self.cycles-pp.memset_orig




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



Attachments:
(No filename) (5.01 kB)
config-6.3.0-rc7-00002-g20f3337d350c (160.70 kB)
job-script (9.25 kB)
job.yaml (6.26 kB)
reproduce (552.00 B)
Download all attachments