Greeting,
FYI, we noticed a 4.7% improvement of fio.write_iops due to commit:
commit: 9edeaea1bc452372718837ed2ba775811baf1ba1 ("sched: Core-wide rq->lock")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
in testcase: fio-basic
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
disk: 2pmem
fs: xfs
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: write
bs: 2M
ioengine: mmap
test_size: 200G
cpufreq_governor: performance
ucode: 0x5003006
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
2M/gcc-9/performance/2pmem/xfs/mmap/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/write/lkp-csl-2sp6/200G/fio-basic/tb/0x5003006
commit:
d66f1b06b5 ("sched: Prepare for Core-wide rq->lock")
9edeaea1bc ("sched: Core-wide rq->lock")
d66f1b06b5b438cd 9edeaea1bc452372718837ed2ba
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.15 ? 17% +69.8 71.93 ? 7% fio.latency_10ms%
97.83 -69.8 28.05 ? 18% fio.latency_20ms%
4.705e+08 +4.7% 4.927e+08 fio.time.minor_page_faults
547.26 +5.9% 579.60 fio.time.user_time
918927 +4.7% 962153 fio.workload
9188 +4.7% 9621 fio.write_bw_MBps
10573141 -4.3% 10114389 fio.write_clat_90%_us
10660522 -4.3% 10201770 fio.write_clat_95%_us
11141120 -4.3% 10660522 fio.write_clat_99%_us
10367263 -4.5% 9896727 fio.write_clat_mean_us
4594 +4.7% 4810 fio.write_iops
2.82 +5.9% 2.98 iostat.cpu.user
456542 +4.7% 477998 proc-vmstat.nr_page_table_pages
3072146 +2.5% 3147841 proc-vmstat.numa_hit
2985201 +2.5% 3060879 proc-vmstat.numa_local
3991178 +2.2% 4078232 proc-vmstat.pgalloc_normal
4.718e+08 +4.7% 4.939e+08 proc-vmstat.pgfault
3626261 +2.5% 3718014 proc-vmstat.pgfree
918927 +4.7% 962153 proc-vmstat.thp_fault_fallback
77.50 ? 18% -48.0% 40.33 ? 47% interrupts.CPU16.TLB:TLB_shootdowns
3752 ? 41% +98.1% 7431 ? 10% interrupts.CPU19.NMI:Non-maskable_interrupts
3752 ? 41% +98.1% 7431 ? 10% interrupts.CPU19.PMI:Performance_monitoring_interrupts
77.33 ? 19% -40.9% 45.67 ? 16% interrupts.CPU3.TLB:TLB_shootdowns
7322 ? 10% -32.1% 4973 ? 32% interrupts.CPU50.NMI:Non-maskable_interrupts
7322 ? 10% -32.1% 4973 ? 32% interrupts.CPU50.PMI:Performance_monitoring_interrupts
7096 ? 13% -45.7% 3855 ? 35% interrupts.CPU56.NMI:Non-maskable_interrupts
7096 ? 13% -45.7% 3855 ? 35% interrupts.CPU56.PMI:Performance_monitoring_interrupts
86.83 ? 14% -29.6% 61.17 ? 21% interrupts.CPU57.TLB:TLB_shootdowns
6793 ? 22% -42.8% 3888 ? 38% interrupts.CPU68.NMI:Non-maskable_interrupts
6793 ? 22% -42.8% 3888 ? 38% interrupts.CPU68.PMI:Performance_monitoring_interrupts
81.50 ? 17% -39.3% 49.50 ? 31% interrupts.CPU69.TLB:TLB_shootdowns
81.67 ? 18% -45.3% 44.67 ? 33% interrupts.CPU8.TLB:TLB_shootdowns
15.20 +2.6% 15.60 perf-stat.i.MPKI
6.81e+09 +2.7% 6.997e+09 perf-stat.i.branch-instructions
1.934e+08 +5.0% 2.031e+08 perf-stat.i.cache-misses
3.95e+08 +5.2% 4.158e+08 perf-stat.i.cache-references
5.16 -2.5% 5.03 perf-stat.i.cpi
701.20 -4.8% 667.48 perf-stat.i.cycles-between-cache-misses
2602676 ? 4% +6.1% 2760864 perf-stat.i.dTLB-load-misses
6.573e+09 +2.8% 6.757e+09 perf-stat.i.dTLB-loads
61805808 +5.4% 65122898 perf-stat.i.dTLB-store-misses
1.939e+09 +4.4% 2.024e+09 perf-stat.i.dTLB-stores
10469674 ? 3% +5.4% 11033398 ? 3% perf-stat.i.iTLB-load-misses
3790783 +2.5% 3885669 ? 2% perf-stat.i.iTLB-loads
2.59e+10 +2.6% 2.658e+10 perf-stat.i.instructions
0.20 +2.5% 0.20 perf-stat.i.ipc
1632 +5.0% 1714 perf-stat.i.metric.K/sec
163.72 +3.0% 168.68 perf-stat.i.metric.M/sec
2333293 +4.7% 2442886 perf-stat.i.minor-faults
2334698 +4.7% 2444294 perf-stat.i.page-faults
15.25 +2.6% 15.64 perf-stat.overall.MPKI
5.17 -2.5% 5.04 perf-stat.overall.cpi
692.80 -4.8% 659.64 perf-stat.overall.cycles-between-cache-misses
0.19 +2.6% 0.20 perf-stat.overall.ipc
5684913 -2.2% 5557703 perf-stat.overall.path-length
6.777e+09 +2.7% 6.962e+09 perf-stat.ps.branch-instructions
1.924e+08 +5.0% 2.021e+08 perf-stat.ps.cache-misses
3.931e+08 +5.2% 4.137e+08 perf-stat.ps.cache-references
2589917 ? 4% +6.1% 2747324 perf-stat.ps.dTLB-load-misses
6.541e+09 +2.8% 6.724e+09 perf-stat.ps.dTLB-loads
61502658 +5.4% 64802165 perf-stat.ps.dTLB-store-misses
1.929e+09 +4.4% 2.014e+09 perf-stat.ps.dTLB-stores
10420130 ? 3% +5.4% 10981621 ? 3% perf-stat.ps.iTLB-load-misses
3771888 +2.5% 3866318 ? 2% perf-stat.ps.iTLB-loads
2.577e+10 +2.6% 2.645e+10 perf-stat.ps.instructions
2321890 +4.7% 2430892 perf-stat.ps.minor-faults
2323280 +4.7% 2432285 perf-stat.ps.page-faults
5.224e+12 +2.4% 5.347e+12 perf-stat.total.instructions
fio.write_bw_MBps
9900 +--------------------------------------------------------------------+
| O |
9800 |-+ O O |
9700 |-OO OO O O |
| O O |
9600 |-+ O O O O O O |
| O O O |
9500 |-+ O O |
| |
9400 |-+ |
9300 |-+ |
| .+ + + +. + |
9200 |-++ :.++.+. +. .+ + .+ + : +.+ .+ .++. .+ + +. +.|
|.+ + ++.+ ++.+.++.++ +.++ ++.+ + + + + + |
9100 +--------------------------------------------------------------------+
fio.write_iops
4950 +--------------------------------------------------------------------+
| O |
4900 |-+ O O |
4850 |-OO OO O O |
| O O |
4800 |-+ O O O O O O |
| O O O |
4750 |-+ O O |
| |
4700 |-+ |
4650 |-+ |
| .+ + + +. + |
4600 |-++ :.++.+. +. .+ + .+ + : +.+ .+ .++. .+ + +. +.|
|.+ + ++.+ ++.+.++.++ +.++ ++.+ + + + + + |
4550 +--------------------------------------------------------------------+
fio.write_clat_mean_us
1.05e+07 +----------------------------------------------------------------+
| + + +. |
1.04e+07 |+. .+ +.++.++.++.++.++.+ +.++ +.+ + +. +. +.+ +: + +|
1.03e+07 |-+++ +.+ +.+ +.+ ++.+ + + + :+ |
| + |
1.02e+07 |-+ |
1.01e+07 |-+ |
| O O |
1e+07 |-+ O O |
9.9e+06 |-+ O O O O O |
| O O O |
9.8e+06 |O+OO O O O O |
9.7e+06 |-+ O O |
| O |
9.6e+06 +----------------------------------------------------------------+
fio.workload
990000 +------------------------------------------------------------------+
| O |
980000 |-+ OO |
970000 |-OO OO OO |
| O O |
960000 |-+ O O O O O O |
| O O O |
950000 |-+ O O |
| |
940000 |-+ |
930000 |-+ |
| .+ + + +. + |
920000 |-++ :.++.+ .+ .+ + .+ + : ++. +. +.+ .+ + +. +.|
|.+ + +.++ +.++.++.++ +.++ ++.+ + + + + + |
910000 +------------------------------------------------------------------+
fio.time.minor_page_faults
5.05e+08 +----------------------------------------------------------------+
| O O |
5e+08 |O+O O OO O |
4.95e+08 |-+ O O |
| O O O O |
4.9e+08 |-+ O O O O |
| O O |
4.85e+08 |-+ O O |
| |
4.8e+08 |-+ |
4.75e+08 |-+ + |
| .++ +. +.+ +.+ ++.+ :+ |
4.7e+08 |++ + : ++. +.++.+ .++.+ .+ : + : + +.++.++. +.: + .+|
| + + + + +.+ +.++ + + + |
4.65e+08 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang