Greeting,
FYI, we noticed a 3241.7% improvement of fio.write_iops due to commit:
commit: a052d3d1b6c77f193f7051cd5d4b08138fd57332 ("btrfs: only reserve the needed data space amount during fallocate")
https://git.kernel.org/cgit/linux/kernel/git/fdmanana/linux.git misc-next
in testcase: fio-basic
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:
runtime: 300s
disk: 1HDD
fs: btrfs
nr_task: 100%
test_size: 128G
rw: randwrite
bs: 4k
ioengine: falloc
cpufreq_governor: performance
ucode: 0xb000280
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
4k/gcc-9/performance/1HDD/btrfs/falloc/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/300s/randwrite/lkp-icl-2sp1/128G/fio-basic/0xb000280
commit:
3d83c164a0 ("btrfs: move common inode creation code into btrfs_create_new_inode()")
a052d3d1b6 ("btrfs: only reserve the needed data space amount during fallocate")
3d83c164a02f65c3 a052d3d1b6c77f193f7051cd5d4
---------------- ---------------------------
%stddev %change %stddev
\ | \
99.68 -99.6 0.04 ? 49% fio.latency_100us%
0.07 ? 65% +0.1 0.16 ? 16% fio.latency_10us%
0.02 ? 30% +0.1 0.11 ? 36% fio.latency_20us%
0.04 ? 28% -0.0 0.01 fio.latency_250us%
0.01 ? 29% +69.5 69.56 ? 3% fio.latency_2us%
0.02 ? 40% +0.2 0.21 ? 64% fio.latency_4us%
28.60 ? 2% -95.5% 1.28 ? 10% fio.time.elapsed_time
28.60 ? 2% -95.5% 1.28 ? 10% fio.time.elapsed_time.max
16835 ? 3% -96.6% 571.33 ? 13% fio.time.involuntary_context_switches
16349 ? 2% -10.9% 14569 fio.time.minor_page_faults
9381 -46.1% 5060 ? 10% fio.time.percent_of_cpu_this_job_got
2658 ? 2% -98.8% 32.01 fio.time.system_time
25.08 ? 5% +28.4% 32.19 ? 4% fio.time.user_time
3616 ? 2% -76.6% 846.00 ? 5% fio.time.voluntary_context_switches
4651 ? 2% +3241.7% 155452 ? 12% fio.write_bw_MBps
85333 ? 2% -98.5% 1314 fio.write_clat_90%_us
87210 -98.4% 1397 fio.write_clat_95%_us
90453 -98.2% 1664 ? 8% fio.write_clat_99%_us
80075 ? 2% -98.5% 1198 fio.write_clat_mean_us
44839 ? 22% -53.9% 20667 ? 10% fio.write_clat_stddev
1190883 ? 2% +3241.7% 39795840 ? 12% fio.write_iops
207365 ? 19% -42.9% 118345 ? 4% numa-numastat.node1.numa_hit
530.96 -87.7% 65.31 ?223% pmeter.Average_Active_Power
80.02 ? 4% -38.0% 49.62 ? 4% uptime.boot
12.79 ? 14% +545.5% 82.54 ? 2% iostat.cpu.idle
86.29 ? 2% -88.9% 9.55 ? 10% iostat.cpu.system
0.91 ? 4% +771.1% 7.90 ? 13% iostat.cpu.user
7.25 ? 28% +64.1 71.35 ? 8% mpstat.cpu.all.idle%
0.75 ? 5% +0.3 1.05 ? 24% mpstat.cpu.all.irq%
0.01 ? 40% +0.0 0.04 ? 36% mpstat.cpu.all.soft%
91.06 ? 2% -77.0 14.08 ? 21% mpstat.cpu.all.sys%
0.93 ? 3% +12.6 13.48 ? 23% mpstat.cpu.all.usr%
12.17 ? 16% +574.0% 82.00 ? 2% vmstat.cpu.id
85.50 -89.1% 9.33 ? 13% vmstat.cpu.sy
1186 ? 2% -100.0% 0.00 vmstat.io.bo
83.83 ? 2% -80.1% 16.67 ? 33% vmstat.procs.r
2723 ? 2% +180.4% 7637 ? 6% vmstat.system.cs
184148 -25.1% 137836 ? 5% vmstat.system.in
10972 ? 8% +33.5% 14652 ? 8% numa-vmstat.node0.nr_kernel_stack
2566 ? 36% +99.6% 5123 ? 13% numa-vmstat.node0.nr_page_table_pages
305.83 ? 56% -93.8% 18.83 ?218% numa-vmstat.node1.nr_inactive_file
1137 ? 74% -58.5% 472.67 ? 86% numa-vmstat.node1.nr_page_table_pages
7671 ? 23% -35.3% 4967 ? 13% numa-vmstat.node1.nr_slab_reclaimable
27154 ? 12% -20.8% 21509 ? 9% numa-vmstat.node1.nr_slab_unreclaimable
305.83 ? 56% -93.8% 18.83 ?218% numa-vmstat.node1.nr_zone_inactive_file
44723 ? 18% -50.4% 22163 ? 46% numa-meminfo.node0.AnonHugePages
10975 ? 8% +33.7% 14674 ? 8% numa-meminfo.node0.KernelStack
10286 ? 36% +99.5% 20522 ? 13% numa-meminfo.node0.PageTables
1921 ? 25% -46.9% 1021 ? 36% numa-meminfo.node1.Active
1226 ? 56% -93.8% 75.67 ?218% numa-meminfo.node1.Inactive(file)
30687 ? 23% -35.2% 19870 ? 13% numa-meminfo.node1.KReclaimable
4537 ? 74% -58.3% 1893 ? 85% numa-meminfo.node1.PageTables
30687 ? 23% -35.2% 19870 ? 13% numa-meminfo.node1.SReclaimable
108615 ? 12% -20.8% 86038 ? 9% numa-meminfo.node1.SUnreclaim
139303 ? 13% -24.0% 105909 ? 9% numa-meminfo.node1.Slab
3647 -37.6% 2275 ? 12% meminfo.Active
3419 -39.9% 2056 ? 13% meminfo.Active(anon)
58547 ? 3% -41.6% 34178 ? 4% meminfo.AnonHugePages
330276 +34.8% 445049 ? 4% meminfo.AnonPages
3798478 ? 2% -65.4% 1314197 ? 16% meminfo.Committed_AS
366737 +27.7% 468352 ? 4% meminfo.Inactive
364992 +28.2% 467743 ? 4% meminfo.Inactive(anon)
20396 +11.8% 22795 ? 2% meminfo.KernelStack
55354 -21.6% 43382 meminfo.Mapped
14784 ? 3% +52.0% 22466 ? 9% meminfo.PageTables
38638 -34.7% 25212 ? 2% meminfo.Shmem
2939 ? 2% -77.0% 676.67 ? 13% turbostat.Avg_MHz
91.92 ? 2% -66.3 25.58 ? 13% turbostat.Busy%
3196 -16.1% 2681 turbostat.Bzy_MHz
6.41 ? 41% +39.4 45.81 ? 34% turbostat.C1E%
1.70 ?128% +26.1 27.81 ? 49% turbostat.C6%
6.70 ? 37% +694.4% 53.20 ? 29% turbostat.CPU%c1
1.39 ?147% +1428.2% 21.22 ? 67% turbostat.CPU%c6
59.33 -10.4% 53.17 ? 3% turbostat.CoreTmp
5954076 ? 3% -89.3% 638209 ? 16% turbostat.IRQ
59.67 -10.6% 53.33 ? 2% turbostat.PkgTmp
347.98 -15.4% 294.40 ? 4% turbostat.PkgWatt
14.31 ? 29% -5.8 8.52 ?142% perf-profile.calltrace.cycles-pp._dl_catch_error
10.95 ? 75% -5.4 5.56 ?141% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._dl_catch_error
10.95 ? 75% -5.4 5.56 ?141% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._dl_catch_error
5.22 ?100% -5.2 0.00 perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
5.22 ?100% -5.2 0.00 perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__x64_sys_exit_group
7.00 ?111% -4.6 2.44 ?147% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.mmput.do_exit.do_group_exit
7.00 ?111% -4.6 2.44 ?147% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.mmput.do_exit
5.48 ?113% -3.0 2.44 ?147% perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.exit_mmap.mmput
14.31 ? 29% -6.4 7.96 ?144% perf-profile.children.cycles-pp._dl_catch_error
4.89 ?103% -4.9 0.00 perf-profile.children.cycles-pp.release_pages
7.00 ?111% -4.6 2.44 ?147% perf-profile.children.cycles-pp.unmap_vmas
7.00 ?111% -4.6 2.44 ?147% perf-profile.children.cycles-pp.unmap_page_range
5.98 ?103% -4.3 1.67 ?223% perf-profile.children.cycles-pp.walk_component
5.48 ?113% -3.0 2.44 ?147% perf-profile.children.cycles-pp.zap_pte_range
5.82 ?110% -0.1 5.68 ?162% perf-profile.children.cycles-pp.format_decode
5.82 ?110% -2.2 3.60 ?144% perf-profile.self.cycles-pp.format_decode
852.67 -39.6% 514.67 ? 13% proc-vmstat.nr_active_anon
82485 +35.0% 111385 ? 4% proc-vmstat.nr_anon_pages
5711 ? 73% -99.2% 44.17 ? 63% proc-vmstat.nr_dirtied
91158 +28.7% 117287 ? 4% proc-vmstat.nr_inactive_anon
20403 +11.5% 22754 ? 2% proc-vmstat.nr_kernel_stack
14082 -18.9% 11426 proc-vmstat.nr_mapped
3707 ? 4% +51.3% 5608 ? 9% proc-vmstat.nr_page_table_pages
9647 -32.2% 6544 ? 2% proc-vmstat.nr_shmem
27727 -4.2% 26564 proc-vmstat.nr_slab_reclaimable
852.67 -39.6% 514.67 ? 13% proc-vmstat.nr_zone_active_anon
91158 +28.7% 117287 ? 4% proc-vmstat.nr_zone_inactive_anon
380393 -16.7% 316892 proc-vmstat.numa_hit
293361 ? 2% -21.7% 229776 proc-vmstat.numa_local
380455 -16.7% 316873 proc-vmstat.pgalloc_normal
258327 -29.5% 182208 ? 2% proc-vmstat.pgfault
278534 ? 3% -42.0% 161619 ? 3% proc-vmstat.pgfree
12395 ? 3% -37.0% 7812 ? 6% proc-vmstat.pgreuse
0.39 ? 6% +1.1 1.44 ? 35% perf-stat.i.branch-miss-rate%
11226008 ? 7% +440.8% 60709258 ? 21% perf-stat.i.branch-misses
42.60 -25.4 17.23 ? 22% perf-stat.i.cache-miss-rate%
29893686 -48.5% 15393933 ? 18% perf-stat.i.cache-misses
2142 +161.1% 5594 ? 17% perf-stat.i.context-switches
20.93 -90.3% 2.04 ? 20% perf-stat.i.cpi
96050 +2.4% 98325 ? 3% perf-stat.i.cpu-clock
2.914e+11 -79.6% 5.937e+10 ? 46% perf-stat.i.cpu-cycles
161.58 ? 2% +172.4% 440.20 ? 25% perf-stat.i.cpu-migrations
9574 ? 2% -63.2% 3520 ? 38% perf-stat.i.cycles-between-cache-misses
0.00 ? 84% +0.0 0.05 ? 35% perf-stat.i.dTLB-load-miss-rate%
107745 ? 44% +1472.2% 1693986 ? 34% perf-stat.i.dTLB-load-misses
0.00 ? 28% +0.0 0.03 ? 38% perf-stat.i.dTLB-store-miss-rate%
48471 ? 12% +1086.3% 575011 ? 20% perf-stat.i.dTLB-store-misses
0.06 ? 7% +967.1% 0.62 ? 24% perf-stat.i.ipc
169.22 ? 3% +2000.8% 3555 ? 39% perf-stat.i.major-faults
3.04 -79.9% 0.61 ? 47% perf-stat.i.metric.GHz
4997 ? 3% +804.7% 45213 ? 28% perf-stat.i.minor-faults
95.51 -33.3 62.24 ? 11% perf-stat.i.node-load-miss-rate%
5112839 -76.9% 1180256 ? 48% perf-stat.i.node-load-misses
157058 ? 2% +184.7% 447159 ? 27% perf-stat.i.node-loads
69.69 -41.4 28.28 ? 33% perf-stat.i.node-store-miss-rate%
6517770 -76.5% 1530670 ? 32% perf-stat.i.node-store-misses
5166 ? 3% +843.9% 48766 ? 28% perf-stat.i.page-faults
96050 +2.4% 98326 ? 3% perf-stat.i.task-clock
42.43 ? 2% -24.8 17.59 ? 23% perf-stat.overall.cache-miss-rate%
21.12 -92.9% 1.49 ? 41% perf-stat.overall.cpi
9748 -60.1% 3886 ? 41% perf-stat.overall.cycles-between-cache-misses
0.00 ? 46% +0.0 0.03 ? 89% perf-stat.overall.dTLB-load-miss-rate%
0.05 +1526.5% 0.77 ? 31% perf-stat.overall.ipc
97.02 -27.6 69.44 ? 19% perf-stat.overall.node-load-miss-rate%
70.99 -40.6 30.34 ? 34% perf-stat.overall.node-store-miss-rate%
12059 -73.1% 3245 ? 81% perf-stat.overall.path-length
10857952 ? 7% +231.5% 35995103 ? 25% perf-stat.ps.branch-misses
28909823 -68.0% 9254837 ? 27% perf-stat.ps.cache-misses
68161639 ? 2% -22.7% 52690454 ? 13% perf-stat.ps.cache-references
2071 +54.7% 3204 ? 8% perf-stat.ps.context-switches
92860 -37.4% 58162 ? 16% perf-stat.ps.cpu-clock
2.818e+11 -86.4% 3.828e+10 ? 58% perf-stat.ps.cpu-cycles
156.29 ? 2% +61.4% 252.29 ? 15% perf-stat.ps.cpu-migrations
104205 ? 44% +820.5% 959180 ? 20% perf-stat.ps.dTLB-load-misses
46858 ? 12% +603.2% 329524 ? 12% perf-stat.ps.dTLB-store-misses
162.82 ? 3% +1085.3% 1929 ? 25% perf-stat.ps.major-faults
4827 ? 3% +425.3% 25360 ? 15% perf-stat.ps.minor-faults
4943794 -84.6% 760468 ? 60% perf-stat.ps.node-load-misses
151872 ? 2% +65.9% 251935 ? 14% perf-stat.ps.node-loads
6302973 -84.9% 950283 ? 44% perf-stat.ps.node-store-misses
2576891 ? 4% -19.4% 2077892 ? 9% perf-stat.ps.node-stores
4990 ? 3% +446.8% 27289 ? 15% perf-stat.ps.page-faults
92860 -37.4% 58162 ? 16% perf-stat.ps.task-clock
4.046e+11 -73.1% 1.089e+11 ? 81% perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp