Greeting,
FYI, we noticed a 27.8% improvement of fio.write_iops due to commit:
commit: 7d28631786b2333c5d48ad25172eb159aaa2945f ("mpage: stop using bdev_{read,write}_page")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
with following parameters:
disk: 2pmem
fs: ext2
runtime: 200s
nr_task: 50%
time_based: tb
rw: randwrite
bs: 2M
ioengine: io_uring
test_size: 200G
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based:
2M/gcc-11/performance/2pmem/ext2/io_uring/x86_64-rhel-8.3/50%/debian-11.1-x86_64-20220510.cgz/200s/randwrite/lkp-csl-2sp7/200G/fio-basic/tb
commit:
9e5fa0ae52 ("mm: refactor va_remove_mappings")
7d28631786 ("mpage: stop using bdev_{read,write}_page")
9e5fa0ae52fc67de 7d28631786b2333c5d48ad25172
---------------- ---------------------------
%stddev %change %stddev
\ | \
14.89 ? 10% -10.8 4.13 ? 14% fio.latency_1000ms%
0.09 ? 9% +0.0 0.13 ? 8% fio.latency_10ms%
7.55 ? 12% -4.0 3.56 ? 25% fio.latency_2000ms%
10.25 ? 11% +3.9 14.13 ? 17% fio.latency_250ms%
13.67 ? 17% +19.5 33.22 ? 27% fio.latency_500ms%
0.33 ? 31% +0.3 0.61 ? 31% fio.latency_50ms%
1.14 ? 3% -0.4 0.71 ? 17% fio.latency_>=2000ms%
2.053e+09 +27.5% 2.618e+09 ? 3% fio.time.file_system_outputs
8052 ? 5% +53.4% 12353 ? 12% fio.time.involuntary_context_switches
296137 ? 3% +14.2% 338218 ? 5% fio.time.minor_page_faults
1041 ? 4% +23.9% 1289 ? 13% fio.time.percent_of_cpu_this_job_got
1978 ? 4% +22.3% 2419 ? 14% fio.time.system_time
125.49 ? 8% +45.3% 182.35 ? 10% fio.time.user_time
4440324 ? 8% -46.7% 2366863 ? 11% fio.time.voluntary_context_switches
501829 +27.6% 640275 ? 3% fio.workload
4986 +27.8% 6371 ? 3% fio.write_bw_MBps
9.731e+08 -28.7% 6.935e+08 ? 5% fio.write_clat_90%_us
1.075e+09 -12.9% 9.367e+08 ? 8% fio.write_clat_95%_us
2.266e+09 ? 4% -24.4% 1.714e+09 ? 7% fio.write_clat_99%_us
6.123e+08 -21.6% 4.802e+08 ? 4% fio.write_clat_mean_us
2493 +27.8% 3185 ? 3% fio.write_iops
232414 ? 7% +15.7% 268867 ? 7% fio.write_slat_mean_us
41428534 -13.3% 35901780 cpuidle..usage
1137 ? 4% -21.9% 888.17 ? 16% meminfo.Mlocked
297.67 ?122% -83.0% 50.67 ?141% numa-meminfo.node1.Mlocked
11.50 ? 4% +2.3 13.82 ? 12% mpstat.cpu.all.sys%
0.67 ? 8% +0.3 0.96 ? 9% mpstat.cpu.all.usr%
49.17 ? 27% +1.3e+07% 6178756 ? 4% vmstat.io.bo
78001 ? 9% -56.3% 34101 ? 14% vmstat.system.cs
37.37 -6.3% 35.01 ? 3% iostat.cpu.iowait
12.25 ? 4% +18.8% 14.55 ? 11% iostat.cpu.system
0.66 ? 8% +42.9% 0.95 ? 9% iostat.cpu.user
2931 ? 18% +55.5% 4558 ? 23% sched_debug.cpu.avg_idle.min
51959 ? 15% -55.0% 23376 ? 13% sched_debug.cpu.nr_switches.avg
591108 ? 22% -75.2% 146571 ? 30% sched_debug.cpu.nr_switches.max
90813 ? 17% -75.3% 22412 ? 18% sched_debug.cpu.nr_switches.stddev
401.00 ? 3% +18.0% 473.17 ? 10% turbostat.Avg_MHz
14.51 ? 3% +2.5 16.98 ? 10% turbostat.Busy%
6339607 ? 12% -75.7% 1541772 ? 22% turbostat.POLL
195.79 +5.1% 205.76 ? 2% turbostat.PkgWatt
45.88 +5.0% 48.19 turbostat.RAMWatt
39919870 ? 4% +30.5% 52080482 ? 3% numa-numastat.node0.local_node
93131711 ? 3% +22.7% 1.143e+08 ? 8% numa-numastat.node0.numa_foreign
39955233 ? 4% +30.5% 52138480 ? 3% numa-numastat.node0.numa_hit
1.284e+08 ? 2% +29.2% 1.658e+08 ? 4% numa-numastat.node1.local_node
1.284e+08 ? 2% +29.2% 1.658e+08 ? 4% numa-numastat.node1.numa_hit
93183098 ? 3% +22.6% 1.143e+08 ? 8% numa-numastat.node1.numa_miss
93196371 ? 3% +22.7% 1.143e+08 ? 8% numa-numastat.node1.other_node
37631959 ? 4% +31.9% 49655093 ? 3% numa-vmstat.node0.nr_dirtied
35783356 ? 4% +33.8% 47880969 ? 3% numa-vmstat.node0.nr_written
93131711 ? 3% +22.7% 1.143e+08 ? 8% numa-vmstat.node0.numa_foreign
39955168 ? 4% +30.5% 52138531 ? 3% numa-vmstat.node0.numa_hit
39919805 ? 4% +30.5% 52080533 ? 3% numa-vmstat.node0.numa_local
2.19e+08 +26.7% 2.776e+08 ? 4% numa-vmstat.node1.nr_dirtied
2.081e+08 +28.4% 2.673e+08 ? 4% numa-vmstat.node1.nr_written
1.284e+08 ? 2% +29.2% 1.658e+08 ? 4% numa-vmstat.node1.numa_hit
1.284e+08 ? 2% +29.2% 1.658e+08 ? 4% numa-vmstat.node1.numa_local
93183098 ? 3% +22.6% 1.143e+08 ? 8% numa-vmstat.node1.numa_miss
93196371 ? 3% +22.7% 1.143e+08 ? 8% numa-vmstat.node1.numa_other
2.567e+08 +27.5% 3.272e+08 ? 3% proc-vmstat.nr_dirtied
13024849 -3.1% 12619595 proc-vmstat.nr_dirty
28301145 -2.1% 27702967 proc-vmstat.nr_file_pages
47040923 +1.3% 47658720 proc-vmstat.nr_free_pages
27529221 -2.2% 26931136 proc-vmstat.nr_inactive_file
810128 -1.9% 795051 proc-vmstat.nr_slab_reclaimable
2.441e+08 +29.1% 3.151e+08 ? 4% proc-vmstat.nr_written
27529221 -2.2% 26931135 proc-vmstat.nr_zone_inactive_file
13024848 -3.1% 12621112 proc-vmstat.nr_zone_write_pending
93131711 ? 3% +22.7% 1.143e+08 ? 8% proc-vmstat.numa_foreign
251090 ? 3% +14.8% 288318 ? 6% proc-vmstat.numa_hint_faults
232531 ? 3% +14.7% 266672 ? 7% proc-vmstat.numa_hint_faults_local
1.684e+08 +29.5% 2.18e+08 ? 4% proc-vmstat.numa_hit
13951 ? 3% +15.9% 16170 ? 6% proc-vmstat.numa_huge_pte_updates
1.683e+08 +29.5% 2.179e+08 ? 4% proc-vmstat.numa_local
93183098 ? 3% +22.6% 1.143e+08 ? 8% proc-vmstat.numa_miss
93241897 ? 3% +22.6% 1.144e+08 ? 8% proc-vmstat.numa_other
7429923 ? 3% +16.1% 8622634 ? 6% proc-vmstat.numa_pte_updates
2.629e+08 +26.9% 3.336e+08 ? 3% proc-vmstat.pgalloc_normal
975147 +5.0% 1024241 proc-vmstat.pgfault
2.367e+08 +29.9% 3.075e+08 ? 4% proc-vmstat.pgfree
10020 ? 28% +1.3e+07% 1.26e+09 ? 4% proc-vmstat.pgpgout
93.33 ? 20% +1.3e+06% 1198216 ? 10% proc-vmstat.pgrotated
21.90 -9.9% 19.74 perf-stat.i.MPKI
3.348e+09 +21.5% 4.068e+09 ? 5% perf-stat.i.branch-instructions
0.36 -0.0 0.35 perf-stat.i.branch-miss-rate%
12061197 +12.4% 13558900 ? 4% perf-stat.i.branch-misses
74.33 +5.2 79.58 perf-stat.i.cache-miss-rate%
2.787e+08 +18.8% 3.312e+08 ? 4% perf-stat.i.cache-misses
3.866e+08 +8.5% 4.194e+08 ? 5% perf-stat.i.cache-references
79217 ? 9% -56.7% 34320 ? 14% perf-stat.i.context-switches
3.787e+10 ? 3% +18.1% 4.472e+10 ? 10% perf-stat.i.cpu-cycles
211.78 +23.5% 261.65 ? 8% perf-stat.i.cpu-migrations
4.812e+09 +22.2% 5.879e+09 ? 4% perf-stat.i.dTLB-loads
2.91e+09 +21.9% 3.547e+09 ? 4% perf-stat.i.dTLB-stores
1.77e+10 +22.5% 2.168e+10 ? 4% perf-stat.i.instructions
15015 +24.9% 18748 ? 4% perf-stat.i.instructions-per-iTLB-miss
0.39 ? 3% +18.1% 0.47 ? 10% perf-stat.i.metric.GHz
869.18 +25.6% 1091 ? 6% perf-stat.i.metric.K/sec
119.36 +21.4% 144.94 ? 4% perf-stat.i.metric.M/sec
3926 +6.2% 4169 ? 2% perf-stat.i.minor-faults
35.58 -6.1 29.50 ? 13% perf-stat.i.node-load-miss-rate%
27642154 +46.7% 40562253 ? 12% perf-stat.i.node-loads
47.43 -6.3 41.18 ? 7% perf-stat.i.node-store-miss-rate%
19442397 +36.2% 26480148 ? 9% perf-stat.i.node-stores
3927 +6.2% 4170 ? 2% perf-stat.i.page-faults
21.85 -11.5% 19.35 perf-stat.overall.MPKI
0.36 -0.0 0.33 perf-stat.overall.branch-miss-rate%
72.09 +6.9 78.95 perf-stat.overall.cache-miss-rate%
15115 ? 2% +24.8% 18861 ? 4% perf-stat.overall.instructions-per-iTLB-miss
36.17 -6.9 29.29 ? 12% perf-stat.overall.node-load-miss-rate%
48.39 -6.8 41.56 ? 7% perf-stat.overall.node-store-miss-rate%
7111992 -4.1% 6818682 perf-stat.overall.path-length
3.334e+09 +21.5% 4.051e+09 ? 5% perf-stat.ps.branch-instructions
12009193 +12.4% 13497817 ? 4% perf-stat.ps.branch-misses
2.776e+08 +18.8% 3.297e+08 ? 4% perf-stat.ps.cache-misses
3.85e+08 +8.5% 4.177e+08 ? 5% perf-stat.ps.cache-references
78798 ? 9% -56.6% 34220 ? 14% perf-stat.ps.context-switches
3.782e+10 ? 3% +18.2% 4.47e+10 ? 10% perf-stat.ps.cpu-cycles
211.41 +23.5% 261.13 ? 8% perf-stat.ps.cpu-migrations
4.791e+09 +22.2% 5.853e+09 ? 4% perf-stat.ps.dTLB-loads
2.897e+09 +21.9% 3.531e+09 ? 4% perf-stat.ps.dTLB-stores
1.762e+10 +22.5% 2.159e+10 ? 4% perf-stat.ps.instructions
3921 +6.2% 4164 ? 2% perf-stat.ps.minor-faults
27527246 +46.7% 40382491 ? 12% perf-stat.ps.node-loads
19376729 +36.1% 26378339 ? 9% perf-stat.ps.node-stores
3922 +6.2% 4165 ? 2% perf-stat.ps.page-faults
3.569e+12 +22.4% 4.367e+12 ? 4% perf-stat.total.instructions
19.48 ? 13% -19.5 0.00 perf-profile.calltrace.cycles-pp.bdev_write_page.__mpage_writepage.write_cache_pages.mpage_writepages.do_writepages
16.17 ? 13% -16.2 0.00 perf-profile.calltrace.cycles-pp.pmem_rw_page.bdev_write_page.__mpage_writepage.write_cache_pages.mpage_writepages
8.50 ? 14% -8.5 0.00 perf-profile.calltrace.cycles-pp.folio_end_writeback.pmem_rw_page.bdev_write_page.__mpage_writepage.write_cache_pages
7.60 ? 18% -7.6 0.00 perf-profile.calltrace.cycles-pp.pmem_do_write.pmem_rw_page.bdev_write_page.__mpage_writepage.write_cache_pages
7.52 ? 18% -7.5 0.00 perf-profile.calltrace.cycles-pp.__memcpy_flushcache.pmem_do_write.pmem_rw_page.bdev_write_page.__mpage_writepage
5.79 ? 19% -5.8 0.00 perf-profile.calltrace.cycles-pp.__folio_end_writeback.folio_end_writeback.pmem_rw_page.bdev_write_page.__mpage_writepage
4.38 ? 13% -1.1 3.32 ? 9% perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.io_write
0.76 ? 12% +0.2 0.98 ? 8% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_account_dirtied.__folio_mark_dirty.mark_buffer_dirty.__block_commit_write
1.34 ? 14% +0.3 1.67 ? 8% perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
0.27 ?100% +0.4 0.64 ? 9% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__mod_lruvec_page_state.folio_account_dirtied.__folio_mark_dirty.mark_buffer_dirty
0.39 ? 71% +0.5 0.88 ? 14% perf-profile.calltrace.cycles-pp.free_buffer_head.try_to_free_buffers.mapping_evict_folio.invalidate_mapping_pagevec.generic_fadvise
0.50 ? 46% +0.5 0.99 ? 11% perf-profile.calltrace.cycles-pp.drop_buffers.try_to_free_buffers.mapping_evict_folio.invalidate_mapping_pagevec.generic_fadvise
0.39 ? 71% +0.5 0.88 ? 17% perf-profile.calltrace.cycles-pp.find_lock_entries.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64
0.26 ?100% +0.5 0.79 ? 17% perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_list.release_pages.__pagevec_release
0.09 ?223% +0.6 0.69 ? 14% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.filemap_unaccount_folio.__filemap_remove_folio.__remove_mapping.remove_mapping
0.10 ?223% +0.6 0.72 ? 15% perf-profile.calltrace.cycles-pp.filemap_unaccount_folio.__filemap_remove_folio.__remove_mapping.remove_mapping.invalidate_mapping_pagevec
0.46 ? 71% +0.6 1.09 ? 18% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec
0.78 ? 46% +0.7 1.44 ? 13% perf-profile.calltrace.cycles-pp.__filemap_remove_folio.__remove_mapping.remove_mapping.invalidate_mapping_pagevec.generic_fadvise
0.76 ? 46% +0.7 1.46 ? 17% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise
2.19 ? 12% +0.8 3.02 ? 13% perf-profile.calltrace.cycles-pp.get_io_u
1.01 ? 46% +0.8 1.84 ? 13% perf-profile.calltrace.cycles-pp.__remove_mapping.remove_mapping.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64
1.03 ? 46% +0.8 1.86 ? 13% perf-profile.calltrace.cycles-pp.remove_mapping.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64
1.16 ? 25% +0.9 2.02 ? 12% perf-profile.calltrace.cycles-pp.try_to_free_buffers.mapping_evict_folio.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64
1.19 ? 25% +0.9 2.06 ? 12% perf-profile.calltrace.cycles-pp.mapping_evict_folio.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64
0.16 ?223% +1.1 1.22 ? 52% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.__pagevec_release.invalidate_mapping_pagevec
0.16 ?223% +1.1 1.23 ? 52% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise
1.66 ? 29% +1.2 2.87 ? 17% perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc
0.89 ? 61% +1.2 2.11 ? 24% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist
2.03 ? 25% +1.2 3.26 ? 15% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.__filemap_get_folio
0.90 ? 61% +1.2 2.13 ? 23% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages
3.04 ? 19% +1.3 4.33 ? 11% perf-profile.calltrace.cycles-pp.folio_alloc.__filemap_get_folio.pagecache_get_page.block_write_begin.ext2_write_begin
2.92 ? 20% +1.3 4.22 ? 11% perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.__filemap_get_folio.pagecache_get_page.block_write_begin
2.73 ? 21% +1.3 4.03 ? 12% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.__filemap_get_folio.pagecache_get_page
0.00 +1.4 1.38 ? 24% perf-profile.calltrace.cycles-pp.__folio_start_writeback.__mpage_writepage.write_cache_pages.mpage_writepages.do_writepages
7.60 ? 16% +1.9 9.53 ? 9% perf-profile.calltrace.cycles-pp.pagecache_get_page.block_write_begin.ext2_write_begin.generic_perform_write.__generic_file_write_iter
7.57 ? 16% +1.9 9.50 ? 10% perf-profile.calltrace.cycles-pp.__filemap_get_folio.pagecache_get_page.block_write_begin.ext2_write_begin.generic_perform_write
1.89 ? 36% +2.1 3.94 ? 22% perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64
1.90 ? 36% +2.1 3.95 ? 22% perf-profile.calltrace.cycles-pp.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64
0.00 +2.2 2.20 ? 17% perf-profile.calltrace.cycles-pp.__folio_end_writeback.folio_end_writeback.mpage_end_io.__submit_bio.__submit_bio_noacct
0.00 +2.5 2.50 ? 18% perf-profile.calltrace.cycles-pp.folio_end_writeback.mpage_end_io.__submit_bio.__submit_bio_noacct.__mpage_writepage
0.00 +2.9 2.94 ? 18% perf-profile.calltrace.cycles-pp.mpage_end_io.__submit_bio.__submit_bio_noacct.__mpage_writepage.write_cache_pages
4.79 ? 30% +4.2 8.94 ? 16% perf-profile.calltrace.cycles-pp.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64
0.00 +9.7 9.74 ? 13% perf-profile.calltrace.cycles-pp.__memcpy_flushcache.pmem_do_write.pmem_submit_bio.__submit_bio.__submit_bio_noacct
0.00 +9.8 9.84 ? 13% perf-profile.calltrace.cycles-pp.pmem_do_write.pmem_submit_bio.__submit_bio.__submit_bio_noacct.__mpage_writepage
0.00 +9.9 9.94 ? 13% perf-profile.calltrace.cycles-pp.pmem_submit_bio.__submit_bio.__submit_bio_noacct.__mpage_writepage.write_cache_pages
0.00 +12.9 12.90 ? 14% perf-profile.calltrace.cycles-pp.__submit_bio_noacct.__mpage_writepage.write_cache_pages.mpage_writepages.do_writepages
0.00 +12.9 12.90 ? 14% perf-profile.calltrace.cycles-pp.__submit_bio.__submit_bio_noacct.__mpage_writepage.write_cache_pages.mpage_writepages
19.50 ? 13% -19.5 0.00 perf-profile.children.cycles-pp.bdev_write_page
16.17 ? 13% -16.2 0.00 perf-profile.children.cycles-pp.pmem_rw_page
8.51 ? 14% -6.0 2.51 ? 17% perf-profile.children.cycles-pp.folio_end_writeback
5.82 ? 19% -3.6 2.24 ? 18% perf-profile.children.cycles-pp.__folio_end_writeback
2.97 ? 26% -1.4 1.55 ? 18% perf-profile.children.cycles-pp.__folio_start_writeback
4.40 ? 13% -1.1 3.33 ? 9% perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
0.31 ? 19% -0.3 0.04 ? 72% perf-profile.children.cycles-pp.fprop_reflect_period_percpu
0.42 ? 19% -0.2 0.22 ? 23% perf-profile.children.cycles-pp.__fprop_add_percpu
0.32 ? 23% -0.2 0.15 ? 17% perf-profile.children.cycles-pp.__list_add_valid
0.41 ? 16% -0.1 0.32 ? 9% perf-profile.children.cycles-pp.__irq_exit_rcu
0.30 ? 12% -0.1 0.22 ? 20% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.06 ? 49% +0.0 0.10 ? 16% perf-profile.children.cycles-pp.uncharge_folio
0.10 ? 14% +0.0 0.15 ? 20% perf-profile.children.cycles-pp.xas_find_marked
0.05 ? 71% +0.1 0.10 ? 18% perf-profile.children.cycles-pp.xas_find
0.00 +0.1 0.06 ? 17% perf-profile.children.cycles-pp.folio_mapping
0.13 ? 19% +0.1 0.20 ? 7% perf-profile.children.cycles-pp.node_page_state
0.00 +0.1 0.07 ? 16% perf-profile.children.cycles-pp.__bio_try_merge_page
0.08 ? 51% +0.1 0.16 ? 17% perf-profile.children.cycles-pp.page_counter_uncharge
0.16 ? 27% +0.1 0.25 ? 9% perf-profile.children.cycles-pp.__slab_free
0.37 ? 17% +0.1 0.46 ? 12% perf-profile.children.cycles-pp.__count_memcg_events
0.12 ? 53% +0.1 0.22 ? 16% perf-profile.children.cycles-pp.uncharge_batch
0.00 +0.1 0.12 ? 17% perf-profile.children.cycles-pp.bio_add_page
0.24 ? 15% +0.1 0.36 ? 18% perf-profile.children.cycles-pp.__xa_clear_mark
0.20 ? 36% +0.1 0.34 ? 15% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
0.53 ? 8% +0.2 0.72 ? 8% perf-profile.children.cycles-pp.__mod_node_page_state
0.63 ? 9% +0.2 0.85 ? 8% perf-profile.children.cycles-pp.__mod_lruvec_state
0.33 ? 25% +0.2 0.57 ? 12% perf-profile.children.cycles-pp.kmem_cache_free
0.64 ? 18% +0.3 0.90 ? 21% perf-profile.children.cycles-pp.filemap_get_folios_tag
1.36 ? 15% +0.3 1.69 ? 7% perf-profile.children.cycles-pp.fault_in_readable
0.46 ? 24% +0.3 0.81 ? 17% perf-profile.children.cycles-pp.__free_one_page
0.38 ? 36% +0.3 0.73 ? 14% perf-profile.children.cycles-pp.filemap_unaccount_folio
0.51 ? 27% +0.4 0.88 ? 14% perf-profile.children.cycles-pp.free_buffer_head
0.50 ? 27% +0.4 0.89 ? 17% perf-profile.children.cycles-pp.find_lock_entries
0.01 ?223% +0.4 0.40 ? 23% perf-profile.children.cycles-pp.page_endio
0.55 ? 25% +0.4 1.00 ? 11% perf-profile.children.cycles-pp.drop_buffers
1.53 ? 14% +0.5 1.99 ? 9% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.59 ? 27% +0.5 1.11 ? 18% perf-profile.children.cycles-pp.free_pcppages_bulk
1.98 ? 14% +0.6 2.56 ? 9% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.84 ? 28% +0.6 1.45 ? 13% perf-profile.children.cycles-pp.__filemap_remove_folio
1.52 ? 23% +0.6 2.14 ? 21% perf-profile.children.cycles-pp.folio_batch_move_lru
0.83 ? 27% +0.6 1.47 ? 17% perf-profile.children.cycles-pp.free_unref_page_list
1.10 ? 27% +0.8 1.85 ? 13% perf-profile.children.cycles-pp.__remove_mapping
1.11 ? 27% +0.8 1.87 ? 13% perf-profile.children.cycles-pp.remove_mapping
2.20 ? 12% +0.8 3.03 ? 13% perf-profile.children.cycles-pp.get_io_u
1.16 ? 26% +0.9 2.02 ? 12% perf-profile.children.cycles-pp.try_to_free_buffers
1.19 ? 25% +0.9 2.06 ? 12% perf-profile.children.cycles-pp.mapping_evict_folio
2.87 ? 12% +0.9 3.77 ? 9% perf-profile.children.cycles-pp.__mod_lruvec_page_state
1.69 ? 28% +1.2 2.90 ? 17% perf-profile.children.cycles-pp.rmqueue_bulk
2.08 ? 24% +1.2 3.30 ? 15% perf-profile.children.cycles-pp.rmqueue
2.98 ? 20% +1.3 4.27 ? 11% perf-profile.children.cycles-pp.__alloc_pages
2.78 ? 21% +1.3 4.08 ? 12% perf-profile.children.cycles-pp.get_page_from_freelist
3.04 ? 19% +1.3 4.34 ? 11% perf-profile.children.cycles-pp.folio_alloc
0.92 ? 63% +1.4 2.30 ? 46% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
7.62 ? 16% +1.9 9.56 ? 9% perf-profile.children.cycles-pp.__filemap_get_folio
7.64 ? 16% +1.9 9.58 ? 9% perf-profile.children.cycles-pp.pagecache_get_page
1.99 ? 34% +2.1 4.06 ? 22% perf-profile.children.cycles-pp.__pagevec_release
2.09 ? 32% +2.1 4.17 ? 21% perf-profile.children.cycles-pp.release_pages
7.53 ? 18% +2.2 9.77 ? 13% perf-profile.children.cycles-pp.__memcpy_flushcache
7.60 ? 18% +2.3 9.86 ? 13% perf-profile.children.cycles-pp.pmem_do_write
0.00 +2.9 2.94 ? 18% perf-profile.children.cycles-pp.mpage_end_io
4.79 ? 30% +4.2 8.94 ? 16% perf-profile.children.cycles-pp.invalidate_mapping_pagevec
0.00 +10.0 9.96 ? 13% perf-profile.children.cycles-pp.pmem_submit_bio
0.00 +12.9 12.92 ? 14% perf-profile.children.cycles-pp.__submit_bio_noacct
0.00 +12.9 12.92 ? 14% perf-profile.children.cycles-pp.__submit_bio
2.63 ? 12% -2.4 0.26 ? 17% perf-profile.self.cycles-pp.folio_end_writeback
3.43 ? 6% -1.4 2.01 ? 12% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.53 ? 24% -1.1 0.42 ? 11% perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
1.34 ? 17% -0.9 0.41 ? 19% perf-profile.self.cycles-pp.__folio_end_writeback
0.81 ? 13% -0.5 0.35 ? 19% perf-profile.self.cycles-pp.__folio_start_writeback
0.30 ? 19% -0.3 0.03 ?102% perf-profile.self.cycles-pp.fprop_reflect_period_percpu
0.30 ? 25% -0.2 0.14 ? 17% perf-profile.self.cycles-pp.__list_add_valid
0.24 ? 18% -0.1 0.13 ? 6% perf-profile.self.cycles-pp.folio_account_dirtied
0.12 ? 13% -0.0 0.09 ? 15% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.11 ? 14% +0.0 0.14 ? 11% perf-profile.self.cycles-pp.__mod_zone_page_state
0.06 ? 46% +0.0 0.10 ? 12% perf-profile.self.cycles-pp.free_unref_page_commit
0.06 ? 46% +0.0 0.09 ? 15% perf-profile.self.cycles-pp.__filemap_remove_folio
0.06 ? 50% +0.0 0.10 ? 18% perf-profile.self.cycles-pp.uncharge_folio
0.01 ?223% +0.1 0.06 ? 15% perf-profile.self.cycles-pp.try_to_free_buffers
0.10 ? 18% +0.1 0.15 ? 21% perf-profile.self.cycles-pp.xas_find_marked
0.12 ? 31% +0.1 0.18 ? 16% perf-profile.self.cycles-pp.free_unref_page_list
0.12 ? 19% +0.1 0.18 ? 6% perf-profile.self.cycles-pp.node_page_state
0.00 +0.1 0.07 ? 13% perf-profile.self.cycles-pp.__bio_try_merge_page
0.10 ? 45% +0.1 0.16 ? 13% perf-profile.self.cycles-pp.__remove_mapping
0.07 ? 51% +0.1 0.14 ? 20% perf-profile.self.cycles-pp.page_counter_uncharge
0.16 ? 27% +0.1 0.24 ? 11% perf-profile.self.cycles-pp.__slab_free
0.00 +0.1 0.10 ? 27% perf-profile.self.cycles-pp.pmem_submit_bio
0.16 ? 28% +0.1 0.29 ? 10% perf-profile.self.cycles-pp.kmem_cache_free
0.38 ? 13% +0.2 0.54 ? 12% perf-profile.self.cycles-pp.release_pages
0.52 ? 9% +0.2 0.70 ? 8% perf-profile.self.cycles-pp.__mod_node_page_state
0.77 ? 13% +0.2 0.98 ? 9% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.53 ? 18% +0.2 0.74 ? 21% perf-profile.self.cycles-pp.filemap_get_folios_tag
0.73 ? 14% +0.3 1.00 ? 11% perf-profile.self.cycles-pp.__mod_lruvec_page_state
0.41 ? 24% +0.3 0.72 ? 17% perf-profile.self.cycles-pp.__free_one_page
1.33 ? 14% +0.3 1.66 ? 7% perf-profile.self.cycles-pp.fault_in_readable
0.45 ? 26% +0.3 0.79 ? 17% perf-profile.self.cycles-pp.find_lock_entries
0.92 ? 16% +0.4 1.28 ? 13% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.94 ? 11% +0.4 1.31 ? 11% perf-profile.self.cycles-pp.xas_load
0.01 ?223% +0.4 0.39 ? 24% perf-profile.self.cycles-pp.page_endio
0.54 ? 25% +0.4 0.99 ? 11% perf-profile.self.cycles-pp.drop_buffers
0.50 ? 19% +0.5 1.02 ? 19% perf-profile.self.cycles-pp.__mpage_writepage
2.18 ? 12% +0.8 3.00 ? 14% perf-profile.self.cycles-pp.get_io_u
7.47 ? 18% +2.2 9.66 ? 13% perf-profile.self.cycles-pp.__memcpy_flushcache
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
I think this simply accounts for the I/Os now that were skipped
when using the bdev_read/write path before.
On (23/03/15 08:58), Christoph Hellwig wrote:
>
> I think this simply accounts for the I/Os now that were skipped
> when using the bdev_read/write path before.
Oh, that would explain it. Otherwise I was slightly surprised (in a good
way) and puzzled.