Greeting,
FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: phoronix-test-suite
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:
test: fio-1.14.1
option_a: Sequential Write
option_b: IO_uring
option_c: Yes
option_d: Yes
option_e: 1MB
option_f: Default Test Directory
cpufreq_governor: performance
ucode: 0x500320a
test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
test-url: http://www.phoronix-test-suite.com/
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/option_d/option_e/option_f/rootfs/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/Sequential Write/IO_uring/Yes/Yes/1MB/Default Test Directory/debian-x86_64-phoronix/lkp-csl-2sp7/fio-1.14.1/phoronix-test-suite/0x500320a
commit:
a3e4bc23d5 ("io_uring: defer splice/tee file validity check until command issue")
584b0180f0 ("io_uring: move read/write file prep state into actual opcode handler")
a3e4bc23d5470b2b 584b0180f0f4d67d7145950fe68
---------------- ---------------------------
%stddev %change %stddev
\ | \
1081 -10.2% 971.00 phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.iops
1084 -10.2% 974.67 phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s
118.42 +132.0% 274.70 ? 55% phoronix-test-suite.time.elapsed_time
118.42 +132.0% 274.70 ? 55% phoronix-test-suite.time.elapsed_time.max
1317 ? 19% +921.5% 13458 ? 53% phoronix-test-suite.time.involuntary_context_switches
185595 +23.8% 229715 ? 17% phoronix-test-suite.time.minor_page_faults
68.33 +2031.5% 1456 ? 3% phoronix-test-suite.time.percent_of_cpu_this_job_got
58.62 +6771.2% 4028 ? 58% phoronix-test-suite.time.system_time
244.97 +10.1% 269.72 pmeter.Average_Active_Power
1655992 ? 78% +1356.2% 24114501 ? 50% numa-numastat.node1.local_node
1662758 ? 78% +1374.2% 24512157 ? 50% numa-numastat.node1.numa_hit
958669 +10.8% 1062574 ? 7% meminfo.Active
843569 +12.0% 945049 ? 8% meminfo.Active(anon)
61229 ? 3% -60.2% 24352 ? 21% meminfo.Writeback
96.73 -14.0 82.71 mpstat.cpu.all.idle%
1.75 ? 15% -0.3 1.43 ? 16% mpstat.cpu.all.irq%
0.80 +14.4 15.19 ? 3% mpstat.cpu.all.sys%
0.33 -0.0 0.28 ? 11% mpstat.cpu.all.usr%
258678 ? 7% -37.0% 162850 ? 17% numa-meminfo.node0.Dirty
56023 ? 12% -85.5% 8096 ? 50% numa-meminfo.node0.Writeback
2978 ? 44% +350.6% 13421 ? 82% numa-meminfo.node1.Active(anon)
1.83 ? 73% +8.1e+06% 148245 ? 22% numa-meminfo.node1.Dirty
96.33 -14.4% 82.50 vmstat.cpu.id
746.67 ? 2% -40.8% 441.83 ? 53% vmstat.io.bi
1.00 +1350.0% 14.50 ? 15% vmstat.procs.r
187622 ? 3% +2.6% 192531 vmstat.system.in
79.33 ? 9% +490.3% 468.33 ? 3% turbostat.Avg_MHz
4.77 ? 15% +13.3 18.11 ? 5% turbostat.Busy%
1688 ? 7% +53.6% 2593 ? 2% turbostat.Bzy_MHz
13881106 ? 34% +181.8% 39121066 ? 64% turbostat.C1E
22867090 ? 3% +134.6% 53644530 ? 54% turbostat.IRQ
49.00 +5.4% 51.67 ? 2% turbostat.PkgTmp
120.78 +16.5% 140.68 turbostat.PkgWatt
64647 ? 7% -36.9% 40763 ? 17% numa-vmstat.node0.nr_dirty
13891 ? 12% -85.1% 2064 ? 50% numa-vmstat.node0.nr_writeback
78537 ? 6% -45.5% 42771 ? 17% numa-vmstat.node0.nr_zone_write_pending
744.50 ? 44% +350.7% 3355 ? 82% numa-vmstat.node1.nr_active_anon
6.00 ? 52% +3.6e+08% 21761586 ? 53% numa-vmstat.node1.nr_dirtied
0.00 +3.7e+106% 37065 ? 22% numa-vmstat.node1.nr_dirty
5.50 ? 48% +3.9e+08% 21697670 ? 53% numa-vmstat.node1.nr_written
744.50 ? 44% +350.7% 3355 ? 82% numa-vmstat.node1.nr_zone_active_anon
0.00 +4e+106% 40201 ? 22% numa-vmstat.node1.nr_zone_write_pending
1662586 ? 78% +1374.3% 24512265 ? 50% numa-vmstat.node1.numa_hit
1655820 ? 78% +1356.4% 24114609 ? 50% numa-vmstat.node1.numa_local
17009 ? 36% +3730.9% 651618 ? 64% sched_debug.cfs_rq:/.min_vruntime.avg
33445 ? 30% +2211.0% 772918 ? 59% sched_debug.cfs_rq:/.min_vruntime.max
11326 ? 45% +4758.2% 550285 ? 70% sched_debug.cfs_rq:/.min_vruntime.min
3907 ? 21% +2044.3% 83779 ? 38% sched_debug.cfs_rq:/.min_vruntime.stddev
91.33 ? 33% +55.1% 141.65 ? 37% sched_debug.cfs_rq:/.runnable_avg.avg
7037 ? 70% +823.4% 64981 ? 67% sched_debug.cfs_rq:/.spread0.max
-15244 +934.3% -157672 sched_debug.cfs_rq:/.spread0.min
3930 ? 21% +2031.5% 83782 ? 38% sched_debug.cfs_rq:/.spread0.stddev
90.43 ? 34% +53.1% 138.43 ? 37% sched_debug.cfs_rq:/.util_avg.avg
135.84 ? 21% +529.3% 854.87 ? 67% sched_debug.cpu.curr->pid.avg
1000 ? 10% +376.5% 4767 ? 59% sched_debug.cpu.nr_switches.min
210892 +12.0% 236262 ? 8% proc-vmstat.nr_active_anon
18617 +7.7% 20053 proc-vmstat.nr_kernel_stack
53538 +3.3% 55319 proc-vmstat.nr_slab_unreclaimable
15260 ? 4% -60.1% 6094 ? 21% proc-vmstat.nr_writeback
210892 +12.0% 236262 ? 8% proc-vmstat.nr_zone_active_anon
9868 ? 31% +227.2% 32291 ? 60% proc-vmstat.numa_hint_faults
9609 ? 28% +156.6% 24657 ? 62% proc-vmstat.numa_hint_faults_local
416.00 ? 8% +259.4% 1495 ? 54% proc-vmstat.numa_huge_pte_updates
259.00 ?156% +40441.1% 105001 ? 46% proc-vmstat.numa_pages_migrated
230799 ? 7% +252.9% 814389 ? 53% proc-vmstat.numa_pte_updates
292996 +7.3% 314293 proc-vmstat.pgactivate
867707 +73.7% 1507465 ? 39% proc-vmstat.pgfault
259.00 ?156% +40441.1% 105001 ? 46% proc-vmstat.pgmigrate_success
311.33 ? 7% +2155.2% 7021 ? 65% proc-vmstat.pgrotated
30.29 ? 42% -73.8% 7.93 ?110% perf-stat.i.MPKI
7.08e+08 +283.6% 2.716e+09 perf-stat.i.branch-instructions
2.91 ? 44% -1.9 1.04 ? 83% perf-stat.i.branch-miss-rate%
34699379 -10.6% 31027580 ? 3% perf-stat.i.cache-misses
6.802e+09 ? 10% +554.9% 4.455e+10 ? 3% perf-stat.i.cpu-cycles
33.73 ? 25% +1103.3% 405.85 ? 5% perf-stat.i.cpu-migrations
1102 ? 33% +132.4% 2562 ? 22% perf-stat.i.cycles-between-cache-misses
0.22 ? 50% -0.2 0.07 ?137% perf-stat.i.dTLB-load-miss-rate%
9.67e+08 ? 4% +270.6% 3.584e+09 ? 2% perf-stat.i.dTLB-loads
4.994e+08 ? 2% -6.6% 4.664e+08 ? 2% perf-stat.i.dTLB-stores
3.503e+09 +284.6% 1.347e+10 perf-stat.i.instructions
3126 ? 5% +327.7% 13371 ? 7% perf-stat.i.instructions-per-iTLB-miss
0.52 ? 9% -38.4% 0.32 ? 8% perf-stat.i.ipc
70851 ? 10% +554.6% 463814 ? 3% perf-stat.i.metric.GHz
23543856 +202.1% 71118618 perf-stat.i.metric.M/sec
24.45 ? 5% +17.6 42.05 ? 5% perf-stat.i.node-load-miss-rate%
154141 ? 4% +1027.3% 1737599 ? 11% perf-stat.i.node-load-misses
6780078 -41.7% 3950788 ? 5% perf-stat.i.node-loads
19.06 ? 16% +32.5 51.53 ? 7% perf-stat.i.node-store-miss-rate%
54242 ? 17% +4415.1% 2449115 ? 15% perf-stat.i.node-store-misses
5188725 -41.8% 3018300 ? 12% perf-stat.i.node-stores
20.79 ? 28% -81.7% 3.80 ? 31% perf-stat.overall.MPKI
2.06 ? 28% -1.8 0.31 ? 39% perf-stat.overall.branch-miss-rate%
1.94 ? 10% +70.2% 3.31 ? 2% perf-stat.overall.cpi
195.90 ? 8% +632.9% 1435 perf-stat.overall.cycles-between-cache-misses
0.10 ? 48% -0.1 0.01 ?116% perf-stat.overall.dTLB-load-miss-rate%
3154 ? 6% +306.3% 12813 ? 8% perf-stat.overall.instructions-per-iTLB-miss
0.52 ? 11% -41.9% 0.30 ? 2% perf-stat.overall.ipc
2.23 ? 4% +28.3 30.55 ? 10% perf-stat.overall.node-load-miss-rate%
1.04 ? 17% +43.8 44.81 ? 15% perf-stat.overall.node-store-miss-rate%
7.02e+08 +284.8% 2.702e+09 perf-stat.ps.branch-instructions
34381260 -10.2% 30861100 ? 3% perf-stat.ps.cache-misses
6.746e+09 ? 10% +556.9% 4.431e+10 ? 3% perf-stat.ps.cpu-cycles
33.43 ? 25% +1107.4% 403.66 ? 5% perf-stat.ps.cpu-migrations
9.587e+08 ? 4% +271.8% 3.565e+09 ? 2% perf-stat.ps.dTLB-loads
4.951e+08 ? 2% -6.3% 4.64e+08 ? 2% perf-stat.ps.dTLB-stores
3.473e+09 +285.9% 1.34e+10 perf-stat.ps.instructions
152903 ? 4% +1030.3% 1728344 ? 11% perf-stat.ps.node-load-misses
6717519 -41.5% 3929544 ? 5% perf-stat.ps.node-loads
53840 ? 17% +4424.7% 2436140 ? 15% perf-stat.ps.node-store-misses
5140848 -41.6% 3001971 ? 12% perf-stat.ps.node-stores
4.129e+11 +801.9% 3.724e+12 ? 56% perf-stat.total.instructions
61.17 ? 3% -49.8 11.37 ? 16% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
60.62 ? 3% -49.4 11.26 ? 16% perf-profile.calltrace.cycles-pp.cpu_startup_entry.secondary_startup_64_no_verify
60.58 ? 3% -49.3 11.26 ? 16% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
59.76 ? 3% -48.6 11.19 ? 16% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
55.62 ? 2% -44.8 10.84 ? 17% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
55.13 ? 3% -44.4 10.73 ? 17% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
41.00 ? 4% -31.5 9.54 ? 20% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
40.57 ? 3% -31.0 9.53 ? 20% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
13.53 ? 4% -12.4 1.10 ? 9% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
14.04 ? 21% -12.1 1.96 ? 4% perf-profile.calltrace.cycles-pp.generic_perform_write.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work
13.80 ? 10% -12.1 1.73 ? 9% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
10.98 ? 5% -10.0 0.98 ? 9% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
11.01 ? 9% -9.6 1.41 ? 11% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
10.97 ? 9% -9.6 1.40 ? 11% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
10.78 ? 9% -9.4 1.38 ? 11% perf-profile.calltrace.cycles-pp.loop_process_work.process_one_work.worker_thread.kthread.ret_from_fork
10.60 ? 10% -9.2 1.35 ? 11% perf-profile.calltrace.cycles-pp.lo_write_simple.loop_process_work.process_one_work.worker_thread.kthread
10.32 ? 9% -9.0 1.31 ? 12% perf-profile.calltrace.cycles-pp.do_iter_write.lo_write_simple.loop_process_work.process_one_work.worker_thread
10.00 ? 9% -8.7 1.28 ? 12% perf-profile.calltrace.cycles-pp.do_iter_readv_writev.do_iter_write.lo_write_simple.loop_process_work.process_one_work
9.93 ? 9% -8.7 1.28 ? 12% perf-profile.calltrace.cycles-pp.generic_file_write_iter.do_iter_readv_writev.do_iter_write.lo_write_simple.loop_process_work
9.59 ? 9% -8.4 1.24 ? 13% perf-profile.calltrace.cycles-pp.__generic_file_write_iter.generic_file_write_iter.do_iter_readv_writev.do_iter_write.lo_write_simple
9.41 ? 9% -8.2 1.22 ? 13% perf-profile.calltrace.cycles-pp.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.do_iter_readv_writev.do_iter_write
9.02 ? 8% -8.1 0.95 ? 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
9.02 ? 8% -8.1 0.95 ? 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.21 ? 8% -7.3 0.88 ? 7% perf-profile.calltrace.cycles-pp.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.21 ? 8% -7.3 0.88 ? 7% perf-profile.calltrace.cycles-pp.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.21 ? 8% -7.3 0.88 ? 7% perf-profile.calltrace.cycles-pp.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.16 ? 9% -7.1 1.07 ? 14% perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.do_iter_readv_writev
7.36 ? 6% -6.7 0.62 ? 9% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
7.17 ? 7% -6.6 0.62 ? 9% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
5.18 ? 21% -4.5 0.72 ? 6% perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.ext4_buffered_write_iter.io_write.io_issue_sqe
5.08 ? 21% -4.4 0.70 ? 5% perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter_atomic.generic_perform_write.ext4_buffered_write_iter.io_write
5.05 ? 21% -4.4 0.70 ? 6% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.copy_page_from_iter_atomic.generic_perform_write.ext4_buffered_write_iter
4.69 ? 8% -4.3 0.36 ? 70% perf-profile.calltrace.cycles-pp.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64
5.02 ? 21% -4.3 0.75 ? 4% perf-profile.calltrace.cycles-pp.ext4_da_write_begin.generic_perform_write.ext4_buffered_write_iter.io_write.io_issue_sqe
0.00 +0.6 0.61 ? 2% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe
0.00 +2.0 2.03 ? 2% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write
28.19 ? 7% +59.2 87.44 ? 2% perf-profile.calltrace.cycles-pp.ret_from_fork
14.40 ? 21% +71.3 85.71 ? 2% perf-profile.calltrace.cycles-pp.io_wqe_worker.ret_from_fork
14.25 ? 21% +71.4 85.64 ? 2% perf-profile.calltrace.cycles-pp.io_worker_handle_work.io_wqe_worker.ret_from_fork
14.23 ? 21% +71.4 85.63 ? 2% perf-profile.calltrace.cycles-pp.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork
14.23 ? 21% +71.4 85.64 ? 2% perf-profile.calltrace.cycles-pp.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork
14.22 ? 21% +71.4 85.63 ? 2% perf-profile.calltrace.cycles-pp.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker
14.10 ? 21% +71.5 85.62 ? 2% perf-profile.calltrace.cycles-pp.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work
0.00 +80.9 80.92 ? 2% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write
0.00 +83.0 82.99 ? 2% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe
0.00 +83.6 83.63 ? 2% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work
61.17 ? 3% -49.8 11.37 ? 16% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
61.17 ? 3% -49.8 11.37 ? 16% perf-profile.children.cycles-pp.cpu_startup_entry
61.17 ? 3% -49.8 11.37 ? 16% perf-profile.children.cycles-pp.do_idle
60.34 ? 3% -49.0 11.30 ? 16% perf-profile.children.cycles-pp.cpuidle_idle_call
56.14 ? 3% -45.2 10.95 ? 17% perf-profile.children.cycles-pp.cpuidle_enter
56.08 ? 3% -45.1 10.94 ? 17% perf-profile.children.cycles-pp.cpuidle_enter_state
41.25 ? 4% -31.6 9.64 ? 20% perf-profile.children.cycles-pp.intel_idle
41.16 ? 4% -31.5 9.64 ? 20% perf-profile.children.cycles-pp.mwait_idle_with_hints
23.64 ? 10% -20.4 3.22 ? 5% perf-profile.children.cycles-pp.generic_perform_write
13.80 ? 10% -12.1 1.73 ? 9% perf-profile.children.cycles-pp.kthread
13.48 ? 8% -11.8 1.72 ? 9% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
13.41 ? 5% -11.6 1.80 ? 9% perf-profile.children.cycles-pp.copy_page_from_iter_atomic
11.71 ? 6% -10.2 1.55 ? 9% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
11.01 ? 9% -9.6 1.41 ? 11% perf-profile.children.cycles-pp.worker_thread
10.97 ? 9% -9.6 1.40 ? 11% perf-profile.children.cycles-pp.process_one_work
10.78 ? 9% -9.4 1.38 ? 11% perf-profile.children.cycles-pp.loop_process_work
10.61 ? 10% -9.3 1.35 ? 12% perf-profile.children.cycles-pp.lo_write_simple
10.19 ? 7% -9.1 1.12 ? 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
10.18 ? 7% -9.1 1.12 ? 5% perf-profile.children.cycles-pp.do_syscall_64
10.32 ? 9% -9.0 1.32 ? 12% perf-profile.children.cycles-pp.do_iter_write
10.08 ? 9% -8.8 1.32 ? 12% perf-profile.children.cycles-pp.generic_file_write_iter
10.00 ? 9% -8.7 1.28 ? 12% perf-profile.children.cycles-pp.do_iter_readv_writev
9.75 ? 9% -8.5 1.28 ? 13% perf-profile.children.cycles-pp.__generic_file_write_iter
8.21 ? 8% -7.3 0.88 ? 7% perf-profile.children.cycles-pp.__x64_sys_fadvise64
8.21 ? 8% -7.3 0.88 ? 7% perf-profile.children.cycles-pp.ksys_fadvise64_64
8.21 ? 8% -7.3 0.88 ? 7% perf-profile.children.cycles-pp.generic_fadvise
7.93 ? 6% -6.8 1.16 ? 10% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
7.76 ? 7% -6.6 1.15 ? 10% perf-profile.children.cycles-pp.hrtimer_interrupt
5.14 ? 21% -4.4 0.72 ? 5% perf-profile.children.cycles-pp.copyin
5.13 ? 21% -4.4 0.72 ? 5% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
5.02 ? 21% -4.3 0.75 ? 4% perf-profile.children.cycles-pp.ext4_da_write_begin
4.70 ? 8% -4.2 0.52 ? 9% perf-profile.children.cycles-pp.invalidate_mapping_pagevec
4.60 ? 10% -3.8 0.80 ? 15% perf-profile.children.cycles-pp.__hrtimer_run_queues
3.98 ? 9% -3.5 0.44 ? 7% perf-profile.children.cycles-pp.__softirqentry_text_start
3.54 ? 13% -3.2 0.30 ? 13% perf-profile.children.cycles-pp.menu_select
3.71 ? 15% -3.2 0.54 ? 3% perf-profile.children.cycles-pp.pagecache_get_page
3.51 ? 9% -3.2 0.36 ? 10% perf-profile.children.cycles-pp.__filemap_fdatawrite_range
3.51 ? 9% -3.2 0.36 ? 10% perf-profile.children.cycles-pp.filemap_fdatawrite_wbc
3.51 ? 9% -3.1 0.37 ? 6% perf-profile.children.cycles-pp.do_writepages
3.51 ? 9% -3.1 0.37 ? 6% perf-profile.children.cycles-pp.ext4_writepages
3.50 ? 9% -3.1 0.37 ? 6% perf-profile.children.cycles-pp.mpage_prepare_extent_to_map
3.67 ? 16% -3.1 0.54 ? 3% perf-profile.children.cycles-pp.__filemap_get_folio
2.87 ? 10% -2.6 0.29 ? 11% perf-profile.children.cycles-pp.mpage_process_page_bufs
3.08 ? 11% -2.4 0.66 ? 15% perf-profile.children.cycles-pp.tick_sched_timer
2.71 ? 52% -2.4 0.30 ? 23% perf-profile.children.cycles-pp.ktime_get
2.70 ? 10% -2.4 0.30 ? 5% perf-profile.children.cycles-pp.smpboot_thread_fn
2.66 ? 21% -2.3 0.33 ? 3% perf-profile.children.cycles-pp.generic_write_end
2.61 ? 11% -2.3 0.29 ? 6% perf-profile.children.cycles-pp.run_ksoftirqd
2.57 ? 11% -2.3 0.28 ? 7% perf-profile.children.cycles-pp.blk_complete_reqs
2.56 ? 11% -2.3 0.28 ? 7% perf-profile.children.cycles-pp.blk_mq_end_request
2.56 ? 11% -2.3 0.28 ? 7% perf-profile.children.cycles-pp.blk_update_request
2.54 ? 11% -2.3 0.28 ? 7% perf-profile.children.cycles-pp.ext4_end_bio
2.54 ? 11% -2.3 0.28 ? 7% perf-profile.children.cycles-pp.ext4_finish_bio
2.45 ? 9% -2.2 0.24 ? 9% perf-profile.children.cycles-pp.mpage_submit_page
2.48 ? 21% -2.2 0.30 ? 2% perf-profile.children.cycles-pp.__block_commit_write
2.40 ? 15% -1.8 0.58 ? 14% perf-profile.children.cycles-pp.tick_sched_handle
2.06 ? 34% -1.8 0.24 ? 15% perf-profile.children.cycles-pp.clockevents_program_event
2.26 ? 13% -1.7 0.57 ? 13% perf-profile.children.cycles-pp.update_process_times
1.80 ? 9% -1.6 0.17 ? 8% perf-profile.children.cycles-pp.ext4_bio_write_page
1.78 ? 11% -1.6 0.18 ? 7% perf-profile.children.cycles-pp.folio_end_writeback
1.86 ? 21% -1.6 0.28 ? 6% perf-profile.children.cycles-pp.filemap_add_folio
1.78 ? 14% -1.6 0.20 ? 13% perf-profile.children.cycles-pp.__irq_exit_rcu
1.80 ? 22% -1.5 0.26 ? 6% perf-profile.children.cycles-pp.ext4_block_write_begin
1.69 ? 12% -1.5 0.20 ? 17% perf-profile.children.cycles-pp.mapping_evict_folio
1.63 ? 22% -1.5 0.14 ? 12% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
1.68 ? 12% -1.5 0.20 ? 17% perf-profile.children.cycles-pp.filemap_release_folio
1.55 ? 10% -1.4 0.16 ? 7% perf-profile.children.cycles-pp.__folio_end_writeback
1.50 ? 6% -1.3 0.15 ? 5% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.41 ? 11% -1.3 0.14 ? 8% perf-profile.children.cycles-pp.remove_mapping
1.38 ? 11% -1.2 0.14 ? 7% perf-profile.children.cycles-pp.__remove_mapping
1.28 ? 14% -1.1 0.14 ? 10% perf-profile.children.cycles-pp.release_pages
1.24 ? 27% -1.1 0.11 ? 17% perf-profile.children.cycles-pp.tick_nohz_next_event
1.26 ? 21% -1.1 0.19 ? 5% perf-profile.children.cycles-pp.__filemap_add_folio
1.26 ? 9% -1.1 0.19 ? 4% perf-profile.children.cycles-pp.__schedule
1.19 ? 14% -1.1 0.13 ? 13% perf-profile.children.cycles-pp.__pagevec_release
1.15 ? 15% -1.0 0.13 ? 25% perf-profile.children.cycles-pp.try_to_free_buffers
1.11 ? 16% -1.0 0.10 ? 15% perf-profile.children.cycles-pp.__folio_start_writeback
1.17 ? 24% -1.0 0.18 ? 8% perf-profile.children.cycles-pp.create_empty_buffers
1.35 ? 8% -1.0 0.40 ? 13% perf-profile.children.cycles-pp.perf_tp_event
1.42 ? 13% -0.9 0.47 ? 13% perf-profile.children.cycles-pp.scheduler_tick
1.05 ? 7% -0.9 0.12 ? 12% perf-profile.children.cycles-pp.__mod_lruvec_page_state
1.31 ? 8% -0.9 0.38 ? 13% perf-profile.children.cycles-pp.perf_event_output_forward
1.31 ? 8% -0.9 0.39 ? 13% perf-profile.children.cycles-pp.__perf_event_overflow
1.06 ? 6% -0.9 0.14 ? 9% perf-profile.children.cycles-pp.xas_load
1.17 ? 8% -0.8 0.34 ? 14% perf-profile.children.cycles-pp.perf_prepare_sample
0.96 ? 22% -0.8 0.14 ? 6% perf-profile.children.cycles-pp.mark_buffer_dirty
0.94 ? 21% -0.8 0.15 ? 6% perf-profile.children.cycles-pp.folio_alloc
1.12 ? 8% -0.8 0.32 ? 15% perf-profile.children.cycles-pp.perf_callchain
1.11 ? 8% -0.8 0.32 ? 15% perf-profile.children.cycles-pp.get_perf_callchain
0.92 ? 19% -0.8 0.15 ? 5% perf-profile.children.cycles-pp.__alloc_pages
0.90 ? 18% -0.8 0.14 ? 9% perf-profile.children.cycles-pp.fault_in_iov_iter_readable
0.86 ? 10% -0.8 0.09 ? 11% perf-profile.children.cycles-pp._raw_spin_lock
0.89 ? 27% -0.7 0.14 ? 7% perf-profile.children.cycles-pp.alloc_page_buffers
0.80 ? 10% -0.7 0.07 ? 15% perf-profile.children.cycles-pp.irq_work_run_list
0.81 ? 17% -0.7 0.07 ? 21% perf-profile.children.cycles-pp.irq_enter_rcu
0.86 ? 18% -0.7 0.13 ? 9% perf-profile.children.cycles-pp.fault_in_readable
0.88 ? 8% -0.7 0.16 ? 4% perf-profile.children.cycles-pp.schedule
0.84 ? 29% -0.7 0.14 ? 9% perf-profile.children.cycles-pp.kmem_cache_alloc
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.asm_sysvec_irq_work
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.sysvec_irq_work
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.__sysvec_irq_work
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.irq_work_single
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.irq_work_run
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp._printk
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.vprintk_emit
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.console_unlock
0.78 ? 9% -0.7 0.07 ? 16% perf-profile.children.cycles-pp.call_console_drivers
0.77 ? 19% -0.7 0.06 ? 47% perf-profile.children.cycles-pp.tick_irq_enter
0.84 ? 29% -0.7 0.14 ? 9% perf-profile.children.cycles-pp.alloc_buffer_head
0.80 ? 16% -0.7 0.10 ? 10% perf-profile.children.cycles-pp.shmem_write_begin
0.78 ? 24% -0.7 0.09 ? 30% perf-profile.children.cycles-pp.kmem_cache_free
0.75 ? 10% -0.7 0.06 ? 11% perf-profile.children.cycles-pp.serial8250_console_write
0.75 ? 10% -0.7 0.06 ? 11% perf-profile.children.cycles-pp.uart_console_write
0.77 ? 14% -0.7 0.09 ? 8% perf-profile.children.cycles-pp.lapic_next_deadline
0.76 ? 9% -0.7 0.08 ? 12% perf-profile.children.cycles-pp.__filemap_remove_folio
0.76 ? 17% -0.7 0.10 ? 10% perf-profile.children.cycles-pp.shmem_getpage_gfp
0.72 ? 55% -0.7 0.06 ? 47% perf-profile.children.cycles-pp.tick_nohz_irq_exit
0.72 ? 9% -0.7 0.06 ? 11% perf-profile.children.cycles-pp.wait_for_xmitr
0.72 ? 10% -0.7 0.06 ? 11% perf-profile.children.cycles-pp.serial8250_console_putchar
0.73 ? 23% -0.6 0.08 ? 29% perf-profile.children.cycles-pp.free_buffer_head
0.70 ? 19% -0.6 0.06 ? 19% perf-profile.children.cycles-pp.rebalance_domains
0.69 ? 14% -0.6 0.06 ? 17% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.73 ? 21% -0.6 0.12 ? 6% perf-profile.children.cycles-pp.get_page_from_freelist
0.71 ? 8% -0.6 0.09 ? 5% perf-profile.children.cycles-pp.native_irq_return_iret
0.64 ? 21% -0.6 0.06 ? 14% perf-profile.children.cycles-pp.free_unref_page_list
0.67 ? 23% -0.6 0.10 ? 10% perf-profile.children.cycles-pp.__folio_mark_dirty
0.59 ? 8% -0.6 0.02 ? 99% perf-profile.children.cycles-pp.io_serial_in
0.62 ? 15% -0.6 0.06 ? 11% perf-profile.children.cycles-pp.folio_clear_dirty_for_io
0.62 ? 14% -0.6 0.06 ? 11% perf-profile.children.cycles-pp.sched_clock_cpu
0.60 ? 20% -0.5 0.09 ? 7% perf-profile.children.cycles-pp.folio_add_lru
0.54 ? 16% -0.5 0.06 ? 9% perf-profile.children.cycles-pp.native_sched_clock
0.50 ? 7% -0.5 0.04 ? 71% perf-profile.children.cycles-pp.read_tsc
0.50 ? 10% -0.4 0.06 ? 6% perf-profile.children.cycles-pp.__might_resched
0.55 ? 38% -0.4 0.11 ? 25% perf-profile.children.cycles-pp.start_kernel
0.50 ? 17% -0.4 0.07 ? 10% perf-profile.children.cycles-pp.xas_store
0.49 ? 11% -0.4 0.06 ? 11% perf-profile.children.cycles-pp.__mod_lruvec_state
0.51 ? 19% -0.4 0.08 ? 8% perf-profile.children.cycles-pp.__pagevec_lru_add
0.54 ? 18% -0.4 0.11 ? 11% perf-profile.children.cycles-pp.load_balance
0.47 ? 49% -0.4 0.05 ? 46% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.50 ? 21% -0.4 0.08 ? 5% perf-profile.children.cycles-pp.rmqueue
0.46 ? 14% -0.4 0.04 ? 44% perf-profile.children.cycles-pp.irqtime_account_irq
0.47 ? 22% -0.4 0.06 ? 7% perf-profile.children.cycles-pp.ext4_da_get_block_prep
0.43 ? 34% -0.4 0.03 ?102% perf-profile.children.cycles-pp.memcg_slab_free_hook
0.46 ? 8% -0.4 0.06 ? 17% perf-profile.children.cycles-pp.jbd2_journal_try_to_free_buffers
0.47 ? 12% -0.4 0.08 ? 13% perf-profile.children.cycles-pp.perf_callchain_user
0.44 ? 7% -0.4 0.06 ? 8% perf-profile.children.cycles-pp.ksys_read
0.62 ? 7% -0.4 0.24 ? 18% perf-profile.children.cycles-pp.perf_callchain_kernel
0.44 ? 7% -0.4 0.06 ? 8% perf-profile.children.cycles-pp.vfs_read
0.43 ? 8% -0.4 0.06 ? 13% perf-profile.children.cycles-pp.jbd2_journal_grab_journal_head
0.39 ? 7% -0.4 0.03 ?100% perf-profile.children.cycles-pp.free_pcppages_bulk
0.38 ? 8% -0.4 0.02 ? 99% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.40 ? 13% -0.4 0.05 ? 7% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.43 ? 9% -0.3 0.08 ? 8% perf-profile.children.cycles-pp.try_to_wake_up
0.38 ? 10% -0.3 0.04 ? 44% perf-profile.children.cycles-pp.__mod_node_page_state
0.41 ? 13% -0.3 0.07 ? 15% perf-profile.children.cycles-pp.__get_user_nocheck_8
0.40 ? 20% -0.3 0.08 ? 11% perf-profile.children.cycles-pp.find_busiest_group
0.65 ? 9% -0.3 0.33 ? 17% perf-profile.children.cycles-pp.update_curr
0.51 ? 7% -0.3 0.19 ? 17% perf-profile.children.cycles-pp.unwind_next_frame
0.38 ? 21% -0.3 0.06 ? 9% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.37 ? 20% -0.3 0.06 ? 9% perf-profile.children.cycles-pp.__pagevec_lru_add_fn
0.63 ? 9% -0.3 0.32 ? 17% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
0.38 ? 17% -0.3 0.08 ? 12% perf-profile.children.cycles-pp.update_sd_lb_stats
0.38 ? 6% -0.3 0.08 ? 8% perf-profile.children.cycles-pp.__libc_start_main
0.32 ? 12% -0.3 0.02 ? 99% perf-profile.children.cycles-pp.read
0.32 ? 21% -0.3 0.04 ? 45% perf-profile.children.cycles-pp.rmqueue_bulk
0.31 ? 42% -0.3 0.04 ? 71% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
0.26 ? 25% -0.2 0.02 ? 99% perf-profile.children.cycles-pp.folio_account_dirtied
0.29 ? 20% -0.2 0.07 ? 14% perf-profile.children.cycles-pp.update_sg_lb_stats
0.27 ? 7% -0.2 0.08 ? 7% perf-profile.children.cycles-pp.asm_exc_page_fault
0.24 ? 10% -0.2 0.08 ? 22% perf-profile.children.cycles-pp.__unwind_start
0.18 ? 16% -0.2 0.03 ?100% perf-profile.children.cycles-pp.__orc_find
0.16 ? 16% -0.1 0.04 ? 71% perf-profile.children.cycles-pp.ksys_write
0.16 ? 16% -0.1 0.04 ? 71% perf-profile.children.cycles-pp.vfs_write
0.16 ? 16% -0.1 0.04 ? 71% perf-profile.children.cycles-pp.new_sync_write
0.15 ? 17% -0.1 0.02 ? 99% perf-profile.children.cycles-pp.__libc_write
0.17 ? 18% -0.1 0.07 ? 6% perf-profile.children.cycles-pp.schedule_timeout
0.15 ? 24% -0.1 0.09 ? 13% perf-profile.children.cycles-pp.pick_next_task_fair
0.00 +2.6 2.64 ? 2% perf-profile.children.cycles-pp.rwsem_spin_on_owner
28.20 ? 7% +59.2 87.44 ? 2% perf-profile.children.cycles-pp.ret_from_fork
14.40 ? 21% +71.3 85.71 ? 2% perf-profile.children.cycles-pp.io_wqe_worker
14.25 ? 21% +71.4 85.64 ? 2% perf-profile.children.cycles-pp.io_worker_handle_work
14.23 ? 21% +71.4 85.63 ? 2% perf-profile.children.cycles-pp.io_issue_sqe
14.23 ? 21% +71.4 85.64 ? 2% perf-profile.children.cycles-pp.io_wq_submit_work
14.22 ? 21% +71.4 85.63 ? 2% perf-profile.children.cycles-pp.io_write
14.10 ? 21% +71.5 85.62 ? 2% perf-profile.children.cycles-pp.ext4_buffered_write_iter
0.00 +80.9 80.95 ? 2% perf-profile.children.cycles-pp.osq_lock
0.00 +83.0 83.00 ? 2% perf-profile.children.cycles-pp.rwsem_optimistic_spin
0.00 +83.6 83.63 ? 2% perf-profile.children.cycles-pp.rwsem_down_write_slowpath
40.90 ? 3% -31.3 9.63 ? 20% perf-profile.self.cycles-pp.mwait_idle_with_hints
8.22 ? 9% -7.1 1.08 ? 14% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
5.02 ? 20% -4.3 0.71 ? 5% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
2.31 ? 61% -2.0 0.26 ? 25% perf-profile.self.cycles-pp.ktime_get
1.89 ? 12% -1.7 0.17 ? 8% perf-profile.self.cycles-pp.cpuidle_enter_state
1.64 ? 15% -1.5 0.14 ? 20% perf-profile.self.cycles-pp.menu_select
1.42 ? 20% -1.3 0.15 ? 7% perf-profile.self.cycles-pp.__block_commit_write
1.26 ? 7% -1.1 0.14 ? 6% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.86 ? 5% -0.7 0.12 ? 13% perf-profile.self.cycles-pp.xas_load
0.83 ? 17% -0.7 0.12 ? 10% perf-profile.self.cycles-pp.fault_in_readable
0.77 ? 14% -0.7 0.09 ? 8% perf-profile.self.cycles-pp.lapic_next_deadline
0.69 ? 9% -0.6 0.08 ? 16% perf-profile.self.cycles-pp._raw_spin_lock
0.70 ? 8% -0.6 0.09 ? 5% perf-profile.self.cycles-pp.native_irq_return_iret
0.59 ? 8% -0.6 0.02 ? 99% perf-profile.self.cycles-pp.io_serial_in
0.50 ? 6% -0.5 0.03 ?100% perf-profile.self.cycles-pp.read_tsc
0.51 ? 16% -0.5 0.06 ? 9% perf-profile.self.cycles-pp.native_sched_clock
0.46 ? 10% -0.4 0.06 ? 8% perf-profile.self.cycles-pp.__might_resched
0.43 ? 11% -0.4 0.03 ? 70% perf-profile.self.cycles-pp.ext4_bio_write_page
0.42 ? 8% -0.4 0.05 ? 8% perf-profile.self.cycles-pp.jbd2_journal_grab_journal_head
0.38 ? 11% -0.3 0.03 ? 70% perf-profile.self.cycles-pp.__mod_node_page_state
0.36 ? 15% -0.3 0.02 ? 99% perf-profile.self.cycles-pp.release_pages
0.37 ? 62% -0.3 0.05 ? 45% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
0.21 ? 19% -0.2 0.06 ? 8% perf-profile.self.cycles-pp.update_sg_lb_stats
0.18 ? 16% -0.2 0.03 ?100% perf-profile.self.cycles-pp.__orc_find
0.19 ? 8% -0.1 0.08 ? 15% perf-profile.self.cycles-pp.unwind_next_frame
0.00 +2.6 2.62 ? 2% perf-profile.self.cycles-pp.rwsem_spin_on_owner
0.00 +80.4 80.42 ? 2% perf-profile.self.cycles-pp.osq_lock
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
On 5/27/22 3:24 AM, kernel test robot wrote:
>
>
> Greeting,
>
> FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
>
>
> commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> in testcase: phoronix-test-suite
> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
> with following parameters:
>
> test: fio-1.14.1
> option_a: Sequential Write
> option_b: IO_uring
> option_c: Yes
> option_d: Yes
> option_e: 1MB
> option_f: Default Test Directory
> cpufreq_governor: performance
> ucode: 0x500320a
>
> test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
> test-url: http://www.phoronix-test-suite.com/
I'm a bit skeptical on this, but I'd like to try and run the test case.
Since it's just a fio test case, why can't I find it somewhere? Seems
very convoluted to have to setup lkp-tests just for this. Besides, I
tried, but it doesn't work on aarch64...
--
Jens Axboe
Hi Jens Axboe,
On Fri, May 27, 2022 at 07:50:27AM -0600, Jens Axboe wrote:
> On 5/27/22 3:24 AM, kernel test robot wrote:
> >
> >
> > Greeting,
> >
> > FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
> >
> >
> > commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > in testcase: phoronix-test-suite
> > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
> > with following parameters:
> >
> > test: fio-1.14.1
> > option_a: Sequential Write
> > option_b: IO_uring
> > option_c: Yes
> > option_d: Yes
> > option_e: 1MB
> > option_f: Default Test Directory
> > cpufreq_governor: performance
> > ucode: 0x500320a
> >
> > test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
> > test-url: http://www.phoronix-test-suite.com/
>
> I'm a bit skeptical on this, but I'd like to try and run the test case.
> Since it's just a fio test case, why can't I find it somewhere? Seems
> very convoluted to have to setup lkp-tests just for this. Besides, I
> tried, but it doesn't work on aarch64...
we just follow doc on http://www.phoronix-test-suite.com/ to run tests in PTS
framework, so you don't need to care about lkp-tests.
and for this fio test, the parameters we used just as:
test: fio-1.14.1
option_a: Sequential Write
option_b: IO_uring
option_c: Yes
option_d: Yes
option_e: 1MB
option_f: Default Test Directory
and yeah, we most focus on x86_64 and don't support lkp-tests to run on
aarch64...
if you have some idea that we could run other tests, could you let us know?
it will be great pleasure to run more tests to check for us if we can support.
Thanks a lot!
>
> --
> Jens Axboe
>
On 5/27/2022 9:50 PM, Jens Axboe wrote:
> On 5/27/22 3:24 AM, kernel test robot wrote:
>>
>>
>> Greeting,
>>
>> FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
>>
>>
>> commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: phoronix-test-suite
>> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
>> with following parameters:
>>
>> test: fio-1.14.1
>> option_a: Sequential Write
>> option_b: IO_uring
>> option_c: Yes
>> option_d: Yes
>> option_e: 1MB
>> option_f: Default Test Directory
>> cpufreq_governor: performance
>> ucode: 0x500320a
>>
>> test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
>> test-url: http://www.phoronix-test-suite.com/
>
> I'm a bit skeptical on this, but I'd like to try and run the test case.
> Since it's just a fio test case, why can't I find it somewhere? Seems
> very convoluted to have to setup lkp-tests just for this. Besides, I
> tried, but it doesn't work on aarch64...
>
We re-run the test and still could get exactly same test result. We noticed
following info from perf profile:
14.40 ± 21% +71.3 85.71 ± 2% perf-profile.calltrace.cycles-pp.io_wqe_worker.ret_from_fork
14.25 ± 21% +71.4 85.64 ± 2% perf-profile.calltrace.cycles-pp.io_worker_handle_work.io_wqe_worker.ret_from_fork
14.23 ± 21% +71.4 85.63 ± 2% perf-profile.calltrace.cycles-pp.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork
14.23 ± 21% +71.4 85.64 ± 2% perf-profile.calltrace.cycles-pp.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork
14.22 ± 21% +71.4 85.63 ± 2% perf-profile.calltrace.cycles-pp.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker
14.10 ± 21% +71.5 85.62 ± 2% perf-profile.calltrace.cycles-pp.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work
0.00 +80.9 80.92 ± 2% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write
0.00 +83.0 82.99 ± 2% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe
0.00 +83.6 83.63 ± 2% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work
The above operations takes more time with the patch applied.
It looks like the inode lock contention raised a lot with
the patch.
Frankly, we can't connect this behavior with the patch. Just
list here for your information. Thanks.
Regards
Yin, Fengwei
Hi Jens,
On 5/27/2022 9:50 PM, Jens Axboe wrote:
> I'm a bit skeptical on this, but I'd like to try and run the test case.
> Since it's just a fio test case, why can't I find it somewhere? Seems
> very convoluted to have to setup lkp-tests just for this. Besides, I
> tried, but it doesn't work on aarch64...
Recheck this regression report. The regression could be reproduced if
the following config file is used with fio (tag: fio-3.25) :
[global]
rw=write
ioengine=io_uring
iodepth=64
size=1g
direct=1
buffered=1
startdelay=5
force_async=4
ramp_time=5
runtime=20
time_based
clat_percentiles=0
disable_lat=1
disable_clat=1
disable_slat=1
filename=test_fiofile
[test]
name=test
bs=1M
stonewall
Just FYI, a small change to commit: 584b0180f0f4d67d7145950fe68c625f06c88b10:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 969f65de9972..616d857f8fc6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3181,8 +3181,13 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct kiocb *kiocb = &req->rw.kiocb;
unsigned ioprio;
+ struct file *file = req->file;
int ret;
+ if (likely(file && (file->f_mode & FMODE_WRITE)))
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
kiocb->ki_pos = READ_ONCE(sqe->off);
ioprio = READ_ONCE(sqe->ioprio);
could make regression gone. No idea how req->flags impact the write performance. Thanks.
Regards
Yin, Fengwei
On 7/12/22 2:06 AM, Yin Fengwei wrote:
> Hi Jens,
>
> On 5/27/2022 9:50 PM, Jens Axboe wrote:
>> I'm a bit skeptical on this, but I'd like to try and run the test case.
>> Since it's just a fio test case, why can't I find it somewhere? Seems
>> very convoluted to have to setup lkp-tests just for this. Besides, I
>> tried, but it doesn't work on aarch64...
> Recheck this regression report. The regression could be reproduced if
> the following config file is used with fio (tag: fio-3.25) :
>
> [global]
> rw=write
> ioengine=io_uring
> iodepth=64
> size=1g
> direct=1
> buffered=1
> startdelay=5
> force_async=4
> ramp_time=5
> runtime=20
> time_based
> clat_percentiles=0
> disable_lat=1
> disable_clat=1
> disable_slat=1
> filename=test_fiofile
> [test]
> name=test
> bs=1M
> stonewall
>
> Just FYI, a small change to commit: 584b0180f0f4d67d7145950fe68c625f06c88b10:
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 969f65de9972..616d857f8fc6 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3181,8 +3181,13 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> {
> struct kiocb *kiocb = &req->rw.kiocb;
> unsigned ioprio;
> + struct file *file = req->file;
> int ret;
>
> + if (likely(file && (file->f_mode & FMODE_WRITE)))
> + if (!io_req_ffs_set(req))
> + req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
> +
> kiocb->ki_pos = READ_ONCE(sqe->off);
>
> ioprio = READ_ONCE(sqe->ioprio);
>
> could make regression gone. No idea how req->flags impact the write
> performance. Thanks.
I can't really explain that either, at least not immediately. I tried
running with and without that patch, and don't see any difference here.
In terms of making this more obvious, does the below also fix it for
you?
And what filesystem is this being run on?
diff --git a/fs/io_uring.c b/fs/io_uring.c
index a01ea49f3017..797fad99780d 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4269,9 +4269,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
if (unlikely(!file || !(file->f_mode & mode)))
return -EBADF;
- if (!io_req_ffs_set(req))
- req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
kiocb->ki_flags = iocb_flags(file);
ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
if (unlikely(ret))
@@ -8309,7 +8306,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
else
req->file = io_file_get_normal(req, req->cqe.fd);
- return !!req->file;
+ if (unlikely(!req->file))
+ return false;
+
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
+ return true;
}
static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
--
Jens Axboe
On 7/15/2022 11:58 PM, Jens Axboe wrote:
> I can't really explain that either, at least not immediately. I tried
> running with and without that patch, and don't see any difference here.
> In terms of making this more obvious, does the below also fix it for
> you?
I will try the fix and let you know the result.
>
> And what filesystem is this being run on?
I am using ext4 and LKP are also using ext4. Thanks.
Regards
Yin, Fengwei
>
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index a01ea49f3017..797fad99780d 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4269,9 +4269,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
> if (unlikely(!file || !(file->f_mode & mode)))
> return -EBADF;
>
> - if (!io_req_ffs_set(req))
> - req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
> -
> kiocb->ki_flags = iocb_flags(file);
> ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
> if (unlikely(ret))
> @@ -8309,7 +8306,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
> else
> req->file = io_file_get_normal(req, req->cqe.fd);
>
> - return !!req->file;
> + if (unlikely(!req->file))
> + return false;
> +
> + if (!io_req_ffs_set(req))
> + req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
> +
> + return true;
> }
>
> static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
On 7/17/22 6:58 PM, Yin Fengwei wrote:
>
>
> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>> I can't really explain that either, at least not immediately. I tried
>> running with and without that patch, and don't see any difference here.
>> In terms of making this more obvious, does the below also fix it for
>> you?
> I will try the fix and let you know the result.
>
>>
>> And what filesystem is this being run on?
> I am using ext4 and LKP are also using ext4. Thanks.
Thanks, I'll try ext4 as well (was on XFS).
--
Jens Axboe
Hi Jens,
On 7/15/2022 11:58 PM, Jens Axboe wrote:
> In terms of making this more obvious, does the below also fix it for
> you?
The regression is still there after applied the change you posted.
Your change can't be applied to v5.18 (the latest commit on master branch).
I changed it a little bit to be applied:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e0823f58f7959..0bf7f3d18d46e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3762,9 +3762,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
if (unlikely(!file || !(file->f_mode & mode)))
return -EBADF;
- if (!io_req_ffs_set(req))
- req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
kiocb->ki_flags = iocb_flags(file);
ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
if (unlikely(ret))
@@ -7114,8 +7111,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
req->file = io_file_get_fixed(req, req->fd, issue_flags);
else
req->file = io_file_get_normal(req, req->fd);
- if (req->file)
+ if (req->file) {
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(req->file) <<
+ REQ_F_SUPPORT_NOWAIT_BIT;
+
return true;
+ }
req_set_fail(req);
req->result = -EBADF;
Regards
Yin, Fengwei
On 7/17/22 9:30 PM, Yin Fengwei wrote:
> Hi Jens,
>
> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>> In terms of making this more obvious, does the below also fix it for
>> you?
>
> The regression is still there after applied the change you posted.
Still don't see the regression here, using ext4. I get about 1020-1045
IOPS with or without the patch you sent.
This is running it in a vm, and the storage device is nvme. What is
hosting your ext4 fs?
--
Jens Axboe
On 7/19/2022 12:27 AM, Jens Axboe wrote:
> On 7/17/22 9:30 PM, Yin Fengwei wrote:
>> Hi Jens,
>>
>> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>>> In terms of making this more obvious, does the below also fix it for
>>> you?
>>
>> The regression is still there after applied the change you posted.
>
> Still don't see the regression here, using ext4. I get about 1020-1045
> IOPS with or without the patch you sent.
>
> This is running it in a vm, and the storage device is nvme. What is
> hosting your ext4 fs?
>
My local testing system is also a vm with SATA disk. LKP test platform
is a native one with SATA disk.
I could reproduce the regression on both environment. I will try to use
nvme to host my local vm disk and check whether we could see something
different.
Regards
Yin, Fengwei
Hi Jens,
On 7/19/2022 12:27 AM, Jens Axboe wrote:
> On 7/17/22 9:30 PM, Yin Fengwei wrote:
>> Hi Jens,
>>
>> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>>> In terms of making this more obvious, does the below also fix it for
>>> you?
>>
>> The regression is still there after applied the change you posted.
>
> Still don't see the regression here, using ext4. I get about 1020-1045
> IOPS with or without the patch you sent.
>
> This is running it in a vm, and the storage device is nvme. What is
> hosting your ext4 fs?
Just did more test with vm. The regression can't be reproduced with latest
code (I tried the tag v5.19-rc7) whatever the underneath storage is SATA
or NVME.
But the regression and the debugging patch from me could be reproduced
on both SATA and NVME if use commit 584b0180f0f4d6 as base commit
(584b0180f0f4d6 vs 584b0180f0f4d6 with my debugging patch).
Here is the test result I got:
NVME as host storage:
5.19.0-rc7:
write: IOPS=933, BW=937MiB/s (982MB/s)(18.3GiB/20020msec); 0 zone resets
write: IOPS=993, BW=996MiB/s (1045MB/s)(19.5GiB/20020msec); 0 zone resets
write: IOPS=1005, BW=1009MiB/s (1058MB/s)(19.7GiB/20020msec); 0 zone resets
write: IOPS=985, BW=989MiB/s (1037MB/s)(19.3GiB/20020msec); 0 zone resets
write: IOPS=1020, BW=1024MiB/s (1073MB/s)(20.0GiB/20020msec); 0 zone resets
5.19.0-rc7 with my debugging patch:
write: IOPS=988, BW=992MiB/s (1040MB/s)(19.7GiB/20384msec); 0 zone resets
write: IOPS=995, BW=998MiB/s (1047MB/s)(20.1GiB/20574msec); 0 zone resets
write: IOPS=996, BW=1000MiB/s (1048MB/s)(19.5GiB/20020msec); 0 zone resets
write: IOPS=995, BW=998MiB/s (1047MB/s)(19.5GiB/20020msec); 0 zone resets
write: IOPS=1006, BW=1009MiB/s (1058MB/s)(19.7GiB/20019msec); 0 zone resets
584b0180f0:
write: IOPS=1004, BW=1008MiB/s (1057MB/s)(19.7GiB/20020msec); 0 zone resets
write: IOPS=968, BW=971MiB/s (1018MB/s)(19.4GiB/20468msec); 0 zone resets
write: IOPS=982, BW=986MiB/s (1033MB/s)(19.3GiB/20020msec); 0 zone resets
write: IOPS=1000, BW=1004MiB/s (1053MB/s)(20.1GiB/20461msec); 0 zone resets
write: IOPS=903, BW=906MiB/s (950MB/s)(18.1GiB/20419msec); 0 zone resets
584b0180f0 with my debugging the patch:
write: IOPS=1073, BW=1076MiB/s (1129MB/s)(21.1GiB/20036msec); 0 zone resets
write: IOPS=1131, BW=1135MiB/s (1190MB/s)(22.2GiB/20022msec); 0 zone resets
write: IOPS=1122, BW=1126MiB/s (1180MB/s)(22.1GiB/20071msec); 0 zone resets
write: IOPS=1071, BW=1075MiB/s (1127MB/s)(21.1GiB/20071msec); 0 zone resets
write: IOPS=1049, BW=1053MiB/s (1104MB/s)(21.1GiB/20482msec); 0 zone resets
SATA disk as host storage:
5.19.0-rc7:
write: IOPS=624, BW=627MiB/s (658MB/s)(12.3GiB/20023msec); 0 zone resets
write: IOPS=655, BW=658MiB/s (690MB/s)(12.9GiB/20021msec); 0 zone resets
write: IOPS=596, BW=600MiB/s (629MB/s)(12.1GiB/20586msec); 0 zone resets
write: IOPS=647, BW=650MiB/s (682MB/s)(12.7GiB/20020msec); 0 zone resets
write: IOPS=591, BW=594MiB/s (623MB/s)(12.1GiB/20787msec); 0 zone resets
5.19.0-rc7 with my debugging patch:
write: IOPS=633, BW=637MiB/s (668MB/s)(12.6GiB/20201msec); 0 zone resets
write: IOPS=614, BW=617MiB/s (647MB/s)(13.1GiB/21667msec); 0 zone resets
write: IOPS=653, BW=657MiB/s (689MB/s)(12.8GiB/20020msec); 0 zone resets
write: IOPS=618, BW=622MiB/s (652MB/s)(12.2GiB/20033msec); 0 zone resets
write: IOPS=604, BW=608MiB/s (638MB/s)(12.1GiB/20314msec); 0 zone resets
584b0180f0:
write: IOPS=635, BW=638MiB/s (669MB/s)(12.5GiB/20020msec); 0 zone resets
write: IOPS=649, BW=652MiB/s (684MB/s)(12.8GiB/20066msec); 0 zone resets
write: IOPS=639, BW=642MiB/s (674MB/s)(13.1GiB/20818msec); 0 zone resets
584b0180f0 with my debugging patch:
write: IOPS=850, BW=853MiB/s (895MB/s)(17.1GiB/20474msec); 0 zone resets
write: IOPS=738, BW=742MiB/s (778MB/s)(15.1GiB/20787msec); 0 zone resets
write: IOPS=751, BW=755MiB/s (792MB/s)(15.1GiB/20432msec); 0 zone resets
Regards
Yin, Fengwei
On 7/18/22 8:16 PM, Yin Fengwei wrote:
> Hi Jens,
>
> On 7/19/2022 12:27 AM, Jens Axboe wrote:
>> On 7/17/22 9:30 PM, Yin Fengwei wrote:
>>> Hi Jens,
>>>
>>> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>>>> In terms of making this more obvious, does the below also fix it for
>>>> you?
>>>
>>> The regression is still there after applied the change you posted.
>>
>> Still don't see the regression here, using ext4. I get about 1020-1045
>> IOPS with or without the patch you sent.
>>
>> This is running it in a vm, and the storage device is nvme. What is
>> hosting your ext4 fs?
> Just did more test with vm. The regression can't be reproduced with latest
> code (I tried the tag v5.19-rc7) whatever the underneath storage is SATA
> or NVME.
>
> But the regression and the debugging patch from me could be reproduced
> on both SATA and NVME if use commit 584b0180f0f4d6 as base commit
> (584b0180f0f4d6 vs 584b0180f0f4d6 with my debugging patch).
>
>
> Here is the test result I got:
> NVME as host storage:
> 5.19.0-rc7:
> write: IOPS=933, BW=937MiB/s (982MB/s)(18.3GiB/20020msec); 0 zone resets
> write: IOPS=993, BW=996MiB/s (1045MB/s)(19.5GiB/20020msec); 0 zone resets
> write: IOPS=1005, BW=1009MiB/s (1058MB/s)(19.7GiB/20020msec); 0 zone resets
> write: IOPS=985, BW=989MiB/s (1037MB/s)(19.3GiB/20020msec); 0 zone resets
> write: IOPS=1020, BW=1024MiB/s (1073MB/s)(20.0GiB/20020msec); 0 zone resets
>
> 5.19.0-rc7 with my debugging patch:
> write: IOPS=988, BW=992MiB/s (1040MB/s)(19.7GiB/20384msec); 0 zone resets
> write: IOPS=995, BW=998MiB/s (1047MB/s)(20.1GiB/20574msec); 0 zone resets
> write: IOPS=996, BW=1000MiB/s (1048MB/s)(19.5GiB/20020msec); 0 zone resets
> write: IOPS=995, BW=998MiB/s (1047MB/s)(19.5GiB/20020msec); 0 zone resets
> write: IOPS=1006, BW=1009MiB/s (1058MB/s)(19.7GiB/20019msec); 0 zone resets
These two basically look identical, which may be why I get the same with
and without your patch. I don't think it makes a difference for this.
Curious how it came about?
> 584b0180f0:
> write: IOPS=1004, BW=1008MiB/s (1057MB/s)(19.7GiB/20020msec); 0 zone resets
> write: IOPS=968, BW=971MiB/s (1018MB/s)(19.4GiB/20468msec); 0 zone resets
> write: IOPS=982, BW=986MiB/s (1033MB/s)(19.3GiB/20020msec); 0 zone resets
> write: IOPS=1000, BW=1004MiB/s (1053MB/s)(20.1GiB/20461msec); 0 zone resets
> write: IOPS=903, BW=906MiB/s (950MB/s)(18.1GiB/20419msec); 0 zone resets
>
> 584b0180f0 with my debugging the patch:
> write: IOPS=1073, BW=1076MiB/s (1129MB/s)(21.1GiB/20036msec); 0 zone resets
> write: IOPS=1131, BW=1135MiB/s (1190MB/s)(22.2GiB/20022msec); 0 zone resets
> write: IOPS=1122, BW=1126MiB/s (1180MB/s)(22.1GiB/20071msec); 0 zone resets
> write: IOPS=1071, BW=1075MiB/s (1127MB/s)(21.1GiB/20071msec); 0 zone resets
> write: IOPS=1049, BW=1053MiB/s (1104MB/s)(21.1GiB/20482msec); 0 zone resets
Last one looks like it may be faster indeed. I do wonder if this is
something else, though. There's no reason why -rc7 with that same patch
applied should be any different than 584b0180f0 with it.
these resu
>
>
> SATA disk as host storage:
> 5.19.0-rc7:
> write: IOPS=624, BW=627MiB/s (658MB/s)(12.3GiB/20023msec); 0 zone resets
> write: IOPS=655, BW=658MiB/s (690MB/s)(12.9GiB/20021msec); 0 zone resets
> write: IOPS=596, BW=600MiB/s (629MB/s)(12.1GiB/20586msec); 0 zone resets
> write: IOPS=647, BW=650MiB/s (682MB/s)(12.7GiB/20020msec); 0 zone resets
> write: IOPS=591, BW=594MiB/s (623MB/s)(12.1GiB/20787msec); 0 zone resets
>
> 5.19.0-rc7 with my debugging patch:
> write: IOPS=633, BW=637MiB/s (668MB/s)(12.6GiB/20201msec); 0 zone resets
> write: IOPS=614, BW=617MiB/s (647MB/s)(13.1GiB/21667msec); 0 zone resets
> write: IOPS=653, BW=657MiB/s (689MB/s)(12.8GiB/20020msec); 0 zone resets
> write: IOPS=618, BW=622MiB/s (652MB/s)(12.2GiB/20033msec); 0 zone resets
> write: IOPS=604, BW=608MiB/s (638MB/s)(12.1GiB/20314msec); 0 zone resets
These again are probably the same, within variance.
> 584b0180f0:
> write: IOPS=635, BW=638MiB/s (669MB/s)(12.5GiB/20020msec); 0 zone resets
> write: IOPS=649, BW=652MiB/s (684MB/s)(12.8GiB/20066msec); 0 zone resets
> write: IOPS=639, BW=642MiB/s (674MB/s)(13.1GiB/20818msec); 0 zone resets
>
> 584b0180f0 with my debugging patch:
> write: IOPS=850, BW=853MiB/s (895MB/s)(17.1GiB/20474msec); 0 zone resets
> write: IOPS=738, BW=742MiB/s (778MB/s)(15.1GiB/20787msec); 0 zone resets
> write: IOPS=751, BW=755MiB/s (792MB/s)(15.1GiB/20432msec); 0 zone resets
But this one looks like a clear difference.
I'll poke at this tomorrow.
--
Jens Axboe
Hi Jens,
On 7/19/2022 10:29 AM, Jens Axboe wrote:
> I'll poke at this tomorrow.
Just FYI. Another finding (test is based on commit 584b0180f0):
If the code block is put to different function, the fio performance result is
different:
Patch1:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 616d857f8fc6..b0578a3d063a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3184,10 +3184,6 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
struct file *file = req->file;
int ret;
- if (likely(file && (file->f_mode & FMODE_WRITE)))
- if (!io_req_ffs_set(req))
- req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
kiocb->ki_pos = READ_ONCE(sqe->off);
ioprio = READ_ONCE(sqe->ioprio);
@@ -7852,6 +7848,10 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
return 0;
}
+ if (likely(req->file))
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
io_queue_sqe(req);
return 0;
Patch2:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index b0578a3d063a..af705e7ba8d3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7639,6 +7639,11 @@ static void io_queue_sqe_fallback(struct io_kiocb *req)
static inline void io_queue_sqe(struct io_kiocb *req)
__must_hold(&req->ctx->uring_lock)
{
+
+ if (likely(req->file))
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
__io_queue_sqe(req);
else
@@ -7848,10 +7853,6 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
return 0;
}
- if (likely(req->file))
- if (!io_req_ffs_set(req))
- req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
io_queue_sqe(req);
return 0;
}
Patch3:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index af705e7ba8d3..5771d6d0ad8a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7598,6 +7598,10 @@ static inline void __io_queue_sqe(struct io_kiocb *req)
struct io_kiocb *linked_timeout;
int ret;
+ if (likely(req->file))
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
if (req->flags & REQ_F_COMPLETE_INLINE) {
@@ -7640,10 +7644,6 @@ static inline void io_queue_sqe(struct io_kiocb *req)
__must_hold(&req->ctx->uring_lock)
{
- if (likely(req->file))
- if (!io_req_ffs_set(req))
- req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
__io_queue_sqe(req);
else
The test result (confirmed on my own test env and LKP):
patch1 and patch2 have no regression. patch3 has regression.
Regards
Yin, Fengwei
On 7/19/22 2:58 AM, Yin Fengwei wrote:
> Hi Jens,
>
> On 7/19/2022 10:29 AM, Jens Axboe wrote:
>> I'll poke at this tomorrow.
>
> Just FYI. Another finding (test is based on commit 584b0180f0):
> If the code block is put to different function, the fio performance result is
> different:
I think this turned out to be a little bit of a goose chase. What's
happening here is that later kernels defer the file assignment, which
means it isn't set if a request is queued with IOSQE_ASYNC. That in
turn, for writes, means that we don't hash it on io-wq insertion, and
then it doesn't get serialized with other writes to that file.
I'll come up with a patch for this that you can test.
--
Jens Axboe
On 7/20/22 11:24 AM, Jens Axboe wrote:
> On 7/19/22 2:58 AM, Yin Fengwei wrote:
>> Hi Jens,
>>
>> On 7/19/2022 10:29 AM, Jens Axboe wrote:
>>> I'll poke at this tomorrow.
>>
>> Just FYI. Another finding (test is based on commit 584b0180f0):
>> If the code block is put to different function, the fio performance result is
>> different:
>
> I think this turned out to be a little bit of a goose chase. What's
> happening here is that later kernels defer the file assignment, which
> means it isn't set if a request is queued with IOSQE_ASYNC. That in
> turn, for writes, means that we don't hash it on io-wq insertion, and
> then it doesn't get serialized with other writes to that file.
>
> I'll come up with a patch for this that you can test.
Can you try this? It's against 5.19-rc7.
diff --git a/fs/io_uring.c b/fs/io_uring.c
index a01ea49f3017..34758e95990a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2015,6 +2015,64 @@ static inline void io_arm_ltimeout(struct io_kiocb *req)
__io_arm_ltimeout(req);
}
+static bool io_bdev_nowait(struct block_device *bdev)
+{
+ return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
+}
+
+/*
+ * If we tracked the file through the SCM inflight mechanism, we could support
+ * any file. For now, just ensure that anything potentially problematic is done
+ * inline.
+ */
+static bool __io_file_supports_nowait(struct file *file, umode_t mode)
+{
+ if (S_ISBLK(mode)) {
+ if (IS_ENABLED(CONFIG_BLOCK) &&
+ io_bdev_nowait(I_BDEV(file->f_mapping->host)))
+ return true;
+ return false;
+ }
+ if (S_ISSOCK(mode))
+ return true;
+ if (S_ISREG(mode)) {
+ if (IS_ENABLED(CONFIG_BLOCK) &&
+ io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
+ file->f_op != &io_uring_fops)
+ return true;
+ return false;
+ }
+
+ /* any ->read/write should understand O_NONBLOCK */
+ if (file->f_flags & O_NONBLOCK)
+ return true;
+ return file->f_mode & FMODE_NOWAIT;
+}
+
+static inline bool io_file_supports_nowait(struct io_kiocb *req)
+{
+ return req->flags & REQ_F_SUPPORT_NOWAIT;
+}
+
+/*
+ * If we tracked the file through the SCM inflight mechanism, we could support
+ * any file. For now, just ensure that anything potentially problematic is done
+ * inline.
+ */
+static unsigned int io_file_get_flags(struct file *file)
+{
+ umode_t mode = file_inode(file)->i_mode;
+ unsigned int res = 0;
+
+ if (S_ISREG(mode))
+ res |= FFS_ISREG;
+ if (__io_file_supports_nowait(file, mode))
+ res |= FFS_NOWAIT;
+ if (io_file_need_scm(file))
+ res |= FFS_SCM;
+ return res;
+}
+
static void io_prep_async_work(struct io_kiocb *req)
{
const struct io_op_def *def = &io_op_defs[req->opcode];
@@ -2031,6 +2089,9 @@ static void io_prep_async_work(struct io_kiocb *req)
if (req->flags & REQ_F_FORCE_ASYNC)
req->work.flags |= IO_WQ_WORK_CONCURRENT;
+ if (req->file && !io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
if (req->flags & REQ_F_ISREG) {
if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
io_wq_hash_work(&req->work, file_inode(req->file));
@@ -3556,64 +3617,6 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
}
}
-static bool io_bdev_nowait(struct block_device *bdev)
-{
- return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
-}
-
-/*
- * If we tracked the file through the SCM inflight mechanism, we could support
- * any file. For now, just ensure that anything potentially problematic is done
- * inline.
- */
-static bool __io_file_supports_nowait(struct file *file, umode_t mode)
-{
- if (S_ISBLK(mode)) {
- if (IS_ENABLED(CONFIG_BLOCK) &&
- io_bdev_nowait(I_BDEV(file->f_mapping->host)))
- return true;
- return false;
- }
- if (S_ISSOCK(mode))
- return true;
- if (S_ISREG(mode)) {
- if (IS_ENABLED(CONFIG_BLOCK) &&
- io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
- file->f_op != &io_uring_fops)
- return true;
- return false;
- }
-
- /* any ->read/write should understand O_NONBLOCK */
- if (file->f_flags & O_NONBLOCK)
- return true;
- return file->f_mode & FMODE_NOWAIT;
-}
-
-/*
- * If we tracked the file through the SCM inflight mechanism, we could support
- * any file. For now, just ensure that anything potentially problematic is done
- * inline.
- */
-static unsigned int io_file_get_flags(struct file *file)
-{
- umode_t mode = file_inode(file)->i_mode;
- unsigned int res = 0;
-
- if (S_ISREG(mode))
- res |= FFS_ISREG;
- if (__io_file_supports_nowait(file, mode))
- res |= FFS_NOWAIT;
- if (io_file_need_scm(file))
- res |= FFS_SCM;
- return res;
-}
-
-static inline bool io_file_supports_nowait(struct io_kiocb *req)
-{
- return req->flags & REQ_F_SUPPORT_NOWAIT;
-}
-
static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct kiocb *kiocb = &req->rw.kiocb;
--
Jens Axboe
On 7/21/2022 2:13 AM, Jens Axboe wrote:
> Can you try this? It's against 5.19-rc7.
Sure. I will try it and share the test result. Thanks.
Regards
Yin, Fengwei
Hi Jens,
On 7/21/2022 2:13 AM, Jens Axboe wrote:
> Can you try this? It's against 5.19-rc7.
>
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index a01ea49f3017..34758e95990a 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2015,6 +2015,64 @@ static inline void io_arm_ltimeout(struct io_kiocb *req)
> __io_arm_ltimeout(req);
> }
>
> +static bool io_bdev_nowait(struct block_device *bdev)
> +{
> + return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
> +}
> +
> +/*
> + * If we tracked the file through the SCM inflight mechanism, we could support
> + * any file. For now, just ensure that anything potentially problematic is done
> + * inline.
> + */
> +static bool __io_file_supports_nowait(struct file *file, umode_t mode)
> +{
> + if (S_ISBLK(mode)) {
> + if (IS_ENABLED(CONFIG_BLOCK) &&
> + io_bdev_nowait(I_BDEV(file->f_mapping->host)))
> + return true;
> + return false;
> + }
> + if (S_ISSOCK(mode))
> + return true;
> + if (S_ISREG(mode)) {
> + if (IS_ENABLED(CONFIG_BLOCK) &&
> + io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
> + file->f_op != &io_uring_fops)
> + return true;
> + return false;
> + }
> +
> + /* any ->read/write should understand O_NONBLOCK */
> + if (file->f_flags & O_NONBLOCK)
> + return true;
> + return file->f_mode & FMODE_NOWAIT;
> +}
> +
> +static inline bool io_file_supports_nowait(struct io_kiocb *req)
> +{
> + return req->flags & REQ_F_SUPPORT_NOWAIT;
> +}
> +
> +/*
> + * If we tracked the file through the SCM inflight mechanism, we could support
> + * any file. For now, just ensure that anything potentially problematic is done
> + * inline.
> + */
> +static unsigned int io_file_get_flags(struct file *file)
> +{
> + umode_t mode = file_inode(file)->i_mode;
> + unsigned int res = 0;
> +
> + if (S_ISREG(mode))
> + res |= FFS_ISREG;
> + if (__io_file_supports_nowait(file, mode))
> + res |= FFS_NOWAIT;
> + if (io_file_need_scm(file))
> + res |= FFS_SCM;
> + return res;
> +}
> +
> static void io_prep_async_work(struct io_kiocb *req)
> {
> const struct io_op_def *def = &io_op_defs[req->opcode];
> @@ -2031,6 +2089,9 @@ static void io_prep_async_work(struct io_kiocb *req)
> if (req->flags & REQ_F_FORCE_ASYNC)
> req->work.flags |= IO_WQ_WORK_CONCURRENT;
>
> + if (req->file && !io_req_ffs_set(req))
> + req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
> +
> if (req->flags & REQ_F_ISREG) {
> if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
> io_wq_hash_work(&req->work, file_inode(req->file));
> @@ -3556,64 +3617,6 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
> }
> }
>
> -static bool io_bdev_nowait(struct block_device *bdev)
> -{
> - return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
> -}
> -
> -/*
> - * If we tracked the file through the SCM inflight mechanism, we could support
> - * any file. For now, just ensure that anything potentially problematic is done
> - * inline.
> - */
> -static bool __io_file_supports_nowait(struct file *file, umode_t mode)
> -{
> - if (S_ISBLK(mode)) {
> - if (IS_ENABLED(CONFIG_BLOCK) &&
> - io_bdev_nowait(I_BDEV(file->f_mapping->host)))
> - return true;
> - return false;
> - }
> - if (S_ISSOCK(mode))
> - return true;
> - if (S_ISREG(mode)) {
> - if (IS_ENABLED(CONFIG_BLOCK) &&
> - io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
> - file->f_op != &io_uring_fops)
> - return true;
> - return false;
> - }
> -
> - /* any ->read/write should understand O_NONBLOCK */
> - if (file->f_flags & O_NONBLOCK)
> - return true;
> - return file->f_mode & FMODE_NOWAIT;
> -}
> -
> -/*
> - * If we tracked the file through the SCM inflight mechanism, we could support
> - * any file. For now, just ensure that anything potentially problematic is done
> - * inline.
> - */
> -static unsigned int io_file_get_flags(struct file *file)
> -{
> - umode_t mode = file_inode(file)->i_mode;
> - unsigned int res = 0;
> -
> - if (S_ISREG(mode))
> - res |= FFS_ISREG;
> - if (__io_file_supports_nowait(file, mode))
> - res |= FFS_NOWAIT;
> - if (io_file_need_scm(file))
> - res |= FFS_SCM;
> - return res;
> -}
> -
> -static inline bool io_file_supports_nowait(struct io_kiocb *req)
> -{
> - return req->flags & REQ_F_SUPPORT_NOWAIT;
> -}
> -
> static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> {
> struct kiocb *kiocb = &req->rw.kiocb;
>
> -- Jens Axboe
This change could make regression gone. The test result is as following:
28d3a5662d44077aa6eb42bfcfa is your patch
584b0180f0f4d67d v5.19-rc7 28d3a5662d44077aa6eb42bfcfa
---------------- --------------------------- ---------------------------
fail:runs %reproduction fail:runs %reproduction fail:runs
| | | | |
503:3 9297% 782:3 178% 509:3 dmesg.timestamp:last
3:3 0% 3:3 0% 3:3 pmeter.pmeter.fail
:3 100% 3:3 100% 3:3 kmsg.I/O_error,dev_loop#,sector#op#:(READ)flags#phys_seg#prio_class
:3 3755% 112:3 4016% 120:3 kmsg.timestamp:I/O_error,dev_loop#,sector#op#:(READ)flags#phys_seg#prio_class
465:3 9221% 742:3 235% 473:3 kmsg.timestamp:last
%stddev %change %stddev %change %stddev
\ | \ | \
972.00 -0.3% 968.67 +11.4% 1082 phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.iops
975.00 -0.3% 972.33 +11.5% 1086 phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s
Comparing to v5.19-rc7 and 584b0180f0f4d67d, it could bring 11% regression back.
Thanks.
Regards
Yin, Fengwei
Hi Jens,
On 7/21/2022 1:24 AM, Jens Axboe wrote:
> I think this turned out to be a little bit of a goose chase. What's
> happening here is that later kernels defer the file assignment, which
> means it isn't set if a request is queued with IOSQE_ASYNC. That in
> turn, for writes, means that we don't hash it on io-wq insertion, and
> then it doesn't get serialized with other writes to that file.
Thanks a lot for the detail behavior explanation.
Regards
Yin, Fengwei