Greeting,
FYI, we noticed a 15.8% improvement of fxmark.ssd_btrfs_dbench_client_4_bufferedio.works/sec due to commit:
commit: 259c4b96d78dda8477a3ac21d6b3cf0eb9f75c8b ("btrfs: stop doing unnecessary log updates during a rename")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fxmark
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
with following parameters:
disk: 1SSD
media: ssd
test: dbench_client
fstype: btrfs
directio: bufferedio
cpufreq_governor: performance
ucode: 0xd000331
test-description: FxMark is a filesystem benchmark that test multicore scalability.
test-url: https://github.com/sslab-gatech/fxmark
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/directio/disk/fstype/kconfig/media/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/bufferedio/1SSD/btrfs/x86_64-rhel-8.3/ssd/debian-10.4-x86_64-20200603.cgz/lkp-icl-2sp5/dbench_client/fxmark/0xd000331
commit:
88d2beec7e ("btrfs: avoid logging all directory changes during renames")
259c4b96d7 ("btrfs: stop doing unnecessary log updates during a rename")
88d2beec7e53fc50 259c4b96d78dda8477a3ac21d6b
---------------- ---------------------------
%stddev %change %stddev
\ | \
14.14 ? 2% +14.9% 16.24 fxmark.ssd_btrfs_dbench_client_4_bufferedio.user_sec
5.94 ? 2% +14.8% 6.83 fxmark.ssd_btrfs_dbench_client_4_bufferedio.user_util
1150 ? 3% +15.8% 1333 fxmark.ssd_btrfs_dbench_client_4_bufferedio.works/sec
199.66 ? 20% -21.6% 156.57 ? 24% fxmark.ssd_btrfs_dbench_client_54_bufferedio.iowait_sec
6.20 ? 20% -21.5% 4.86 ? 24% fxmark.ssd_btrfs_dbench_client_54_bufferedio.iowait_util
637309 +18.7% 756452 fxmark.time.involuntary_context_switches
172.66 +6.7% 184.16 ? 2% fxmark.time.user_time
4.16 +5.6% 4.40 iostat.cpu.user
9072 ? 12% -21.3% 7143 ? 11% meminfo.Writeback
5816 ? 14% -22.5% 4507 ? 9% numa-meminfo.node0.Writeback
5017 ? 11% +49.5% 7500 ? 10% perf-stat.i.cpu-migrations
5007 ? 11% +49.5% 7484 ? 10% perf-stat.ps.cpu-migrations
392206 ? 2% -4.7% 373592 ? 3% proc-vmstat.nr_active_file
1234447 -2.5% 1203100 proc-vmstat.nr_file_pages
197139 -4.7% 187818 ? 2% proc-vmstat.nr_inactive_file
392206 ? 2% -4.7% 373592 ? 3% proc-vmstat.nr_zone_active_file
197139 -4.7% 187818 ? 2% proc-vmstat.nr_zone_inactive_file
9.28 ? 11% -2.6 6.73 ? 8% perf-profile.calltrace.cycles-pp.fsync
9.26 ? 11% -2.5 6.71 ? 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fsync
9.26 ? 11% -2.5 6.71 ? 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync
9.25 ? 11% -2.5 6.70 ? 8% perf-profile.calltrace.cycles-pp.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync
9.24 ? 11% -2.5 6.70 ? 8% perf-profile.calltrace.cycles-pp.btrfs_sync_file.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
9.25 ? 11% -2.5 6.70 ? 8% perf-profile.calltrace.cycles-pp.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync
5.05 ? 18% -2.0 3.08 ? 14% perf-profile.calltrace.cycles-pp.btrfs_sync_log.btrfs_sync_file.do_fsync.__x64_sys_fsync.do_syscall_64
3.80 ? 24% -2.0 1.85 ? 21% perf-profile.calltrace.cycles-pp.wait_log_commit.btrfs_sync_log.btrfs_sync_file.do_fsync.__x64_sys_fsync
3.53 ? 26% -1.9 1.61 ? 24% perf-profile.calltrace.cycles-pp.__mutex_lock.wait_log_commit.btrfs_sync_log.btrfs_sync_file.do_fsync
3.28 ? 27% -1.9 1.41 ? 25% perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.wait_log_commit.btrfs_sync_log.btrfs_sync_file
2.28 ? 17% -1.3 1.03 ? 20% perf-profile.calltrace.cycles-pp.__mutex_lock.join_running_log_trans.btrfs_del_inode_ref_in_log.__btrfs_unlink_inode.btrfs_unlink_inode
2.31 ? 17% -1.3 1.06 ? 19% perf-profile.calltrace.cycles-pp.join_running_log_trans.btrfs_del_inode_ref_in_log.__btrfs_unlink_inode.btrfs_unlink_inode.btrfs_unlink
1.79 ? 19% -1.2 0.61 ? 48% perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.join_running_log_trans.btrfs_del_inode_ref_in_log.__btrfs_unlink_inode
1.42 ? 11% -0.6 0.83 ? 9% perf-profile.calltrace.cycles-pp.__btrfs_unlink_inode.btrfs_rename.vfs_rename.do_renameat2.__x64_sys_rename
2.98 ? 5% -0.6 2.39 ? 7% perf-profile.calltrace.cycles-pp.btrfs_log_inode_parent.btrfs_log_dentry_safe.btrfs_sync_file.do_fsync.__x64_sys_fsync
2.98 ? 5% -0.6 2.39 ? 7% perf-profile.calltrace.cycles-pp.btrfs_log_dentry_safe.btrfs_sync_file.do_fsync.__x64_sys_fsync.do_syscall_64
1.91 ? 4% -0.2 1.69 ? 4% perf-profile.calltrace.cycles-pp.btrfs_log_inode.btrfs_log_inode_parent.btrfs_log_dentry_safe.btrfs_sync_file.do_fsync
0.74 ? 5% -0.1 0.62 ? 8% perf-profile.calltrace.cycles-pp.log_one_extent.btrfs_log_changed_extents.btrfs_log_inode.btrfs_log_inode_parent.btrfs_log_dentry_safe
0.75 ? 5% -0.1 0.63 ? 8% perf-profile.calltrace.cycles-pp.btrfs_log_changed_extents.btrfs_log_inode.btrfs_log_inode_parent.btrfs_log_dentry_safe.btrfs_sync_file
0.59 ? 3% +0.0 0.62 ? 2% perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter_atomic.btrfs_copy_from_user.btrfs_buffered_write.btrfs_file_write_iter
1.08 ? 2% +0.5 1.56 ? 20% perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.rwsem_down_read_slowpath.__btrfs_tree_read_lock.btrfs_read_lock_root_node.btrfs_search_slot
0.97 ? 2% +0.5 1.45 ? 21% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.rwsem_down_read_slowpath.__btrfs_tree_read_lock.btrfs_read_lock_root_node
5.12 ? 8% +0.7 5.80 ? 4% perf-profile.calltrace.cycles-pp.btrfs_work_helper.process_one_work.worker_thread.kthread.ret_from_fork
5.60 ? 5% +0.7 6.32 ? 3% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
5.91 ? 5% +0.7 6.66 ? 3% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
6.73 ? 7% +0.8 7.48 ? 4% perf-profile.calltrace.cycles-pp.btrfs_read_lock_root_node.btrfs_search_slot.btrfs_real_readdir.iterate_dir.__x64_sys_getdents64
6.00 ? 5% +0.8 6.75 ? 3% perf-profile.calltrace.cycles-pp.ret_from_fork
6.00 ? 5% +0.8 6.75 ? 3% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
6.38 ? 7% +0.8 7.15 ? 4% perf-profile.calltrace.cycles-pp.__btrfs_tree_read_lock.btrfs_read_lock_root_node.btrfs_search_slot.btrfs_real_readdir.iterate_dir
9.32 ? 4% +1.0 10.36 perf-profile.calltrace.cycles-pp.btrfs_search_slot.btrfs_real_readdir.iterate_dir.__x64_sys_getdents64.do_syscall_64
11.74 ? 3% +1.3 13.00 ? 2% perf-profile.calltrace.cycles-pp.btrfs_real_readdir.iterate_dir.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe
12.33 ? 3% +1.3 13.63 ? 2% perf-profile.calltrace.cycles-pp.iterate_dir.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe.telldir
12.39 ? 3% +1.3 13.70 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe.telldir
12.45 ? 3% +1.3 13.78 ? 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.telldir
12.49 ? 3% +1.3 13.82 ? 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.telldir
12.73 ? 3% +1.3 14.06 ? 2% perf-profile.calltrace.cycles-pp.telldir
11.29 ? 15% -4.4 6.87 ? 10% perf-profile.children.cycles-pp.osq_lock
6.87 ? 19% -3.5 3.36 ? 21% perf-profile.children.cycles-pp.__mutex_lock
9.28 ? 11% -2.5 6.74 ? 8% perf-profile.children.cycles-pp.fsync
9.25 ? 11% -2.5 6.70 ? 8% perf-profile.children.cycles-pp.btrfs_sync_file
9.25 ? 11% -2.5 6.70 ? 8% perf-profile.children.cycles-pp.do_fsync
9.25 ? 11% -2.5 6.70 ? 8% perf-profile.children.cycles-pp.__x64_sys_fsync
4.90 ? 19% -2.4 2.49 ? 19% perf-profile.children.cycles-pp.btrfs_del_inode_ref_in_log
5.05 ? 18% -2.0 3.08 ? 14% perf-profile.children.cycles-pp.btrfs_sync_log
3.80 ? 24% -2.0 1.85 ? 21% perf-profile.children.cycles-pp.wait_log_commit
2.52 ? 16% -1.4 1.08 ? 19% perf-profile.children.cycles-pp.join_running_log_trans
2.98 ? 5% -0.6 2.39 ? 7% perf-profile.children.cycles-pp.btrfs_log_dentry_safe
1.11 ? 21% -0.6 0.54 ? 26% perf-profile.children.cycles-pp.btrfs_del_dir_entries_in_log
0.97 ? 22% -0.5 0.44 ? 29% perf-profile.children.cycles-pp.btrfs_lookup_dir_index_item
0.93 ? 8% -0.2 0.74 ? 15% perf-profile.children.cycles-pp.mutex_spin_on_owner
0.74 ? 5% -0.1 0.62 ? 8% perf-profile.children.cycles-pp.log_one_extent
0.75 ? 5% -0.1 0.63 ? 8% perf-profile.children.cycles-pp.btrfs_log_changed_extents
0.32 ? 34% -0.1 0.21 ? 8% perf-profile.children.cycles-pp.btrfs_block_rsv_add
0.13 ? 18% -0.1 0.05 ? 72% perf-profile.children.cycles-pp.btrfs_check_node
0.06 ? 7% +0.0 0.08 ? 4% perf-profile.children.cycles-pp.__mod_lruvec_state
0.13 ? 7% +0.0 0.16 ? 3% perf-profile.children.cycles-pp.perf_event_pid_type
0.04 ? 45% +0.0 0.07 ? 15% perf-profile.children.cycles-pp.raw_spin_rq_lock_nested
0.12 ? 4% +0.0 0.14 ? 5% perf-profile.children.cycles-pp.mod_objcg_state
0.15 ? 6% +0.0 0.18 ? 7% perf-profile.children.cycles-pp.__task_pid_nr_ns
0.14 ? 4% +0.0 0.17 ? 7% perf-profile.children.cycles-pp.__mod_lruvec_page_state
0.08 ? 9% +0.0 0.10 ? 7% perf-profile.children.cycles-pp.check_extent_data_item
0.04 ? 71% +0.0 0.07 ? 10% perf-profile.children.cycles-pp.kblockd_mod_delayed_work_on
0.04 ? 71% +0.0 0.07 ? 10% perf-profile.children.cycles-pp.mod_delayed_work_on
0.03 ? 70% +0.0 0.07 ? 12% perf-profile.children.cycles-pp.cpumask_next_and
0.50 ? 4% +0.0 0.54 ? 3% perf-profile.children.cycles-pp.btrfs_comp_cpu_keys
0.06 ? 14% +0.1 0.13 ? 13% perf-profile.children.cycles-pp.write_all_supers
1.58 ? 3% +0.1 1.69 ? 5% perf-profile.children.cycles-pp.read_block_for_search
0.65 ? 6% +0.1 0.77 ? 7% perf-profile.children.cycles-pp.ktime_get
0.46 ? 3% +0.2 0.65 ? 9% perf-profile.children.cycles-pp.find_busiest_group
0.45 ? 3% +0.2 0.64 ? 8% perf-profile.children.cycles-pp.update_sd_lb_stats
0.54 ? 3% +0.2 0.75 ? 8% perf-profile.children.cycles-pp.load_balance
0.64 ? 3% +0.2 0.88 ? 7% perf-profile.children.cycles-pp.newidle_balance
1.28 ? 3% +0.3 1.58 ? 3% perf-profile.children.cycles-pp.pick_next_task_fair
5.12 ? 8% +0.7 5.80 ? 4% perf-profile.children.cycles-pp.btrfs_work_helper
5.60 ? 5% +0.7 6.32 ? 3% perf-profile.children.cycles-pp.process_one_work
5.91 ? 5% +0.7 6.66 ? 3% perf-profile.children.cycles-pp.worker_thread
6.00 ? 5% +0.8 6.76 ? 3% perf-profile.children.cycles-pp.ret_from_fork
6.00 ? 5% +0.8 6.75 ? 3% perf-profile.children.cycles-pp.kthread
2.84 ? 3% +0.9 3.78 ? 15% perf-profile.children.cycles-pp._raw_spin_lock_irq
11.78 ? 3% +1.3 13.04 ? 2% perf-profile.children.cycles-pp.btrfs_real_readdir
12.34 ? 3% +1.3 13.63 ? 2% perf-profile.children.cycles-pp.iterate_dir
12.39 ? 3% +1.3 13.70 ? 2% perf-profile.children.cycles-pp.__x64_sys_getdents64
12.76 ? 3% +1.3 14.11 ? 2% perf-profile.children.cycles-pp.telldir
11.23 ? 15% -4.4 6.82 ? 10% perf-profile.self.cycles-pp.osq_lock
0.92 ? 9% -0.2 0.73 ? 14% perf-profile.self.cycles-pp.mutex_spin_on_owner
0.13 ? 5% +0.0 0.15 ? 7% perf-profile.self.cycles-pp.__task_pid_nr_ns
0.47 ? 5% +0.0 0.51 ? 3% perf-profile.self.cycles-pp.btrfs_comp_cpu_keys
0.34 ? 4% +0.1 0.48 ? 8% perf-profile.self.cycles-pp.update_sd_lb_stats
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp