2021-04-13 07:46:13

by kernel test robot

[permalink] [raw]
Subject: [btrfs] ab12313a9f: fsmark.files_per_sec 21.3% improvement



Greeting,

FYI, we noticed a 21.3% improvement of fsmark.files_per_sec due to commit:


commit: ab12313a9f56b939529abc80ac26bedefb3d5b62 ("btrfs: avoid logging new ancestor inodes when logging new inode")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: fsmark
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:

iterations: 1x
nr_threads: 32t
disk: 1SSD
fs: btrfs
filesize: 8K
test_size: 400M
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
cpufreq_governor: performance
ucode: 0x5003006

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
gcc-9/performance/1SSD/8K/btrfs/1x/x86_64-rhel-8.3/16d/256fpd/32t/debian-10.4-x86_64-20200603.cgz/fsyncBeforeClose/lkp-csl-2sp7/400M/fsmark/0x5003006

commit:
e593e54ed1 ("btrfs: stop setting nbytes when filling inode item for logging")
ab12313a9f ("btrfs: avoid logging new ancestor inodes when logging new inode")

e593e54ed1f643f5 ab12313a9f56b939529abc80ac2
---------------- ---------------------------
%stddev %change %stddev
\ | \
13864 ? 4% +21.3% 16811 fsmark.files_per_sec
1582810 ? 7% -12.2% 1390437 fsmark.time.file_system_outputs
2918 ? 17% +41.7% 4134 ? 2% fsmark.time.involuntary_context_switches
18.72 ? 6% -20.2% 14.94 ? 7% fsmark.time.system_time
252201 ? 25% -41.0% 148775 ? 42% numa-meminfo.node1.Inactive
137.18 ? 6% -12.7% 119.72 ? 10% sched_debug.cfs_rq:/.util_est_enqueued.stddev
4.671e+08 ? 6% -44.7% 2.583e+08 ? 46% cpuidle.C1E.time
1469142 ? 9% -28.9% 1044121 ? 16% cpuidle.C1E.usage
87889 ? 28% -51.9% 42244 ? 9% meminfo.Active
86382 ? 28% -52.8% 40750 ? 10% meminfo.Active(file)
5.12 ? 84% +119.1% 11.21 ? 5% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
2.57 ? 2% +9.7% 2.82 ? 6% perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork
17799 ? 7% +21.1% 21549 slabinfo.kmalloc-rcl-64.active_objs
18199 ? 5% +19.7% 21784 slabinfo.kmalloc-rcl-64.num_objs
22713 ? 29% -54.9% 10248 ? 10% proc-vmstat.nr_active_file
197974 ? 7% -11.1% 175943 proc-vmstat.nr_dirtied
197671 ? 8% -10.9% 176107 proc-vmstat.nr_written
22713 ? 29% -54.9% 10248 ? 10% proc-vmstat.nr_zone_active_file
469007 ? 3% -4.8% 446448 proc-vmstat.pgalloc_normal
801996 ? 8% -10.6% 716871 proc-vmstat.pgpgout
7210 ? 9% +15.3% 8314 ? 11% perf-stat.i.minor-faults
564621 ? 18% +29.9% 733408 ? 10% perf-stat.i.node-stores
7211 ? 9% +15.3% 8315 ? 11% perf-stat.i.page-faults
5901 ? 6% +11.4% 6574 ? 8% perf-stat.ps.minor-faults
460541 ? 16% +26.1% 580842 ? 11% perf-stat.ps.node-stores
5902 ? 6% +11.4% 6574 ? 8% perf-stat.ps.page-faults
5.975e+10 ? 16% -17.4% 4.934e+10 ? 11% perf-stat.total.instructions
46.25 ? 27% -22.2 24.02 ? 46% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
10.15 ? 93% -10.2 0.00 perf-profile.calltrace.cycles-pp.proc_pid_readdir.iterate_dir.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.15 ? 93% -9.2 0.93 ?223% perf-profile.calltrace.cycles-pp.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.15 ? 93% -9.2 0.93 ?223% perf-profile.calltrace.cycles-pp.iterate_dir.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.37 ? 93% -7.3 1.11 ?223% perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.37 ? 93% -7.3 1.11 ?223% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.37 ? 93% -7.3 1.11 ?223% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.84 ? 73% -5.8 0.00 perf-profile.calltrace.cycles-pp.next_tgid.proc_pid_readdir.iterate_dir.__x64_sys_getdents64.do_syscall_64
6.89 ? 80% -5.1 1.75 ?148% perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
6.89 ? 80% -5.1 1.75 ?148% perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.get_signal
6.88 ?109% -4.7 2.21 ?144% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.88 ?109% -4.7 2.21 ?144% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.65 ?102% -4.6 0.00 perf-profile.calltrace.cycles-pp.pid_nr_ns.next_tgid.proc_pid_readdir.iterate_dir.__x64_sys_getdents64
4.52 ?102% -4.5 0.00 perf-profile.calltrace.cycles-pp.__schedule.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group
4.52 ?102% -4.5 0.00 perf-profile.calltrace.cycles-pp.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
4.52 ?102% -4.5 0.00 perf-profile.calltrace.cycles-pp.load_balance.newidle_balance.pick_next_task_fair.__schedule.do_task_dead
4.52 ?102% -4.5 0.00 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.do_task_dead.do_exit.do_group_exit
4.52 ?102% -4.5 0.00 perf-profile.calltrace.cycles-pp.newidle_balance.pick_next_task_fair.__schedule.do_task_dead.do_exit
4.52 ?102% -3.9 0.64 ?223% perf-profile.calltrace.cycles-pp.update_sd_lb_stats.find_busiest_group.load_balance.newidle_balance.pick_next_task_fair
4.52 ?102% -3.9 0.64 ?223% perf-profile.calltrace.cycles-pp.find_busiest_group.load_balance.newidle_balance.pick_next_task_fair.__schedule
5.61 ? 74% -3.9 1.75 ?148% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.mmput.do_exit.do_group_exit
5.61 ? 74% -3.9 1.75 ?148% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.mmput.do_exit
5.61 ? 74% -3.9 1.75 ?148% perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.exit_mmap.mmput
10.15 ? 93% -10.2 0.00 perf-profile.children.cycles-pp.proc_pid_readdir
10.15 ? 93% -9.2 0.93 ?223% perf-profile.children.cycles-pp.__x64_sys_getdents64
10.15 ? 93% -9.2 0.93 ?223% perf-profile.children.cycles-pp.iterate_dir
8.37 ? 93% -7.3 1.11 ?223% perf-profile.children.cycles-pp.__x64_sys_exit_group
5.84 ? 73% -5.8 0.00 perf-profile.children.cycles-pp.next_tgid
8.17 ? 78% -5.3 2.86 ?170% perf-profile.children.cycles-pp.mmput
8.17 ? 78% -5.3 2.86 ?170% perf-profile.children.cycles-pp.exit_mmap
4.65 ?102% -4.6 0.00 perf-profile.children.cycles-pp.pid_nr_ns
4.52 ?102% -4.5 0.00 perf-profile.children.cycles-pp.do_task_dead
4.52 ?102% -3.9 0.64 ?223% perf-profile.children.cycles-pp.update_sd_lb_stats
4.52 ?102% -3.9 0.64 ?223% perf-profile.children.cycles-pp.load_balance
4.52 ?102% -3.9 0.64 ?223% perf-profile.children.cycles-pp.find_busiest_group
4.52 ?102% -3.9 0.64 ?223% perf-profile.children.cycles-pp.pick_next_task_fair
4.52 ?102% -3.9 0.64 ?223% perf-profile.children.cycles-pp.newidle_balance
5.61 ? 74% -3.9 1.75 ?148% perf-profile.children.cycles-pp.unmap_vmas
5.61 ? 74% -3.9 1.75 ?148% perf-profile.children.cycles-pp.unmap_page_range
5.61 ? 74% -3.9 1.75 ?148% perf-profile.children.cycles-pp.zap_pte_range
4.08 ?100% -3.2 0.93 ?223% perf-profile.children.cycles-pp.filename_lookup
4.08 ?100% -3.2 0.93 ?223% perf-profile.children.cycles-pp.path_lookupat
4.08 ?100% -3.2 0.93 ?223% perf-profile.children.cycles-pp.seq_printf
4.08 ?100% -3.2 0.93 ?223% perf-profile.children.cycles-pp.seq_vprintf
3.99 ?101% -2.6 1.39 ?223% perf-profile.children.cycles-pp.__lookup_slow
4.52 ?102% -0.9 3.66 ?150% perf-profile.children.cycles-pp.__schedule
4.52 ?102% -0.8 3.77 ?148% perf-profile.children.cycles-pp._raw_spin_lock
4.65 ?102% -4.6 0.00 perf-profile.self.cycles-pp.pid_nr_ns
4.32 ?104% -3.2 1.11 ?223% perf-profile.self.cycles-pp.zap_pte_range
77407 -7.1% 71910 interrupts.CAL:Function_call_interrupts
12882 ? 18% -18.6% 10485 ? 2% interrupts.CPU0.LOC:Local_timer_interrupts
12863 ? 18% -18.7% 10458 ? 2% interrupts.CPU1.LOC:Local_timer_interrupts
12845 ? 18% -18.4% 10483 ? 2% interrupts.CPU10.LOC:Local_timer_interrupts
12846 ? 18% -18.8% 10436 ? 2% interrupts.CPU11.LOC:Local_timer_interrupts
12800 ? 18% -19.1% 10352 ? 3% interrupts.CPU12.LOC:Local_timer_interrupts
12832 ? 18% -18.7% 10437 ? 2% interrupts.CPU13.LOC:Local_timer_interrupts
12829 ? 18% -18.7% 10424 ? 2% interrupts.CPU14.LOC:Local_timer_interrupts
12812 ? 19% -18.6% 10431 ? 2% interrupts.CPU15.LOC:Local_timer_interrupts
12783 ? 19% -18.5% 10421 ? 2% interrupts.CPU16.LOC:Local_timer_interrupts
12811 ? 18% -18.7% 10418 ? 2% interrupts.CPU17.LOC:Local_timer_interrupts
12820 ? 19% -18.6% 10434 ? 2% interrupts.CPU18.LOC:Local_timer_interrupts
12772 ? 19% -17.6% 10522 ? 2% interrupts.CPU19.LOC:Local_timer_interrupts
12859 ? 18% -18.6% 10462 ? 2% interrupts.CPU2.LOC:Local_timer_interrupts
12831 ? 19% -18.9% 10405 ? 2% interrupts.CPU20.LOC:Local_timer_interrupts
12822 ? 18% -19.2% 10366 ? 3% interrupts.CPU21.LOC:Local_timer_interrupts
939.67 ? 19% -27.4% 682.67 ? 10% interrupts.CPU22.CAL:Function_call_interrupts
12829 ? 18% -18.7% 10424 ? 2% interrupts.CPU22.LOC:Local_timer_interrupts
12948 ? 20% -19.5% 10423 ? 2% interrupts.CPU23.LOC:Local_timer_interrupts
12829 ? 18% -20.3% 10221 ? 6% interrupts.CPU24.LOC:Local_timer_interrupts
12798 ? 18% -19.8% 10261 ? 6% interrupts.CPU25.LOC:Local_timer_interrupts
12833 ? 19% -19.9% 10278 ? 5% interrupts.CPU26.LOC:Local_timer_interrupts
12965 ? 18% -21.0% 10248 ? 5% interrupts.CPU27.LOC:Local_timer_interrupts
12817 ? 19% -19.1% 10367 ? 6% interrupts.CPU28.LOC:Local_timer_interrupts
12777 ? 19% -19.7% 10259 ? 5% interrupts.CPU29.LOC:Local_timer_interrupts
12839 ? 18% -18.4% 10474 ? 2% interrupts.CPU3.LOC:Local_timer_interrupts
12807 ? 19% -19.8% 10267 ? 5% interrupts.CPU30.LOC:Local_timer_interrupts
12818 ? 19% -20.1% 10246 ? 5% interrupts.CPU31.LOC:Local_timer_interrupts
12794 ? 19% -19.9% 10246 ? 5% interrupts.CPU32.LOC:Local_timer_interrupts
12800 ? 19% -20.0% 10234 ? 5% interrupts.CPU33.LOC:Local_timer_interrupts
12831 ? 19% -20.2% 10239 ? 5% interrupts.CPU34.LOC:Local_timer_interrupts
12831 ? 18% -20.2% 10238 ? 5% interrupts.CPU35.LOC:Local_timer_interrupts
12740 ? 19% -19.7% 10229 ? 5% interrupts.CPU36.LOC:Local_timer_interrupts
12796 ? 18% -19.9% 10252 ? 5% interrupts.CPU37.LOC:Local_timer_interrupts
12831 ? 19% -20.6% 10182 ? 5% interrupts.CPU38.LOC:Local_timer_interrupts
12829 ? 19% -20.4% 10215 ? 5% interrupts.CPU39.LOC:Local_timer_interrupts
12879 ? 18% -18.9% 10447 ? 2% interrupts.CPU4.LOC:Local_timer_interrupts
12832 ? 19% -19.6% 10316 ? 6% interrupts.CPU40.LOC:Local_timer_interrupts
12773 ? 19% -20.0% 10215 ? 5% interrupts.CPU41.LOC:Local_timer_interrupts
12805 ? 18% -20.3% 10200 ? 5% interrupts.CPU42.LOC:Local_timer_interrupts
12735 ? 19% -19.7% 10221 ? 5% interrupts.CPU43.LOC:Local_timer_interrupts
12834 ? 18% -20.3% 10235 ? 5% interrupts.CPU44.LOC:Local_timer_interrupts
12819 ? 19% -20.6% 10183 ? 5% interrupts.CPU45.LOC:Local_timer_interrupts
12826 ? 19% -20.3% 10228 ? 5% interrupts.CPU46.LOC:Local_timer_interrupts
12813 ? 18% -20.0% 10245 ? 5% interrupts.CPU47.LOC:Local_timer_interrupts
12818 ? 18% -18.5% 10452 ? 2% interrupts.CPU48.LOC:Local_timer_interrupts
12803 ? 18% -17.8% 10520 ? 2% interrupts.CPU49.LOC:Local_timer_interrupts
12893 ? 18% -18.9% 10461 ? 2% interrupts.CPU5.LOC:Local_timer_interrupts
12805 ? 18% -18.4% 10455 ? 2% interrupts.CPU50.LOC:Local_timer_interrupts
12836 ? 18% -18.4% 10475 ? 2% interrupts.CPU51.LOC:Local_timer_interrupts
12830 ? 18% -18.3% 10477 ? 2% interrupts.CPU52.LOC:Local_timer_interrupts
12838 ? 18% -18.5% 10462 ? 2% interrupts.CPU53.LOC:Local_timer_interrupts
12827 ? 18% -19.0% 10392 ? 3% interrupts.CPU54.LOC:Local_timer_interrupts
12830 ? 18% -18.0% 10524 ? 2% interrupts.CPU55.LOC:Local_timer_interrupts
12825 ? 18% -18.5% 10454 ? 2% interrupts.CPU56.LOC:Local_timer_interrupts
12826 ? 18% -18.6% 10445 ? 2% interrupts.CPU57.LOC:Local_timer_interrupts
12836 ? 18% -18.6% 10453 ? 2% interrupts.CPU58.LOC:Local_timer_interrupts
12841 ? 18% -18.6% 10455 ? 2% interrupts.CPU59.LOC:Local_timer_interrupts
12845 ? 18% -18.5% 10472 ? 2% interrupts.CPU6.LOC:Local_timer_interrupts
12805 ? 18% -18.2% 10473 ? 2% interrupts.CPU60.LOC:Local_timer_interrupts
12842 ? 18% -17.8% 10562 ? 2% interrupts.CPU61.LOC:Local_timer_interrupts
12843 ? 18% -18.3% 10497 ? 2% interrupts.CPU62.LOC:Local_timer_interrupts
12846 ? 18% -18.4% 10485 ? 2% interrupts.CPU63.LOC:Local_timer_interrupts
12846 ? 18% -18.6% 10457 ? 2% interrupts.CPU64.LOC:Local_timer_interrupts
12842 ? 18% -18.3% 10491 ? 2% interrupts.CPU65.LOC:Local_timer_interrupts
12844 ? 18% -18.5% 10462 ? 2% interrupts.CPU66.LOC:Local_timer_interrupts
12832 ? 18% -18.3% 10481 ? 2% interrupts.CPU67.LOC:Local_timer_interrupts
12837 ? 18% -18.6% 10446 ? 2% interrupts.CPU68.LOC:Local_timer_interrupts
12806 ? 18% -18.3% 10468 ? 2% interrupts.CPU69.LOC:Local_timer_interrupts
12842 ? 18% -18.6% 10449 ? 2% interrupts.CPU7.LOC:Local_timer_interrupts
12839 ? 18% -17.6% 10575 ? 4% interrupts.CPU70.LOC:Local_timer_interrupts
12816 ? 18% -18.3% 10468 ? 2% interrupts.CPU71.LOC:Local_timer_interrupts
12830 ? 18% -20.2% 10233 ? 5% interrupts.CPU72.LOC:Local_timer_interrupts
12777 ? 18% -19.8% 10243 ? 5% interrupts.CPU73.LOC:Local_timer_interrupts
12839 ? 18% -20.2% 10248 ? 5% interrupts.CPU74.LOC:Local_timer_interrupts
12930 ? 18% -20.7% 10247 ? 5% interrupts.CPU75.LOC:Local_timer_interrupts
12853 ? 18% -20.2% 10254 ? 5% interrupts.CPU76.LOC:Local_timer_interrupts
12882 ? 18% -20.4% 10249 ? 5% interrupts.CPU77.LOC:Local_timer_interrupts
12841 ? 18% -20.2% 10246 ? 5% interrupts.CPU78.LOC:Local_timer_interrupts
12839 ? 18% -20.7% 10179 ? 6% interrupts.CPU79.LOC:Local_timer_interrupts
12852 ? 18% -18.7% 10451 ? 2% interrupts.CPU8.LOC:Local_timer_interrupts
12840 ? 18% -20.1% 10262 ? 5% interrupts.CPU80.LOC:Local_timer_interrupts
12836 ? 18% -20.1% 10250 ? 5% interrupts.CPU81.LOC:Local_timer_interrupts
12824 ? 18% -20.1% 10252 ? 5% interrupts.CPU82.LOC:Local_timer_interrupts
12974 ? 18% -21.0% 10255 ? 5% interrupts.CPU83.LOC:Local_timer_interrupts
12835 ? 19% -20.0% 10273 ? 5% interrupts.CPU84.LOC:Local_timer_interrupts
12834 ? 19% -20.1% 10257 ? 5% interrupts.CPU85.LOC:Local_timer_interrupts
12840 ? 18% -20.1% 10255 ? 5% interrupts.CPU86.LOC:Local_timer_interrupts
12829 ? 18% -20.2% 10242 ? 5% interrupts.CPU87.LOC:Local_timer_interrupts
12847 ? 18% -20.2% 10248 ? 5% interrupts.CPU88.LOC:Local_timer_interrupts
12847 ? 18% -20.1% 10271 ? 5% interrupts.CPU89.LOC:Local_timer_interrupts
12844 ? 18% -18.3% 10491 ? 2% interrupts.CPU9.LOC:Local_timer_interrupts
12843 ? 18% -20.1% 10264 ? 5% interrupts.CPU90.LOC:Local_timer_interrupts
12831 ? 18% -20.1% 10256 ? 5% interrupts.CPU91.LOC:Local_timer_interrupts
12843 ? 18% -20.1% 10265 ? 5% interrupts.CPU92.LOC:Local_timer_interrupts
12828 ? 19% -20.0% 10264 ? 5% interrupts.CPU93.LOC:Local_timer_interrupts
12822 ? 19% -20.0% 10253 ? 5% interrupts.CPU94.LOC:Local_timer_interrupts
12858 ? 18% -20.0% 10282 ? 5% interrupts.CPU95.LOC:Local_timer_interrupts
1231989 ? 18% -19.3% 993963 ? 3% interrupts.LOC:Local_timer_interrupts



fsmark.files_per_sec

17500 +-------------------------------------------------------------------+
17000 |-O OO O O O O O |
| O O O O O O O O O O |
16500 |-+ O O O O O O O O O O |
16000 |-+ O |
| O |
15500 |-+ |
15000 |-+ + + |
14500 |-+ .+ + :: +. + : + |
| + : :: : : + +.+ + + :: + : + |
14000 |:+: : : : : : +.+ :+ : :: : : : : + + |
13500 |:+: : : : + : + : : : : :: : + : + |
| + +.: + :: : : : : + +. :+ : + |
13000 |-+ + + +.: +: + + |
12500 +-------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (20.57 kB)
config-5.11.0-rc7-00079-gab12313a9f56 (175.07 kB)
job-script (8.49 kB)
job.yaml (5.60 kB)
reproduce (910.00 B)
Download all attachments