2021-06-03 01:38:39

by kernel test robot

[permalink] [raw]
Subject: [fanotify] a8b98c808e: stress-ng.fanotify.ops_per_sec 32.2% improvement



Greeting,

FYI, we noticed a 32.2% improvement of stress-ng.fanotify.ops_per_sec due to commit:


commit: a8b98c808eab3ec8f1b5a64be967b0f4af4cae43 ("fanotify: fix permission model of unprivileged group")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: stress-ng
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:

nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: filesystem
test: fanotify
cpufreq_governor: performance
ucode: 0x5003006






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
filesystem/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/fanotify/stress-ng/60s/0x5003006

commit:
b577750e41 ("MAINTAINERS: Add Matthew Bobrowski as a reviewer")
a8b98c808e ("fanotify: fix permission model of unprivileged group")

b577750e4157050e a8b98c808eab3ec8f1b5a64be96
---------------- ---------------------------
%stddev %change %stddev
\ | \
86763162 +32.2% 1.147e+08 stress-ng.fanotify.ops
1445480 +32.2% 1910539 stress-ng.fanotify.ops_per_sec
28274341 +14.5% 32376584 ? 4% stress-ng.time.file_system_outputs
14639 ? 2% +28.3% 18779 ? 2% stress-ng.time.involuntary_context_switches
1603 +2.5% 1644 stress-ng.time.percent_of_cpu_this_job_got
989.77 +2.2% 1011 stress-ng.time.system_time
7.76 +24.2% 9.64 ? 3% stress-ng.time.user_time
158230 ? 3% -59.1% 64735 ? 62% stress-ng.time.voluntary_context_switches
17.33 +4.0% 18.02 iostat.cpu.system
0.01 ? 12% -48.7% 0.01 ? 73% perf-sched.sch_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read
187.32 +2.5% 191.94 turbostat.PkgWatt
173408 ? 6% +9.3% 189469 ? 4% interrupts.NMI:Non-maskable_interrupts
173408 ? 6% +9.3% 189469 ? 4% interrupts.PMI:Performance_monitoring_interrupts
37466 ? 28% -77.2% 8527 ? 41% numa-meminfo.node0.Dirty
34905 ? 29% -78.6% 7455 ? 42% numa-meminfo.node0.Inactive(file)
0.02 ? 12% -0.0 0.02 ? 14% mpstat.cpu.all.iowait%
0.70 ? 5% +0.2 0.88 ? 3% mpstat.cpu.all.soft%
0.18 +0.0 0.21 ? 3% mpstat.cpu.all.usr%
82.00 -1.2% 81.00 vmstat.cpu.id
6603 -19.9% 5287 ? 19% vmstat.memory.buff
7288 ? 2% -36.6% 4620 ? 26% vmstat.system.cs
3265335 ? 7% +19.3% 3896092 ? 5% numa-numastat.node0.local_node
3346695 ? 7% +17.3% 3926041 ? 6% numa-numastat.node0.numa_hit
81364 ? 14% -63.2% 29955 ? 53% numa-numastat.node0.other_node
3548555 ? 6% +23.1% 4367587 ? 3% numa-numastat.node1.local_node
3553822 ? 6% +24.5% 4424271 ? 4% numa-numastat.node1.numa_hit
5274 ?218% +974.8% 56690 ? 28% numa-numastat.node1.other_node
9280 ? 28% -77.4% 2097 ? 43% numa-vmstat.node0.nr_dirty
8654 ? 29% -78.8% 1836 ? 43% numa-vmstat.node0.nr_inactive_file
8654 ? 29% -78.8% 1837 ? 43% numa-vmstat.node0.nr_zone_inactive_file
9286 ? 29% -76.7% 2163 ? 43% numa-vmstat.node0.nr_zone_write_pending
84889 ? 13% -61.7% 32478 ? 42% numa-vmstat.node0.numa_other
13513 ? 88% +387.9% 65928 ? 20% numa-vmstat.node1.numa_other
9173 ? 6% -11.5% 8118 ? 5% softirqs.CPU18.SCHED
9438 ? 6% -15.9% 7938 ? 12% softirqs.CPU38.SCHED
5148 ? 24% +123.7% 11518 ? 35% softirqs.CPU42.RCU
3794 ? 10% +86.4% 7075 ? 47% softirqs.CPU87.RCU
6193 ? 33% +58.8% 9833 ? 26% softirqs.CPU93.RCU
805948 ? 2% +15.6% 931978 softirqs.RCU
6787 -19.3% 5480 ? 19% meminfo.Active(file)
6597 -20.0% 5276 ? 19% meminfo.Buffers
61060 ? 5% -71.6% 17318 ? 62% meminfo.Dirty
358019 -14.2% 307304 ? 5% meminfo.Inactive
57195 ? 5% -72.4% 15781 ? 64% meminfo.Inactive(file)
135110 ? 2% -17.9% 110864 ? 3% meminfo.KReclaimable
135110 ? 2% -17.9% 110864 ? 3% meminfo.SReclaimable
1697 -19.3% 1370 ? 18% proc-vmstat.nr_active_file
3536717 +14.5% 4048711 ? 4% proc-vmstat.nr_dirtied
15154 ? 5% -71.4% 4341 ? 63% proc-vmstat.nr_dirty
286580 -3.7% 275906 proc-vmstat.nr_file_pages
14194 ? 5% -72.2% 3952 ? 64% proc-vmstat.nr_inactive_file
33734 ? 2% -18.0% 27679 ? 3% proc-vmstat.nr_slab_reclaimable
1697 -19.3% 1370 ? 18% proc-vmstat.nr_zone_active_file
14194 ? 5% -72.2% 3952 ? 64% proc-vmstat.nr_zone_inactive_file
15107 ? 5% -70.5% 4456 ? 62% proc-vmstat.nr_zone_write_pending
6916501 +20.9% 8365370 ? 2% proc-vmstat.numa_hit
6829859 +21.2% 8278720 ? 2% proc-vmstat.numa_local
6112 ? 5% +8.8% 6650 ? 3% proc-vmstat.pgactivate
10102561 +28.1% 12945356 ? 2% proc-vmstat.pgalloc_normal
9959455 +28.5% 12802727 ? 3% proc-vmstat.pgfree
47144 ? 7% -39.7% 28440 ? 6% slabinfo.buffer_head.active_objs
1218 ? 7% -39.1% 741.67 ? 6% slabinfo.buffer_head.active_slabs
47555 ? 7% -39.2% 28931 ? 6% slabinfo.buffer_head.num_objs
1218 ? 7% -39.1% 741.67 ? 6% slabinfo.buffer_head.num_slabs
129606 -14.5% 110804 ? 2% slabinfo.dentry.active_objs
3127 -14.5% 2674 ? 2% slabinfo.dentry.active_slabs
131378 -14.5% 112353 ? 2% slabinfo.dentry.num_objs
3127 -14.5% 2674 ? 2% slabinfo.dentry.num_slabs
106711 ? 8% -36.6% 67653 ? 6% slabinfo.ext4_extent_status.active_objs
1048 ? 8% -36.6% 664.50 ? 6% slabinfo.ext4_extent_status.active_slabs
106945 ? 8% -36.6% 67834 ? 6% slabinfo.ext4_extent_status.num_objs
1048 ? 8% -36.6% 664.50 ? 6% slabinfo.ext4_extent_status.num_slabs
24635 ? 3% -45.2% 13505 ? 13% slabinfo.ext4_inode_cache.active_objs
947.00 ? 4% -48.5% 487.33 ? 14% slabinfo.ext4_inode_cache.active_slabs
26527 ? 4% -48.5% 13658 ? 14% slabinfo.ext4_inode_cache.num_objs
947.00 ? 4% -48.5% 487.33 ? 14% slabinfo.ext4_inode_cache.num_slabs
114430 ? 5% +13.1% 129428 ? 3% slabinfo.filp.active_objs
3583 ? 5% +13.1% 4053 ? 3% slabinfo.filp.active_slabs
114684 ? 5% +13.1% 129710 ? 3% slabinfo.filp.num_objs
3583 ? 5% +13.1% 4053 ? 3% slabinfo.filp.num_slabs
2133 ? 4% -27.7% 1541 ? 22% slabinfo.jbd2_journal_head.active_objs
2133 ? 4% -27.7% 1541 ? 22% slabinfo.jbd2_journal_head.num_objs
1700 ? 5% -10.4% 1523 ? 5% slabinfo.khugepaged_mm_slot.active_objs
1700 ? 5% -10.4% 1523 ? 5% slabinfo.khugepaged_mm_slot.num_objs
991.83 ? 3% +11.9% 1109 ? 4% slabinfo.kmalloc-256.active_slabs
31752 ? 3% +11.9% 35520 ? 4% slabinfo.kmalloc-256.num_objs
991.83 ? 3% +11.9% 1109 ? 4% slabinfo.kmalloc-256.num_slabs
9850 ? 3% -30.4% 6852 ? 13% slabinfo.kmalloc-rcl-512.active_objs
9933 ? 3% -30.3% 6925 ? 13% slabinfo.kmalloc-rcl-512.num_objs
89858 ? 3% -15.7% 75727 ? 4% slabinfo.vmap_area.active_objs
1422 ? 3% -16.3% 1190 ? 4% slabinfo.vmap_area.active_slabs
91033 ? 3% -16.3% 76222 ? 4% slabinfo.vmap_area.num_objs
1422 ? 3% -16.3% 1190 ? 4% slabinfo.vmap_area.num_slabs
12.02 ? 8% -10.3% 10.78 ? 4% perf-stat.i.MPKI
3.726e+09 +16.7% 4.347e+09 ? 2% perf-stat.i.branch-instructions
27267896 +7.6% 29328635 perf-stat.i.branch-misses
32.59 ? 3% +2.7 35.27 ? 2% perf-stat.i.cache-miss-rate%
72069672 ? 2% +16.9% 84243539 ? 3% perf-stat.i.cache-misses
2.17e+08 ? 2% +7.9% 2.342e+08 ? 2% perf-stat.i.cache-references
7282 -38.1% 4505 ? 27% perf-stat.i.context-switches
2.65 -11.3% 2.35 ? 2% perf-stat.i.cpi
4.891e+10 +4.4% 5.105e+10 perf-stat.i.cpu-cycles
194.63 +9.8% 213.71 perf-stat.i.cpu-migrations
737.10 ? 2% -9.4% 667.84 perf-stat.i.cycles-between-cache-misses
749259 ? 18% -35.6% 482663 ? 18% perf-stat.i.dTLB-load-misses
5.399e+09 +17.9% 6.367e+09 ? 2% perf-stat.i.dTLB-loads
2.864e+09 +20.2% 3.443e+09 ? 2% perf-stat.i.dTLB-stores
1.832e+10 +17.2% 2.146e+10 ? 2% perf-stat.i.instructions
1325 +11.6% 1479 perf-stat.i.instructions-per-iTLB-miss
0.38 +11.6% 0.43 ? 2% perf-stat.i.ipc
0.51 +4.4% 0.53 perf-stat.i.metric.GHz
520.49 +14.9% 598.21 ? 2% perf-stat.i.metric.K/sec
127.15 +17.9% 149.92 ? 2% perf-stat.i.metric.M/sec
26413268 +18.1% 31202417 ? 3% perf-stat.i.node-load-misses
81.23 +2.4 83.61 perf-stat.i.node-store-miss-rate%
13620737 +22.0% 16617089 ? 4% perf-stat.i.node-store-misses
11.85 ? 2% -7.9% 10.92 ? 3% perf-stat.overall.MPKI
0.73 -0.1 0.68 ? 2% perf-stat.overall.branch-miss-rate%
33.22 ? 3% +2.8 35.97 ? 2% perf-stat.overall.cache-miss-rate%
2.67 -10.9% 2.38 ? 2% perf-stat.overall.cpi
678.95 -10.7% 606.59 ? 2% perf-stat.overall.cycles-between-cache-misses
0.01 ? 19% -0.0 0.01 ? 19% perf-stat.overall.dTLB-load-miss-rate%
0.01 ? 6% -0.0 0.01 ? 9% perf-stat.overall.dTLB-store-miss-rate%
1237 +12.9% 1398 perf-stat.overall.instructions-per-iTLB-miss
0.37 +12.2% 0.42 ? 2% perf-stat.overall.ipc
88.23 +1.4 89.62 perf-stat.overall.node-load-miss-rate%
82.89 +2.8 85.72 perf-stat.overall.node-store-miss-rate%
3.668e+09 +16.6% 4.278e+09 ? 2% perf-stat.ps.branch-instructions
26840788 +7.5% 28863296 perf-stat.ps.branch-misses
70936457 ? 2% +16.9% 82908277 ? 3% perf-stat.ps.cache-misses
2.137e+08 ? 2% +7.9% 2.305e+08 ? 2% perf-stat.ps.cache-references
7174 ? 2% -38.0% 4446 ? 28% perf-stat.ps.context-switches
4.814e+10 +4.4% 5.024e+10 perf-stat.ps.cpu-cycles
191.65 +9.9% 210.65 perf-stat.ps.cpu-migrations
737105 ? 18% -35.7% 474008 ? 18% perf-stat.ps.dTLB-load-misses
5.315e+09 +17.9% 6.267e+09 ? 2% perf-stat.ps.dTLB-loads
2.819e+09 +20.2% 3.389e+09 ? 2% perf-stat.ps.dTLB-stores
14569596 ? 2% +3.7% 15104767 perf-stat.ps.iTLB-load-misses
1.803e+10 +17.1% 2.112e+10 ? 2% perf-stat.ps.instructions
25998185 +18.1% 30704499 ? 3% perf-stat.ps.node-load-misses
13406477 +22.0% 16351829 ? 4% perf-stat.ps.node-store-misses
1.156e+12 +15.4% 1.334e+12 ? 2% perf-stat.total.instructions
9.17 ? 10% -4.6 4.62 ? 33% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fsnotify_add_event.fanotify_handle_event.fsnotify
10.31 ? 9% -4.1 6.20 ? 19% perf-profile.calltrace.cycles-pp._raw_spin_lock.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent
7.46 ? 8% -2.2 5.29 ? 14% perf-profile.calltrace.cycles-pp.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent.do_sys_openat2
1.04 ? 6% -0.7 0.32 ?101% perf-profile.calltrace.cycles-pp.fanotify_merge.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent
0.68 ? 7% +0.2 0.92 ? 9% perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.47 ? 44% +0.3 0.73 ? 8% perf-profile.calltrace.cycles-pp.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.70 ? 8% +0.3 0.97 ? 9% perf-profile.calltrace.cycles-pp.dput.path_put.fanotify_free_event.fanotify_read.vfs_read
0.71 ? 8% +0.3 0.99 ? 9% perf-profile.calltrace.cycles-pp.path_put.fanotify_free_event.fanotify_read.vfs_read.ksys_read
0.52 ? 45% +0.3 0.85 ? 10% perf-profile.calltrace.cycles-pp.lockref_put_return.dput.path_put.fanotify_free_event.fanotify_read
0.73 ? 8% +0.4 1.09 ? 8% perf-profile.calltrace.cycles-pp.ext4_file_open.do_dentry_open.dentry_open.copy_event_to_user.fanotify_read
0.51 ? 44% +0.4 0.89 ? 8% perf-profile.calltrace.cycles-pp.fscrypt_file_open.ext4_file_open.do_dentry_open.dentry_open.copy_event_to_user
1.13 ? 11% +0.4 1.52 ? 9% perf-profile.calltrace.cycles-pp._raw_spin_lock.fanotify_read.vfs_read.ksys_read.do_syscall_64
0.18 ?141% +0.4 0.60 ? 6% perf-profile.calltrace.cycles-pp.kmem_cache_free.fanotify_handle_event.fsnotify.__fsnotify_parent.__fput
0.17 ?141% +0.4 0.61 ? 10% perf-profile.calltrace.cycles-pp.fsnotify_destroy_event.fanotify_read.vfs_read.ksys_read.do_syscall_64
0.26 ?100% +0.5 0.72 ? 6% perf-profile.calltrace.cycles-pp.put_pid.fanotify_free_event.fanotify_read.vfs_read.ksys_read
1.25 ? 7% +0.5 1.76 ? 8% perf-profile.calltrace.cycles-pp.fanotify_free_event.fanotify_read.vfs_read.ksys_read.do_syscall_64
1.21 ? 12% +0.5 1.72 ? 10% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.__alloc_file.alloc_empty_file.dentry_open.copy_event_to_user
1.66 ? 10% +0.6 2.26 ? 9% perf-profile.calltrace.cycles-pp.kmem_cache_free.fanotify_read.vfs_read.ksys_read.do_syscall_64
0.00 +0.9 0.92 ? 11% perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_fd.copy_event_to_user.fanotify_read.vfs_read
0.00 +1.1 1.07 ? 10% perf-profile.calltrace.cycles-pp.alloc_fd.copy_event_to_user.fanotify_read.vfs_read.ksys_read
4.45 ? 6% -4.2 0.28 ? 12% perf-profile.children.cycles-pp.ns_capable_common
4.37 ? 6% -4.1 0.28 ? 12% perf-profile.children.cycles-pp.security_capable
10.27 ? 9% -4.1 6.18 ? 21% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
4.32 ? 7% -4.1 0.27 ? 13% perf-profile.children.cycles-pp.apparmor_capable
15.78 ? 8% -3.8 11.97 ? 14% perf-profile.children.cycles-pp.fsnotify_add_event
12.40 ? 9% -2.9 9.46 ? 14% perf-profile.children.cycles-pp._raw_spin_lock
1.49 ? 7% -0.5 0.96 ? 14% perf-profile.children.cycles-pp.fanotify_merge
0.61 ? 12% -0.4 0.24 ? 40% perf-profile.children.cycles-pp.__x64_sys_ioctl
0.60 ? 12% -0.4 0.24 ? 40% perf-profile.children.cycles-pp.do_vfs_ioctl
0.60 ? 13% -0.4 0.24 ? 39% perf-profile.children.cycles-pp.fanotify_ioctl
0.10 ? 6% +0.0 0.12 ? 8% perf-profile.children.cycles-pp.ext4_map_blocks
0.07 ? 14% +0.0 0.09 ? 9% perf-profile.children.cycles-pp.__fsnotify_alloc_group
0.12 ? 12% +0.0 0.15 ? 5% perf-profile.children.cycles-pp.ext4_ext_map_blocks
0.06 ? 7% +0.0 0.09 ? 7% perf-profile.children.cycles-pp.copy_user_generic_unrolled
0.04 ? 71% +0.0 0.07 ? 7% perf-profile.children.cycles-pp.__check_block_validity
0.03 ? 70% +0.0 0.07 ? 11% perf-profile.children.cycles-pp.alloc_page_buffers
0.06 ? 11% +0.0 0.10 ? 5% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
0.14 ? 9% +0.0 0.18 ? 7% perf-profile.children.cycles-pp._copy_to_user
0.21 ? 12% +0.0 0.25 ? 7% perf-profile.children.cycles-pp.ext4_da_map_blocks
0.13 ? 6% +0.0 0.17 ? 8% perf-profile.children.cycles-pp.ext4_getblk
0.12 ? 10% +0.0 0.17 ? 9% perf-profile.children.cycles-pp.close_fd
0.21 ? 12% +0.0 0.25 ? 7% perf-profile.children.cycles-pp.ext4_da_get_block_prep
0.14 ? 10% +0.1 0.19 ? 9% perf-profile.children.cycles-pp.task_work_add
0.14 ? 12% +0.1 0.20 ? 9% perf-profile.children.cycles-pp.errseq_sample
0.19 ? 11% +0.1 0.25 ? 6% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.19 ? 14% +0.1 0.26 ? 9% perf-profile.children.cycles-pp.page_counter_try_charge
0.12 ? 15% +0.1 0.19 ? 9% perf-profile.children.cycles-pp.fsnotify_peek_first_event
0.31 ? 7% +0.1 0.38 ? 7% perf-profile.children.cycles-pp.__list_add_valid
0.22 ? 14% +0.1 0.29 ? 9% perf-profile.children.cycles-pp.obj_cgroup_charge_pages
0.17 ? 10% +0.1 0.25 ? 8% perf-profile.children.cycles-pp.fanotify_insert_event
0.26 ? 7% +0.1 0.35 ? 9% perf-profile.children.cycles-pp.locks_remove_posix
0.21 ? 10% +0.1 0.29 ? 10% perf-profile.children.cycles-pp.fput_many
0.19 ? 11% +0.1 0.28 ? 7% perf-profile.children.cycles-pp.call_rcu
0.24 ? 13% +0.1 0.33 ? 9% perf-profile.children.cycles-pp.page_counter_cancel
0.22 ? 6% +0.1 0.33 ? 8% perf-profile.children.cycles-pp.pid_vnr
0.46 ? 10% +0.1 0.59 ? 8% perf-profile.children.cycles-pp.obj_cgroup_charge
0.33 ? 18% +0.1 0.46 ? 10% perf-profile.children.cycles-pp.page_counter_uncharge
0.38 ? 10% +0.1 0.52 ? 10% perf-profile.children.cycles-pp.obj_cgroup_uncharge_pages
0.68 ? 8% +0.2 0.85 ? 9% perf-profile.children.cycles-pp.fsnotify_destroy_event
0.55 ? 6% +0.2 0.74 ? 8% perf-profile.children.cycles-pp.filp_close
0.36 ? 7% +0.2 0.56 ? 8% perf-profile.children.cycles-pp.lockref_get_not_zero
0.38 ? 8% +0.2 0.60 ? 7% perf-profile.children.cycles-pp.dget_parent
0.68 ? 7% +0.2 0.92 ? 9% perf-profile.children.cycles-pp.__x64_sys_close
0.68 ? 10% +0.2 0.93 ? 11% perf-profile.children.cycles-pp.drain_obj_stock
1.29 ? 9% +0.3 1.57 ? 7% perf-profile.children.cycles-pp.irq_exit_rcu
0.67 ? 8% +0.3 0.98 ? 7% perf-profile.children.cycles-pp.fscrypt_file_open
0.86 ? 8% +0.4 1.22 ? 7% perf-profile.children.cycles-pp.ext4_file_open
1.21 ? 9% +0.4 1.64 ? 10% perf-profile.children.cycles-pp.refill_obj_stock
1.94 ? 10% +0.5 2.46 ? 9% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.29 ? 16% +0.6 1.86 ? 10% perf-profile.children.cycles-pp.get_obj_cgroup_from_current
0.33 ? 12% +0.8 1.12 ? 10% perf-profile.children.cycles-pp.alloc_fd
5.78 ? 9% +1.8 7.57 ? 9% perf-profile.children.cycles-pp.kmem_cache_free
10.17 ? 9% -4.1 6.09 ? 21% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
4.21 ? 7% -3.9 0.26 ? 11% perf-profile.self.cycles-pp.apparmor_capable
1.47 ? 7% -0.5 0.94 ? 14% perf-profile.self.cycles-pp.fanotify_merge
0.56 ? 12% -0.3 0.22 ? 40% perf-profile.self.cycles-pp.fanotify_ioctl
0.07 ? 6% +0.0 0.09 ? 7% perf-profile.self.cycles-pp.fd_install
0.06 ? 7% +0.0 0.09 ? 13% perf-profile.self.cycles-pp.copy_user_generic_unrolled
0.04 ? 45% +0.0 0.07 ? 10% perf-profile.self.cycles-pp.dentry_open
0.06 ? 11% +0.0 0.10 ? 5% perf-profile.self.cycles-pp.rcu_segcblist_enqueue
0.12 ? 11% +0.0 0.16 ? 8% perf-profile.self.cycles-pp.call_rcu
0.11 ? 15% +0.0 0.15 ? 12% perf-profile.self.cycles-pp.fsnotify_remove_queued_event
0.08 ? 8% +0.0 0.12 ? 12% perf-profile.self.cycles-pp.alloc_fd
0.13 ? 8% +0.0 0.18 ? 8% perf-profile.self.cycles-pp.task_work_add
0.14 ? 14% +0.1 0.19 ? 9% perf-profile.self.cycles-pp.ext4_file_open
0.18 ? 15% +0.1 0.23 ? 10% perf-profile.self.cycles-pp.page_counter_try_charge
0.14 ? 9% +0.1 0.19 ? 8% perf-profile.self.cycles-pp.errseq_sample
0.19 ? 13% +0.1 0.24 ? 7% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.23 ? 7% +0.1 0.29 ? 7% perf-profile.self.cycles-pp.dput
0.11 ? 14% +0.1 0.18 ? 7% perf-profile.self.cycles-pp.fsnotify_peek_first_event
0.31 ? 8% +0.1 0.38 ? 7% perf-profile.self.cycles-pp.__list_add_valid
0.26 ? 7% +0.1 0.33 ? 10% perf-profile.self.cycles-pp.locks_remove_posix
0.23 ? 10% +0.1 0.30 ? 9% perf-profile.self.cycles-pp.ext4_release_file
0.17 ? 10% +0.1 0.25 ? 9% perf-profile.self.cycles-pp.fanotify_insert_event
0.23 ? 13% +0.1 0.33 ? 10% perf-profile.self.cycles-pp.page_counter_cancel
0.21 ? 8% +0.1 0.32 ? 7% perf-profile.self.cycles-pp.pid_vnr
0.66 ? 7% +0.2 0.82 ? 9% perf-profile.self.cycles-pp.fsnotify_destroy_event
0.52 ? 9% +0.2 0.68 ? 9% perf-profile.self.cycles-pp.refill_obj_stock
0.33 ? 9% +0.2 0.52 ? 6% perf-profile.self.cycles-pp.lockref_get_not_zero
0.76 ? 7% +0.2 0.97 ? 7% perf-profile.self.cycles-pp.do_dentry_open
1.52 ? 11% +0.5 2.00 ? 9% perf-profile.self.cycles-pp.kmem_cache_alloc
1.71 ? 10% +0.5 2.19 ? 8% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.22 ? 16% +0.5 1.74 ? 10% perf-profile.self.cycles-pp.get_obj_cgroup_from_current
2.34 ? 10% +1.1 3.48 ? 8% perf-profile.self.cycles-pp._raw_spin_lock



stress-ng.time.involuntary_context_switches

20000 +-------------------------------------------------------------------+
| OO O |
19000 |-O O O O O O O O O O O |
| O O O OO O O O O O O OO O |
| O O O |
18000 |-+ O O O |
| |
17000 |-+ |
| |
16000 |-+ |
| ++ + |
|: + : : :: + |
15000 |:+ + + +. .+. : : : : .+.+. +.+ .+ + |
| +.++.+ +.+ + ++.+. .+ +.+.+.+ +.+.+.++ + + .+ +.|
14000 +-------------------------------------------------------------------+


stress-ng.fanotify.ops

1.25e+08 +----------------------------------------------------------------+
| O |
1.2e+08 |-+ O O O O O O O O O O |
1.15e+08 |-O O OO O O O O O O O O O O O O O |
| O O O OO |
1.1e+08 |-+O |
1.05e+08 |-+ |
| |
1e+08 |-+ |
9.5e+07 |-+ |
| + |
9e+07 |-+ + : + + .+. + |
8.5e+07 |.+ .+. + : .+. +.+. .+.+ + : + : + +.+ + +|
| ++.+ ++.+.+ +.++ + +.++ +.+.+ + : + : + |
8e+07 +----------------------------------------------------------------+


stress-ng.fanotify.ops_per_sec

2.1e+06 +-----------------------------------------------------------------+
| O |
2e+06 |-+ OO O O |
| O O OO O O O O O O O OO O O O O O OO O |
1.9e+06 |-+ O O O O O O O |
1.8e+06 |-+O |
| |
1.7e+06 |-+ |
| |
1.6e+06 |-+ + |
1.5e+06 |-+ :: |
| .+. .+ : +. +. .+ + +. .+.+. +.|
1.4e+06 |.+ .+ ++.+.+ :.+.+ +.+.++. .+ +.+. + + + +. + ++ + |
| + + + + + + |
1.3e+06 +-----------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (28.31 kB)
config-5.13.0-rc1-00002-ga8b98c808eab (176.78 kB)
job-script (8.59 kB)
job.yaml (5.96 kB)
reproduce (553.00 B)
Download all attachments

2021-06-03 06:59:18

by Amir Goldstein

[permalink] [raw]
Subject: Re: [fanotify] a8b98c808e: stress-ng.fanotify.ops_per_sec 32.2% improvement

On Thu, Jun 3, 2021 at 4:36 AM kernel test robot <[email protected]> wrote:
>
>
>
> Greeting,
>
> FYI, we noticed a 32.2% improvement of stress-ng.fanotify.ops_per_sec due to commit:
>
>
> commit: a8b98c808eab3ec8f1b5a64be967b0f4af4cae43 ("fanotify: fix permission model of unprivileged group")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
>

I guess now we know what caused the reported regression:
https://lore.kernel.org/lkml/[email protected]/

I didn't know that capable() is so significant.

FWIW, here is a link to the test code:
https://github.com/ColinIanKing/stress-ng/blob/master/stress-fanotify.c#L474

It creates events in a loop by child process while the parent process
reads the generated events in a loop (on two different fanotify groups).

Thanks,
Amir.

2021-06-03 08:47:48

by Jan Kara

[permalink] [raw]
Subject: Re: [fanotify] a8b98c808e: stress-ng.fanotify.ops_per_sec 32.2% improvement

On Thu 03-06-21 09:57:15, Amir Goldstein wrote:
> On Thu, Jun 3, 2021 at 4:36 AM kernel test robot <[email protected]> wrote:
> >
> >
> >
> > Greeting,
> >
> > FYI, we noticed a 32.2% improvement of stress-ng.fanotify.ops_per_sec due to commit:
> >
> >
> > commit: a8b98c808eab3ec8f1b5a64be967b0f4af4cae43 ("fanotify: fix permission model of unprivileged group")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> >
>
> I guess now we know what caused the reported regression:
> https://lore.kernel.org/lkml/[email protected]/
>
> I didn't know that capable() is so significant.

Yeah, I wouldn't guess either. Interesting.

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2021-06-03 21:58:02

by Matt Bobrowski

[permalink] [raw]
Subject: Re: [fanotify] a8b98c808e: stress-ng.fanotify.ops_per_sec 32.2% improvement

On Thu, Jun 03, 2021 at 10:43:24AM +0200, Jan Kara wrote:
> On Thu 03-06-21 09:57:15, Amir Goldstein wrote:
> > On Thu, Jun 3, 2021 at 4:36 AM kernel test robot <[email protected]> wro
> > > Greeting,
> > >
> > > FYI, we noticed a 32.2% improvement of stress-ng.fanotify.ops_per_sec due to commit:
> > >
> > >
> > > commit: a8b98c808eab3ec8f1b5a64be967b0f4af4cae43 ("fanotify: fix permission model of unprivileged group")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > >
> > >
> >
> > I guess now we know what caused the reported regression:
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > I didn't know that capable() is so significant.
>
> Yeah, I wouldn't guess either. Interesting.

Indeed, interesting! :)

While on the topic of stress-ng, it reminds me to set this up on my server
so we can perform such regressions before merging fanotify changes into
master.

/M