2021-05-11 06:53:17

by kernel test robot

[permalink] [raw]
Subject: [fanotify] 7cea2a3c50: stress-ng.fanotify.ops_per_sec -23.4% regression



Greeting,

FYI, we noticed a -23.4% regression of stress-ng.fanotify.ops_per_sec due to commit:


commit: 7cea2a3c505e87a9d6afc78be4a7f7be636a73a7 ("fanotify: support limited functionality for unprivileged users")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: stress-ng
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: os
test: fanotify
cpufreq_governor: performance
ucode: 0x5003006




If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/fanotify/stress-ng/60s/0x5003006

commit:
5b8fea65d1 ("fanotify: configurable limits via sysfs")
7cea2a3c50 ("fanotify: support limited functionality for unprivileged users")

5b8fea65d197f408 7cea2a3c505e87a9d6afc78be4a
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.214e+08 ? 3% -23.4% 93048328 ? 4% stress-ng.fanotify.ops
2023656 ? 3% -23.4% 1550295 ? 4% stress-ng.fanotify.ops_per_sec
33565053 -10.9% 29899893 ? 3% stress-ng.time.file_system_outputs
19487 ? 3% -19.0% 15787 ? 3% stress-ng.time.involuntary_context_switches
10868 -1.0% 10755 stress-ng.time.minor_page_faults
1648 -1.9% 1618 stress-ng.time.percent_of_cpu_this_job_got
1014 -1.6% 998.53 stress-ng.time.system_time
38378 ? 97% +280.5% 146045 ? 7% stress-ng.time.voluntary_context_switches
18.19 -3.9% 17.48 ? 2% iostat.cpu.system
0.96 ? 6% -0.2 0.72 ? 8% mpstat.cpu.all.soft%
0.21 -0.0 0.18 ? 6% mpstat.cpu.all.usr%
191.47 -2.3% 187.06 turbostat.PkgWatt
91.25 -1.4% 89.97 turbostat.RAMWatt
3494 ? 39% +55.6% 5438 vmstat.memory.buff
3805 ? 30% +81.6% 6909 ? 3% vmstat.system.cs
509.50 +23.4% 628.50 ? 18% interrupts.CPU53.CAL:Function_call_interrupts
3119 ? 44% -62.8% 1159 ? 38% interrupts.CPU62.NMI:Non-maskable_interrupts
3119 ? 44% -62.8% 1159 ? 38% interrupts.CPU62.PMI:Performance_monitoring_interrupts
4580 ?106% +357.3% 20942 ? 38% numa-meminfo.node0.Dirty
4290 ?109% +363.2% 19873 ? 36% numa-meminfo.node0.Inactive(file)
5634 ? 74% +364.1% 26149 ? 25% numa-meminfo.node1.Dirty
4728 ? 75% +406.7% 23958 ? 25% numa-meminfo.node1.Inactive(file)
52712 ? 12% +28.6% 67787 ? 8% numa-meminfo.node1.KReclaimable
52712 ? 12% +28.6% 67787 ? 8% numa-meminfo.node1.SReclaimable
15433 ? 7% +15.2% 17772 ? 2% meminfo.Active
3658 ? 37% +52.2% 5568 meminfo.Active(file)
3490 ? 39% +55.6% 5431 meminfo.Buffers
10124 ? 75% +366.6% 47241 ? 15% meminfo.Dirty
314459 ? 2% +12.9% 354928 meminfo.Inactive
8907 ? 77% +393.4% 43949 ? 15% meminfo.Inactive(file)
107422 +16.4% 124988 ? 2% meminfo.KReclaimable
107422 +16.4% 124988 ? 2% meminfo.SReclaimable
1110 ?111% +368.9% 5205 ? 38% numa-vmstat.node0.nr_dirty
1046 ?114% +372.2% 4940 ? 36% numa-vmstat.node0.nr_inactive_file
1048 ?113% +371.4% 4941 ? 36% numa-vmstat.node0.nr_zone_inactive_file
1138 ?110% +360.2% 5241 ? 38% numa-vmstat.node0.nr_zone_write_pending
1385 ? 78% +368.3% 6486 ? 25% numa-vmstat.node1.nr_dirty
1155 ? 81% +413.4% 5933 ? 25% numa-vmstat.node1.nr_inactive_file
13198 ? 12% +28.1% 16909 ? 8% numa-vmstat.node1.nr_slab_reclaimable
1154 ? 81% +413.8% 5933 ? 25% numa-vmstat.node1.nr_zone_inactive_file
1409 ? 78% +362.8% 6523 ? 25% numa-vmstat.node1.nr_zone_write_pending
8001 ? 4% +18.4% 9476 ? 6% softirqs.CPU3.SCHED
7965 ? 6% +16.4% 9273 ? 7% softirqs.CPU54.SCHED
7857 ? 6% +18.4% 9306 ? 6% softirqs.CPU58.SCHED
7971 ? 5% +14.0% 9085 ? 5% softirqs.CPU62.SCHED
7773 ? 5% +16.7% 9072 ? 7% softirqs.CPU64.SCHED
9103 ? 18% -33.4% 6060 ? 26% softirqs.CPU71.RCU
7936 ? 7% +17.7% 9341 ? 7% softirqs.CPU71.SCHED
8074 ? 35% -50.5% 3997 ? 17% softirqs.CPU77.RCU
7813 ? 6% +15.7% 9036 ? 7% softirqs.CPU8.SCHED
934447 ? 2% -10.4% 836914 ? 2% softirqs.RCU
914.67 ? 37% +52.2% 1392 proc-vmstat.nr_active_file
4196621 -10.9% 3740233 ? 3% proc-vmstat.nr_dirtied
2530 ? 75% +364.4% 11749 ? 15% proc-vmstat.nr_dirty
273393 +3.5% 282850 proc-vmstat.nr_file_pages
2239 ? 76% +388.3% 10932 ? 15% proc-vmstat.nr_inactive_file
26824 +16.3% 31185 ? 3% proc-vmstat.nr_slab_reclaimable
1550 ? 27% +249.9% 5424 ? 40% proc-vmstat.nr_written
914.67 ? 37% +52.2% 1392 proc-vmstat.nr_zone_active_file
2239 ? 76% +388.3% 10932 ? 15% proc-vmstat.nr_zone_inactive_file
2547 ? 77% +363.4% 11805 ? 15% proc-vmstat.nr_zone_write_pending
8695504 -15.1% 7378817 ? 3% proc-vmstat.numa_hit
8608986 -15.3% 7292304 ? 3% proc-vmstat.numa_local
13559136 -18.5% 11047582 ? 4% proc-vmstat.pgalloc_normal
13417362 -18.7% 10904217 ? 4% proc-vmstat.pgfree
27913 ? 8% +37.7% 38440 ? 10% slabinfo.buffer_head.active_objs
726.67 ? 7% +37.7% 1000 ? 9% slabinfo.buffer_head.active_slabs
28355 ? 7% +37.7% 39043 ? 9% slabinfo.buffer_head.num_objs
726.67 ? 7% +37.7% 1000 ? 9% slabinfo.buffer_head.num_slabs
107206 +11.6% 119642 ? 2% slabinfo.dentry.active_objs
2593 +11.1% 2881 ? 2% slabinfo.dentry.active_slabs
108962 +11.1% 121015 ? 2% slabinfo.dentry.num_objs
2593 +11.1% 2881 ? 2% slabinfo.dentry.num_slabs
66688 ? 9% +31.8% 87926 ? 8% slabinfo.ext4_extent_status.active_objs
655.00 ? 9% +31.8% 863.17 ? 8% slabinfo.ext4_extent_status.active_slabs
66862 ? 9% +31.8% 88097 ? 8% slabinfo.ext4_extent_status.num_objs
655.00 ? 9% +31.8% 863.17 ? 8% slabinfo.ext4_extent_status.num_slabs
11791 ? 11% +79.3% 21139 ? 8% slabinfo.ext4_inode_cache.active_objs
440.17 ? 11% +85.2% 815.00 ? 9% slabinfo.ext4_inode_cache.active_slabs
11889 ? 11% +85.2% 22019 ? 9% slabinfo.ext4_inode_cache.num_objs
440.17 ? 11% +85.2% 815.00 ? 9% slabinfo.ext4_inode_cache.num_slabs
1428 ? 25% +41.9% 2027 ? 5% slabinfo.jbd2_journal_head.active_objs
1428 ? 25% +41.9% 2027 ? 5% slabinfo.jbd2_journal_head.num_objs
6756 ? 15% +41.1% 9536 ? 3% slabinfo.kmalloc-rcl-512.active_objs
6792 ? 15% +42.1% 9649 ? 3% slabinfo.kmalloc-rcl-512.num_objs
72015 ? 5% +15.2% 82951 ? 4% slabinfo.vmap_area.active_objs
1129 ? 5% +16.2% 1312 ? 4% slabinfo.vmap_area.active_slabs
72353 ? 5% +16.2% 84046 ? 4% slabinfo.vmap_area.num_objs
1129 ? 5% +16.2% 1312 ? 4% slabinfo.vmap_area.num_slabs
4.537e+09 -13.0% 3.948e+09 ? 4% perf-stat.i.branch-instructions
34409487 -12.1% 30235678 ? 2% perf-stat.i.branch-misses
34.06 ? 2% -2.4 31.67 ? 2% perf-stat.i.cache-miss-rate%
85290547 ? 3% -12.7% 74474324 ? 5% perf-stat.i.cache-misses
3674 ? 32% +88.6% 6931 ? 3% perf-stat.i.context-switches
2.35 +12.2% 2.64 ? 4% perf-stat.i.cpi
5.242e+10 -3.5% 5.057e+10 perf-stat.i.cpu-cycles
224.46 -9.4% 203.26 ? 2% perf-stat.i.cpu-migrations
656.95 ? 2% +12.4% 738.24 ? 5% perf-stat.i.cycles-between-cache-misses
300745 ? 11% +53.4% 461361 ? 19% perf-stat.i.dTLB-load-misses
6.438e+09 -13.9% 5.543e+09 ? 4% perf-stat.i.dTLB-loads
161739 ? 7% +14.6% 185423 ? 6% perf-stat.i.dTLB-store-misses
3.536e+09 -15.4% 2.991e+09 ? 4% perf-stat.i.dTLB-stores
88.81 -1.9 86.94 perf-stat.i.iTLB-load-miss-rate%
20384992 ? 2% -14.5% 17430423 ? 4% perf-stat.i.iTLB-load-misses
2.218e+10 -13.3% 1.923e+10 ? 4% perf-stat.i.instructions
0.43 -9.5% 0.39 ? 3% perf-stat.i.ipc
0.55 -3.5% 0.53 perf-stat.i.metric.GHz
612.45 ? 2% -11.7% 540.91 perf-stat.i.metric.K/sec
153.67 -13.9% 132.37 ? 4% perf-stat.i.metric.M/sec
88.26 -1.5 86.78 perf-stat.i.node-load-miss-rate%
31264969 ? 3% -14.5% 26719326 ? 3% perf-stat.i.node-load-misses
82.85 -2.5 80.30 perf-stat.i.node-store-miss-rate%
16368326 ? 3% -16.1% 13725763 ? 2% perf-stat.i.node-store-misses
11.08 ? 2% +7.9% 11.95 ? 5% perf-stat.overall.MPKI
34.71 ? 2% -2.3 32.43 ? 2% perf-stat.overall.cache-miss-rate%
2.36 +11.4% 2.63 ? 2% perf-stat.overall.cpi
0.00 ? 12% +0.0 0.01 ? 25% perf-stat.overall.dTLB-load-miss-rate%
0.00 ? 8% +0.0 0.01 ? 9% perf-stat.overall.dTLB-store-miss-rate%
90.08 -1.6 88.49 perf-stat.overall.iTLB-load-miss-rate%
0.42 -10.2% 0.38 ? 2% perf-stat.overall.ipc
84.44 -2.2 82.27 perf-stat.overall.node-store-miss-rate%
4.465e+09 -13.0% 3.886e+09 ? 4% perf-stat.ps.branch-instructions
33856926 -12.1% 29757385 ? 2% perf-stat.ps.branch-misses
83927962 ? 3% -12.7% 73296150 ? 5% perf-stat.ps.cache-misses
3620 ? 31% +88.4% 6819 ? 4% perf-stat.ps.context-switches
5.158e+10 -3.5% 4.977e+10 perf-stat.ps.cpu-cycles
221.03 -9.5% 200.02 ? 2% perf-stat.ps.cpu-migrations
295389 ? 11% +53.7% 454026 ? 19% perf-stat.ps.dTLB-load-misses
6.336e+09 -13.9% 5.456e+09 ? 4% perf-stat.ps.dTLB-loads
159067 ? 7% +14.7% 182479 ? 6% perf-stat.ps.dTLB-store-misses
3.48e+09 -15.4% 2.944e+09 ? 4% perf-stat.ps.dTLB-stores
20061765 ? 2% -14.5% 17156375 ? 4% perf-stat.ps.iTLB-load-misses
2.183e+10 -13.3% 1.893e+10 ? 4% perf-stat.ps.instructions
30762758 ? 3% -14.5% 26295053 ? 3% perf-stat.ps.node-load-misses
16104244 ? 3% -16.1% 13507363 ? 2% perf-stat.ps.node-store-misses
1.377e+12 ? 2% -12.2% 1.21e+12 ? 2% perf-stat.total.instructions
2.87 ? 11% -0.7 2.18 ? 11% perf-profile.calltrace.cycles-pp.ret_from_fork
2.87 ? 11% -0.7 2.18 ? 11% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
2.85 ? 11% -0.7 2.16 ? 11% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork
2.84 ? 11% -0.7 2.15 ? 11% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
2.84 ? 11% -0.7 2.15 ? 11% perf-profile.calltrace.cycles-pp.rcu_core.__softirqentry_text_start.run_ksoftirqd.smpboot_thread_fn.kthread
2.84 ? 11% -0.7 2.15 ? 11% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
2.83 ? 11% -0.7 2.15 ? 11% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.__softirqentry_text_start.run_ksoftirqd.smpboot_thread_fn
2.50 ? 10% -0.6 1.88 ? 11% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.__softirqentry_text_start.run_ksoftirqd
2.26 ? 10% -0.6 1.69 ? 14% perf-profile.calltrace.cycles-pp.kmem_cache_free.fanotify_read.vfs_read.ksys_read.do_syscall_64
1.79 ? 12% -0.5 1.30 ? 14% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.__alloc_file.alloc_empty_file.dentry_open.copy_event_to_user
1.74 ? 9% -0.5 1.26 ? 13% perf-profile.calltrace.cycles-pp.fanotify_free_event.fanotify_read.vfs_read.ksys_read.do_syscall_64
1.58 ? 12% -0.4 1.18 ? 11% perf-profile.calltrace.cycles-pp._raw_spin_lock.fanotify_read.vfs_read.ksys_read.do_syscall_64
1.43 ? 8% -0.4 1.03 ? 29% perf-profile.calltrace.cycles-pp.lockref_get.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent
2.19 ? 7% -0.4 1.83 ? 6% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.__fput
0.71 ? 11% -0.3 0.37 ? 70% perf-profile.calltrace.cycles-pp.put_pid.fanotify_free_event.fanotify_read.vfs_read.ksys_read
2.14 ? 7% -0.3 1.80 ? 7% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent
1.17 ? 10% -0.3 0.85 ? 12% perf-profile.calltrace.cycles-pp.ext4_file_open.do_dentry_open.dentry_open.copy_event_to_user.fanotify_read
2.37 ? 8% -0.3 2.07 ? 6% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.do_sys_openat2
0.98 ? 9% -0.3 0.72 ? 14% perf-profile.calltrace.cycles-pp.path_put.fanotify_free_event.fanotify_read.vfs_read.ksys_read
0.96 ? 9% -0.3 0.71 ? 14% perf-profile.calltrace.cycles-pp.dput.path_put.fanotify_free_event.fanotify_read.vfs_read
0.90 ? 9% -0.3 0.65 ? 13% perf-profile.calltrace.cycles-pp.fscrypt_file_open.ext4_file_open.do_dentry_open.dentry_open.copy_event_to_user
0.84 ? 8% -0.2 0.60 ? 14% perf-profile.calltrace.cycles-pp.lockref_put_return.dput.path_put.fanotify_free_event.fanotify_read
0.80 ? 7% -0.2 0.60 ? 9% perf-profile.calltrace.cycles-pp.refill_obj_stock.kmem_cache_free.rcu_do_batch.rcu_core.__softirqentry_text_start
0.87 ? 15% -0.2 0.68 ? 10% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fanotify_read.vfs_read.ksys_read
0.89 ? 7% -0.1 0.76 ? 8% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.vfs_write
0.72 ? 9% -0.1 0.60 ? 5% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_common_lock.fsnotify_add_event.fanotify_handle_event.fsnotify
0.75 ? 6% -0.1 0.63 ? 8% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.vfs_read
0.11 ?223% +0.7 0.83 ? 24% perf-profile.calltrace.cycles-pp.fanotify_merge.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent
4.92 ? 15% +1.7 6.61 ? 16% perf-profile.calltrace.cycles-pp.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent.do_sys_openat2
5.69 ? 22% +3.1 8.80 ? 23% perf-profile.calltrace.cycles-pp._raw_spin_lock.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent
4.00 ? 37% +3.6 7.60 ? 26% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fsnotify_add_event.fanotify_handle_event.fsnotify
0.00 +4.0 4.02 ? 12% perf-profile.calltrace.cycles-pp.apparmor_capable.security_capable.ns_capable_common.copy_event_to_user.fanotify_read
0.00 +4.1 4.07 ? 12% perf-profile.calltrace.cycles-pp.security_capable.ns_capable_common.copy_event_to_user.fanotify_read.vfs_read
0.00 +4.1 4.14 ? 12% perf-profile.calltrace.cycles-pp.ns_capable_common.copy_event_to_user.fanotify_read.vfs_read.ksys_read
7.67 ? 9% -1.7 5.97 ? 11% perf-profile.children.cycles-pp.kmem_cache_free
5.74 ? 7% -1.2 4.59 ? 9% perf-profile.children.cycles-pp.kmem_cache_alloc
4.27 ? 11% -0.9 3.36 ? 11% perf-profile.children.cycles-pp.rcu_core
6.23 ? 7% -0.9 5.32 ? 7% perf-profile.children.cycles-pp.fanotify_alloc_event
4.26 ? 11% -0.9 3.35 ? 11% perf-profile.children.cycles-pp.rcu_do_batch
4.38 ? 10% -0.9 3.48 ? 11% perf-profile.children.cycles-pp.__softirqentry_text_start
1.09 ? 13% -0.7 0.35 ? 10% perf-profile.children.cycles-pp.alloc_fd
2.87 ? 11% -0.7 2.18 ? 11% perf-profile.children.cycles-pp.ret_from_fork
2.87 ? 11% -0.7 2.18 ? 11% perf-profile.children.cycles-pp.kthread
2.85 ? 11% -0.7 2.16 ? 11% perf-profile.children.cycles-pp.smpboot_thread_fn
2.84 ? 11% -0.7 2.15 ? 11% perf-profile.children.cycles-pp.run_ksoftirqd
1.94 ? 8% -0.5 1.42 ? 10% perf-profile.children.cycles-pp.get_obj_cgroup_from_current
2.52 ? 9% -0.5 2.01 ? 6% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
3.38 ? 9% -0.5 2.88 ? 7% perf-profile.children.cycles-pp.__wake_up_common_lock
1.64 ? 8% -0.4 1.28 ? 10% perf-profile.children.cycles-pp.refill_obj_stock
1.30 ? 11% -0.3 0.98 ? 12% perf-profile.children.cycles-pp.ext4_file_open
1.53 ? 8% -0.3 1.23 ? 11% perf-profile.children.cycles-pp.path_put
0.97 ? 9% -0.3 0.72 ? 12% perf-profile.children.cycles-pp.fscrypt_file_open
0.92 ? 7% -0.2 0.71 ? 10% perf-profile.children.cycles-pp.drain_obj_stock
0.58 ? 9% -0.2 0.39 ? 10% perf-profile.children.cycles-pp.dget_parent
0.55 ? 9% -0.2 0.36 ? 9% perf-profile.children.cycles-pp.lockref_get_not_zero
0.34 ? 25% -0.1 0.22 ? 12% perf-profile.children.cycles-pp.pid_vnr
0.20 ? 15% -0.1 0.14 ? 15% perf-profile.children.cycles-pp.fsnotify_peek_first_event
0.25 ? 13% -0.1 0.20 ? 15% perf-profile.children.cycles-pp.call_rcu
0.26 ? 9% -0.1 0.20 ? 9% perf-profile.children.cycles-pp.page_counter_try_charge
0.22 ? 12% -0.1 0.17 ? 15% perf-profile.children.cycles-pp.kmem_cache_alloc_trace
0.24 ? 12% -0.1 0.19 ? 8% perf-profile.children.cycles-pp.fanotify_insert_event
0.19 ? 10% -0.1 0.13 ? 17% perf-profile.children.cycles-pp.close_fd
0.23 ? 8% -0.0 0.19 ? 13% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.15 ? 10% -0.0 0.11 ? 18% perf-profile.children.cycles-pp.pick_file
0.13 ? 14% -0.0 0.10 ? 18% perf-profile.children.cycles-pp.propagate_protected_usage
0.08 ? 10% -0.0 0.05 ? 46% perf-profile.children.cycles-pp.fanotify_free_group_priv
0.16 ? 9% -0.0 0.13 ? 9% perf-profile.children.cycles-pp.rcu_all_qs
0.09 ? 13% -0.0 0.06 ? 11% perf-profile.children.cycles-pp.__fsnotify_alloc_group
0.07 ? 11% -0.0 0.04 ? 45% perf-profile.children.cycles-pp.alloc_page_buffers
0.11 ? 11% -0.0 0.09 ? 7% perf-profile.children.cycles-pp.try_to_free_buffers
0.10 ? 8% -0.0 0.08 ? 12% perf-profile.children.cycles-pp.create_empty_buffers
0.07 ? 7% -0.0 0.04 ? 45% perf-profile.children.cycles-pp.alloc_buffer_head
0.08 ? 14% -0.0 0.06 ? 11% perf-profile.children.cycles-pp.__srcu_read_lock
0.09 ? 11% -0.0 0.07 ? 17% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
0.19 ? 60% +0.3 0.48 ? 30% perf-profile.children.cycles-pp.__x64_sys_ioctl
0.19 ? 61% +0.3 0.48 ? 30% perf-profile.children.cycles-pp.do_vfs_ioctl
0.19 ? 61% +0.3 0.48 ? 30% perf-profile.children.cycles-pp.fanotify_ioctl
0.81 ? 17% +0.5 1.28 ? 15% perf-profile.children.cycles-pp.fanotify_merge
5.66 ? 22% +3.0 8.69 ? 23% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.23 ? 9% +4.0 4.21 ? 12% perf-profile.children.cycles-pp.apparmor_capable
0.24 ? 9% +4.0 4.26 ? 12% perf-profile.children.cycles-pp.security_capable
0.24 ? 8% +4.1 4.33 ? 11% perf-profile.children.cycles-pp.ns_capable_common
4.85 ? 9% -1.1 3.72 ? 11% perf-profile.self.cycles-pp.kmem_cache_free
3.49 ? 9% -1.1 2.43 ? 9% perf-profile.self.cycles-pp._raw_spin_lock
1.81 ? 8% -0.5 1.32 ? 9% perf-profile.self.cycles-pp.get_obj_cgroup_from_current
2.04 ? 6% -0.5 1.57 ? 11% perf-profile.self.cycles-pp.kmem_cache_alloc
2.24 ? 9% -0.5 1.78 ? 6% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.97 ? 9% -0.2 0.73 ? 11% perf-profile.self.cycles-pp.do_dentry_open
0.51 ? 9% -0.2 0.33 ? 10% perf-profile.self.cycles-pp.lockref_get_not_zero
0.68 ? 9% -0.1 0.54 ? 9% perf-profile.self.cycles-pp.refill_obj_stock
0.84 ? 9% -0.1 0.70 ? 7% perf-profile.self.cycles-pp.__wake_up_common
0.55 ? 6% -0.1 0.41 ? 11% perf-profile.self.cycles-pp.drain_obj_stock
0.33 ? 25% -0.1 0.21 ? 12% perf-profile.self.cycles-pp.pid_vnr
0.30 ? 9% -0.1 0.23 ? 11% perf-profile.self.cycles-pp.dput
0.33 ? 11% -0.1 0.26 ? 12% perf-profile.self.cycles-pp.page_counter_cancel
0.19 ? 15% -0.1 0.13 ? 12% perf-profile.self.cycles-pp.fsnotify_peek_first_event
0.24 ? 11% -0.1 0.19 ? 8% perf-profile.self.cycles-pp.fanotify_insert_event
0.13 ? 14% -0.1 0.08 ? 16% perf-profile.self.cycles-pp.alloc_fd
0.23 ? 8% -0.0 0.19 ? 13% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.10 ? 15% -0.0 0.07 ? 7% perf-profile.self.cycles-pp.fanotify_read
0.14 ? 9% -0.0 0.11 ? 12% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.09 ? 11% -0.0 0.07 ? 17% perf-profile.self.cycles-pp.rcu_segcblist_enqueue
0.12 ? 16% +0.1 0.18 ? 16% perf-profile.self.cycles-pp.___slab_alloc
0.00 +0.1 0.07 ? 14% perf-profile.self.cycles-pp.ns_capable_common
0.00 +0.2 0.18 ? 24% perf-profile.self.cycles-pp.try_to_wake_up
0.17 ? 63% +0.3 0.46 ? 30% perf-profile.self.cycles-pp.fanotify_ioctl
0.79 ? 17% +0.5 1.27 ? 15% perf-profile.self.cycles-pp.fanotify_merge
5.56 ? 22% +3.0 8.59 ? 23% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.22 ? 9% +3.9 4.08 ? 11% perf-profile.self.cycles-pp.apparmor_capable



stress-ng.fanotify.ops_per_sec

2.2e+06 +-----------------------------------------------------------------+
| +. : + .+. .+.+.+.+.+.+..+. |
2.1e+06 |-+ + : +. .+ +..+.+ +.+ +.+.+..+ |
2e+06 |-+ + : +.+.+ + + |
| + +.+ |
1.9e+06 |-+ |
1.8e+06 |-+ |
| |
1.7e+06 |-+ |
1.6e+06 |-+ O |
| O O O |
1.5e+06 |-+ O O O O O O O |
1.4e+06 |-+ O O O O O O O O O O O O O O |
| O O O O O |
1.3e+06 +-----------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (25.96 kB)
config-5.12.0-rc3-00007-g7cea2a3c505e (175.54 kB)
job-script (8.41 kB)
job.yaml (5.83 kB)
reproduce (553.00 B)
Download all attachments

2021-05-11 11:14:40

by Amir Goldstein

[permalink] [raw]
Subject: Re: [fanotify] 7cea2a3c50: stress-ng.fanotify.ops_per_sec -23.4% regression

On Tue, May 11, 2021 at 9:51 AM kernel test robot <[email protected]> wrote:
>
>
>
> Greeting,
>
> FYI, we noticed a -23.4% regression of stress-ng.fanotify.ops_per_sec due to commit:
>
>
> commit: 7cea2a3c505e87a9d6afc78be4a7f7be636a73a7 ("fanotify: support limited functionality for unprivileged users")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>

I am not sure how this is possible.
This commit only changes permissions to perform the operation.
Is the test not being run as root?

Thanks,
Amir.

>
> in testcase: stress-ng
> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
> with following parameters:
>
> nr_threads: 10%
> disk: 1HDD
> testtime: 60s
> fs: ext4
> class: os
> test: fanotify
> cpufreq_governor: performance
> ucode: 0x5003006
>
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> bin/lkp run generated-yaml-file
>
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
> os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/fanotify/stress-ng/60s/0x5003006
>
> commit:
> 5b8fea65d1 ("fanotify: configurable limits via sysfs")
> 7cea2a3c50 ("fanotify: support limited functionality for unprivileged users")
>
> 5b8fea65d197f408 7cea2a3c505e87a9d6afc78be4a
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 1.214e+08 ą 3% -23.4% 93048328 ą 4% stress-ng.fanotify.ops
> 2023656 ą 3% -23.4% 1550295 ą 4% stress-ng.fanotify.ops_per_sec
> 33565053 -10.9% 29899893 ą 3% stress-ng.time.file_system_outputs
> 19487 ą 3% -19.0% 15787 ą 3% stress-ng.time.involuntary_context_switches
> 10868 -1.0% 10755 stress-ng.time.minor_page_faults
> 1648 -1.9% 1618 stress-ng.time.percent_of_cpu_this_job_got
> 1014 -1.6% 998.53 stress-ng.time.system_time
> 38378 ą 97% +280.5% 146045 ą 7% stress-ng.time.voluntary_context_switches
> 18.19 -3.9% 17.48 ą 2% iostat.cpu.system
> 0.96 ą 6% -0.2 0.72 ą 8% mpstat.cpu.all.soft%
> 0.21 -0.0 0.18 ą 6% mpstat.cpu.all.usr%
> 191.47 -2.3% 187.06 turbostat.PkgWatt
> 91.25 -1.4% 89.97 turbostat.RAMWatt
> 3494 ą 39% +55.6% 5438 vmstat.memory.buff
> 3805 ą 30% +81.6% 6909 ą 3% vmstat.system.cs
> 509.50 +23.4% 628.50 ą 18% interrupts.CPU53.CAL:Function_call_interrupts
> 3119 ą 44% -62.8% 1159 ą 38% interrupts.CPU62.NMI:Non-maskable_interrupts
> 3119 ą 44% -62.8% 1159 ą 38% interrupts.CPU62.PMI:Performance_monitoring_interrupts
> 4580 ą106% +357.3% 20942 ą 38% numa-meminfo.node0.Dirty
> 4290 ą109% +363.2% 19873 ą 36% numa-meminfo.node0.Inactive(file)
> 5634 ą 74% +364.1% 26149 ą 25% numa-meminfo.node1.Dirty
> 4728 ą 75% +406.7% 23958 ą 25% numa-meminfo.node1.Inactive(file)
> 52712 ą 12% +28.6% 67787 ą 8% numa-meminfo.node1.KReclaimable
> 52712 ą 12% +28.6% 67787 ą 8% numa-meminfo.node1.SReclaimable
> 15433 ą 7% +15.2% 17772 ą 2% meminfo.Active
> 3658 ą 37% +52.2% 5568 meminfo.Active(file)
> 3490 ą 39% +55.6% 5431 meminfo.Buffers
> 10124 ą 75% +366.6% 47241 ą 15% meminfo.Dirty
> 314459 ą 2% +12.9% 354928 meminfo.Inactive
> 8907 ą 77% +393.4% 43949 ą 15% meminfo.Inactive(file)
> 107422 +16.4% 124988 ą 2% meminfo.KReclaimable
> 107422 +16.4% 124988 ą 2% meminfo.SReclaimable
> 1110 ą111% +368.9% 5205 ą 38% numa-vmstat.node0.nr_dirty
> 1046 ą114% +372.2% 4940 ą 36% numa-vmstat.node0.nr_inactive_file
> 1048 ą113% +371.4% 4941 ą 36% numa-vmstat.node0.nr_zone_inactive_file
> 1138 ą110% +360.2% 5241 ą 38% numa-vmstat.node0.nr_zone_write_pending
> 1385 ą 78% +368.3% 6486 ą 25% numa-vmstat.node1.nr_dirty
> 1155 ą 81% +413.4% 5933 ą 25% numa-vmstat.node1.nr_inactive_file
> 13198 ą 12% +28.1% 16909 ą 8% numa-vmstat.node1.nr_slab_reclaimable
> 1154 ą 81% +413.8% 5933 ą 25% numa-vmstat.node1.nr_zone_inactive_file
> 1409 ą 78% +362.8% 6523 ą 25% numa-vmstat.node1.nr_zone_write_pending
> 8001 ą 4% +18.4% 9476 ą 6% softirqs.CPU3.SCHED
> 7965 ą 6% +16.4% 9273 ą 7% softirqs.CPU54.SCHED
> 7857 ą 6% +18.4% 9306 ą 6% softirqs.CPU58.SCHED
> 7971 ą 5% +14.0% 9085 ą 5% softirqs.CPU62.SCHED
> 7773 ą 5% +16.7% 9072 ą 7% softirqs.CPU64.SCHED
> 9103 ą 18% -33.4% 6060 ą 26% softirqs.CPU71.RCU
> 7936 ą 7% +17.7% 9341 ą 7% softirqs.CPU71.SCHED
> 8074 ą 35% -50.5% 3997 ą 17% softirqs.CPU77.RCU
> 7813 ą 6% +15.7% 9036 ą 7% softirqs.CPU8.SCHED
> 934447 ą 2% -10.4% 836914 ą 2% softirqs.RCU
> 914.67 ą 37% +52.2% 1392 proc-vmstat.nr_active_file
> 4196621 -10.9% 3740233 ą 3% proc-vmstat.nr_dirtied
> 2530 ą 75% +364.4% 11749 ą 15% proc-vmstat.nr_dirty
> 273393 +3.5% 282850 proc-vmstat.nr_file_pages
> 2239 ą 76% +388.3% 10932 ą 15% proc-vmstat.nr_inactive_file
> 26824 +16.3% 31185 ą 3% proc-vmstat.nr_slab_reclaimable
> 1550 ą 27% +249.9% 5424 ą 40% proc-vmstat.nr_written
> 914.67 ą 37% +52.2% 1392 proc-vmstat.nr_zone_active_file
> 2239 ą 76% +388.3% 10932 ą 15% proc-vmstat.nr_zone_inactive_file
> 2547 ą 77% +363.4% 11805 ą 15% proc-vmstat.nr_zone_write_pending
> 8695504 -15.1% 7378817 ą 3% proc-vmstat.numa_hit
> 8608986 -15.3% 7292304 ą 3% proc-vmstat.numa_local
> 13559136 -18.5% 11047582 ą 4% proc-vmstat.pgalloc_normal
> 13417362 -18.7% 10904217 ą 4% proc-vmstat.pgfree
> 27913 ą 8% +37.7% 38440 ą 10% slabinfo.buffer_head.active_objs
> 726.67 ą 7% +37.7% 1000 ą 9% slabinfo.buffer_head.active_slabs
> 28355 ą 7% +37.7% 39043 ą 9% slabinfo.buffer_head.num_objs
> 726.67 ą 7% +37.7% 1000 ą 9% slabinfo.buffer_head.num_slabs
> 107206 +11.6% 119642 ą 2% slabinfo.dentry.active_objs
> 2593 +11.1% 2881 ą 2% slabinfo.dentry.active_slabs
> 108962 +11.1% 121015 ą 2% slabinfo.dentry.num_objs
> 2593 +11.1% 2881 ą 2% slabinfo.dentry.num_slabs
> 66688 ą 9% +31.8% 87926 ą 8% slabinfo.ext4_extent_status.active_objs
> 655.00 ą 9% +31.8% 863.17 ą 8% slabinfo.ext4_extent_status.active_slabs
> 66862 ą 9% +31.8% 88097 ą 8% slabinfo.ext4_extent_status.num_objs
> 655.00 ą 9% +31.8% 863.17 ą 8% slabinfo.ext4_extent_status.num_slabs
> 11791 ą 11% +79.3% 21139 ą 8% slabinfo.ext4_inode_cache.active_objs
> 440.17 ą 11% +85.2% 815.00 ą 9% slabinfo.ext4_inode_cache.active_slabs
> 11889 ą 11% +85.2% 22019 ą 9% slabinfo.ext4_inode_cache.num_objs
> 440.17 ą 11% +85.2% 815.00 ą 9% slabinfo.ext4_inode_cache.num_slabs
> 1428 ą 25% +41.9% 2027 ą 5% slabinfo.jbd2_journal_head.active_objs
> 1428 ą 25% +41.9% 2027 ą 5% slabinfo.jbd2_journal_head.num_objs
> 6756 ą 15% +41.1% 9536 ą 3% slabinfo.kmalloc-rcl-512.active_objs
> 6792 ą 15% +42.1% 9649 ą 3% slabinfo.kmalloc-rcl-512.num_objs
> 72015 ą 5% +15.2% 82951 ą 4% slabinfo.vmap_area.active_objs
> 1129 ą 5% +16.2% 1312 ą 4% slabinfo.vmap_area.active_slabs
> 72353 ą 5% +16.2% 84046 ą 4% slabinfo.vmap_area.num_objs
> 1129 ą 5% +16.2% 1312 ą 4% slabinfo.vmap_area.num_slabs
> 4.537e+09 -13.0% 3.948e+09 ą 4% perf-stat.i.branch-instructions
> 34409487 -12.1% 30235678 ą 2% perf-stat.i.branch-misses
> 34.06 ą 2% -2.4 31.67 ą 2% perf-stat.i.cache-miss-rate%
> 85290547 ą 3% -12.7% 74474324 ą 5% perf-stat.i.cache-misses
> 3674 ą 32% +88.6% 6931 ą 3% perf-stat.i.context-switches
> 2.35 +12.2% 2.64 ą 4% perf-stat.i.cpi
> 5.242e+10 -3.5% 5.057e+10 perf-stat.i.cpu-cycles
> 224.46 -9.4% 203.26 ą 2% perf-stat.i.cpu-migrations
> 656.95 ą 2% +12.4% 738.24 ą 5% perf-stat.i.cycles-between-cache-misses
> 300745 ą 11% +53.4% 461361 ą 19% perf-stat.i.dTLB-load-misses
> 6.438e+09 -13.9% 5.543e+09 ą 4% perf-stat.i.dTLB-loads
> 161739 ą 7% +14.6% 185423 ą 6% perf-stat.i.dTLB-store-misses
> 3.536e+09 -15.4% 2.991e+09 ą 4% perf-stat.i.dTLB-stores
> 88.81 -1.9 86.94 perf-stat.i.iTLB-load-miss-rate%
> 20384992 ą 2% -14.5% 17430423 ą 4% perf-stat.i.iTLB-load-misses
> 2.218e+10 -13.3% 1.923e+10 ą 4% perf-stat.i.instructions
> 0.43 -9.5% 0.39 ą 3% perf-stat.i.ipc
> 0.55 -3.5% 0.53 perf-stat.i.metric.GHz
> 612.45 ą 2% -11.7% 540.91 perf-stat.i.metric.K/sec
> 153.67 -13.9% 132.37 ą 4% perf-stat.i.metric.M/sec
> 88.26 -1.5 86.78 perf-stat.i.node-load-miss-rate%
> 31264969 ą 3% -14.5% 26719326 ą 3% perf-stat.i.node-load-misses
> 82.85 -2.5 80.30 perf-stat.i.node-store-miss-rate%
> 16368326 ą 3% -16.1% 13725763 ą 2% perf-stat.i.node-store-misses
> 11.08 ą 2% +7.9% 11.95 ą 5% perf-stat.overall.MPKI
> 34.71 ą 2% -2.3 32.43 ą 2% perf-stat.overall.cache-miss-rate%
> 2.36 +11.4% 2.63 ą 2% perf-stat.overall.cpi
> 0.00 ą 12% +0.0 0.01 ą 25% perf-stat.overall.dTLB-load-miss-rate%
> 0.00 ą 8% +0.0 0.01 ą 9% perf-stat.overall.dTLB-store-miss-rate%
> 90.08 -1.6 88.49 perf-stat.overall.iTLB-load-miss-rate%
> 0.42 -10.2% 0.38 ą 2% perf-stat.overall.ipc
> 84.44 -2.2 82.27 perf-stat.overall.node-store-miss-rate%
> 4.465e+09 -13.0% 3.886e+09 ą 4% perf-stat.ps.branch-instructions
> 33856926 -12.1% 29757385 ą 2% perf-stat.ps.branch-misses
> 83927962 ą 3% -12.7% 73296150 ą 5% perf-stat.ps.cache-misses
> 3620 ą 31% +88.4% 6819 ą 4% perf-stat.ps.context-switches
> 5.158e+10 -3.5% 4.977e+10 perf-stat.ps.cpu-cycles
> 221.03 -9.5% 200.02 ą 2% perf-stat.ps.cpu-migrations
> 295389 ą 11% +53.7% 454026 ą 19% perf-stat.ps.dTLB-load-misses
> 6.336e+09 -13.9% 5.456e+09 ą 4% perf-stat.ps.dTLB-loads
> 159067 ą 7% +14.7% 182479 ą 6% perf-stat.ps.dTLB-store-misses
> 3.48e+09 -15.4% 2.944e+09 ą 4% perf-stat.ps.dTLB-stores
> 20061765 ą 2% -14.5% 17156375 ą 4% perf-stat.ps.iTLB-load-misses
> 2.183e+10 -13.3% 1.893e+10 ą 4% perf-stat.ps.instructions
> 30762758 ą 3% -14.5% 26295053 ą 3% perf-stat.ps.node-load-misses
> 16104244 ą 3% -16.1% 13507363 ą 2% perf-stat.ps.node-store-misses
> 1.377e+12 ą 2% -12.2% 1.21e+12 ą 2% perf-stat.total.instructions
> 2.87 ą 11% -0.7 2.18 ą 11% perf-profile.calltrace.cycles-pp.ret_from_fork
> 2.87 ą 11% -0.7 2.18 ą 11% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
> 2.85 ą 11% -0.7 2.16 ą 11% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork
> 2.84 ą 11% -0.7 2.15 ą 11% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> 2.84 ą 11% -0.7 2.15 ą 11% perf-profile.calltrace.cycles-pp.rcu_core.__softirqentry_text_start.run_ksoftirqd.smpboot_thread_fn.kthread
> 2.84 ą 11% -0.7 2.15 ą 11% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> 2.83 ą 11% -0.7 2.15 ą 11% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.__softirqentry_text_start.run_ksoftirqd.smpboot_thread_fn
> 2.50 ą 10% -0.6 1.88 ą 11% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.__softirqentry_text_start.run_ksoftirqd
> 2.26 ą 10% -0.6 1.69 ą 14% perf-profile.calltrace.cycles-pp.kmem_cache_free.fanotify_read.vfs_read.ksys_read.do_syscall_64
> 1.79 ą 12% -0.5 1.30 ą 14% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.__alloc_file.alloc_empty_file.dentry_open.copy_event_to_user
> 1.74 ą 9% -0.5 1.26 ą 13% perf-profile.calltrace.cycles-pp.fanotify_free_event.fanotify_read.vfs_read.ksys_read.do_syscall_64
> 1.58 ą 12% -0.4 1.18 ą 11% perf-profile.calltrace.cycles-pp._raw_spin_lock.fanotify_read.vfs_read.ksys_read.do_syscall_64
> 1.43 ą 8% -0.4 1.03 ą 29% perf-profile.calltrace.cycles-pp.lockref_get.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent
> 2.19 ą 7% -0.4 1.83 ą 6% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.__fput
> 0.71 ą 11% -0.3 0.37 ą 70% perf-profile.calltrace.cycles-pp.put_pid.fanotify_free_event.fanotify_read.vfs_read.ksys_read
> 2.14 ą 7% -0.3 1.80 ą 7% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent
> 1.17 ą 10% -0.3 0.85 ą 12% perf-profile.calltrace.cycles-pp.ext4_file_open.do_dentry_open.dentry_open.copy_event_to_user.fanotify_read
> 2.37 ą 8% -0.3 2.07 ą 6% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.do_sys_openat2
> 0.98 ą 9% -0.3 0.72 ą 14% perf-profile.calltrace.cycles-pp.path_put.fanotify_free_event.fanotify_read.vfs_read.ksys_read
> 0.96 ą 9% -0.3 0.71 ą 14% perf-profile.calltrace.cycles-pp.dput.path_put.fanotify_free_event.fanotify_read.vfs_read
> 0.90 ą 9% -0.3 0.65 ą 13% perf-profile.calltrace.cycles-pp.fscrypt_file_open.ext4_file_open.do_dentry_open.dentry_open.copy_event_to_user
> 0.84 ą 8% -0.2 0.60 ą 14% perf-profile.calltrace.cycles-pp.lockref_put_return.dput.path_put.fanotify_free_event.fanotify_read
> 0.80 ą 7% -0.2 0.60 ą 9% perf-profile.calltrace.cycles-pp.refill_obj_stock.kmem_cache_free.rcu_do_batch.rcu_core.__softirqentry_text_start
> 0.87 ą 15% -0.2 0.68 ą 10% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fanotify_read.vfs_read.ksys_read
> 0.89 ą 7% -0.1 0.76 ą 8% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.vfs_write
> 0.72 ą 9% -0.1 0.60 ą 5% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_common_lock.fsnotify_add_event.fanotify_handle_event.fsnotify
> 0.75 ą 6% -0.1 0.63 ą 8% perf-profile.calltrace.cycles-pp.fanotify_alloc_event.fanotify_handle_event.fsnotify.__fsnotify_parent.vfs_read
> 0.11 ą223% +0.7 0.83 ą 24% perf-profile.calltrace.cycles-pp.fanotify_merge.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent
> 4.92 ą 15% +1.7 6.61 ą 16% perf-profile.calltrace.cycles-pp.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent.do_sys_openat2
> 5.69 ą 22% +3.1 8.80 ą 23% perf-profile.calltrace.cycles-pp._raw_spin_lock.fsnotify_add_event.fanotify_handle_event.fsnotify.__fsnotify_parent
> 4.00 ą 37% +3.6 7.60 ą 26% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fsnotify_add_event.fanotify_handle_event.fsnotify
> 0.00 +4.0 4.02 ą 12% perf-profile.calltrace.cycles-pp.apparmor_capable.security_capable.ns_capable_common.copy_event_to_user.fanotify_read
> 0.00 +4.1 4.07 ą 12% perf-profile.calltrace.cycles-pp.security_capable.ns_capable_common.copy_event_to_user.fanotify_read.vfs_read
> 0.00 +4.1 4.14 ą 12% perf-profile.calltrace.cycles-pp.ns_capable_common.copy_event_to_user.fanotify_read.vfs_read.ksys_read
> 7.67 ą 9% -1.7 5.97 ą 11% perf-profile.children.cycles-pp.kmem_cache_free
> 5.74 ą 7% -1.2 4.59 ą 9% perf-profile.children.cycles-pp.kmem_cache_alloc
> 4.27 ą 11% -0.9 3.36 ą 11% perf-profile.children.cycles-pp.rcu_core
> 6.23 ą 7% -0.9 5.32 ą 7% perf-profile.children.cycles-pp.fanotify_alloc_event
> 4.26 ą 11% -0.9 3.35 ą 11% perf-profile.children.cycles-pp.rcu_do_batch
> 4.38 ą 10% -0.9 3.48 ą 11% perf-profile.children.cycles-pp.__softirqentry_text_start
> 1.09 ą 13% -0.7 0.35 ą 10% perf-profile.children.cycles-pp.alloc_fd
> 2.87 ą 11% -0.7 2.18 ą 11% perf-profile.children.cycles-pp.ret_from_fork
> 2.87 ą 11% -0.7 2.18 ą 11% perf-profile.children.cycles-pp.kthread
> 2.85 ą 11% -0.7 2.16 ą 11% perf-profile.children.cycles-pp.smpboot_thread_fn
> 2.84 ą 11% -0.7 2.15 ą 11% perf-profile.children.cycles-pp.run_ksoftirqd
> 1.94 ą 8% -0.5 1.42 ą 10% perf-profile.children.cycles-pp.get_obj_cgroup_from_current
> 2.52 ą 9% -0.5 2.01 ą 6% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 3.38 ą 9% -0.5 2.88 ą 7% perf-profile.children.cycles-pp.__wake_up_common_lock
> 1.64 ą 8% -0.4 1.28 ą 10% perf-profile.children.cycles-pp.refill_obj_stock
> 1.30 ą 11% -0.3 0.98 ą 12% perf-profile.children.cycles-pp.ext4_file_open
> 1.53 ą 8% -0.3 1.23 ą 11% perf-profile.children.cycles-pp.path_put
> 0.97 ą 9% -0.3 0.72 ą 12% perf-profile.children.cycles-pp.fscrypt_file_open
> 0.92 ą 7% -0.2 0.71 ą 10% perf-profile.children.cycles-pp.drain_obj_stock
> 0.58 ą 9% -0.2 0.39 ą 10% perf-profile.children.cycles-pp.dget_parent
> 0.55 ą 9% -0.2 0.36 ą 9% perf-profile.children.cycles-pp.lockref_get_not_zero
> 0.34 ą 25% -0.1 0.22 ą 12% perf-profile.children.cycles-pp.pid_vnr
> 0.20 ą 15% -0.1 0.14 ą 15% perf-profile.children.cycles-pp.fsnotify_peek_first_event
> 0.25 ą 13% -0.1 0.20 ą 15% perf-profile.children.cycles-pp.call_rcu
> 0.26 ą 9% -0.1 0.20 ą 9% perf-profile.children.cycles-pp.page_counter_try_charge
> 0.22 ą 12% -0.1 0.17 ą 15% perf-profile.children.cycles-pp.kmem_cache_alloc_trace
> 0.24 ą 12% -0.1 0.19 ą 8% perf-profile.children.cycles-pp.fanotify_insert_event
> 0.19 ą 10% -0.1 0.13 ą 17% perf-profile.children.cycles-pp.close_fd
> 0.23 ą 8% -0.0 0.19 ą 13% perf-profile.children.cycles-pp.syscall_return_via_sysret
> 0.15 ą 10% -0.0 0.11 ą 18% perf-profile.children.cycles-pp.pick_file
> 0.13 ą 14% -0.0 0.10 ą 18% perf-profile.children.cycles-pp.propagate_protected_usage
> 0.08 ą 10% -0.0 0.05 ą 46% perf-profile.children.cycles-pp.fanotify_free_group_priv
> 0.16 ą 9% -0.0 0.13 ą 9% perf-profile.children.cycles-pp.rcu_all_qs
> 0.09 ą 13% -0.0 0.06 ą 11% perf-profile.children.cycles-pp.__fsnotify_alloc_group
> 0.07 ą 11% -0.0 0.04 ą 45% perf-profile.children.cycles-pp.alloc_page_buffers
> 0.11 ą 11% -0.0 0.09 ą 7% perf-profile.children.cycles-pp.try_to_free_buffers
> 0.10 ą 8% -0.0 0.08 ą 12% perf-profile.children.cycles-pp.create_empty_buffers
> 0.07 ą 7% -0.0 0.04 ą 45% perf-profile.children.cycles-pp.alloc_buffer_head
> 0.08 ą 14% -0.0 0.06 ą 11% perf-profile.children.cycles-pp.__srcu_read_lock
> 0.09 ą 11% -0.0 0.07 ą 17% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
> 0.19 ą 60% +0.3 0.48 ą 30% perf-profile.children.cycles-pp.__x64_sys_ioctl
> 0.19 ą 61% +0.3 0.48 ą 30% perf-profile.children.cycles-pp.do_vfs_ioctl
> 0.19 ą 61% +0.3 0.48 ą 30% perf-profile.children.cycles-pp.fanotify_ioctl
> 0.81 ą 17% +0.5 1.28 ą 15% perf-profile.children.cycles-pp.fanotify_merge
> 5.66 ą 22% +3.0 8.69 ą 23% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 0.23 ą 9% +4.0 4.21 ą 12% perf-profile.children.cycles-pp.apparmor_capable
> 0.24 ą 9% +4.0 4.26 ą 12% perf-profile.children.cycles-pp.security_capable
> 0.24 ą 8% +4.1 4.33 ą 11% perf-profile.children.cycles-pp.ns_capable_common
> 4.85 ą 9% -1.1 3.72 ą 11% perf-profile.self.cycles-pp.kmem_cache_free
> 3.49 ą 9% -1.1 2.43 ą 9% perf-profile.self.cycles-pp._raw_spin_lock
> 1.81 ą 8% -0.5 1.32 ą 9% perf-profile.self.cycles-pp.get_obj_cgroup_from_current
> 2.04 ą 6% -0.5 1.57 ą 11% perf-profile.self.cycles-pp.kmem_cache_alloc
> 2.24 ą 9% -0.5 1.78 ą 6% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> 0.97 ą 9% -0.2 0.73 ą 11% perf-profile.self.cycles-pp.do_dentry_open
> 0.51 ą 9% -0.2 0.33 ą 10% perf-profile.self.cycles-pp.lockref_get_not_zero
> 0.68 ą 9% -0.1 0.54 ą 9% perf-profile.self.cycles-pp.refill_obj_stock
> 0.84 ą 9% -0.1 0.70 ą 7% perf-profile.self.cycles-pp.__wake_up_common
> 0.55 ą 6% -0.1 0.41 ą 11% perf-profile.self.cycles-pp.drain_obj_stock
> 0.33 ą 25% -0.1 0.21 ą 12% perf-profile.self.cycles-pp.pid_vnr
> 0.30 ą 9% -0.1 0.23 ą 11% perf-profile.self.cycles-pp.dput
> 0.33 ą 11% -0.1 0.26 ą 12% perf-profile.self.cycles-pp.page_counter_cancel
> 0.19 ą 15% -0.1 0.13 ą 12% perf-profile.self.cycles-pp.fsnotify_peek_first_event
> 0.24 ą 11% -0.1 0.19 ą 8% perf-profile.self.cycles-pp.fanotify_insert_event
> 0.13 ą 14% -0.1 0.08 ą 16% perf-profile.self.cycles-pp.alloc_fd
> 0.23 ą 8% -0.0 0.19 ą 13% perf-profile.self.cycles-pp.syscall_return_via_sysret
> 0.10 ą 15% -0.0 0.07 ą 7% perf-profile.self.cycles-pp.fanotify_read
> 0.14 ą 9% -0.0 0.11 ą 12% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
> 0.09 ą 11% -0.0 0.07 ą 17% perf-profile.self.cycles-pp.rcu_segcblist_enqueue
> 0.12 ą 16% +0.1 0.18 ą 16% perf-profile.self.cycles-pp.___slab_alloc
> 0.00 +0.1 0.07 ą 14% perf-profile.self.cycles-pp.ns_capable_common
> 0.00 +0.2 0.18 ą 24% perf-profile.self.cycles-pp.try_to_wake_up
> 0.17 ą 63% +0.3 0.46 ą 30% perf-profile.self.cycles-pp.fanotify_ioctl
> 0.79 ą 17% +0.5 1.27 ą 15% perf-profile.self.cycles-pp.fanotify_merge
> 5.56 ą 22% +3.0 8.59 ą 23% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> 0.22 ą 9% +3.9 4.08 ą 11% perf-profile.self.cycles-pp.apparmor_capable
>
>
>
> stress-ng.fanotify.ops_per_sec
>
> 2.2e+06 +-----------------------------------------------------------------+
> | +. : + .+. .+.+.+.+.+.+..+. |
> 2.1e+06 |-+ + : +. .+ +..+.+ +.+ +.+.+..+ |
> 2e+06 |-+ + : +.+.+ + + |
> | + +.+ |
> 1.9e+06 |-+ |
> 1.8e+06 |-+ |
> | |
> 1.7e+06 |-+ |
> 1.6e+06 |-+ O |
> | O O O |
> 1.5e+06 |-+ O O O O O O O |
> 1.4e+06 |-+ O O O O O O O O O O O O O O |
> | O O O O O |
> 1.3e+06 +-----------------------------------------------------------------+
>
>
> [*] bisect-good sample
> [O] bisect-bad sample
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> ---
> 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
>
> Thanks,
> Oliver Sang
>

2021-05-11 12:48:07

by Jan Kara

[permalink] [raw]
Subject: Re: [fanotify] 7cea2a3c50: stress-ng.fanotify.ops_per_sec -23.4% regression

On Tue 11-05-21 14:13:38, Amir Goldstein wrote:
> On Tue, May 11, 2021 at 9:51 AM kernel test robot <[email protected]> wrote:
> >
> >
> >
> > Greeting,
> >
> > FYI, we noticed a -23.4% regression of stress-ng.fanotify.ops_per_sec due to commit:
> >
> >
> > commit: 7cea2a3c505e87a9d6afc78be4a7f7be636a73a7 ("fanotify: support limited functionality for unprivileged users")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
>
> I am not sure how this is possible.
> This commit only changes permissions to perform the operation.
> Is the test not being run as root?

Yeah, I've stared at it for some time and didn't come up with a sensible
explanation either. I guess we need to try to reproduce and see...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR