2024-05-29 05:52:18

by kernel test robot

[permalink] [raw]
Subject: [linus:master] [vfs] 681ce86235: filebench.sum_operations/s -7.4% regression


hi, Yafang Shao,

we captured this filebench regression after this patch is merged into mailine.

we noticed there is difference with original version in
https://lore.kernel.org/all/[email protected]/

but we confirmed there is similar regression by origial version. details as
below [1] FYI.



Hello,

kernel test robot noticed a -7.4% regression of filebench.sum_operations/s on:


commit: 681ce8623567ba7e7333908e9826b77145312dda ("vfs: Delete the associated dentry when deleting a file")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


[still regression on linus/master 2bfcfd584ff5ccc8bb7acde19b42570414bf880b]


testcase: filebench
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:

disk: 1HDD
fs: ext4
test: webproxy.f
cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240529/[email protected]

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/rootfs/tbox_group/test/testcase:
gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench

commit:
29c73fc794 ("Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools")
681ce86235 ("vfs: Delete the associated dentry when deleting a file")

29c73fc794c83505 681ce8623567ba7e7333908e982
---------------- ---------------------------
%stddev %change %stddev
\ | \
31537383 ? 2% -75.1% 7846497 ? 4% cpuidle..usage
27.21 +1.4% 27.59 iostat.cpu.system
3830823 ? 5% -16.2% 3208825 ? 4% numa-numastat.node1.local_node
3916065 ? 5% -16.3% 3277633 ? 3% numa-numastat.node1.numa_hit
455641 -74.2% 117514 ? 4% vmstat.system.cs
90146 -34.9% 58712 vmstat.system.in
0.14 -0.0 0.12 ? 2% mpstat.cpu.all.irq%
0.07 -0.0 0.04 ? 2% mpstat.cpu.all.soft%
0.56 -0.2 0.36 ? 2% mpstat.cpu.all.usr%
2038 ? 6% -25.8% 1511 ? 3% perf-c2c.DRAM.local
20304 ? 14% -62.9% 7523 ? 2% perf-c2c.DRAM.remote
18850 ? 16% -71.0% 5470 ? 2% perf-c2c.HITM.local
13220 ? 15% -68.1% 4218 ? 3% perf-c2c.HITM.remote
32070 ? 15% -69.8% 9688 ? 2% perf-c2c.HITM.total
191435 ? 7% +37.3% 262935 ? 11% sched_debug.cfs_rq:/.avg_vruntime.stddev
191435 ? 7% +37.3% 262935 ? 11% sched_debug.cfs_rq:/.min_vruntime.stddev
285707 -72.1% 79601 ? 11% sched_debug.cpu.nr_switches.avg
344088 ? 2% -69.8% 103953 ? 9% sched_debug.cpu.nr_switches.max
206926 ? 8% -73.0% 55912 ? 15% sched_debug.cpu.nr_switches.min
26177 ? 10% -63.9% 9453 ? 10% sched_debug.cpu.nr_switches.stddev
5.00 ? 9% +21.2% 6.06 ? 6% sched_debug.cpu.nr_uninterruptible.stddev
497115 ? 40% -44.8% 274644 ? 44% numa-meminfo.node0.AnonPages
2037838 ? 26% -78.4% 440153 ? 49% numa-meminfo.node1.Active
2001934 ? 26% -79.8% 405182 ? 52% numa-meminfo.node1.Active(anon)
527723 ? 38% +42.4% 751463 ? 16% numa-meminfo.node1.AnonPages
3853109 ? 35% -85.5% 559704 ? 33% numa-meminfo.node1.FilePages
93331 ? 18% -58.7% 38529 ? 22% numa-meminfo.node1.Mapped
5189577 ? 27% -61.5% 1999161 ? 13% numa-meminfo.node1.MemUsed
2014284 ? 26% -78.2% 439808 ? 51% numa-meminfo.node1.Shmem
123485 ? 41% -45.0% 67888 ? 44% numa-vmstat.node0.nr_anon_pages
500704 ? 26% -79.8% 101309 ? 52% numa-vmstat.node1.nr_active_anon
131174 ? 38% +42.6% 187092 ? 16% numa-vmstat.node1.nr_anon_pages
963502 ? 35% -85.5% 139952 ? 33% numa-vmstat.node1.nr_file_pages
23724 ? 18% -59.2% 9690 ? 22% numa-vmstat.node1.nr_mapped
503779 ? 26% -78.2% 109954 ? 51% numa-vmstat.node1.nr_shmem
500704 ? 26% -79.8% 101309 ? 52% numa-vmstat.node1.nr_zone_active_anon
3915420 ? 5% -16.3% 3276906 ? 3% numa-vmstat.node1.numa_hit
3830177 ? 5% -16.2% 3208097 ? 4% numa-vmstat.node1.numa_local
2431824 -65.5% 839190 ? 4% meminfo.Active
2357128 -67.3% 770208 ? 4% meminfo.Active(anon)
74695 ? 3% -7.6% 68981 ? 2% meminfo.Active(file)
5620559 -27.6% 4067556 meminfo.Cached
3838924 -40.4% 2286726 meminfo.Committed_AS
25660 ? 19% +25.8% 32289 ? 5% meminfo.Inactive(file)
141631 ? 5% -32.4% 95728 ? 4% meminfo.Mapped
8334057 -18.6% 6783406 meminfo.Memused
2390655 -64.9% 837973 ? 4% meminfo.Shmem
9824314 -15.2% 8328190 meminfo.max_used_kB
1893 -7.4% 1752 filebench.sum_bytes_mb/s
45921381 -7.4% 42512980 filebench.sum_operations
765287 -7.4% 708444 filebench.sum_operations/s
201392 -7.4% 186432 filebench.sum_reads/s
0.04 +263.5% 0.14 filebench.sum_time_ms/op
40278 -7.4% 37286 filebench.sum_writes/s
48591837 -7.4% 44996528 filebench.time.file_system_outputs
6443 ? 3% -88.7% 729.10 ? 4% filebench.time.involuntary_context_switches
3556 +1.4% 3605 filebench.time.percent_of_cpu_this_job_got
5677 +2.1% 5798 filebench.time.system_time
99.20 -41.4% 58.09 ? 2% filebench.time.user_time
37526666 -74.5% 9587296 ? 4% filebench.time.voluntary_context_switches
589410 -67.3% 192526 ? 4% proc-vmstat.nr_active_anon
18674 ? 3% -7.6% 17253 ? 2% proc-vmstat.nr_active_file
6075100 -7.4% 5625692 proc-vmstat.nr_dirtied
3065571 +1.3% 3104313 proc-vmstat.nr_dirty_background_threshold
6138638 +1.3% 6216217 proc-vmstat.nr_dirty_threshold
1407207 -27.6% 1019126 proc-vmstat.nr_file_pages
30829764 +1.3% 31217496 proc-vmstat.nr_free_pages
262267 +3.4% 271067 proc-vmstat.nr_inactive_anon
6406 ? 19% +26.1% 8076 ? 5% proc-vmstat.nr_inactive_file
35842 ? 5% -32.2% 24284 ? 4% proc-vmstat.nr_mapped
597809 -65.0% 209518 ? 4% proc-vmstat.nr_shmem
32422 -3.3% 31365 proc-vmstat.nr_slab_reclaimable
589410 -67.3% 192526 ? 4% proc-vmstat.nr_zone_active_anon
18674 ? 3% -7.6% 17253 ? 2% proc-vmstat.nr_zone_active_file
262267 +3.4% 271067 proc-vmstat.nr_zone_inactive_anon
6406 ? 19% +26.1% 8076 ? 5% proc-vmstat.nr_zone_inactive_file
100195 ? 10% -54.0% 46112 ? 10% proc-vmstat.numa_hint_faults
48654 ? 9% -50.1% 24286 ? 13% proc-vmstat.numa_hint_faults_local
7506558 -12.4% 6577262 proc-vmstat.numa_hit
7373151 -12.6% 6444638 proc-vmstat.numa_local
803560 ? 4% -6.4% 752097 ? 5% proc-vmstat.numa_pte_updates
4259084 -3.8% 4098506 proc-vmstat.pgactivate
7959837 -11.3% 7064279 proc-vmstat.pgalloc_normal
870736 -9.4% 789267 proc-vmstat.pgfault
7181295 -5.6% 6775640 proc-vmstat.pgfree
1.96 ? 2% -36.9% 1.23 perf-stat.i.MPKI
3.723e+09 +69.5% 6.309e+09 perf-stat.i.branch-instructions
2.70 -0.0 2.66 perf-stat.i.branch-miss-rate%
16048312 -38.4% 9889213 perf-stat.i.branch-misses
16.44 -2.0 14.42 perf-stat.i.cache-miss-rate%
43146188 -47.3% 22744395 ? 2% perf-stat.i.cache-misses
1.141e+08 -39.8% 68732731 perf-stat.i.cache-references
465903 -75.4% 114745 ? 4% perf-stat.i.context-switches
4.11 -36.6% 2.61 perf-stat.i.cpi
1.22e+11 -5.2% 1.157e+11 perf-stat.i.cpu-cycles
236.15 -18.3% 192.90 perf-stat.i.cpu-migrations
1997 ? 2% +40.1% 2798 perf-stat.i.cycles-between-cache-misses
1.644e+10 +90.3% 3.13e+10 perf-stat.i.instructions
0.38 +14.7% 0.43 perf-stat.i.ipc
3.63 -75.7% 0.88 ? 4% perf-stat.i.metric.K/sec
4592 ? 2% -11.6% 4057 perf-stat.i.minor-faults
4592 ? 2% -11.6% 4057 perf-stat.i.page-faults
2.62 -72.3% 0.73 ? 2% perf-stat.overall.MPKI
0.43 -0.3 0.16 perf-stat.overall.branch-miss-rate%
37.79 -4.6 33.22 perf-stat.overall.cache-miss-rate%
7.41 -50.1% 3.70 perf-stat.overall.cpi
2827 +80.1% 5092 ? 2% perf-stat.overall.cycles-between-cache-misses
0.13 +100.5% 0.27 perf-stat.overall.ipc
3.693e+09 +77.2% 6.544e+09 perf-stat.ps.branch-instructions
15913729 -36.1% 10173711 perf-stat.ps.branch-misses
42783592 -44.9% 23577137 ? 2% perf-stat.ps.cache-misses
1.132e+08 -37.3% 70963587 perf-stat.ps.cache-references
461953 -74.2% 118964 ? 4% perf-stat.ps.context-switches
234.44 -17.3% 193.77 perf-stat.ps.cpu-migrations
1.632e+10 +99.0% 3.246e+10 perf-stat.ps.instructions
4555 ? 2% -10.9% 4060 perf-stat.ps.minor-faults
4555 ? 2% -10.9% 4060 perf-stat.ps.page-faults
2.659e+12 +99.2% 5.299e+12 perf-stat.total.instructions



[1]

for patch in
https://lore.kernel.org/all/[email protected]/

we apply it upon
3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq")

there is similar regression


=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/rootfs/tbox_group/test/testcase:
gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench

commit:
3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq")
3681ce3644 ("vfs: Delete the associated dentry when deleting a file")

3c999d1ae3c75991 3681ce364442ce2ec7c7fbe90ad
---------------- ---------------------------
%stddev %change %stddev
\ | \
31.06 +2.8% 31.94 boot-time.boot
30573542 ? 2% -77.0% 7043084 ? 5% cpuidle..usage
27.25 +1.3% 27.61 iostat.cpu.system
0.14 -0.0 0.12 mpstat.cpu.all.irq%
0.07 -0.0 0.04 mpstat.cpu.all.soft%
0.56 -0.2 0.34 ? 2% mpstat.cpu.all.usr%
0.29 ?100% -77.4% 0.07 ? 28% vmstat.procs.b
448491 ? 2% -76.3% 106251 ? 5% vmstat.system.cs
90174 -36.5% 57279 vmstat.system.in
3460368 ? 4% -10.3% 3103696 ? 4% numa-numastat.node0.local_node
3522472 ? 4% -9.2% 3197492 ? 3% numa-numastat.node0.numa_hit
3928489 ? 4% -17.7% 3232163 ? 3% numa-numastat.node1.local_node
3998985 ? 3% -18.2% 3270980 ? 3% numa-numastat.node1.numa_hit
1968 ? 5% -23.2% 1511 perf-c2c.DRAM.local
16452 ? 22% -54.2% 7541 ? 4% perf-c2c.DRAM.remote
14780 ? 26% -64.0% 5321 ? 4% perf-c2c.HITM.local
10689 ? 24% -60.1% 4262 ? 5% perf-c2c.HITM.remote
25469 ? 25% -62.4% 9584 ? 4% perf-c2c.HITM.total
196899 ? 10% +31.1% 258125 ? 11% sched_debug.cfs_rq:/.avg_vruntime.stddev
196899 ? 10% +31.1% 258125 ? 11% sched_debug.cfs_rq:/.min_vruntime.stddev
299051 ? 12% -76.0% 71664 ? 15% sched_debug.cpu.nr_switches.avg
355466 ? 11% -73.4% 94490 ? 14% sched_debug.cpu.nr_switches.max
219349 ? 12% -76.6% 51435 ? 12% sched_debug.cpu.nr_switches.min
25523 ? 11% -67.4% 8322 ? 18% sched_debug.cpu.nr_switches.stddev
36526 ? 4% -16.4% 30519 ? 6% numa-meminfo.node0.Active(file)
897165 ? 14% -26.7% 657740 ? 9% numa-meminfo.node0.AnonPages.max
23571 ? 10% -14.5% 20159 ? 10% numa-meminfo.node0.Dirty
2134726 ? 10% -76.9% 493176 ? 35% numa-meminfo.node1.Active
2096208 ? 11% -78.3% 455673 ? 38% numa-meminfo.node1.Active(anon)
965352 ? 13% +23.5% 1192437 ? 2% numa-meminfo.node1.AnonPages.max
18386 ? 17% +23.8% 22761 ? 4% numa-meminfo.node1.Inactive(file)
2108104 ? 11% -76.7% 492042 ? 37% numa-meminfo.node1.Shmem
2395006 ? 2% -67.4% 779863 ? 3% meminfo.Active
2319964 ? 2% -69.3% 711848 ? 4% meminfo.Active(anon)
75041 ? 2% -9.4% 68015 ? 2% meminfo.Active(file)
5583921 -28.3% 4002297 meminfo.Cached
3802632 -41.5% 2224370 meminfo.Committed_AS
28940 ? 5% +13.6% 32890 ? 4% meminfo.Inactive(file)
134576 ? 6% -31.2% 92641 ? 3% meminfo.Mapped
8310087 -19.2% 6718172 meminfo.Memused
2354275 ? 2% -67.1% 775732 ? 4% meminfo.Shmem
9807659 -15.7% 8271698 meminfo.max_used_kB
1903 -9.3% 1725 filebench.sum_bytes_mb/s
46168615 -9.4% 41846487 filebench.sum_operations
769403 -9.4% 697355 filebench.sum_operations/s
202475 -9.4% 183514 filebench.sum_reads/s
0.04 +268.3% 0.14 filebench.sum_time_ms/op
40495 -9.4% 36703 filebench.sum_writes/s
48846906 -9.3% 44298468 filebench.time.file_system_outputs
6633 -89.4% 701.33 ? 6% filebench.time.involuntary_context_switches
3561 +1.3% 3607 filebench.time.percent_of_cpu_this_job_got
5686 +2.1% 5804 filebench.time.system_time
98.62 -44.2% 55.04 ? 2% filebench.time.user_time
36939924 ? 2% -76.6% 8653175 ? 5% filebench.time.voluntary_context_switches
9134 ? 4% -16.5% 7628 ? 6% numa-vmstat.node0.nr_active_file
3141362 ? 3% -11.5% 2780445 ? 4% numa-vmstat.node0.nr_dirtied
9134 ? 4% -16.5% 7628 ? 6% numa-vmstat.node0.nr_zone_active_file
3522377 ? 4% -9.2% 3197360 ? 3% numa-vmstat.node0.numa_hit
3460272 ? 4% -10.3% 3103565 ? 4% numa-vmstat.node0.numa_local
524285 ? 11% -78.3% 113936 ? 38% numa-vmstat.node1.nr_active_anon
4630 ? 17% +22.6% 5674 ? 4% numa-vmstat.node1.nr_inactive_file
527242 ? 11% -76.7% 123018 ? 37% numa-vmstat.node1.nr_shmem
524285 ? 11% -78.3% 113936 ? 38% numa-vmstat.node1.nr_zone_active_anon
4630 ? 17% +22.5% 5674 ? 4% numa-vmstat.node1.nr_zone_inactive_file
3998675 ? 3% -18.2% 3270307 ? 3% numa-vmstat.node1.numa_hit
3928179 ? 4% -17.7% 3231491 ? 3% numa-vmstat.node1.numa_local
1.82 ? 18% -0.5 1.28 ? 16% perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
1.58 ? 8% -0.5 1.13 ? 18% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.53 ? 9% -0.4 1.13 ? 18% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.05 ? 7% -0.3 1.76 ? 11% perf-profile.calltrace.cycles-pp.update_sd_lb_stats.sched_balance_find_src_group.sched_balance_rq.sched_balance_newidle.balance_fair
3.80 ? 5% -0.3 3.52 ? 5% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
2.11 ? 11% +0.3 2.39 ? 4% perf-profile.calltrace.cycles-pp.sched_setaffinity.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events
3.55 ? 10% +0.3 3.86 ? 4% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
2.63 ? 9% +0.4 3.03 ? 6% perf-profile.calltrace.cycles-pp.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events.cmd_stat
3.47 ? 2% -0.6 2.86 ? 10% perf-profile.children.cycles-pp.vm_mmap_pgoff
3.30 ? 2% -0.6 2.73 ? 10% perf-profile.children.cycles-pp.do_mmap
3.02 ? 6% -0.6 2.46 ? 11% perf-profile.children.cycles-pp.mmap_region
2.34 ? 8% -0.6 1.80 ? 12% perf-profile.children.cycles-pp.ksys_mmap_pgoff
3.80 ? 5% -0.3 3.52 ? 5% perf-profile.children.cycles-pp.smpboot_thread_fn
0.29 ? 2% -0.1 0.17 ? 71% perf-profile.children.cycles-pp.acpi_evaluate_dsm
0.29 ? 2% -0.1 0.17 ? 71% perf-profile.children.cycles-pp.acpi_evaluate_object
0.29 ? 2% -0.1 0.17 ? 71% perf-profile.children.cycles-pp.acpi_nfit_ctl
0.29 ? 2% -0.1 0.17 ? 71% perf-profile.children.cycles-pp.acpi_nfit_query_poison
0.29 ? 2% -0.1 0.17 ? 71% perf-profile.children.cycles-pp.acpi_nfit_scrub
0.16 ? 36% -0.1 0.05 ? 44% perf-profile.children.cycles-pp._find_first_bit
0.10 ? 44% -0.1 0.03 ?100% perf-profile.children.cycles-pp.mtree_load
0.30 ? 25% +0.2 0.47 ? 13% perf-profile.children.cycles-pp.__update_blocked_fair
0.23 ? 55% -0.2 0.08 ? 70% perf-profile.self.cycles-pp.malloc
0.16 ? 40% -0.1 0.05 ? 44% perf-profile.self.cycles-pp._find_first_bit
0.19 ? 30% -0.1 0.09 ? 84% perf-profile.self.cycles-pp.d_alloc_parallel
0.86 ? 17% +0.3 1.12 ? 10% perf-profile.self.cycles-pp.menu_select
580113 ? 2% -69.3% 177978 ? 4% proc-vmstat.nr_active_anon
18761 ? 2% -9.3% 17008 ? 2% proc-vmstat.nr_active_file
6107024 -9.3% 5538417 proc-vmstat.nr_dirtied
3066271 +1.3% 3105966 proc-vmstat.nr_dirty_background_threshold
6140041 +1.3% 6219526 proc-vmstat.nr_dirty_threshold
1398313 -28.3% 1002802 proc-vmstat.nr_file_pages
30835864 +1.3% 31234154 proc-vmstat.nr_free_pages
262597 +2.8% 269986 proc-vmstat.nr_inactive_anon
7233 ? 5% +13.4% 8201 ? 4% proc-vmstat.nr_inactive_file
34066 ? 6% -31.1% 23487 ? 3% proc-vmstat.nr_mapped
588705 ? 2% -67.0% 193984 ? 4% proc-vmstat.nr_shmem
32476 -3.6% 31292 proc-vmstat.nr_slab_reclaimable
580113 ? 2% -69.3% 177978 ? 4% proc-vmstat.nr_zone_active_anon
18761 ? 2% -9.3% 17008 ? 2% proc-vmstat.nr_zone_active_file
262597 +2.8% 269986 proc-vmstat.nr_zone_inactive_anon
7233 ? 5% +13.4% 8201 ? 4% proc-vmstat.nr_zone_inactive_file
148417 ? 19% -82.3% 26235 ? 17% proc-vmstat.numa_hint_faults
76831 ? 23% -84.5% 11912 ? 33% proc-vmstat.numa_hint_faults_local
7524741 -14.0% 6471471 proc-vmstat.numa_hit
7392150 -14.2% 6338859 proc-vmstat.numa_local
826291 ? 4% -12.3% 724471 ? 4% proc-vmstat.numa_pte_updates
4284054 -6.1% 4024194 proc-vmstat.pgactivate
7979760 -12.9% 6948927 proc-vmstat.pgalloc_normal
917223 ? 2% -16.2% 768255 proc-vmstat.pgfault
7212208 -7.4% 6679624 proc-vmstat.pgfree
1.97 -39.5% 1.19 ? 2% perf-stat.i.MPKI
3.749e+09 +65.2% 6.195e+09 perf-stat.i.branch-instructions
2.69 -0.0 2.65 perf-stat.i.branch-miss-rate%
15906654 -39.7% 9595633 perf-stat.i.branch-misses
16.53 -2.2 14.37 perf-stat.i.cache-miss-rate%
43138175 -48.8% 22080984 ? 2% perf-stat.i.cache-misses
1.137e+08 -41.1% 67035007 ? 2% perf-stat.i.cache-references
458704 ? 2% -77.4% 103593 ? 6% perf-stat.i.context-switches
4.04 -39.3% 2.45 perf-stat.i.cpi
1.221e+11 -5.4% 1.155e+11 perf-stat.i.cpu-cycles
238.75 -19.5% 192.29 perf-stat.i.cpu-migrations
1960 +45.0% 2843 ? 2% perf-stat.i.cycles-between-cache-misses
1.678e+10 +103.7% 3.419e+10 perf-stat.i.instructions
0.39 +17.5% 0.46 perf-stat.i.ipc
3.58 ? 2% -77.8% 0.80 ? 6% perf-stat.i.metric.K/sec
4918 ? 3% -19.9% 3940 perf-stat.i.minor-faults
4918 ? 3% -19.9% 3940 perf-stat.i.page-faults
2.57 -74.9% 0.65 ? 2% perf-stat.overall.MPKI
0.42 -0.3 0.15 perf-stat.overall.branch-miss-rate%
37.92 -4.8 33.07 perf-stat.overall.cache-miss-rate%
7.27 -53.5% 3.38 perf-stat.overall.cpi
2830 +85.1% 5239 ? 2% perf-stat.overall.cycles-between-cache-misses
0.14 +115.0% 0.30 perf-stat.overall.ipc
3.72e+09 +72.8% 6.428e+09 perf-stat.ps.branch-instructions
15775403 -37.5% 9854215 perf-stat.ps.branch-misses
42773264 -46.5% 22897139 ? 2% perf-stat.ps.cache-misses
1.128e+08 -38.6% 69221969 ? 2% perf-stat.ps.cache-references
454754 ? 2% -76.4% 107434 ? 6% perf-stat.ps.context-switches
237.02 -18.5% 193.17 perf-stat.ps.cpu-migrations
1.666e+10 +113.0% 3.548e+10 perf-stat.ps.instructions
4878 ? 3% -19.3% 3937 perf-stat.ps.minor-faults
4878 ? 3% -19.3% 3937 perf-stat.ps.page-faults
2.715e+12 +113.5% 5.796e+12 perf-stat.total.instructions


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



2024-05-29 16:41:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [linus:master] [vfs] 681ce86235: filebench.sum_operations/s -7.4% regression

On Tue, 28 May 2024 at 22:52, kernel test robot <[email protected]> wrote:
>
> kernel test robot noticed a -7.4% regression of filebench.sum_operations/s on:
>
> commit: 681ce8623567 ("vfs: Delete the associated dentry when deleting a file")

Well, there we are. I guess I'm reverting this, and we're back to the
drawing board for some of the other alternatives to fixing Yafang's
issue.

Al, did you decide on what approach you'd prefer?

Linus

2024-05-30 07:04:47

by Yafang Shao

[permalink] [raw]
Subject: Re: [linus:master] [vfs] 681ce86235: filebench.sum_operations/s -7.4% regression

On Wed, May 29, 2024 at 1:52 PM kernel test robot <[email protected]> wrote:
>
>
> hi, Yafang Shao,
>
> we captured this filebench regression after this patch is merged into mailine.
>
> we noticed there is difference with original version in
> https://lore.kernel.org/all/[email protected]/
>
> but we confirmed there is similar regression by origial version. details as
> below [1] FYI.
>
>
>
> Hello,
>
> kernel test robot noticed a -7.4% regression of filebench.sum_operations/s on:
>
>
> commit: 681ce8623567ba7e7333908e9826b77145312dda ("vfs: Delete the associated dentry when deleting a file")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
>
> [still regression on linus/master 2bfcfd584ff5ccc8bb7acde19b42570414bf880b]
>
>
> testcase: filebench
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> parameters:
>
> disk: 1HDD
> fs: ext4
> test: webproxy.f
> cpufreq_governor: performance
>
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <[email protected]>
> | Closes: https://lore.kernel.org/oe-lkp/[email protected]
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240529/[email protected]
>
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/rootfs/tbox_group/test/testcase:
> gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench
>
> commit:
> 29c73fc794 ("Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of git://gitkernel.org/pub/scm/linux/kernel/git/perf/perf-tools")
> 681ce86235 ("vfs: Delete the associated dentry when deleting a file")
>
> 29c73fc794c83505 681ce8623567ba7e7333908e982
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 31537383 ą 2% -75.1% 7846497 ą 4% cpuidle..usage
> 27.21 +1.4% 27.59 iostat.cpu.system
> 3830823 ą 5% -16.2% 3208825 ą 4% numa-numastat.node1.local_node
> 3916065 ą 5% -16.3% 3277633 ą 3% numa-numastat.node1.numa_hit
> 455641 -74.2% 117514 ą 4% vmstat.system.cs
> 90146 -34.9% 58712 vmstat.system.in
> 0.14 -0.0 0.12 ą 2% mpstat.cpu.all.irq%
> 0.07 -0.0 0.04 ą 2% mpstat.cpu.all.soft%
> 0.56 -0.2 0.36 ą 2% mpstat.cpu.all.usr%
> 2038 ą 6% -25.8% 1511 ą 3% perf-c2c.DRAM.local
> 20304 ą 14% -62.9% 7523 ą 2% perf-c2c.DRAM.remote
> 18850 ą 16% -71.0% 5470 ą 2% perf-c2c.HITM.local
> 13220 ą 15% -68.1% 4218 ą 3% perf-c2c.HITM.remote
> 32070 ą 15% -69.8% 9688 ą 2% perf-c2c.HITM.total
> 191435 ą 7% +37.3% 262935 ą 11% sched_debug.cfs_rq:/.avg_vruntime.stddev
> 191435 ą 7% +37.3% 262935 ą 11% sched_debug.cfs_rq:/.min_vruntime.stddev
> 285707 -72.1% 79601 ą 11% sched_debug.cpu.nr_switches.avg
> 344088 ą 2% -69.8% 103953 ą 9% sched_debug.cpu.nr_switches.max
> 206926 ą 8% -73.0% 55912 ą 15% sched_debug.cpu.nr_switches.min
> 26177 ą 10% -63.9% 9453 ą 10% sched_debug.cpu.nr_switches.stddev
> 5.00 ą 9% +21.2% 6.06 ą 6% sched_debug.cpu.nr_uninterruptible.stddev
> 497115 ą 40% -44.8% 274644 ą 44% numa-meminfo.node0.AnonPages
> 2037838 ą 26% -78.4% 440153 ą 49% numa-meminfo.node1.Active
> 2001934 ą 26% -79.8% 405182 ą 52% numa-meminfo.node1.Active(anon)
> 527723 ą 38% +42.4% 751463 ą 16% numa-meminfo.node1.AnonPages
> 3853109 ą 35% -85.5% 559704 ą 33% numa-meminfo.node1.FilePages
> 93331 ą 18% -58.7% 38529 ą 22% numa-meminfo.node1.Mapped
> 5189577 ą 27% -61.5% 1999161 ą 13% numa-meminfo.node1.MemUsed
> 2014284 ą 26% -78.2% 439808 ą 51% numa-meminfo.node1.Shmem
> 123485 ą 41% -45.0% 67888 ą 44% numa-vmstat.node0nr_anon_pages
> 500704 ą 26% -79.8% 101309 ą 52% numa-vmstat.node1nr_active_anon
> 131174 ą 38% +42.6% 187092 ą 16% numa-vmstat.node1nr_anon_pages
> 963502 ą 35% -85.5% 139952 ą 33% numa-vmstat.node1nr_file_pages
> 23724 ą 18% -59.2% 9690 ą 22% numa-vmstat.node1nr_mapped
> 503779 ą 26% -78.2% 109954 ą 51% numa-vmstat.node1nr_shmem
> 500704 ą 26% -79.8% 101309 ą 52% numa-vmstat.node1nr_zone_active_anon
> 3915420 ą 5% -16.3% 3276906 ą 3% numa-vmstat.node1numa_hit
> 3830177 ą 5% -16.2% 3208097 ą 4% numa-vmstat.node1numa_local
> 2431824 -65.5% 839190 ą 4% meminfo.Active
> 2357128 -67.3% 770208 ą 4% meminfo.Active(anon)
> 74695 ą 3% -7.6% 68981 ą 2% meminfo.Active(file)
> 5620559 -27.6% 4067556 meminfo.Cached
> 3838924 -40.4% 2286726 meminfo.Committed_AS
> 25660 ą 19% +25.8% 32289 ą 5% meminfo.Inactive(file)
> 141631 ą 5% -32.4% 95728 ą 4% meminfo.Mapped
> 8334057 -18.6% 6783406 meminfo.Memused
> 2390655 -64.9% 837973 ą 4% meminfo.Shmem
> 9824314 -15.2% 8328190 meminfo.max_used_kB
> 1893 -7.4% 1752 filebench.sum_bytes_mb/s
> 45921381 -7.4% 42512980 filebench.sum_operations
> 765287 -7.4% 708444 filebench.sum_operations/s
> 201392 -7.4% 186432 filebench.sum_reads/s
> 0.04 +263.5% 0.14 filebench.sum_time_ms/op
> 40278 -7.4% 37286 filebench.sum_writes/s
> 48591837 -7.4% 44996528 filebench.time.file_system_outputs
> 6443 ą 3% -88.7% 729.10 ą 4% filebench.time.involuntary_context_switches
> 3556 +1.4% 3605 filebench.time.percent_of_cpu_this_job_got
> 5677 +2.1% 5798 filebench.time.system_time
> 99.20 -41.4% 58.09 ą 2% filebench.time.user_time
> 37526666 -74.5% 9587296 ą 4% filebench.time.voluntary_context_switches
> 589410 -67.3% 192526 ą 4% proc-vmstat.nr_active_anon
> 18674 ą 3% -7.6% 17253 ą 2% proc-vmstat.nr_active_file
> 6075100 -7.4% 5625692 proc-vmstat.nr_dirtied
> 3065571 +1.3% 3104313 proc-vmstat.nr_dirty_background_threshold
> 6138638 +1.3% 6216217 proc-vmstat.nr_dirty_threshold
> 1407207 -27.6% 1019126 proc-vmstat.nr_file_pages
> 30829764 +1.3% 31217496 proc-vmstat.nr_free_pages
> 262267 +3.4% 271067 proc-vmstat.nr_inactive_anon
> 6406 ą 19% +26.1% 8076 ą 5% proc-vmstat.nr_inactive_file
> 35842 ą 5% -32.2% 24284 ą 4% proc-vmstat.nr_mapped
> 597809 -65.0% 209518 ą 4% proc-vmstat.nr_shmem
> 32422 -3.3% 31365 proc-vmstat.nr_slab_reclaimable
> 589410 -67.3% 192526 ą 4% proc-vmstat.nr_zone_active_anon
> 18674 ą 3% -7.6% 17253 ą 2% proc-vmstat.nr_zone_active_file
> 262267 +3.4% 271067 proc-vmstat.nr_zone_inactive_anon
> 6406 ą 19% +26.1% 8076 ą 5% proc-vmstat.nr_zone_inactive_file
> 100195 ą 10% -54.0% 46112 ą 10% proc-vmstat.numa_hint_faults
> 48654 ą 9% -50.1% 24286 ą 13% proc-vmstat.numa_hint_faults_local
> 7506558 -12.4% 6577262 proc-vmstat.numa_hit
> 7373151 -12.6% 6444638 proc-vmstat.numa_local
> 803560 ą 4% -6.4% 752097 ą 5% proc-vmstat.numa_pte_updates
> 4259084 -3.8% 4098506 proc-vmstat.pgactivate
> 7959837 -11.3% 7064279 proc-vmstat.pgalloc_normal
> 870736 -9.4% 789267 proc-vmstat.pgfault
> 7181295 -5.6% 6775640 proc-vmstat.pgfree
> 1.96 ą 2% -36.9% 1.23 perf-stat.i.MPKI
> 3.723e+09 +69.5% 6.309e+09 perf-stat.i.branch-instructions
> 2.70 -0.0 2.66 perf-stat.i.branch-miss-rate%
> 16048312 -38.4% 9889213 perf-stat.i.branch-misses
> 16.44 -2.0 14.42 perf-stat.i.cache-miss-rate%
> 43146188 -47.3% 22744395 ą 2% perf-stat.i.cache-misses
> 1.141e+08 -39.8% 68732731 perf-stat.i.cache-references
> 465903 -75.4% 114745 ą 4% perf-stat.i.context-switches
> 4.11 -36.6% 2.61 perf-stat.i.cpi
> 1.22e+11 -5.2% 1.157e+11 perf-stat.i.cpu-cycles
> 236.15 -18.3% 192.90 perf-stat.i.cpu-migrations
> 1997 ą 2% +40.1% 2798 perf-stat.i.cycles-between-cache-misses
> 1.644e+10 +90.3% 3.13e+10 perf-stat.i.instructions
> 0.38 +14.7% 0.43 perf-stat.i.ipc
> 3.63 -75.7% 0.88 ą 4% perf-stat.i.metric.K/sec
> 4592 ą 2% -11.6% 4057 perf-stat.i.minor-faults
> 4592 ą 2% -11.6% 4057 perf-stat.i.page-faults
> 2.62 -72.3% 0.73 ą 2% perf-stat.overall.MPKI
> 0.43 -0.3 0.16 perf-stat.overall.branch-miss-rate%
> 37.79 -4.6 33.22 perf-stat.overall.cache-miss-rate%
> 7.41 -50.1% 3.70 perf-stat.overall.cpi
> 2827 +80.1% 5092 ą 2% perf-stat.overall.cycles-between-cache-misses
> 0.13 +100.5% 0.27 perf-stat.overall.ipc
> 3.693e+09 +77.2% 6.544e+09 perf-stat.ps.branch-instructions
> 15913729 -36.1% 10173711 perf-stat.ps.branch-misses
> 42783592 -44.9% 23577137 ą 2% perf-stat.ps.cache-misses
> 1.132e+08 -37.3% 70963587 perf-stat.ps.cache-references
> 461953 -74.2% 118964 ą 4% perf-stat.ps.context-switches
> 234.44 -17.3% 193.77 perf-stat.ps.cpu-migrations
> 1.632e+10 +99.0% 3.246e+10 perf-stat.ps.instructions
> 4555 ą 2% -10.9% 4060 perf-stat.ps.minor-faults
> 4555 ą 2% -10.9% 4060 perf-stat.ps.page-faults
> 2.659e+12 +99.2% 5.299e+12 perf-stat.total.instructions
>
>
>
> [1]
>
> for patch in
> https://lore.kernel.org/all/[email protected]/
>
> we apply it upon
> 3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq")
>
> there is similar regression
>
>
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/rootfs/tbox_group/test/testcase:
> gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench
>
> commit:
> 3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq")
> 3681ce3644 ("vfs: Delete the associated dentry when deleting a file")
>
> 3c999d1ae3c75991 3681ce364442ce2ec7c7fbe90ad
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 31.06 +2.8% 31.94 boot-time.boot
> 30573542 ą 2% -77.0% 7043084 ą 5% cpuidle..usage
> 27.25 +1.3% 27.61 iostat.cpu.system
> 0.14 -0.0 0.12 mpstat.cpu.all.irq%
> 0.07 -0.0 0.04 mpstat.cpu.all.soft%
> 0.56 -0.2 0.34 ą 2% mpstat.cpu.all.usr%
> 0.29 ą100% -77.4% 0.07 ą 28% vmstat.procs.b
> 448491 ą 2% -76.3% 106251 ą 5% vmstat.system.cs
> 90174 -36.5% 57279 vmstat.system.in
> 3460368 ą 4% -10.3% 3103696 ą 4% numa-numastat.node0.local_node
> 3522472 ą 4% -9.2% 3197492 ą 3% numa-numastat.node0.numa_hit
> 3928489 ą 4% -17.7% 3232163 ą 3% numa-numastat.node1.local_node
> 3998985 ą 3% -18.2% 3270980 ą 3% numa-numastat.node1.numa_hit
> 1968 ą 5% -23.2% 1511 perf-c2c.DRAM.local
> 16452 ą 22% -54.2% 7541 ą 4% perf-c2c.DRAM.remote
> 14780 ą 26% -64.0% 5321 ą 4% perf-c2c.HITM.local
> 10689 ą 24% -60.1% 4262 ą 5% perf-c2c.HITM.remote
> 25469 ą 25% -62.4% 9584 ą 4% perf-c2c.HITM.total
> 196899 ą 10% +31.1% 258125 ą 11% sched_debug.cfs_rq:/.avg_vruntime.stddev
> 196899 ą 10% +31.1% 258125 ą 11% sched_debug.cfs_rq:/.min_vruntime.stddev
> 299051 ą 12% -76.0% 71664 ą 15% sched_debug.cpu.nr_switches.avg
> 355466 ą 11% -73.4% 94490 ą 14% sched_debug.cpu.nr_switches.max
> 219349 ą 12% -76.6% 51435 ą 12% sched_debug.cpu.nr_switches.min
> 25523 ą 11% -67.4% 8322 ą 18% sched_debug.cpu.nr_switches.stddev
> 36526 ą 4% -16.4% 30519 ą 6% numa-meminfo.node0.Active(file)
> 897165 ą 14% -26.7% 657740 ą 9% numa-meminfo.node0.AnonPages.max
> 23571 ą 10% -14.5% 20159 ą 10% numa-meminfo.node0.Dirty
> 2134726 ą 10% -76.9% 493176 ą 35% numa-meminfo.node1.Active
> 2096208 ą 11% -78.3% 455673 ą 38% numa-meminfo.node1.Active(anon)
> 965352 ą 13% +23.5% 1192437 ą 2% numa-meminfo.node1.AnonPages.max
> 18386 ą 17% +23.8% 22761 ą 4% numa-meminfo.node1.Inactive(file)
> 2108104 ą 11% -76.7% 492042 ą 37% numa-meminfo.node1.Shmem
> 2395006 ą 2% -67.4% 779863 ą 3% meminfo.Active
> 2319964 ą 2% -69.3% 711848 ą 4% meminfo.Active(anon)
> 75041 ą 2% -9.4% 68015 ą 2% meminfo.Active(file)
> 5583921 -28.3% 4002297 meminfo.Cached
> 3802632 -41.5% 2224370 meminfo.Committed_AS
> 28940 ą 5% +13.6% 32890 ą 4% meminfo.Inactive(file)
> 134576 ą 6% -31.2% 92641 ą 3% meminfo.Mapped
> 8310087 -19.2% 6718172 meminfo.Memused
> 2354275 ą 2% -67.1% 775732 ą 4% meminfo.Shmem
> 9807659 -15.7% 8271698 meminfo.max_used_kB
> 1903 -9.3% 1725 filebench.sum_bytes_mb/s
> 46168615 -9.4% 41846487 filebench.sum_operations
> 769403 -9.4% 697355 filebench.sum_operations/s
> 202475 -9.4% 183514 filebench.sum_reads/s
> 0.04 +268.3% 0.14 filebench.sum_time_ms/op
> 40495 -9.4% 36703 filebench.sum_writes/s
> 48846906 -9.3% 44298468 filebench.time.file_system_outputs
> 6633 -89.4% 701.33 ą 6% filebench.time.involuntary_context_switches
> 3561 +1.3% 3607 filebench.time.percent_of_cpu_this_job_got
> 5686 +2.1% 5804 filebench.time.system_time
> 98.62 -44.2% 55.04 ą 2% filebench.time.user_time
> 36939924 ą 2% -76.6% 8653175 ą 5% filebench.time.voluntary_context_switches
> 9134 ą 4% -16.5% 7628 ą 6% numa-vmstat.node0nr_active_file
> 3141362 ą 3% -11.5% 2780445 ą 4% numa-vmstat.node0nr_dirtied
> 9134 ą 4% -16.5% 7628 ą 6% numa-vmstat.node0nr_zone_active_file
> 3522377 ą 4% -9.2% 3197360 ą 3% numa-vmstat.node0numa_hit
> 3460272 ą 4% -10.3% 3103565 ą 4% numa-vmstat.node0numa_local
> 524285 ą 11% -78.3% 113936 ą 38% numa-vmstat.node1nr_active_anon
> 4630 ą 17% +22.6% 5674 ą 4% numa-vmstat.node1nr_inactive_file
> 527242 ą 11% -76.7% 123018 ą 37% numa-vmstat.node1nr_shmem
> 524285 ą 11% -78.3% 113936 ą 38% numa-vmstat.node1nr_zone_active_anon
> 4630 ą 17% +22.5% 5674 ą 4% numa-vmstat.node1nr_zone_inactive_file
> 3998675 ą 3% -18.2% 3270307 ą 3% numa-vmstat.node1numa_hit
> 3928179 ą 4% -17.7% 3231491 ą 3% numa-vmstat.node1numa_local
> 1.82 ą 18% -0.5 1.28 ą 16% perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 1.58 ą 8% -0.5 1.13 ą 18% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.53 ą 9% -0.4 1.13 ą 18% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 2.05 ą 7% -0.3 1.76 ą 11% perf-profile.calltrace.cycles-pp.update_sd_lb_stats.sched_balance_find_src_group.sched_balance_rq.sched_balance_newidle.balance_fair
> 3.80 ą 5% -0.3 3.52 ą 5% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 2.11 ą 11% +0.3 2.39 ą 4% perf-profile.calltrace.cycles-pp.sched_setaffinity.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events
> 3.55 ą 10% +0.3 3.86 ą 4% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
> 2.63 ą 9% +0.4 3.03 ą 6% perf-profile.calltrace.cycles-pp.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events.cmd_stat
> 3.47 ą 2% -0.6 2.86 ą 10% perf-profile.children.cycles-pp.vm_mmap_pgoff
> 3.30 ą 2% -0.6 2.73 ą 10% perf-profile.children.cycles-pp.do_mmap
> 3.02 ą 6% -0.6 2.46 ą 11% perf-profile.children.cycles-pp.mmap_region
> 2.34 ą 8% -0.6 1.80 ą 12% perf-profile.children.cycles-pp.ksys_mmap_pgoff
> 3.80 ą 5% -0.3 3.52 ą 5% perf-profile.children.cycles-pp.smpboot_thread_fn
> 0.29 ą 2% -0.1 0.17 ą 71% perf-profile.children.cycles-pp.acpi_evaluate_dsm
> 0.29 ą 2% -0.1 0.17 ą 71% perf-profile.children.cycles-pp.acpi_evaluate_object
> 0.29 ą 2% -0.1 0.17 ą 71% perf-profile.children.cycles-pp.acpi_nfit_ctl
> 0.29 ą 2% -0.1 0.17 ą 71% perf-profile.children.cycles-pp.acpi_nfit_query_poison
> 0.29 ą 2% -0.1 0.17 ą 71% perf-profile.children.cycles-pp.acpi_nfit_scrub
> 0.16 ą 36% -0.1 0.05 ą 44% perf-profile.children.cycles-pp._find_first_bit
> 0.10 ą 44% -0.1 0.03 ą100% perf-profile.children.cycles-pp.mtree_load
> 0.30 ą 25% +0.2 0.47 ą 13% perf-profile.children.cycles-pp.__update_blocked_fair
> 0.23 ą 55% -0.2 0.08 ą 70% perf-profile.selfcycles-pp.malloc
> 0.16 ą 40% -0.1 0.05 ą 44% perf-profile.selfcycles-pp._find_first_bit
> 0.19 ą 30% -0.1 0.09 ą 84% perf-profile.selfcycles-pp.d_alloc_parallel
> 0.86 ą 17% +0.3 1.12 ą 10% perf-profile.selfcycles-pp.menu_select
> 580113 ą 2% -69.3% 177978 ą 4% proc-vmstat.nr_active_anon
> 18761 ą 2% -9.3% 17008 ą 2% proc-vmstat.nr_active_file
> 6107024 -9.3% 5538417 proc-vmstat.nr_dirtied
> 3066271 +1.3% 3105966 proc-vmstat.nr_dirty_background_threshold
> 6140041 +1.3% 6219526 proc-vmstat.nr_dirty_threshold
> 1398313 -28.3% 1002802 proc-vmstat.nr_file_pages
> 30835864 +1.3% 31234154 proc-vmstat.nr_free_pages
> 262597 +2.8% 269986 proc-vmstat.nr_inactive_anon
> 7233 ą 5% +13.4% 8201 ą 4% proc-vmstat.nr_inactive_file
> 34066 ą 6% -31.1% 23487 ą 3% proc-vmstat.nr_mapped
> 588705 ą 2% -67.0% 193984 ą 4% proc-vmstat.nr_shmem
> 32476 -3.6% 31292 proc-vmstat.nr_slab_reclaimable
> 580113 ą 2% -69.3% 177978 ą 4% proc-vmstat.nr_zone_active_anon
> 18761 ą 2% -9.3% 17008 ą 2% proc-vmstat.nr_zone_active_file
> 262597 +2.8% 269986 proc-vmstat.nr_zone_inactive_anon
> 7233 ą 5% +13.4% 8201 ą 4% proc-vmstat.nr_zone_inactive_file
> 148417 ą 19% -82.3% 26235 ą 17% proc-vmstat.numa_hint_faults
> 76831 ą 23% -84.5% 11912 ą 33% proc-vmstat.numa_hint_faults_local
> 7524741 -14.0% 6471471 proc-vmstat.numa_hit
> 7392150 -14.2% 6338859 proc-vmstat.numa_local
> 826291 ą 4% -12.3% 724471 ą 4% proc-vmstat.numa_pte_updates
> 4284054 -6.1% 4024194 proc-vmstat.pgactivate
> 7979760 -12.9% 6948927 proc-vmstat.pgalloc_normal
> 917223 ą 2% -16.2% 768255 proc-vmstat.pgfault
> 7212208 -7.4% 6679624 proc-vmstat.pgfree
> 1.97 -39.5% 1.19 ą 2% perf-stat.i.MPKI
> 3.749e+09 +65.2% 6.195e+09 perf-stat.i.branch-instructions
> 2.69 -0.0 2.65 perf-stat.i.branch-miss-rate%
> 15906654 -39.7% 9595633 perf-stat.i.branch-misses
> 16.53 -2.2 14.37 perf-stat.i.cache-miss-rate%
> 43138175 -48.8% 22080984 ą 2% perf-stat.i.cache-misses
> 1.137e+08 -41.1% 67035007 ą 2% perf-stat.i.cache-references
> 458704 ą 2% -77.4% 103593 ą 6% perf-stat.i.context-switches
> 4.04 -39.3% 2.45 perf-stat.i.cpi
> 1.221e+11 -5.4% 1.155e+11 perf-stat.i.cpu-cycles
> 238.75 -19.5% 192.29 perf-stat.i.cpu-migrations
> 1960 +45.0% 2843 ą 2% perf-stat.i.cycles-between-cache-misses
> 1.678e+10 +103.7% 3.419e+10 perf-stat.i.instructions
> 0.39 +17.5% 0.46 perf-stat.i.ipc
> 3.58 ą 2% -77.8% 0.80 ą 6% perf-stat.i.metric.K/sec
> 4918 ą 3% -19.9% 3940 perf-stat.i.minor-faults
> 4918 ą 3% -19.9% 3940 perf-stat.i.page-faults
> 2.57 -74.9% 0.65 ą 2% perf-stat.overall.MPKI
> 0.42 -0.3 0.15 perf-stat.overall.branch-miss-rate%
> 37.92 -4.8 33.07 perf-stat.overall.cache-miss-rate%
> 7.27 -53.5% 3.38 perf-stat.overall.cpi
> 2830 +85.1% 5239 ą 2% perf-stat.overall.cycles-between-cache-misses
> 0.14 +115.0% 0.30 perf-stat.overall.ipc
> 3.72e+09 +72.8% 6.428e+09 perf-stat.ps.branch-instructions
> 15775403 -37.5% 9854215 perf-stat.ps.branch-misses
> 42773264 -46.5% 22897139 ą 2% perf-stat.ps.cache-misses
> 1.128e+08 -38.6% 69221969 ą 2% perf-stat.ps.cache-references
> 454754 ą 2% -76.4% 107434 ą 6% perf-stat.ps.context-switches
> 237.02 -18.5% 193.17 perf-stat.ps.cpu-migrations
> 1.666e+10 +113.0% 3.548e+10 perf-stat.ps.instructions
> 4878 ą 3% -19.3% 3937 perf-stat.ps.minor-faults
> 4878 ą 3% -19.3% 3937 perf-stat.ps.page-faults
> 2.715e+12 +113.5% 5.796e+12 perf-stat.total.instructions
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>

Thanks for your report. I will try to reproduce it on my test machine.

--
Regards
Yafang

2024-05-30 07:38:05

by Yafang Shao

[permalink] [raw]
Subject: Re: [linus:master] [vfs] 681ce86235: filebench.sum_operations/s -7.4% regression

On Thu, May 30, 2024 at 12:38 AM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, 28 May 2024 at 22:52, kernel test robot <[email protected]> wrote:
> >
> > kernel test robot noticed a -7.4% regression of filebench.sum_operations/s on:
> >
> > commit: 681ce8623567 ("vfs: Delete the associated dentry when deleting a file")
>
> Well, there we are. I guess I'm reverting this, and we're back to the
> drawing board for some of the other alternatives to fixing Yafang's
> issue.

Hi Linus,

I just checked the test case webproxy.f[0], which triggered the regression.

This test case follows a deletefile-createfile pattern, as shown below:

- flowop deletefile name=deletefile1, filesetname=bigfileset
- flowop createfile name=createfile1, filesetname=bigfileset, fd=1

It seems that this pattern is causing the regression. As we discussed
earlier, my patch might negatively impact this delete-create pattern.
The question is whether this scenario is something we need to address.
Perhaps it only occurs in this specific benchmark and doesn't
represent a real-world workload.

[0] https://github.com/filebench/filebench/blob/master/workloads/webproxy.f

>
> Al, did you decide on what approach you'd prefer?
>
> Linus



--
Regards
Yafang