2022-07-21 15:31:51

by kernel test robot

[permalink] [raw]
Subject: [xfs] 345a4666a7: vm-scalability.throughput -91.7% regression


(just FYI for the possible performance impact of disabling large folios,
our config, as attached, set default N to XFS_LARGE_FOLIOS)


Greeting,

FYI, we noticed a -91.7% regression of vm-scalability.throughput due to commit:


commit: 345a4666a721a81c343186768cdd95817767195f ("xfs: disable large folios except for developers")
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git xfs-5.20-merge

in testcase: vm-scalability
on test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

runtime: 300s
size: 128G
test: truncate-seq
cpufreq_governor: performance
ucode: 0x500320a

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/128G/lkp-csl-2ap4/truncate-seq/vm-scalability/0x500320a

commit:
v5.19-rc6
345a4666a7 ("xfs: disable large folios except for developers")

v5.19-rc6 345a4666a721a81c343186768cd
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.273e+08 ? 2% -91.7% 18799157 ? 3% vm-scalability.median
3232073 -39.4% 1959951 ? 2% vm-scalability.median_fault
2.273e+08 ? 2% -91.7% 18799157 ? 3% vm-scalability.throughput
3232073 -39.4% 1959951 ? 2% vm-scalability.throughput_fault
310.90 +7.8% 335.09 ? 2% vm-scalability.time.elapsed_time
310.90 +7.8% 335.09 ? 2% vm-scalability.time.elapsed_time.max
2569 ? 5% +86.0% 4779 ? 5% vm-scalability.time.involuntary_context_switches
351753 -39.4% 213094 vm-scalability.time.minor_page_faults
306.24 +7.8% 330.10 ? 2% vm-scalability.time.system_time
7830 ? 2% -39.3% 4753 vm-scalability.time.voluntary_context_switches
9.886e+08 -40.0% 5.931e+08 vm-scalability.workload
0.06 ? 13% +76.5% 0.10 ? 11% turbostat.IPC
2721 -4.6% 2595 vmstat.system.cs
150937 ? 53% +5164.6% 7946242 ?216% numa-numastat.node0.local_node
228628 ? 30% +3401.4% 8005103 ?214% numa-numastat.node0.numa_hit
553341 ?217% +12187.5% 67991788 ? 71% numa-numastat.node2.numa_foreign
403930 ?141% +9282.5% 37898905 ? 24% numa-numastat.node3.numa_miss
474222 ?119% +7889.6% 37888462 ? 24% numa-numastat.node3.other_node
99252826 ? 2% -96.7% 3233417 meminfo.Active
99240827 ? 2% -96.8% 3221251 meminfo.Active(file)
1884024 ? 19% +4912.8% 94442149 ? 3% meminfo.Inactive
1506415 ? 24% +6144.4% 94066043 ? 3% meminfo.Inactive(file)
136540 +169.9% 368573 meminfo.KReclaimable
136540 +169.9% 368573 meminfo.SReclaimable
441347 +50.7% 665187 meminfo.Slab
24271284 ? 38% -97.3% 646595 ? 56% numa-meminfo.node0.Active
24268014 ? 38% -97.3% 643929 ? 57% numa-meminfo.node0.Active(file)
716994 ? 34% +2567.7% 19127375 ? 53% numa-meminfo.node0.Inactive
495364 ? 51% +3725.8% 18951464 ? 53% numa-meminfo.node0.Inactive(file)
21285416 ? 49% -97.5% 527742 ? 96% numa-meminfo.node1.Active
21282960 ? 49% -97.5% 526173 ? 96% numa-meminfo.node1.Active(file)
419121 ? 45% +3642.1% 15683937 ? 95% numa-meminfo.node1.Inactive
377081 ? 46% +4040.9% 15614731 ? 96% numa-meminfo.node1.Inactive(file)
16252967 ? 97% -92.5% 1212745 ? 29% numa-meminfo.node2.Active
16250538 ? 97% -92.6% 1210353 ? 29% numa-meminfo.node2.Active(file)
116291 ?107% +29797.7% 34768495 ? 28% numa-meminfo.node2.Inactive
81355 ?153% +42539.4% 34689522 ? 27% numa-meminfo.node2.Inactive(file)
11693 ? 21% +950.0% 122780 ? 12% numa-meminfo.node2.KReclaimable
11693 ? 21% +950.0% 122780 ? 12% numa-meminfo.node2.SReclaimable
75095 ? 13% +159.7% 195036 ? 7% numa-meminfo.node2.Slab
37318573 ? 23% -97.7% 850340 ? 41% numa-meminfo.node3.Active
37314728 ? 23% -97.7% 844809 ? 41% numa-meminfo.node3.Active(file)
620310 ? 37% +3927.7% 24984387 ? 40% numa-meminfo.node3.Inactive
541538 ? 35% +4503.9% 24932103 ? 40% numa-meminfo.node3.Inactive(file)
12955 ? 20% +575.7% 87541 ? 51% numa-meminfo.node3.KReclaimable
12955 ? 20% +575.7% 87541 ? 51% numa-meminfo.node3.SReclaimable
82461 ? 9% +89.0% 155868 ? 33% numa-meminfo.node3.Slab
4540165 ? 24% +497.1% 27111048 ? 40% proc-vmstat.compact_free_scanned
90267 ? 37% +2424.2% 2278537 ? 55% proc-vmstat.compact_isolated
24802210 ? 2% -96.8% 805889 proc-vmstat.nr_active_file
376124 ? 24% +6156.3% 23531515 ? 3% proc-vmstat.nr_inactive_file
7170 -1.2% 7085 proc-vmstat.nr_mapped
34134 +170.0% 92179 proc-vmstat.nr_slab_reclaimable
76199 -2.7% 74152 proc-vmstat.nr_slab_unreclaimable
24802217 ? 2% -96.8% 805906 proc-vmstat.nr_zone_active_file
376115 ? 24% +6156.5% 23531520 ? 3% proc-vmstat.nr_zone_inactive_file
3089882 ? 4% +3252.9% 1.036e+08 ? 6% proc-vmstat.numa_foreign
2180041 ? 6% +2016.9% 46149032 ? 14% proc-vmstat.numa_hit
1919079 ? 7% +2290.0% 45865798 ? 14% proc-vmstat.numa_local
3088671 ? 4% +3258.9% 1.037e+08 ? 6% proc-vmstat.numa_miss
3349532 ? 4% +3002.3% 1.039e+08 ? 6% proc-vmstat.numa_other
2.471e+08 -98.1% 4641169 proc-vmstat.pgactivate
1758828 ? 15% -46.5% 940301 ? 20% proc-vmstat.pgalloc_dma32
2.467e+08 -39.7% 1.487e+08 proc-vmstat.pgalloc_normal
27876754 ? 22% -100.0% 0.00 proc-vmstat.pgdeactivate
1440433 -5.7% 1359036 proc-vmstat.pgfault
2.486e+08 -39.3% 1.508e+08 proc-vmstat.pgfree
11.00 ? 27% +600.0% 77.00 ? 35% proc-vmstat.pgmigrate_fail
2920 -5.3% 2765 proc-vmstat.pgpgout
27876754 ? 22% -100.0% 0.00 proc-vmstat.pgrefill
6070874 ? 38% -97.4% 160830 ? 57% numa-vmstat.node0.nr_active_file
123992 ? 51% +3718.3% 4734421 ? 53% numa-vmstat.node0.nr_inactive_file
6070886 ? 38% -97.4% 160834 ? 57% numa-vmstat.node0.nr_zone_active_file
123989 ? 51% +3718.4% 4734421 ? 53% numa-vmstat.node0.nr_zone_inactive_file
228527 ? 30% +3403.0% 8005405 ?214% numa-vmstat.node0.numa_hit
150837 ? 53% +5168.3% 7946544 ?216% numa-vmstat.node0.numa_local
5330549 ? 49% -97.5% 131406 ? 96% numa-vmstat.node1.nr_active_file
93632 ? 47% +4065.6% 3900358 ? 96% numa-vmstat.node1.nr_inactive_file
5330543 ? 49% -97.5% 131408 ? 96% numa-vmstat.node1.nr_zone_active_file
93628 ? 47% +4065.8% 3900358 ? 96% numa-vmstat.node1.nr_zone_inactive_file
16.83 ? 81% -100.0% 0.00 numa-vmstat.node1.workingset_nodes
4047166 ? 97% -92.5% 302328 ? 29% numa-vmstat.node2.nr_active_file
20939 ?154% +41281.7% 8665114 ? 27% numa-vmstat.node2.nr_inactive_file
2923 ? 21% +949.4% 30675 ? 12% numa-vmstat.node2.nr_slab_reclaimable
4047161 ? 97% -92.5% 302336 ? 29% numa-vmstat.node2.nr_zone_active_file
20941 ?154% +41278.7% 8665116 ? 27% numa-vmstat.node2.nr_zone_inactive_file
553341 ?217% +12187.5% 67991788 ? 71% numa-vmstat.node2.numa_foreign
9337600 ? 23% -97.7% 211052 ? 41% numa-vmstat.node3.nr_active_file
135432 ? 34% +4498.1% 6227247 ? 40% numa-vmstat.node3.nr_inactive_file
3239 ? 20% +575.3% 21873 ? 51% numa-vmstat.node3.nr_slab_reclaimable
9337600 ? 23% -97.7% 211056 ? 41% numa-vmstat.node3.nr_zone_active_file
135424 ? 34% +4498.3% 6227248 ? 40% numa-vmstat.node3.nr_zone_inactive_file
403930 ?141% +9282.5% 37898905 ? 24% numa-vmstat.node3.numa_miss
474222 ?119% +7889.6% 37888462 ? 24% numa-vmstat.node3.numa_other
113.00 ? 77% -100.0% 0.00 numa-vmstat.node3.workingset_nodes
4.27e+08 ? 7% +83.1% 7.82e+08 ? 5% perf-stat.i.branch-instructions
5.25 ? 11% +10.2 15.49 ? 37% perf-stat.i.cache-miss-rate%
4060515 ? 26% +103.0% 8244161 ? 15% perf-stat.i.cache-misses
2618 -4.1% 2512 perf-stat.i.context-switches
4.61 ? 18% -39.2% 2.80 ? 16% perf-stat.i.cpi
206.54 -1.2% 204.10 perf-stat.i.cpu-migrations
2910 ? 15% -57.3% 1243 ? 8% perf-stat.i.cycles-between-cache-misses
5.726e+08 ? 6% +77.3% 1.015e+09 ? 4% perf-stat.i.dTLB-loads
2.989e+08 ? 5% +86.4% 5.571e+08 ? 4% perf-stat.i.dTLB-stores
2.107e+09 ? 6% +82.1% 3.837e+09 ? 5% perf-stat.i.instructions
803.34 ? 5% +98.5% 1594 ? 11% perf-stat.i.instructions-per-iTLB-miss
0.23 ? 13% +73.1% 0.40 ? 14% perf-stat.i.ipc
2.20 ? 8% -21.0% 1.74 ? 4% perf-stat.i.major-faults
6.76 ? 6% +81.3% 12.26 ? 4% perf-stat.i.metric.M/sec
4128 -12.9% 3595 perf-stat.i.minor-faults
873102 ? 3% -28.8% 621748 ? 10% perf-stat.i.node-load-misses
165173 ? 13% +676.5% 1282502 ? 4% perf-stat.i.node-store-misses
41508 ? 10% +1074.0% 487294 ? 18% perf-stat.i.node-stores
4130 -12.9% 3596 perf-stat.i.page-faults
5.28 ? 11% +9.1 14.41 ? 35% perf-stat.overall.cache-miss-rate%
4.32 ? 18% -41.5% 2.53 ? 15% perf-stat.overall.cpi
2284 ? 11% -48.4% 1179 ? 8% perf-stat.overall.cycles-between-cache-misses
809.97 ? 5% +97.0% 1595 ? 11% perf-stat.overall.instructions-per-iTLB-miss
0.24 ? 14% +70.3% 0.40 ? 14% perf-stat.overall.ipc
79.66 ? 4% -7.1 72.54 ? 6% perf-stat.overall.node-store-miss-rate%
663.37 ? 5% +227.9% 2174 ? 5% perf-stat.overall.path-length
4.256e+08 ? 7% +83.2% 7.798e+08 ? 5% perf-stat.ps.branch-instructions
4047234 ? 26% +103.1% 8220321 ? 15% perf-stat.ps.cache-misses
2609 -4.0% 2504 perf-stat.ps.context-switches
5.708e+08 ? 6% +77.4% 1.012e+09 ? 4% perf-stat.ps.dTLB-loads
2.979e+08 ? 5% +86.5% 5.556e+08 ? 4% perf-stat.ps.dTLB-stores
2.1e+09 ? 6% +82.2% 3.826e+09 ? 5% perf-stat.ps.instructions
2.20 ? 8% -20.9% 1.74 ? 3% perf-stat.ps.major-faults
4110 -12.9% 3579 perf-stat.ps.minor-faults
869912 ? 3% -28.7% 619922 ? 10% perf-stat.ps.node-load-misses
164601 ? 13% +676.7% 1278429 ? 4% perf-stat.ps.node-store-misses
41365 ? 10% +1074.4% 485782 ? 18% perf-stat.ps.node-stores
4112 -12.9% 3581 perf-stat.ps.page-faults
6.558e+11 ? 5% +96.7% 1.29e+12 ? 5% perf-stat.total.instructions
26.00 ? 5% -26.0 0.00 perf-profile.calltrace.cycles-pp.page_cache_ra_order.filemap_get_pages.filemap_read.xfs_file_buffered_read.xfs_file_read_iter
25.90 ? 5% -25.9 0.00 perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_order.filemap_get_pages.filemap_read.xfs_file_buffered_read
25.90 ? 5% -25.9 0.00 perf-profile.calltrace.cycles-pp.iomap_readahead.read_pages.page_cache_ra_order.filemap_get_pages.filemap_read
25.89 ? 5% -25.9 0.00 perf-profile.calltrace.cycles-pp.iomap_readpage_iter.iomap_readahead.read_pages.page_cache_ra_order.filemap_get_pages
25.78 ? 5% -25.8 0.00 perf-profile.calltrace.cycles-pp.zero_user_segments.iomap_readpage_iter.iomap_readahead.read_pages.page_cache_ra_order
24.63 ? 5% -21.3 3.32 ? 15% perf-profile.calltrace.cycles-pp.memset_erms.zero_user_segments.iomap_readpage_iter.iomap_readahead.read_pages
0.84 ? 10% -0.3 0.53 ? 45% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
0.83 ? 9% -0.3 0.52 ? 45% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
0.78 ? 9% -0.3 0.50 ? 45% perf-profile.calltrace.cycles-pp.drm_fb_helper_damage_work.process_one_work.worker_thread.kthread.ret_from_fork
0.78 ? 8% -0.3 0.50 ? 45% perf-profile.calltrace.cycles-pp.drm_fb_helper_damage_blit_real.drm_fb_helper_damage_work.process_one_work.worker_thread.kthread
1.02 ? 17% -0.3 0.75 ? 14% perf-profile.calltrace.cycles-pp.ret_from_fork
1.02 ? 17% -0.3 0.75 ? 14% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
0.75 ? 7% -0.3 0.49 ? 45% perf-profile.calltrace.cycles-pp.memcpy_toio.drm_fb_helper_damage_blit_real.drm_fb_helper_damage_work.process_one_work.worker_thread
0.00 +0.7 0.74 ? 27% perf-profile.calltrace.cycles-pp.__pagevec_lru_add.folio_add_lru.filemap_add_folio.page_cache_ra_unbounded.filemap_get_pages
0.00 +0.8 0.80 ? 22% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded
0.00 +0.9 0.88 ? 27% perf-profile.calltrace.cycles-pp.folio_add_lru.filemap_add_folio.page_cache_ra_unbounded.filemap_get_pages.filemap_read
0.00 +1.2 1.15 ? 30% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages
0.00 +1.4 1.36 ? 27% perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read
0.00 +1.5 1.50 ? 25% perf-profile.calltrace.cycles-pp.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read.xfs_file_buffered_read
0.00 +1.6 1.58 ? 17% perf-profile.calltrace.cycles-pp.__filemap_add_folio.filemap_add_folio.page_cache_ra_unbounded.filemap_get_pages.filemap_read
0.94 ? 8% +2.0 2.96 ? 16% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.filemap_read.xfs_file_buffered_read
0.96 ? 8% +2.0 3.00 ? 16% perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.filemap_read.xfs_file_buffered_read.xfs_file_read_iter
1.06 ? 8% +2.2 3.25 ? 16% perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.xfs_file_buffered_read.xfs_file_read_iter.new_sync_read
0.00 +2.5 2.49 ? 20% perf-profile.calltrace.cycles-pp.filemap_add_folio.page_cache_ra_unbounded.filemap_get_pages.filemap_read.xfs_file_buffered_read
0.00 +3.4 3.43 ? 15% perf-profile.calltrace.cycles-pp.zero_user_segments.iomap_readpage_iter.iomap_readahead.read_pages.page_cache_ra_unbounded
0.00 +19.5 19.50 ? 17% perf-profile.calltrace.cycles-pp.iomap_readpage_iter.iomap_readahead.read_pages.page_cache_ra_unbounded.filemap_get_pages
0.00 +19.9 19.91 ? 17% perf-profile.calltrace.cycles-pp.iomap_readahead.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read
0.00 +19.9 19.92 ? 17% perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read.xfs_file_buffered_read
0.00 +24.1 24.14 ? 17% perf-profile.calltrace.cycles-pp.page_cache_ra_unbounded.filemap_get_pages.filemap_read.xfs_file_buffered_read.xfs_file_read_iter
26.00 ? 5% -26.0 0.00 perf-profile.children.cycles-pp.page_cache_ra_order
25.77 ? 5% -22.4 3.36 ? 15% perf-profile.children.cycles-pp.memset_erms
25.78 ? 5% -22.4 3.43 ? 15% perf-profile.children.cycles-pp.zero_user_segments
25.89 ? 5% -6.4 19.51 ? 17% perf-profile.children.cycles-pp.iomap_readpage_iter
25.90 ? 5% -6.0 19.92 ? 17% perf-profile.children.cycles-pp.iomap_readahead
25.90 ? 5% -6.0 19.92 ? 17% perf-profile.children.cycles-pp.read_pages
1.02 ? 17% -0.3 0.75 ? 14% perf-profile.children.cycles-pp.ret_from_fork
1.02 ? 17% -0.3 0.75 ? 14% perf-profile.children.cycles-pp.kthread
0.84 ? 10% -0.2 0.61 ? 10% perf-profile.children.cycles-pp.worker_thread
0.83 ? 9% -0.2 0.61 ? 10% perf-profile.children.cycles-pp.process_one_work
0.78 ? 9% -0.2 0.58 ? 10% perf-profile.children.cycles-pp.drm_fb_helper_damage_work
0.78 ? 8% -0.2 0.58 ? 10% perf-profile.children.cycles-pp.drm_fb_helper_damage_blit_real
0.78 ? 8% -0.2 0.58 ? 10% perf-profile.children.cycles-pp.memcpy_toio
0.49 ? 12% -0.1 0.40 ? 11% perf-profile.children.cycles-pp._raw_spin_lock
0.46 ? 5% -0.1 0.39 ? 11% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.11 ? 9% -0.0 0.06 ? 47% perf-profile.children.cycles-pp.bit_putcs
0.10 ? 17% -0.0 0.05 ? 46% perf-profile.children.cycles-pp.drm_fbdev_fb_imageblit
0.12 ? 11% -0.0 0.08 ? 16% perf-profile.children.cycles-pp.con_scroll
0.12 ? 11% -0.0 0.08 ? 16% perf-profile.children.cycles-pp.lf
0.12 ? 11% -0.0 0.08 ? 16% perf-profile.children.cycles-pp.fbcon_scroll
0.12 ? 9% -0.0 0.08 ? 17% perf-profile.children.cycles-pp.vt_console_print
0.09 ? 21% -0.0 0.05 ? 46% perf-profile.children.cycles-pp.fast_imageblit
0.12 ? 10% -0.0 0.08 ? 16% perf-profile.children.cycles-pp.fbcon_putcs
0.12 ? 10% -0.0 0.08 ? 16% perf-profile.children.cycles-pp.fbcon_redraw
0.09 ? 21% -0.0 0.05 ? 46% perf-profile.children.cycles-pp.sys_imageblit
0.14 ? 6% -0.0 0.11 ? 13% perf-profile.children.cycles-pp.wait_for_xmitr
0.05 ? 49% +0.1 0.12 ? 5% perf-profile.children.cycles-pp.__might_resched
0.00 +0.1 0.07 ? 14% perf-profile.children.cycles-pp.folio_mapping
0.00 +0.1 0.08 ? 18% perf-profile.children.cycles-pp.xas_alloc
0.00 +0.1 0.08 ? 18% perf-profile.children.cycles-pp.kmem_cache_alloc_lru
0.00 +0.1 0.08 ? 13% perf-profile.children.cycles-pp.__might_sleep
0.00 +0.1 0.09 ? 17% perf-profile.children.cycles-pp.pagevec_lru_move_fn
0.00 +0.1 0.10 ? 19% perf-profile.children.cycles-pp.folio_unlock
0.00 +0.1 0.10 ? 11% perf-profile.children.cycles-pp.__list_del_entry_valid
0.00 +0.1 0.10 ? 22% perf-profile.children.cycles-pp.alloc_pages
0.00 +0.1 0.11 ? 20% perf-profile.children.cycles-pp.release_pages
0.00 +0.1 0.11 ? 18% perf-profile.children.cycles-pp.xas_start
0.00 +0.1 0.12 ? 13% perf-profile.children.cycles-pp.__mod_node_page_state
0.00 +0.1 0.12 ? 18% perf-profile.children.cycles-pp.try_charge_memcg
0.00 +0.1 0.12 ? 16% perf-profile.children.cycles-pp.xas_create
0.00 +0.1 0.12 ? 21% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.00 +0.1 0.13 ? 32% perf-profile.children.cycles-pp.xas_find_conflict
0.00 +0.1 0.14 ? 28% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
0.00 +0.2 0.15 ? 17% perf-profile.children.cycles-pp.xa_get_order
0.00 +0.2 0.15 ? 15% perf-profile.children.cycles-pp.__mod_lruvec_state
0.00 +0.2 0.23 ? 24% perf-profile.children.cycles-pp.__mod_lruvec_page_state
0.00 +0.2 0.24 ? 23% perf-profile.children.cycles-pp.filemap_get_read_batch
0.00 +0.2 0.24 ? 20% perf-profile.children.cycles-pp.xas_store
0.00 +0.3 0.25 ? 11% perf-profile.children.cycles-pp.charge_memcg
0.00 +0.3 0.29 ? 17% perf-profile.children.cycles-pp.xa_load
0.00 +0.4 0.39 ? 19% perf-profile.children.cycles-pp.xas_load
0.00 +0.4 0.44 ? 16% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.00 +0.5 0.54 ? 18% perf-profile.children.cycles-pp.__pagevec_lru_add_fn
0.00 +0.6 0.57 ? 22% perf-profile.children.cycles-pp.rmqueue_bulk
0.00 +0.6 0.62 ? 21% perf-profile.children.cycles-pp.folio_mark_accessed
0.00 +0.7 0.75 ? 27% perf-profile.children.cycles-pp.__pagevec_lru_add
0.05 ? 72% +0.8 0.81 ? 22% perf-profile.children.cycles-pp.rmqueue
0.00 +0.9 0.88 ? 27% perf-profile.children.cycles-pp.folio_add_lru
0.08 ? 25% +1.1 1.18 ? 30% perf-profile.children.cycles-pp.get_page_from_freelist
0.10 ? 25% +1.3 1.41 ? 27% perf-profile.children.cycles-pp.__alloc_pages
0.06 ? 51% +1.4 1.50 ? 25% perf-profile.children.cycles-pp.folio_alloc
0.00 +1.6 1.60 ? 17% perf-profile.children.cycles-pp.__filemap_add_folio
0.96 ? 9% +2.0 3.00 ? 16% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.97 ? 8% +2.0 3.00 ? 16% perf-profile.children.cycles-pp.copyout
1.07 ? 8% +2.2 3.26 ? 16% perf-profile.children.cycles-pp.copy_page_to_iter
0.00 +2.5 2.50 ? 20% perf-profile.children.cycles-pp.filemap_add_folio
0.00 +24.1 24.14 ? 17% perf-profile.children.cycles-pp.page_cache_ra_unbounded
25.37 ? 5% -22.0 3.32 ? 14% perf-profile.self.cycles-pp.memset_erms
0.77 ? 9% -0.2 0.58 ? 9% perf-profile.self.cycles-pp.memcpy_toio
0.09 ? 21% -0.0 0.05 ? 46% perf-profile.self.cycles-pp.fast_imageblit
0.10 ? 14% -0.0 0.06 ? 13% perf-profile.self.cycles-pp.push_and_clear_regs
0.05 ? 49% +0.1 0.12 ? 8% perf-profile.self.cycles-pp.__might_resched
0.00 +0.1 0.07 ? 14% perf-profile.self.cycles-pp.folio_mapping
0.00 +0.1 0.07 ? 15% perf-profile.self.cycles-pp.__might_sleep
0.00 +0.1 0.08 ? 20% perf-profile.self.cycles-pp.charge_memcg
0.00 +0.1 0.09 ? 20% perf-profile.self.cycles-pp.xas_start
0.00 +0.1 0.09 ? 18% perf-profile.self.cycles-pp.folio_unlock
0.00 +0.1 0.09 ? 25% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.00 +0.1 0.10 ? 21% perf-profile.self.cycles-pp.release_pages
0.04 ? 71% +0.1 0.13 ? 28% perf-profile.self.cycles-pp.copy_page_to_iter
0.00 +0.1 0.10 ? 7% perf-profile.self.cycles-pp.__list_del_entry_valid
0.00 +0.1 0.10 ? 27% perf-profile.self.cycles-pp.xas_store
0.00 +0.1 0.10 ? 23% perf-profile.self.cycles-pp.try_charge_memcg
0.00 +0.1 0.11 ? 24% perf-profile.self.cycles-pp.folio_add_lru
0.00 +0.1 0.11 ? 14% perf-profile.self.cycles-pp.__mod_node_page_state
0.00 +0.1 0.14 ? 14% perf-profile.self.cycles-pp.__alloc_pages
0.00 +0.1 0.14 ? 29% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
0.00 +0.2 0.16 ? 22% perf-profile.self.cycles-pp.rmqueue
0.00 +0.2 0.17 ? 11% perf-profile.self.cycles-pp.iomap_readahead
0.00 +0.2 0.18 ? 21% perf-profile.self.cycles-pp.filemap_read
0.00 +0.2 0.19 ? 18% perf-profile.self.cycles-pp.filemap_get_read_batch
0.00 +0.3 0.27 ? 16% perf-profile.self.cycles-pp.__filemap_add_folio
0.00 +0.3 0.31 ? 61% perf-profile.self.cycles-pp.get_page_from_freelist
0.00 +0.3 0.31 ? 18% perf-profile.self.cycles-pp.xas_load
0.00 +0.3 0.33 ? 22% perf-profile.self.cycles-pp.__pagevec_lru_add_fn
0.00 +0.5 0.46 ? 18% perf-profile.self.cycles-pp.folio_mark_accessed
0.00 +0.5 0.48 ? 23% perf-profile.self.cycles-pp.rmqueue_bulk
0.96 ? 9% +2.0 2.96 ? 16% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.00 +15.8 15.83 ? 17% perf-profile.self.cycles-pp.iomap_readpage_iter




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (25.94 kB)
config-5.19.0-rc6-00001-g345a4666a721 (166.56 kB)
job-script (7.94 kB)
job.yaml (5.44 kB)
reproduce (49.96 kB)
Download all attachments

2022-07-21 22:14:11

by Dave Chinner

[permalink] [raw]
Subject: Re: [xfs] 345a4666a7: vm-scalability.throughput -91.7% regression

On Thu, Jul 21, 2022 at 11:08:38PM +0800, kernel test robot wrote:
>
> (just FYI for the possible performance impact of disabling large folios,
> our config, as attached, set default N to XFS_LARGE_FOLIOS)
>
>
> Greeting,
>
> FYI, we noticed a -91.7% regression of vm-scalability.throughput due to commit:
>
>
> commit: 345a4666a721a81c343186768cdd95817767195f ("xfs: disable large folios except for developers")

Say what? I've never seen that change go past on a public list...

> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git xfs-5.20-merge

Oh, it's in a developer's working tree, not something that has been
proposed for review let alone been merged.

So why is this report being sent to lkml, linux-xfs, etc as if it
was a change merged into an upstream tree rather than just the
developer who owns the tree the commit is in?

-Dave.
--
Dave Chinner
[email protected]

2022-07-21 22:35:30

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [xfs] 345a4666a7: vm-scalability.throughput -91.7% regression

On Fri, Jul 22, 2022 at 07:33:37AM +1000, Dave Chinner wrote:
> On Thu, Jul 21, 2022 at 11:08:38PM +0800, kernel test robot wrote:
> >
> > (just FYI for the possible performance impact of disabling large folios,
> > our config, as attached, set default N to XFS_LARGE_FOLIOS)
> >
> >
> > Greeting,
> >
> > FYI, we noticed a -91.7% regression of vm-scalability.throughput due to commit:
> >
> >
> > commit: 345a4666a721a81c343186768cdd95817767195f ("xfs: disable large folios except for developers")
>
> Say what? I've never seen that change go past on a public list...
>
> > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git xfs-5.20-merge
>
> Oh, it's in a developer's working tree, not something that has been
> proposed for review let alone been merged.

Correct, djwong-dev has a patch so that I can disable multipage folios
so that I could get other QA work done while willy and I try to sort out
the generic/522 corruption problems.

> So why is this report being sent to lkml, linux-xfs, etc as if it
> was a change merged into an upstream tree rather than just the
> developer who owns the tree the commit is in?

I was wondering that myself.

--D

> -Dave.
> --
> Dave Chinner
> [email protected]

2022-07-22 02:14:35

by kernel test robot

[permalink] [raw]
Subject: Re: [xfs] 345a4666a7: vm-scalability.throughput -91.7% regression

Hi Darrick, Hi Dave, and all,

sorry for this report is annoying according to Darrick and Dave's comments
below.
we will investigate this case and refine our report process.


On Thu, Jul 21, 2022 at 02:38:51PM -0700, Darrick J. Wong wrote:
> On Fri, Jul 22, 2022 at 07:33:37AM +1000, Dave Chinner wrote:
> > On Thu, Jul 21, 2022 at 11:08:38PM +0800, kernel test robot wrote:
> > >
> > > (just FYI for the possible performance impact of disabling large folios,
> > > our config, as attached, set default N to XFS_LARGE_FOLIOS)
> > >
> > >
> > > Greeting,
> > >
> > > FYI, we noticed a -91.7% regression of vm-scalability.throughput due to commit:
> > >
> > >
> > > commit: 345a4666a721a81c343186768cdd95817767195f ("xfs: disable large folios except for developers")
> >
> > Say what? I've never seen that change go past on a public list...
> >
> > > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git xfs-5.20-merge
> >
> > Oh, it's in a developer's working tree, not something that has been
> > proposed for review let alone been merged.
>
> Correct, djwong-dev has a patch so that I can disable multipage folios
> so that I could get other QA work done while willy and I try to sort out
> the generic/522 corruption problems.
>
> > So why is this report being sent to lkml, linux-xfs, etc as if it
> > was a change merged into an upstream tree rather than just the
> > developer who owns the tree the commit is in?
>
> I was wondering that myself.
>
> --D
>
> > -Dave.
> > --
> > Dave Chinner
> > [email protected]

2022-07-22 02:44:22

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [xfs] 345a4666a7: vm-scalability.throughput -91.7% regression

On Fri, Jul 22, 2022 at 10:10:02AM +0800, Oliver Sang wrote:
> Hi Darrick, Hi Dave, and all,
>
> sorry for this report is annoying according to Darrick and Dave's comments
> below.
> we will investigate this case and refine our report process.

FWIW, you can still send /me/ reports about the xfs development patches
I post to djwong/xfs-linux.git, but it's not necessary to cc linux-xfs
with that, since most of those patches are still under development
and/or working their way through patch review.

--D

>
> On Thu, Jul 21, 2022 at 02:38:51PM -0700, Darrick J. Wong wrote:
> > On Fri, Jul 22, 2022 at 07:33:37AM +1000, Dave Chinner wrote:
> > > On Thu, Jul 21, 2022 at 11:08:38PM +0800, kernel test robot wrote:
> > > >
> > > > (just FYI for the possible performance impact of disabling large folios,
> > > > our config, as attached, set default N to XFS_LARGE_FOLIOS)
> > > >
> > > >
> > > > Greeting,
> > > >
> > > > FYI, we noticed a -91.7% regression of vm-scalability.throughput due to commit:
> > > >
> > > >
> > > > commit: 345a4666a721a81c343186768cdd95817767195f ("xfs: disable large folios except for developers")
> > >
> > > Say what? I've never seen that change go past on a public list...
> > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git xfs-5.20-merge
> > >
> > > Oh, it's in a developer's working tree, not something that has been
> > > proposed for review let alone been merged.
> >
> > Correct, djwong-dev has a patch so that I can disable multipage folios
> > so that I could get other QA work done while willy and I try to sort out
> > the generic/522 corruption problems.
> >
> > > So why is this report being sent to lkml, linux-xfs, etc as if it
> > > was a change merged into an upstream tree rather than just the
> > > developer who owns the tree the commit is in?
> >
> > I was wondering that myself.
> >
> > --D
> >
> > > -Dave.
> > > --
> > > Dave Chinner
> > > [email protected]