Greeting,
FYI, we noticed a -8.1% regression of phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s due to commit:
commit: 8b157c14b505f861cf8da783ff89f679a0e50abe ("[PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/mm-filemap-fix-that-first-page-is-not-mark-accessed-in-filemap_read/20220602-161035
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/linux-fsdevel/[email protected]
in testcase: phoronix-test-suite
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:
test: fio-1.14.1
option_a: Sequential Read
option_b: Linux AIO
option_c: Yes
option_d: Yes
option_e: 4KB
option_f: Default Test Directory
cpufreq_governor: performance
ucode: 0x500320a
test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
test-url: http://www.phoronix-test-suite.com/
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/option_d/option_e/option_f/rootfs/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/Sequential Read/Linux AIO/Yes/Yes/4KB/Default Test Directory/debian-x86_64-phoronix/lkp-csl-2sp7/fio-1.14.1/phoronix-test-suite/0x500320a
commit:
2408f14000 ("Merge branch 'mm-nonmm-unstable' into mm-everything")
8b157c14b5 ("mm/filemap: fix that first page is not mark accessed in filemap_read()")
2408f140000f9597 8b157c14b505f861cf8da783ff8
---------------- ---------------------------
%stddev %change %stddev
\ | \
481388 -8.1% 442333 phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.iops
1880 -8.1% 1727 phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s
2.894e+08 -8.1% 2.659e+08 phoronix-test-suite.time.file_system_inputs
0.11 ? 22% -0.0 0.08 mpstat.cpu.all.soft%
292.39 ? 35% -35.3% 189.30 ? 8% sched_debug.cpu.clock_task.stddev
933030 +47.4% 1374932 ? 2% numa-meminfo.node0.Active
92985 ? 16% +478.0% 537464 ? 6% numa-meminfo.node0.Active(file)
23246 ? 16% +475.4% 133769 ? 6% numa-vmstat.node0.nr_active_file
23246 ? 16% +475.4% 133769 ? 6% numa-vmstat.node0.nr_zone_active_file
1181131 -8.1% 1085364 vmstat.io.bi
20529 -7.4% 19019 vmstat.system.cs
954480 +45.1% 1384840 ? 3% meminfo.Active
112134 +386.0% 544959 ? 7% meminfo.Active(file)
2756213 -13.9% 2371792 meminfo.Inactive
1492877 -25.8% 1108430 meminfo.Inactive(file)
84.17 ? 10% -11.7% 74.33 turbostat.Avg_MHz
4.72 ? 18% -0.9 3.84 turbostat.Busy%
854421 ?133% -82.0% 154039 ? 20% turbostat.C1
0.49 ?155% -0.4 0.06 ? 11% turbostat.C1%
28033 +386.2% 136307 ? 7% proc-vmstat.nr_active_file
373247 -25.8% 277108 proc-vmstat.nr_inactive_file
28033 +386.2% 136308 ? 7% proc-vmstat.nr_zone_active_file
373247 -25.8% 277108 proc-vmstat.nr_zone_inactive_file
40703167 ? 2% -8.5% 37255189 proc-vmstat.numa_hit
40122593 -7.5% 37096628 proc-vmstat.numa_local
316253 +10501.8% 33528470 proc-vmstat.pgactivate
40072448 -7.3% 37140540 proc-vmstat.pgalloc_normal
39689252 -7.5% 36696525 proc-vmstat.pgfree
1.447e+08 -8.1% 1.33e+08 proc-vmstat.pgpgin
22.95 ? 52% -53.5% 10.67 perf-stat.i.MPKI
1.088e+09 -3.3% 1.052e+09 perf-stat.i.branch-instructions
14531811 ? 29% -30.9% 10047658 perf-stat.i.branch-misses
31350962 -9.2% 28459348 perf-stat.i.cache-misses
86567058 ? 24% -29.3% 61243543 perf-stat.i.cache-references
21004 -7.6% 19398 perf-stat.i.context-switches
7.243e+09 ? 11% -13.5% 6.262e+09 perf-stat.i.cpu-cycles
0.14 ? 95% -0.1 0.01 ? 10% perf-stat.i.dTLB-load-miss-rate%
1307140 ? 15% +17.6% 1537276 perf-stat.i.iTLB-loads
5.234e+09 -2.9% 5.084e+09 perf-stat.i.instructions
2655 ? 5% -10.9% 2366 ? 3% perf-stat.i.instructions-per-iTLB-miss
75383 ? 11% -13.5% 65208 perf-stat.i.metric.GHz
6029414 -6.2% 5655914 perf-stat.i.node-loads
20.94 ? 15% +3.7 24.66 ? 3% perf-stat.i.node-store-miss-rate%
82166 ? 23% +29.0% 106019 ? 2% perf-stat.i.node-store-misses
6382540 -9.0% 5805257 perf-stat.i.node-stores
16.54 ? 24% -27.2% 12.04 perf-stat.overall.MPKI
2862 ? 5% -11.1% 2544 ? 3% perf-stat.overall.instructions-per-iTLB-miss
5.63 ? 15% +1.0 6.67 perf-stat.overall.node-load-miss-rate%
1.27 ? 23% +0.5 1.79 perf-stat.overall.node-store-miss-rate%
1.078e+09 -3.3% 1.043e+09 perf-stat.ps.branch-instructions
14418791 ? 29% -30.9% 9965662 perf-stat.ps.branch-misses
31056696 -9.2% 28199667 perf-stat.ps.cache-misses
85785810 ? 24% -29.3% 60689278 perf-stat.ps.cache-references
20807 -7.6% 19221 perf-stat.ps.context-switches
7.181e+09 ? 11% -13.5% 6.209e+09 perf-stat.ps.cpu-cycles
1296058 ? 15% +17.6% 1524338 perf-stat.ps.iTLB-loads
5.189e+09 -2.9% 5.04e+09 perf-stat.ps.instructions
5972497 -6.2% 5604175 perf-stat.ps.node-loads
81503 ? 23% +29.0% 105130 ? 2% perf-stat.ps.node-store-misses
6322173 -9.0% 5752078 perf-stat.ps.node-stores
6.205e+11 -2.6% 6.041e+11 perf-stat.total.instructions
7.61 ? 14% -1.6 6.00 ? 13% perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.aio_read.io_submit_one.__x64_sys_io_submit
4.09 ? 14% -0.8 3.27 ? 11% perf-profile.calltrace.cycles-pp.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.calltrace.cycles-pp.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.calltrace.cycles-pp.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.calltrace.cycles-pp.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.46 ? 15% -0.5 2.00 ? 11% perf-profile.calltrace.cycles-pp.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.35 ? 16% -0.4 1.90 ? 11% perf-profile.calltrace.cycles-pp.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.68 ? 16% -0.4 1.30 ? 15% perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read.aio_read
1.75 ? 14% -0.4 1.38 ? 10% perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64
1.77 ? 14% -0.4 1.40 ? 10% perf-profile.calltrace.cycles-pp.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64
0.89 ? 18% -0.3 0.59 ? 46% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.aio_read.io_submit_one
0.85 ? 18% -0.3 0.57 ? 46% perf-profile.calltrace.cycles-pp.ext4_mpage_readpages.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read
1.49 ? 14% -0.3 1.22 ? 12% perf-profile.calltrace.cycles-pp.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read.aio_read
1.32 ? 13% -0.2 1.08 ? 12% perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read
0.98 ? 13% -0.2 0.76 ? 11% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise
1.05 ? 12% -0.2 0.84 ? 14% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages
0.90 ? 10% -0.2 0.72 ? 14% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded
0.75 ? 15% -0.2 0.59 ? 10% perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec
1.53 ? 17% +0.4 1.95 ? 9% perf-profile.calltrace.cycles-pp.schedule.worker_thread.kthread.ret_from_fork
1.53 ? 17% +0.4 1.95 ? 9% perf-profile.calltrace.cycles-pp.__schedule.schedule.worker_thread.kthread.ret_from_fork
0.00 +1.2 1.17 ? 18% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.folio_mark_accessed.filemap_read.aio_read.io_submit_one
0.31 ?101% +2.2 2.47 ? 17% perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_read.aio_read.io_submit_one.__x64_sys_io_submit
7.61 ? 14% -1.6 6.00 ? 13% perf-profile.children.cycles-pp.filemap_get_pages
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.children.cycles-pp.__x64_sys_fadvise64
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.children.cycles-pp.ksys_fadvise64_64
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.children.cycles-pp.generic_fadvise
4.10 ? 14% -0.8 3.28 ? 11% perf-profile.children.cycles-pp.invalidate_mapping_pagevec
2.47 ? 15% -0.5 2.00 ? 11% perf-profile.children.cycles-pp.__x64_sys_io_getevents
2.36 ? 16% -0.5 1.90 ? 11% perf-profile.children.cycles-pp.do_io_getevents
1.68 ? 16% -0.4 1.30 ? 15% perf-profile.children.cycles-pp.read_pages
1.77 ? 14% -0.4 1.40 ? 10% perf-profile.children.cycles-pp.__pagevec_release
1.49 ? 14% -0.3 1.22 ? 12% perf-profile.children.cycles-pp.folio_alloc
1.40 ? 12% -0.3 1.14 ? 12% perf-profile.children.cycles-pp.__alloc_pages
1.16 ? 15% -0.3 0.90 ? 13% perf-profile.children.cycles-pp.lookup_ioctx
0.90 ? 18% -0.2 0.67 ? 17% perf-profile.children.cycles-pp.filemap_get_read_batch
1.00 ? 12% -0.2 0.78 ? 12% perf-profile.children.cycles-pp.free_unref_page_list
1.08 ? 11% -0.2 0.86 ? 14% perf-profile.children.cycles-pp.get_page_from_freelist
0.85 ? 18% -0.2 0.65 ? 15% perf-profile.children.cycles-pp.ext4_mpage_readpages
0.88 ? 16% -0.2 0.70 ? 14% perf-profile.children.cycles-pp.__might_resched
0.93 ? 10% -0.2 0.75 ? 14% perf-profile.children.cycles-pp.rmqueue
0.78 ? 15% -0.2 0.61 ? 10% perf-profile.children.cycles-pp.free_unref_page_commit
0.61 ? 16% -0.1 0.48 ? 12% perf-profile.children.cycles-pp.free_pcppages_bulk
0.27 ? 15% -0.1 0.20 ? 11% perf-profile.children.cycles-pp.hrtimer_next_event_without
0.16 ? 22% -0.1 0.11 ? 19% perf-profile.children.cycles-pp.hrtimer_update_next_event
0.08 ? 20% -0.0 0.04 ? 47% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
0.08 ? 9% -0.0 0.05 ? 47% perf-profile.children.cycles-pp.tick_program_event
1.46 ? 13% +0.3 1.76 ? 8% perf-profile.children.cycles-pp.load_balance
0.00 +0.4 0.43 ? 16% perf-profile.children.cycles-pp.workingset_age_nonresident
0.00 +0.7 0.65 ? 17% perf-profile.children.cycles-pp.workingset_activation
0.00 +0.7 0.67 ? 17% perf-profile.children.cycles-pp.__folio_activate
0.00 +1.2 1.18 ? 18% perf-profile.children.cycles-pp.pagevec_lru_move_fn
0.57 ? 17% +1.9 2.51 ? 17% perf-profile.children.cycles-pp.folio_mark_accessed
4.33 ? 17% -0.9 3.45 ? 13% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
1.36 ? 12% -0.5 0.84 ? 36% perf-profile.self.cycles-pp.menu_select
0.64 ? 17% -0.2 0.45 ? 18% perf-profile.self.cycles-pp.filemap_get_read_batch
0.86 ? 16% -0.2 0.67 ? 13% perf-profile.self.cycles-pp.__might_resched
0.46 ? 19% -0.1 0.32 ? 18% perf-profile.self.cycles-pp.__get_user_4
0.34 ? 10% -0.1 0.24 ? 3% perf-profile.self.cycles-pp.copy_page_to_iter
0.14 ? 17% -0.0 0.09 ? 32% perf-profile.self.cycles-pp.aio_prep_rw
0.11 ? 14% -0.0 0.07 ? 23% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.08 ? 12% -0.0 0.04 ? 73% perf-profile.self.cycles-pp.tick_program_event
0.14 ? 9% -0.0 0.10 ? 11% perf-profile.self.cycles-pp.atime_needs_update
0.00 +0.2 0.22 ? 26% perf-profile.self.cycles-pp.workingset_activation
0.00 +0.3 0.29 ? 19% perf-profile.self.cycles-pp.pagevec_lru_move_fn
0.00 +0.4 0.35 ? 16% perf-profile.self.cycles-pp.__folio_activate
0.00 +0.4 0.43 ? 16% perf-profile.self.cycles-pp.workingset_age_nonresident
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp