Hello,
kernel test robot noticed a 6.4% improvement of fsmark.files_per_sec on:
commit: 63dfa1004322d596417f23da43cdc43cf6298c71 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: fsmark
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:
iterations: 8
disk: 1SSD
nr_threads: 16
fs: ext4
filesize: 8K
test_size: 75G
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240315/[email protected]
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
gcc-12/performance/1SSD/8K/ext4/8/x86_64-rhel-8.3/16d/256fpd/16/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-csl-2sp3/75G/fsmark
commit:
152694c829 ("nvme: set max_hw_sectors unconditionally")
63dfa10043 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard")
152694c82950a093 63dfa1004322d596417f23da43c
---------------- ---------------------------
%stddev %change %stddev
\ | \
492322 ? 8% +15.1% 566574 ? 2% meminfo.Active(anon)
501325 ? 8% +15.0% 576573 ? 2% meminfo.Shmem
458144 ? 18% +22.6% 561659 ? 2% numa-meminfo.node1.Active(anon)
462634 ? 18% +22.6% 567357 ? 2% numa-meminfo.node1.Shmem
114517 ? 18% +22.6% 140395 ? 2% numa-vmstat.node1.nr_active_anon
115654 ? 18% +22.6% 141838 ? 2% numa-vmstat.node1.nr_shmem
114517 ? 18% +22.6% 140395 ? 2% numa-vmstat.node1.nr_zone_active_anon
396.50 +745.6% 3353 ?181% vmstat.memory.buff
201414 +6.0% 213473 vmstat.system.cs
57760 +5.4% 60879 vmstat.system.in
22022 ? 2% +6.4% 23432 fsmark.files_per_sec
502.56 -5.9% 472.94 fsmark.time.elapsed_time
502.56 -5.9% 472.94 fsmark.time.elapsed_time.max
243.62 ? 2% +5.0% 255.75 fsmark.time.percent_of_cpu_this_job_got
123079 ? 8% +15.1% 141624 ? 2% proc-vmstat.nr_active_anon
8462 +2.1% 8637 proc-vmstat.nr_mapped
125342 ? 8% +15.0% 144138 ? 2% proc-vmstat.nr_shmem
123079 ? 8% +15.1% 141624 ? 2% proc-vmstat.nr_zone_active_anon
140970 ? 7% +14.1% 160889 ? 2% proc-vmstat.pgactivate
3.617e+08 -3.7% 3.483e+08 proc-vmstat.pgpgout
2.10 ? 9% -0.2 1.85 ? 3% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
2.08 ? 9% -0.2 1.84 ? 3% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
0.99 ? 20% +0.4 1.37 ? 11% perf-profile.calltrace.cycles-pp.jbd2__journal_start.ext4_do_writepages.ext4_writepages.do_writepages.filemap_fdatawrite_wbc
0.50 ? 60% +0.4 0.89 ? 14% perf-profile.calltrace.cycles-pp.add_transaction_credits.start_this_handle.jbd2__journal_start.ext4_do_writepages.ext4_writepages
2.50 ? 10% -0.3 2.20 ? 3% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
2.48 ? 10% -0.3 2.19 ? 3% perf-profile.children.cycles-pp.hrtimer_interrupt
0.24 ? 6% +0.0 0.27 ? 6% perf-profile.children.cycles-pp.ext4_dirty_inode
0.19 ? 11% +0.1 0.24 ? 6% perf-profile.children.cycles-pp.ext4_block_bitmap_csum_set
1.107e+09 +6.6% 1.18e+09 perf-stat.i.branch-instructions
202521 +6.1% 214902 perf-stat.i.context-switches
1.322e+10 ? 2% +6.7% 1.41e+10 perf-stat.i.cpu-cycles
5.46e+09 +6.6% 5.818e+09 perf-stat.i.instructions
2.11 +6.2% 2.24 perf-stat.i.metric.K/sec
1.105e+09 +6.6% 1.178e+09 perf-stat.ps.branch-instructions
202013 +6.1% 214333 perf-stat.ps.context-switches
1.319e+10 ? 2% +6.7% 1.407e+10 perf-stat.ps.cpu-cycles
5.448e+09 +6.5% 5.805e+09 perf-stat.ps.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Fri, Mar 15, 2024 at 04:21:13PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed a 6.4% improvement of fsmark.files_per_sec on:
>
>
> commit: 63dfa1004322d596417f23da43cdc43cf6298c71 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> testcase: fsmark
> test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
> parameters:
That is kinda odd and unexpected. Is this system using one of the old
Intel SSDs that this quirk is actually used for?
hi, Christoph Hellwig,
On Sun, Mar 17, 2024 at 09:36:45PM +0100, Christoph Hellwig wrote:
> On Fri, Mar 15, 2024 at 04:21:13PM +0800, kernel test robot wrote:
> >
> >
> > Hello,
> >
> > kernel test robot noticed a 6.4% improvement of fsmark.files_per_sec on:
> >
> >
> > commit: 63dfa1004322d596417f23da43cdc43cf6298c71 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > testcase: fsmark
> > test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
> > parameters:
>
> That is kinda odd and unexpected. Is this system using one of the old
> Intel SSDs that this quirk is actually used for?
>
yeah, the nvme sdd on this system is quite old (DC P3608 serial):
https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ssd-dc-p3608-spec.pdf