2021-05-19 21:16:56

by kernel test robot

[permalink] [raw]
Subject: [sched,fair] 0c2de3f054: aim7.jobs-per-min -4.1% regression



Greeting,

FYI, we noticed a -4.1% regression of aim7.jobs-per-min due to commit:


commit: 0c2de3f054a59f15e01804b75a04355c48de628c ("sched,fair: Alternative sched_slice()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: aim7
on test machine: 144 threads 4 sockets Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory
with following parameters:

disk: 1BRD_48G
fs: xfs
test: sync_disk_rw
load: 600
cpufreq_governor: performance
ucode: 0x700001e

test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/1BRD_48G/xfs/x86_64-rhel-8.3/600/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/sync_disk_rw/aim7/0x700001e

commit:
d27e9ae2f2 ("sched: Move /proc/sched_debug to debugfs")
0c2de3f054 ("sched,fair: Alternative sched_slice()")

d27e9ae2f244805b 0c2de3f054a59f15e01804b75a0
---------------- ---------------------------
%stddev %change %stddev
\ | \
19673 -4.1% 18863 aim7.jobs-per-min
183.04 +4.3% 190.90 aim7.time.elapsed_time
183.04 +4.3% 190.90 aim7.time.elapsed_time.max
346160 +1.9% 352778 aim7.time.involuntary_context_switches
96177 ? 3% +7.9% 103802 ? 3% aim7.time.minor_page_faults
10424 +5.5% 10997 aim7.time.system_time
1.054e+08 +1.5% 1.069e+08 aim7.time.voluntary_context_switches
2.70 ? 17% -0.5 2.21 turbostat.C1%
1423343 ? 2% +4.1% 1481526 proc-vmstat.pgfault
49105 +2.8% 50472 proc-vmstat.pgreuse
780983 -4.1% 748697 vmstat.io.bo
1172066 -3.2% 1134470 vmstat.system.cs
0.11 ? 69% -62.8% 0.04 ? 84% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.04 ? 5% -38.2% 0.02 ? 11% perf-sched.sch_delay.avg.ms.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
0.17 ? 91% -71.5% 0.05 ? 42% perf-sched.sch_delay.avg.ms.preempt_schedule_common.__cond_resched.down_write.xfs_ilock.xfs_vn_update_time
0.02 ? 24% +307.1% 0.08 ? 41% perf-sched.sch_delay.avg.ms.schedule_timeout.__down.down.xlog_write_iclog
0.07 ? 12% +138.2% 0.17 ? 10% perf-sched.sch_delay.avg.ms.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
0.01 ? 63% +114.6% 0.01 ? 39% perf-sched.sch_delay.max.ms.preempt_schedule_common.__cond_resched.down_write.xfs_ilock.xfs_trans_alloc_inode
4.05 ? 58% -53.2% 1.90 ? 49% perf-sched.sch_delay.max.ms.preempt_schedule_common.__cond_resched.down_write.xfs_ilock.xfs_vn_update_time
2.52 ? 21% -49.5% 1.27 ? 40% perf-sched.sch_delay.max.ms.preempt_schedule_common.__cond_resched.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
13.41 ? 59% -57.8% 5.66 ? 47% perf-sched.sch_delay.max.ms.preempt_schedule_common.__cond_resched.submit_bio_checks.submit_bio_noacct.submit_bio
1.50 ? 16% -23.9% 1.14 ? 17% perf-sched.sch_delay.max.ms.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_create
1.89 ? 70% -89.7% 0.20 ?204% perf-sched.sch_delay.max.ms.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_inactive_truncate
1.82 ? 2% -30.1% 1.27 ? 3% perf-sched.wait_and_delay.avg.ms.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
4.15 ? 7% +117.8% 9.03 ?102% perf-sched.wait_and_delay.avg.ms.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_roll
0.46 ? 6% +216.3% 1.46 ? 6% perf-sched.wait_and_delay.avg.ms.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
31541 ? 9% +102.0% 63701 ? 9% perf-sched.wait_and_delay.count.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
6913 ? 3% -6.2% 6485 ? 3% perf-sched.wait_and_delay.count.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_iomap_write_unwritten
62109 ? 4% -11.3% 55110 ? 3% perf-sched.wait_and_delay.count.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_vn_update_time
320.33 ? 14% +22.3% 391.67 ? 8% perf-sched.wait_and_delay.count.schedule_timeout.__down.down.xlog_write_iclog
31320 ? 9% +102.9% 63533 ? 9% perf-sched.wait_and_delay.count.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
9.12 ? 44% +41.2% 12.88 ? 41% perf-sched.wait_and_delay.max.ms.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
1.78 ? 2% -30.0% 1.24 ? 3% perf-sched.wait_time.avg.ms.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
1.70 ? 33% -52.4% 0.81 ? 35% perf-sched.wait_time.avg.ms.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_inactive_truncate
3.98 ? 8% +122.7% 8.86 ?104% perf-sched.wait_time.avg.ms.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_roll
0.39 ? 5% +230.3% 1.29 ? 5% perf-sched.wait_time.avg.ms.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
5.59 ? 17% -34.1% 3.69 ? 27% perf-sched.wait_time.max.ms.rwsem_down_read_slowpath.xfs_log_commit_cil.__xfs_trans_commit.xfs_inactive_ifree
8084230 +8.7% 8786389 interrupts.CAL:Function_call_interrupts
3123 ? 3% +7.7% 3363 ? 2% interrupts.CPU107.RES:Rescheduling_interrupts
55371 ? 2% +9.4% 60573 interrupts.CPU108.CAL:Function_call_interrupts
56137 +9.8% 61648 interrupts.CPU131.CAL:Function_call_interrupts
55080 +10.7% 60959 interrupts.CPU133.CAL:Function_call_interrupts
55405 +10.3% 61104 interrupts.CPU134.CAL:Function_call_interrupts
55596 +10.0% 61169 interrupts.CPU137.CAL:Function_call_interrupts
55720 +9.7% 61135 interrupts.CPU138.CAL:Function_call_interrupts
55560 +10.6% 61429 ? 2% interrupts.CPU141.CAL:Function_call_interrupts
54288 +9.2% 59296 ? 2% interrupts.CPU143.CAL:Function_call_interrupts
3006 ? 5% +13.0% 3398 ? 8% interrupts.CPU46.RES:Rescheduling_interrupts
3247 ? 7% +17.0% 3799 ? 9% interrupts.CPU47.RES:Rescheduling_interrupts
2856 ? 6% +10.5% 3155 ? 4% interrupts.CPU5.RES:Rescheduling_interrupts
55880 +9.9% 61426 interrupts.CPU57.CAL:Function_call_interrupts
55280 +10.2% 60913 interrupts.CPU62.CAL:Function_call_interrupts
55713 +9.7% 61098 interrupts.CPU66.CAL:Function_call_interrupts
55510 +10.2% 61168 interrupts.CPU67.CAL:Function_call_interrupts
56179 +9.7% 61639 interrupts.CPU68.CAL:Function_call_interrupts
2865 ? 5% +12.1% 3211 ? 5% interrupts.CPU7.RES:Rescheduling_interrupts
55420 +9.6% 60762 interrupts.CPU71.CAL:Function_call_interrupts
2885 ? 4% +9.7% 3165 ? 4% interrupts.CPU72.RES:Rescheduling_interrupts
2914 ? 5% +20.3% 3505 ? 20% interrupts.CPU79.RES:Rescheduling_interrupts
2930 ? 3% +14.9% 3366 ? 15% interrupts.CPU83.RES:Rescheduling_interrupts
3083 ? 5% +16.3% 3586 ? 15% interrupts.CPU84.RES:Rescheduling_interrupts
3272 ? 20% +25.2% 4098 ? 13% interrupts.CPU87.RES:Rescheduling_interrupts
3263 ? 21% +25.1% 4083 ? 11% interrupts.CPU89.RES:Rescheduling_interrupts
2830 ? 5% +11.2% 3146 ? 4% interrupts.CPU9.RES:Rescheduling_interrupts
3.484e+09 -2.3% 3.405e+09 perf-stat.i.branch-instructions
1.946e+08 -3.4% 1.881e+08 perf-stat.i.cache-references
1185508 -3.2% 1147741 perf-stat.i.context-switches
12.98 +2.8% 13.35 perf-stat.i.cpi
142012 -6.4% 132897 perf-stat.i.cpu-migrations
2588 +4.4% 2702 perf-stat.i.cycles-between-cache-misses
4.305e+09 -2.4% 4.203e+09 perf-stat.i.dTLB-loads
1.901e+09 -3.5% 1.835e+09 perf-stat.i.dTLB-stores
1.613e+10 -2.4% 1.575e+10 perf-stat.i.instructions
68.61 -2.5% 66.87 perf-stat.i.metric.M/sec
32587633 -3.4% 31495220 perf-stat.i.node-load-misses
15327679 -3.6% 14770505 perf-stat.i.node-store-misses
1269337 -4.6% 1210407 perf-stat.i.node-stores
12.98 +2.9% 13.36 perf-stat.overall.cpi
2636 +4.3% 2751 perf-stat.overall.cycles-between-cache-misses
0.08 -2.8% 0.07 perf-stat.overall.ipc
3.467e+09 -2.3% 3.389e+09 perf-stat.ps.branch-instructions
1.936e+08 -3.3% 1.871e+08 perf-stat.ps.cache-references
1179250 -3.2% 1141949 perf-stat.ps.context-switches
141265 -6.4% 132221 perf-stat.ps.cpu-migrations
4.283e+09 -2.4% 4.182e+09 perf-stat.ps.dTLB-loads
1.892e+09 -3.5% 1.826e+09 perf-stat.ps.dTLB-stores
1.605e+10 -2.3% 1.568e+10 perf-stat.ps.instructions
32418796 -3.3% 31338247 perf-stat.ps.node-load-misses
15248979 -3.6% 14697229 perf-stat.ps.node-store-misses
1263227 -4.6% 1204731 perf-stat.ps.node-stores
2.953e+12 +1.7% 3.004e+12 perf-stat.total.instructions
27.41 -0.5 26.95 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
27.21 -0.4 26.77 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
27.21 -0.4 26.76 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
27.21 -0.4 26.77 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
25.95 -0.4 25.52 perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
25.92 -0.4 25.49 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
0.57 -0.0 0.55 ? 2% perf-profile.calltrace.cycles-pp.complete.process_one_work.worker_thread.kthread.ret_from_fork
0.56 -0.0 0.54 perf-profile.calltrace.cycles-pp.swake_up_locked.complete.process_one_work.worker_thread.kthread
0.75 ? 47% +1.7 2.48 ? 13% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write
0.76 ? 47% +1.7 2.48 ? 13% perf-profile.calltrace.cycles-pp._raw_spin_lock.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write.new_sync_write
1.05 ? 46% +2.0 3.00 ? 13% perf-profile.calltrace.cycles-pp.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_write.new_sync_write.vfs_write
27.41 -0.5 26.95 perf-profile.children.cycles-pp.secondary_startup_64_no_verify
27.41 -0.5 26.95 perf-profile.children.cycles-pp.cpu_startup_entry
27.41 -0.5 26.95 perf-profile.children.cycles-pp.do_idle
26.15 -0.5 25.70 perf-profile.children.cycles-pp.cpuidle_enter
26.14 -0.5 25.69 perf-profile.children.cycles-pp.cpuidle_enter_state
27.21 -0.4 26.77 perf-profile.children.cycles-pp.start_secondary
23.16 -0.4 22.73 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
20.99 -0.4 20.64 perf-profile.children.cycles-pp.remove_wait_queue
0.53 ? 3% -0.0 0.48 perf-profile.children.cycles-pp.iomap_file_buffered_write
0.52 ? 4% -0.0 0.48 ? 2% perf-profile.children.cycles-pp.iomap_apply
0.45 ? 3% -0.0 0.41 perf-profile.children.cycles-pp.iomap_write_actor
0.56 -0.0 0.54 ? 2% perf-profile.children.cycles-pp.swake_up_locked
1.14 -0.0 1.11 perf-profile.children.cycles-pp.__schedule
0.58 -0.0 0.55 perf-profile.children.cycles-pp.complete
0.51 -0.0 0.49 perf-profile.children.cycles-pp.wait_for_completion
0.41 -0.0 0.39 ? 2% perf-profile.children.cycles-pp.schedule_timeout
0.33 -0.0 0.31 perf-profile.children.cycles-pp.dequeue_entity
64.75 +0.6 65.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
39.65 +1.1 40.72 perf-profile.children.cycles-pp._raw_spin_lock
21.05 ? 19% +4.4 25.44 ? 4% perf-profile.children.cycles-pp.__xfs_log_force_lsn
0.52 -0.0 0.50 perf-profile.self.cycles-pp._raw_spin_lock
64.40 +0.6 65.02 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath



aim7.jobs-per-min

20200 +-------------------------------------------------------------------+
| |
20000 |-+ + +. +. .+. .+ |
19800 |-+.+ .+ +. :+ + +. .+ + +.+ +.+. + .|
|.+ + +.+ : : + + + + + + : + |
19600 |-+ + : : + + + .+ : + : |
| + +.+.++ + + + : |
19400 |-+ + |
| |
19200 |-+ O O O |
19000 |-+ O O O O O |
| O O O O O O O O |
18800 |-+ O O O O O |
| O O O O |
18600 +-------------------------------------------------------------------+




1.9 +---------------------------------------------------------------------+
| +.+.+. + + .+. .+ + .+ |
1.8 |:+ + .+.+. + + .+ .+.+. .+. .+. + :.+.+.+ + + + :.+ + +|
1.7 |:+ + +.+.+ + + + +.+ + + + + + |
| |
1.6 |-+ |
1.5 |-+ |
| |
1.4 |-+ |
1.3 |-+ O O |
| O O O O O O O OO O |
1.2 |-O O O O O O O O O O |
1.1 |-+ O |
| O O |
1 +---------------------------------------------------------------------+




1.9 +---------------------------------------------------------------------+
| +.+ + +. + .+.+. .+. + : + +.+. .|
1.8 |++ +.+.+.+.+. + +.++.+.+. .+. .+. + + + + : + :+ + |
1.7 |-+ + + +.+ + + + |
| |
1.6 |-+ |
| |
1.5 |-+ |
| |
1.4 |-+ |
1.3 |-+ O O O |
| O O O O O O O OO O |
1.2 |-+ O O O O O O O O O |
| O |
1.1 +---------------------------------------------------------------------+




1.6 +---------------------------------------------------------------------+
| |
1.4 |-+ O O O O OO O O O |
| O O O O O O O O O O O O |
1.2 |-+ O O O |
| |
1 |-+ O |
| |
0.8 |-+ |
| |
0.6 |-+ |
| |
0.4 |.+.+.+.++.+.+.+.+.+.+.+.++.+.+.+.+.+.+.+.+.++.+.+.+.+.+.+.+.++.+.+.+.|
| |
0.2 +---------------------------------------------------------------------+




1.8 +---------------------------------------------------------------------+
| |
1.6 |-+ O O O O |
| O O OO O O O O O O O O O O O O |
1.4 |-+ O O O |
| O |
1.2 |-+ O |
| |
1 |-+ |
| |
0.8 |-+ |
| |
0.6 |-+ |
|. .+.+.++. .+.+.+. .+. +. .+. .+.+. .+.+.++.+.+.+.+.+.+. +.+.+. |
0.4 +---------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (23.08 kB)
config-5.12.0-rc2-00061-g0c2de3f054a5 (175.54 kB)
job-script (8.25 kB)
job.yaml (5.71 kB)
reproduce (748.00 B)
Download all attachments