Greeting,
FYI, we noticed a 4.5% improvement of stress-ng.clock.ops_per_sec due to commit:
commit: df29d3cd5ad4d400767caa199ec7c0ecbab10fc8 ("clocksource: Limit number of CPUs checked for clock synchronization")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: stress-ng
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:
nr_threads: 100%
disk: 1HDD
testtime: 60s
class: interrupt
test: clock
cpufreq_governor: performance
ucode: 0x5003006
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
interrupt/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/clock/stress-ng/60s/0x5003006
commit:
b509a98006 ("clocksource: Check per-CPU clock synchronization when marked unstable")
df29d3cd5a ("clocksource: Limit number of CPUs checked for clock synchronization")
b509a9800648b24a df29d3cd5ad4d400767caa199ec
---------------- ---------------------------
%stddev %change %stddev
\ | \
114360 +4.5% 119554 stress-ng.clock.ops_per_sec
27349 ? 89% -90.9% 2477 ? 4% stress-ng.time.file_system_inputs
11510358 +4.6% 12034522 stress-ng.time.voluntary_context_switches
1.807e+08 +2.4% 1.851e+08 interrupts.CAL:Function_call_interrupts
0.00 ? 53% -0.0 0.00 ? 46% mpstat.cpu.all.iowait%
2024 ? 7% -32.1% 1375 ? 7% slabinfo.dmaengine-unmap-16.active_objs
2024 ? 7% -32.1% 1375 ? 7% slabinfo.dmaengine-unmap-16.num_objs
43640 ? 13% -23.3% 33491 ? 7% meminfo.Active
10027 ? 82% -96.9% 306.00 ? 29% meminfo.Active(file)
752.67 ? 98% -98.7% 9.67 ? 30% meminfo.Buffers
2506 ? 82% -97.0% 76.33 ? 29% proc-vmstat.nr_active_file
2506 ? 82% -97.0% 76.33 ? 29% proc-vmstat.nr_zone_active_file
14176 ? 82% -90.7% 1323 ? 13% proc-vmstat.pgpgin
217.33 ? 82% -90.9% 19.83 ? 14% vmstat.io.bi
753.17 ? 98% -98.7% 9.67 ? 30% vmstat.memory.buff
356471 +4.0% 370646 vmstat.system.cs
1.02 ? 30% +128.4% 2.34 ? 28% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
3.74 ? 33% +201.5% 11.28 ? 26% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
1.02 ? 30% +128.8% 2.33 ? 28% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
3.74 ? 33% +201.1% 11.25 ? 26% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
6.048e+09 +2.2% 6.18e+09 perf-stat.i.branch-instructions
69934017 +2.7% 71856066 perf-stat.i.branch-misses
26.91 -0.6 26.30 perf-stat.i.cache-miss-rate%
53135032 -2.4% 51884388 perf-stat.i.cache-misses
366423 +4.5% 382812 perf-stat.i.context-switches
9.45 -2.6% 9.20 perf-stat.i.cpi
4686 +2.5% 4804 perf-stat.i.cycles-between-cache-misses
6.913e+09 +2.9% 7.11e+09 perf-stat.i.dTLB-loads
2.371e+09 +3.8% 2.461e+09 perf-stat.i.dTLB-stores
2.612e+10 +2.7% 2.683e+10 perf-stat.i.instructions
161.64 +2.7% 166.00 perf-stat.i.metric.M/sec
22851223 +3.2% 23572316 perf-stat.i.node-load-misses
9794183 +3.9% 10176496 perf-stat.i.node-store-misses
7.45 -2.7% 7.24 perf-stat.overall.MPKI
27.37 -0.6 26.74 perf-stat.overall.cache-miss-rate%
9.74 -2.7% 9.48 perf-stat.overall.cpi
1240 ? 3% +5.9% 1313 ? 2% perf-stat.overall.instructions-per-iTLB-miss
0.10 +2.7% 0.11 perf-stat.overall.ipc
5.959e+09 +2.2% 6.088e+09 perf-stat.ps.branch-instructions
68700080 +2.8% 70597822 perf-stat.ps.branch-misses
52442864 -2.4% 51206685 perf-stat.ps.cache-misses
361424 +4.5% 377658 perf-stat.ps.context-switches
6.812e+09 +2.9% 7.007e+09 perf-stat.ps.dTLB-loads
2.337e+09 +3.8% 2.426e+09 perf-stat.ps.dTLB-stores
2.573e+10 +2.7% 2.643e+10 perf-stat.ps.instructions
22537158 +3.2% 23252556 perf-stat.ps.node-load-misses
9660122 +3.9% 10039431 perf-stat.ps.node-store-misses
1.628e+12 +2.8% 1.673e+12 perf-stat.total.instructions
29.42 -0.9 28.53 perf-profile.calltrace.cycles-pp.release_posix_timer.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe
29.55 -0.9 28.68 perf-profile.calltrace.cycles-pp.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.29 -0.5 9.83 ? 2% perf-profile.calltrace.cycles-pp._raw_spin_unlock_irqrestore.release_posix_timer.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe
20.82 -0.4 20.38 perf-profile.calltrace.cycles-pp._raw_spin_lock.do_timer_create.__x64_sys_timer_create.do_syscall_64.entry_SYSCALL_64_after_hwframe
18.82 -0.4 18.39 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_posix_timer.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe
21.26 -0.4 20.84 perf-profile.calltrace.cycles-pp.do_timer_create.__x64_sys_timer_create.do_syscall_64.entry_SYSCALL_64_after_hwframe
21.29 -0.4 20.87 perf-profile.calltrace.cycles-pp.__x64_sys_timer_create.do_syscall_64.entry_SYSCALL_64_after_hwframe
18.73 -0.4 18.31 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_posix_timer.__x64_sys_timer_delete.do_syscall_64
20.59 -0.4 20.18 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_timer_create.__x64_sys_timer_create.do_syscall_64
9.40 -0.4 9.02 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function._raw_spin_unlock_irqrestore.release_posix_timer.__x64_sys_timer_delete
9.56 -0.4 9.18 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function._raw_spin_unlock_irqrestore.release_posix_timer.__x64_sys_timer_delete.do_syscall_64
9.37 -0.4 8.98 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function._raw_spin_unlock_irqrestore.release_posix_timer
11.25 -0.3 10.94 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function._raw_spin_unlock_irqrestore
1.33 ? 2% -0.1 1.24 perf-profile.calltrace.cycles-pp.ktime_get_real_ts64.posix_get_realtime_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.19 -0.1 1.11 perf-profile.calltrace.cycles-pp.ktime_get_ts64.posix_get_monotonic_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.21 -0.1 1.13 perf-profile.calltrace.cycles-pp.posix_get_monotonic_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.86 -0.0 0.82 perf-profile.calltrace.cycles-pp.ktime_get_with_offset.posix_get_boottime_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.posix_get_boottime_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.71 -0.0 0.67 ? 2% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.ktime_get_with_offset.posix_get_tai_timespec.__x64_sys_clock_gettime.do_syscall_64
0.69 ? 2% -0.0 0.65 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.ktime_get_with_offset
0.69 -0.0 0.66 ? 2% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.ktime_get_with_offset.posix_get_tai_timespec.__x64_sys_clock_gettime
0.63 +0.0 0.66 perf-profile.calltrace.cycles-pp.common_timer_get.do_timer_gettime.__x64_sys_timer_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.95 +0.0 0.99 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond
0.94 +0.0 0.99 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.llist_add_batch
0.97 +0.0 1.02 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.clock_was_set
0.95 +0.0 1.00 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask
1.14 +0.1 1.19 perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.clock_was_set.timekeeping_inject_offset
1.19 +0.1 1.24 perf-profile.calltrace.cycles-pp.do_timer_gettime.__x64_sys_timer_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.23 +0.1 1.29 perf-profile.calltrace.cycles-pp.__x64_sys_timer_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.74 +0.1 2.81 perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.retrigger_next_event.flush_smp_call_function_queue.__sysvec_call_function
0.62 ? 3% +0.1 0.69 ? 4% perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.do_sys_open
0.85 ? 2% +0.1 0.94 ? 3% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.85 ? 2% +0.1 0.94 ? 3% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
0.90 ? 3% +0.1 1.00 ? 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64
0.94 ? 2% +0.1 1.03 ? 3% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.92 ? 2% +0.1 1.02 ? 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.94 ? 2% +0.1 1.03 ? 3% perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.49 +0.1 2.58 perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.clock_was_set.timekeeping_inject_offset
19.32 +0.2 19.54 perf-profile.calltrace.cycles-pp.timekeeping_inject_offset.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe
21.84 +0.3 22.18 perf-profile.calltrace.cycles-pp.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.02 +0.3 22.36 perf-profile.calltrace.cycles-pp.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.69 ? 3% +0.4 3.14 ? 4% perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.osq_lock
2.71 ? 3% +0.4 3.15 ? 4% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.osq_lock.__mutex_lock.i40e_ptp_gettimex
2.69 ? 3% +0.4 3.14 ? 4% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.osq_lock.__mutex_lock
2.75 ? 3% +0.5 3.20 ? 4% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.osq_lock.__mutex_lock.i40e_ptp_gettimex.pc_clock_gettime
3.47 +0.5 3.92 ? 6% perf-profile.calltrace.cycles-pp.clockevents_program_event.retrigger_next_event.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function
0.09 ?223% +0.5 0.55 ? 5% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2
14.21 +0.7 14.88 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.68 ? 2% +0.9 6.60 ? 3% perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.i40e_ptp_gettimex.pc_clock_gettime.__x64_sys_clock_gettime
6.09 ? 2% +1.0 7.05 ? 3% perf-profile.calltrace.cycles-pp.__mutex_lock.i40e_ptp_gettimex.pc_clock_gettime.__x64_sys_clock_gettime.do_syscall_64
6.52 ? 2% +1.0 7.50 ? 3% perf-profile.calltrace.cycles-pp.i40e_ptp_gettimex.pc_clock_gettime.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.59 ? 2% +1.0 7.57 ? 3% perf-profile.calltrace.cycles-pp.pc_clock_gettime.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe
29.42 -0.9 28.53 perf-profile.children.cycles-pp.release_posix_timer
29.55 -0.9 28.68 perf-profile.children.cycles-pp.__x64_sys_timer_delete
41.94 -0.6 41.29 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
21.27 -0.4 20.84 perf-profile.children.cycles-pp.do_timer_create
21.29 -0.4 20.87 perf-profile.children.cycles-pp.__x64_sys_timer_create
21.39 -0.4 20.99 perf-profile.children.cycles-pp._raw_spin_lock
15.00 -0.4 14.64 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
2.38 -0.1 2.27 perf-profile.children.cycles-pp.ktime_get_with_offset
1.90 ? 2% -0.1 1.79 perf-profile.children.cycles-pp.ktime_get_real_ts64
1.21 -0.1 1.13 perf-profile.children.cycles-pp.posix_get_monotonic_timespec
1.20 -0.1 1.12 perf-profile.children.cycles-pp.ktime_get_ts64
0.88 -0.0 0.84 perf-profile.children.cycles-pp.posix_get_boottime_timespec
0.33 -0.0 0.31 perf-profile.children.cycles-pp.poll_idle
0.37 +0.0 0.39 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.28 +0.0 0.29 perf-profile.children.cycles-pp.hrtimer_update_next_event
0.21 ? 2% +0.0 0.23 ? 3% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.30 +0.0 0.33 perf-profile.children.cycles-pp._copy_to_user
0.17 ? 3% +0.0 0.19 perf-profile.children.cycles-pp.__lock_timer
0.31 ? 2% +0.0 0.33 ? 3% perf-profile.children.cycles-pp.mutex_spin_on_owner
0.63 +0.0 0.65 perf-profile.children.cycles-pp.__default_send_IPI_dest_field
0.04 ? 45% +0.0 0.07 ? 10% perf-profile.children.cycles-pp.__legitimize_path
0.63 +0.0 0.66 perf-profile.children.cycles-pp.common_timer_get
1.12 +0.0 1.17 perf-profile.children.cycles-pp.llist_reverse_order
1.36 +0.1 1.41 perf-profile.children.cycles-pp.lapic_next_deadline
1.16 +0.1 1.21 perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys
0.49 ? 3% +0.1 0.55 ? 5% perf-profile.children.cycles-pp.do_dentry_open
1.19 +0.1 1.24 perf-profile.children.cycles-pp.do_timer_gettime
1.24 +0.1 1.29 perf-profile.children.cycles-pp.__x64_sys_timer_gettime
0.62 ? 3% +0.1 0.69 ? 4% perf-profile.children.cycles-pp.do_open
0.85 ? 2% +0.1 0.94 ? 3% perf-profile.children.cycles-pp.do_filp_open
0.85 ? 2% +0.1 0.94 ? 3% perf-profile.children.cycles-pp.path_openat
0.94 ? 2% +0.1 1.04 ? 3% perf-profile.children.cycles-pp.do_sys_open
0.94 ? 2% +0.1 1.03 ? 3% perf-profile.children.cycles-pp.do_sys_openat2
2.50 +0.1 2.61 perf-profile.children.cycles-pp.llist_add_batch
19.33 +0.2 19.54 perf-profile.children.cycles-pp.timekeeping_inject_offset
8.95 +0.2 9.17 perf-profile.children.cycles-pp.ktime_get
7.83 +0.2 8.08 perf-profile.children.cycles-pp.clockevents_program_event
21.85 +0.3 22.18 perf-profile.children.cycles-pp.do_adjtimex
22.02 +0.4 22.37 perf-profile.children.cycles-pp.__do_sys_clock_adjtime
14.21 +0.7 14.88 ? 2% perf-profile.children.cycles-pp.__x64_sys_clock_gettime
5.71 ? 2% +0.9 6.64 ? 3% perf-profile.children.cycles-pp.osq_lock
6.09 ? 2% +1.0 7.05 ? 3% perf-profile.children.cycles-pp.__mutex_lock
6.52 ? 2% +1.0 7.50 ? 3% perf-profile.children.cycles-pp.i40e_ptp_gettimex
6.59 ? 2% +1.0 7.57 ? 3% perf-profile.children.cycles-pp.pc_clock_gettime
33.43 -0.6 32.85 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.02 ? 2% -0.1 0.96 perf-profile.self.cycles-pp.ktime_get_real_ts64
0.65 -0.0 0.60 perf-profile.self.cycles-pp.ktime_get_ts64
0.62 -0.0 0.60 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.63 +0.0 0.65 perf-profile.self.cycles-pp.__default_send_IPI_dest_field
0.59 +0.0 0.62 perf-profile.self.cycles-pp.flush_smp_call_function_queue
1.12 +0.0 1.17 perf-profile.self.cycles-pp.llist_reverse_order
1.36 +0.0 1.41 perf-profile.self.cycles-pp.lapic_next_deadline
1.49 +0.1 1.54 perf-profile.self.cycles-pp.llist_add_batch
8.74 +0.2 8.95 perf-profile.self.cycles-pp.ktime_get
2.84 ? 2% +0.5 3.32 ? 4% perf-profile.self.cycles-pp.osq_lock
stress-ng.clock.ops_per_sec
126000 +------------------------------------------------------------------+
124000 |-O O O |
| O |
122000 |-+ |
120000 |-+ O O O O O O O O O O O O O O O O O |
| O O O O |
118000 |-+ |
116000 |-+ |
114000 |-+ +.+.. .+.+.|
| : + |
112000 |-+ : |
110000 |-+ .+..+.+.+.+. .+..+ +. .+.+.. .+.+.+.+.+..+. .+. : |
|.+.+ + : : + + + + |
108000 |-+ : : |
106000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang