2024-03-15 03:17:53

by Oliver Sang

[permalink] [raw]
Subject: [linus:master] [af_unix] d9f21b3613: stress-ng.sockfd.ops_per_sec 9.1% improvement



Hello,

kernel test robot noticed a 9.1% improvement of stress-ng.sockfd.ops_per_sec on:


commit: d9f21b3613337b55cc9d4a6ead484dca68475143 ("af_unix: Try to run GC async.")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

nr_threads: 100%
testtime: 60s
test: sockfd
cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240315/[email protected]

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockfd/stress-ng/60s

commit:
8b90a9f819 ("af_unix: Run GC on only one CPU.")
d9f21b3613 ("af_unix: Try to run GC async.")

8b90a9f819dc2a06 d9f21b3613337b55cc9d4a6ead4
---------------- ---------------------------
%stddev %change %stddev
\ | \
25305 ? 4% +9.7% 27753 ? 2% perf-c2c.HITM.total
64392 +1.8% 65544 vmstat.system.cs
1926720 +1.4% 1954260 proc-vmstat.numa_hit
1694682 +1.5% 1719926 proc-vmstat.numa_local
3151070 +3.4% 3257664 proc-vmstat.pgalloc_normal
0.28 ? 8% -15.0% 0.24 ? 9% sched_debug.cfs_rq:/.h_nr_running.stddev
259.21 ? 7% -12.9% 225.86 ? 6% sched_debug.cfs_rq:/.runnable_avg.stddev
23.78 ? 13% -20.9% 18.80 ? 27% sched_debug.cpu.clock.stddev
50265901 +9.1% 54861338 stress-ng.sockfd.ops
837446 +9.1% 913917 stress-ng.sockfd.ops_per_sec
2293458 -2.8% 2230066 stress-ng.time.involuntary_context_switches
1581490 +8.1% 1709261 stress-ng.time.voluntary_context_switches
26480342 +4.2% 27595498 perf-stat.i.cache-misses
90320805 +3.9% 93807170 perf-stat.i.cache-references
9.86 -1.7% 9.70 perf-stat.i.cpi
25274 -5.1% 23975 perf-stat.i.cycles-between-cache-misses
6.498e+10 +1.1% 6.571e+10 perf-stat.i.instructions
0.11 +1.7% 0.11 perf-stat.i.ipc
10.00 -1.7% 9.83 perf-stat.overall.cpi
24733 -4.7% 23575 perf-stat.overall.cycles-between-cache-misses
0.10 +1.7% 0.10 perf-stat.overall.ipc
1.438e+10 +1.3% 1.458e+10 perf-stat.ps.branch-instructions
24920120 +4.9% 26142747 perf-stat.ps.cache-misses
86987270 +4.5% 90934893 perf-stat.ps.cache-references
6.162e+10 +1.7% 6.268e+10 perf-stat.ps.instructions
3.698e+12 +2.2% 3.781e+12 perf-stat.total.instructions
66.00 ? 70% -49.5 16.45 ?223% perf-profile.calltrace.cycles-pp.stress_sockfd
33.12 ? 70% -24.9 8.24 ?223% perf-profile.calltrace.cycles-pp.sendmsg.stress_sockfd
33.08 ? 70% -24.9 8.23 ?223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
33.08 ? 70% -24.9 8.23 ?223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
33.05 ? 70% -24.8 8.22 ?223% perf-profile.calltrace.cycles-pp.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
33.04 ? 70% -24.8 8.22 ?223% perf-profile.calltrace.cycles-pp.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg
32.99 ? 70% -24.8 8.20 ?223% perf-profile.calltrace.cycles-pp.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
32.95 ? 70% -24.8 8.19 ?223% perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64
32.67 ? 70% -24.5 8.16 ?223% perf-profile.calltrace.cycles-pp.recvmsg.stress_sockfd
32.65 ? 70% -24.5 8.15 ?223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
32.65 ? 70% -24.5 8.15 ?223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
32.64 ? 70% -24.5 8.14 ?223% perf-profile.calltrace.cycles-pp.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
32.63 ? 70% -24.5 8.14 ?223% perf-profile.calltrace.cycles-pp.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg
32.60 ? 70% -24.5 8.14 ?223% perf-profile.calltrace.cycles-pp.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
32.60 ? 70% -24.5 8.14 ?223% perf-profile.calltrace.cycles-pp.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64
32.59 ? 70% -24.5 8.13 ?223% perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg
32.58 ? 70% -24.5 8.13 ?223% perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg
32.51 ? 70% -24.4 8.10 ?223% perf-profile.calltrace.cycles-pp.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg
32.51 ? 70% -24.4 8.10 ?223% perf-profile.calltrace.cycles-pp.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg
32.44 ? 70% -24.4 8.07 ?223% perf-profile.calltrace.cycles-pp.unix_inflight.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg
32.43 ? 70% -24.4 8.07 ?223% perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_inflight.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg
32.37 ? 70% -24.3 8.06 ?223% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_inflight.unix_attach_fds.unix_scm_to_skb
32.31 ? 70% -24.2 8.06 ?223% perf-profile.calltrace.cycles-pp.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg
32.30 ? 70% -24.2 8.06 ?223% perf-profile.calltrace.cycles-pp.unix_notinflight.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
32.30 ? 70% -24.2 8.06 ?223% perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_notinflight.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg
32.23 ? 70% -24.2 8.04 ?223% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_notinflight.unix_detach_fds.unix_stream_read_generic
66.37 ? 70% -49.8 16.57 ?223% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
66.36 ? 70% -49.8 16.56 ?223% perf-profile.children.cycles-pp.do_syscall_64
66.00 ? 70% -49.5 16.45 ?223% perf-profile.children.cycles-pp.stress_sockfd
64.86 ? 70% -48.7 16.17 ?223% perf-profile.children.cycles-pp._raw_spin_lock
64.64 ? 70% -48.5 16.11 ?223% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
33.13 ? 70% -24.9 8.24 ?223% perf-profile.children.cycles-pp.sendmsg
33.06 ? 70% -24.8 8.22 ?223% perf-profile.children.cycles-pp.__sys_sendmsg
33.04 ? 70% -24.8 8.22 ?223% perf-profile.children.cycles-pp.___sys_sendmsg
32.99 ? 70% -24.8 8.20 ?223% perf-profile.children.cycles-pp.____sys_sendmsg
32.95 ? 70% -24.8 8.19 ?223% perf-profile.children.cycles-pp.unix_stream_sendmsg
32.68 ? 70% -24.5 8.16 ?223% perf-profile.children.cycles-pp.recvmsg
32.64 ? 70% -24.5 8.15 ?223% perf-profile.children.cycles-pp.__sys_recvmsg
32.63 ? 70% -24.5 8.14 ?223% perf-profile.children.cycles-pp.___sys_recvmsg
32.61 ? 70% -24.5 8.14 ?223% perf-profile.children.cycles-pp.____sys_recvmsg
32.60 ? 70% -24.5 8.14 ?223% perf-profile.children.cycles-pp.sock_recvmsg
32.59 ? 70% -24.5 8.13 ?223% perf-profile.children.cycles-pp.unix_stream_read_generic
32.59 ? 70% -24.5 8.13 ?223% perf-profile.children.cycles-pp.unix_stream_recvmsg
32.51 ? 70% -24.4 8.10 ?223% perf-profile.children.cycles-pp.unix_scm_to_skb
32.51 ? 70% -24.4 8.10 ?223% perf-profile.children.cycles-pp.unix_attach_fds
32.44 ? 70% -24.4 8.07 ?223% perf-profile.children.cycles-pp.unix_inflight
32.31 ? 70% -24.2 8.06 ?223% perf-profile.children.cycles-pp.unix_detach_fds
32.30 ? 70% -24.2 8.06 ?223% perf-profile.children.cycles-pp.unix_notinflight
64.36 ? 70% -48.3 16.04 ?223% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki