Greeting,
FYI, we noticed a -6.1% regression of netperf.Throughput_Mbps due to commit:
commit: a337531b942bd8a03e7052444d7e36972aac2d92 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git master
in testcase: netperf
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:
ip: ipv4
runtime: 900s
nr_threads: 200%
cluster: cs-localhost
test: TCP_STREAM
ucode: 0x7000013
cpufreq_governor: performance
test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_STREAM/netperf/0x7000013
commit:
3ff6cde846 ("hns3: Another build fix.")
a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
3ff6cde846857d45 a337531b942bd8a03e7052444d
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:4 50% 2:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
2497 -6.1% 2345 netperf.Throughput_Mbps
79924 -6.1% 75061 netperf.Throughput_total_Mbps
186513 +11.3% 207590 netperf.time.involuntary_context_switches
5.488e+08 -6.1% 5.154e+08 netperf.workload
1172 ± 34% -37.6% 731.75 ± 5% cpuidle.C1E.usage
1137 ± 34% -40.0% 682.25 ± 8% turbostat.C1E
2775 ± 11% +17.5% 3261 ± 9% sched_debug.cpu.nr_switches.stddev
0.01 ± 17% +28.2% 0.01 ± 10% sched_debug.rt_rq:/.rt_time.avg
0.14 ± 17% +28.2% 0.18 ± 10% sched_debug.rt_rq:/.rt_time.max
0.03 ± 17% +28.2% 0.04 ± 10% sched_debug.rt_rq:/.rt_time.stddev
66336 +0.9% 66948 proc-vmstat.nr_anon_pages
2.755e+08 -6.1% 2.588e+08 proc-vmstat.numa_hit
2.755e+08 -6.1% 2.588e+08 proc-vmstat.numa_local
2.197e+09 -6.1% 2.064e+09 proc-vmstat.pgalloc_normal
2.197e+09 -6.1% 2.064e+09 proc-vmstat.pgfree
5.903e+11 -7.9% 5.438e+11 perf-stat.branch-instructions
2.68 -0.0 2.64 perf-stat.branch-miss-rate%
1.582e+10 -9.2% 1.436e+10 perf-stat.branch-misses
6.26e+11 -4.7% 5.964e+11 perf-stat.cache-misses
6.26e+11 -4.7% 5.964e+11 perf-stat.cache-references
11.69 +8.6% 12.69 perf-stat.cpi
123723 +2.1% 126291 perf-stat.cpu-migrations
0.09 ± 2% +0.0 0.09 perf-stat.dTLB-load-miss-rate%
1.475e+12 -7.1% 1.37e+12 perf-stat.dTLB-loads
1.094e+12 -6.9% 1.018e+12 perf-stat.dTLB-stores
2.912e+08 ± 5% -13.0% 2.533e+08 perf-stat.iTLB-loads
3.019e+12 -7.9% 2.781e+12 perf-stat.instructions
0.09 -7.9% 0.08 perf-stat.ipc
5500 -1.9% 5394 perf-stat.path-length
0.53 ± 2% -0.2 0.38 ± 57% perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
0.63 ± 2% -0.1 0.58 ± 4% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.73 ± 3% +0.1 0.78 ± 2% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
0.96 +0.1 1.03 perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
98.02 +0.1 98.13 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
97.88 +0.1 98.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.70 ± 3% -0.1 0.64 ± 4% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.26 ± 5% -0.0 0.21 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_bh
0.28 ± 5% -0.0 0.24 ± 6% perf-profile.children.cycles-pp.lock_sock_nested
0.46 ± 4% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.nf_hook_slow
0.21 ± 8% -0.0 0.18 ± 5% perf-profile.children.cycles-pp.tcp_rcv_space_adjust
0.08 ± 5% -0.0 0.06 perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2
0.08 ± 6% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.ip_finish_output
0.17 ± 6% +0.0 0.20 ± 5% perf-profile.children.cycles-pp.tcp_event_new_data_sent
0.24 ± 4% +0.0 0.27 ± 2% perf-profile.children.cycles-pp.mod_timer
0.15 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.__might_sleep
0.80 ± 3% +0.0 0.84 ± 2% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
0.30 ± 3% +0.1 0.36 ± 4% perf-profile.children.cycles-pp.__might_fault
1.61 ± 4% +0.1 1.69 perf-profile.children.cycles-pp.__release_sock
1.06 ± 2% +0.1 1.14 perf-profile.children.cycles-pp.tcp_ack
98.24 +0.1 98.36 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
98.09 +0.1 98.23 perf-profile.children.cycles-pp.do_syscall_64
70.28 +0.6 70.86 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
1.56 -0.1 1.48 ± 3% perf-profile.self.cycles-pp.copy_page_to_iter
0.70 ± 3% -0.1 0.64 ± 4% perf-profile.self.cycles-pp.syscall_return_via_sysret
1.37 ± 2% -0.1 1.32 ± 2% perf-profile.self.cycles-pp.__free_pages_ok
0.55 ± 3% -0.0 0.50 ± 3% perf-profile.self.cycles-pp.__alloc_skb
0.44 ± 3% -0.0 0.40 ± 5% perf-profile.self.cycles-pp.tcp_recvmsg
0.16 ± 9% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.sock_has_perm
0.08 ± 6% -0.0 0.06 perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2
0.10 ± 4% +0.0 0.12 ± 6% perf-profile.self.cycles-pp.tcp_clean_rtx_queue
0.14 ± 6% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__might_sleep
69.25 +0.5 69.77 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
netperf.Throughput_Mbps
3000 +-+------------------------------------------------------------------+
| |
2500 +-+..+.+..+.+..+.+..+.+..+.+..+.+..+.+.+..+.+..+.+..+.+..+.+..+.+..+.|
O O O O O O O O O O O O O O O O O O O O O O O O O |
| : |
2000 +-+ |
|: |
1500 +-+ |
|: |
1000 +-+ |
|: |
|: |
500 +-+ |
| |
0 +-+------------------------------------------------------------------+
netperf.Throughput_total_Mbps
90000 +-+-----------------------------------------------------------------+
| |
80000 O-O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..+.+..+.|
70000 +-+ |
| : |
60000 +-+ |
50000 +-+ |
|: |
40000 +-+ |
30000 +-+ |
|: |
20000 +-+ |
10000 +-+ |
| |
0 +-+-----------------------------------------------------------------+
netperf.workload
6e+08 +-+-----------------------------------------------------------------+
| +..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.|
5e+08 O-O O O O O O O O O O O O O O O O O O O O O O O O |
| : |
| : |
4e+08 +-+ |
|: |
3e+08 +-+ |
|: |
2e+08 +-+ |
|: |
| |
1e+08 +-+ |
| |
0 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
On Tue, Oct 9, 2018 at 7:01 PM kernel test robot <[email protected]> wrote:
>
> Greeting,
>
> FYI, we noticed a -6.1% regression of netperf.Throughput_Mbps due to commit:
>
>
> commit: a337531b942bd8a03e7052444d7e36972aac2d92 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git master
>
> in testcase: netperf
> on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> with following parameters:
>
> ip: ipv4
> runtime: 900s
> nr_threads: 200%
> cluster: cs-localhost
> test: TCP_STREAM
> ucode: 0x7000013
> cpufreq_governor: performance
>
> test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> test-url: http://www.netperf.org/netperf/
>
>
This should have been fixed by :
041a14d2671573611ffd6412bc16e2f64469f7fb tcp: start receiver buffer
autotuning sooner