Return-path: Received: from mail-ig0-f181.google.com ([209.85.213.181]:63624 "EHLO mail-ig0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933853AbbBDN3w (ORCPT ); Wed, 4 Feb 2015 08:29:52 -0500 Message-ID: <1423056591.907.130.camel@edumazet-glaptop2.roam.corp.google.com> (sfid-20150204_143000_468702_B2AAE062) Subject: Re: Throughput regression with `tcp: refine TSO autosizing` From: Eric Dumazet To: Michal Kazior Cc: Neal Cardwell , linux-wireless , Network Development , eyalpe@dev.mellanox.co.il Date: Wed, 04 Feb 2015 05:29:51 -0800 In-Reply-To: <1423055810.907.125.camel@edumazet-glaptop2.roam.corp.google.com> References: <1422537297.21689.15.camel@edumazet-glaptop2.roam.corp.google.com> <1422628835.21689.95.camel@edumazet-glaptop2.roam.corp.google.com> <1422903136.21689.114.camel@edumazet-glaptop2.roam.corp.google.com> <1422926330.21689.138.camel@edumazet-glaptop2.roam.corp.google.com> <1422973660.907.10.camel@edumazet-glaptop2.roam.corp.google.com> <1423051045.907.108.camel@edumazet-glaptop2.roam.corp.google.com> <1423053531.907.115.camel@edumazet-glaptop2.roam.corp.google.com> <1423055810.907.125.camel@edumazet-glaptop2.roam.corp.google.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Sender: linux-wireless-owner@vger.kernel.org List-ID: On Wed, 2015-02-04 at 05:16 -0800, Eric Dumazet wrote: > OK guys > > Using a mlx4 testbed I can reproduce the problem by pushing coalescing > settings and disabling SG (thus disabling GSO) > > ethtool -K eth0 sg off > Actual changes: > scatter-gather: off > tx-scatter-gather: off > generic-segmentation-offload: off [requested on] > > ethtool -C eth0 tx-usecs 1024 tx-frames 64 > > Meaning that NIC waits one ms before sending the TX IRQ, > and can accumulate 64 frames before forcing the interrupt. > > We probably have a bug in cwnd expansion logic : > > lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET > rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=230 rttvar=30 snd_ssthresh=41 cwnd=59 reordering=3 total_retrans=1 ca_state=0 pacing_rate=5943.1 Mbits > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > 87380 16384 16384 10.00 530.39 0.40 0.32 2.965 2.398 > > > -> final cwnd=59 which is not enough to avoid the 1ms delay between each > burst. > > So sender sends ~60 packets, then has to wait 1ms (to get NIC TX IRQ) > before sending the following burst. > > I am CCing Neal, he probably can help to root cause the problem. Arg, this was with net-next, ie not including our recent stretch ack fixes. Using David Miller 'net' tree, cwnd seems OK. Speed is low because of 64 queued frames are exceeding tcp_limit_output_bytes lpaa23:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes 131072 lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=166 rttvar=16 snd_ssthresh=26 cwnd=59 reordering=3 total_retrans=0 ca_state=0 pacing_rate=8203.52 Mbits Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 569.96 0.52 0.38 3.588 2.625 lpaa23:~# echo 262144 >/proc/sys/net/ipv4/tcp_limit_output_bytes lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=98 rttvar=18 snd_ssthresh=312 cwnd=313 reordering=3 total_retrans=23 ca_state=0 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 8518.40 2.60 1.57 1.200 0.727