Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp32453ybl; Thu, 12 Dec 2019 13:31:02 -0800 (PST) X-Google-Smtp-Source: APXvYqzJlnaRzlyy0/YK/ZGpajmxQYgM1Al1FyHemoYDQEJJmYgQLyw8g2+U8zjM2hiQsFvhvNm5 X-Received: by 2002:a05:6808:64e:: with SMTP id z14mr6058582oih.79.1576186262379; Thu, 12 Dec 2019 13:31:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576186262; cv=none; d=google.com; s=arc-20160816; b=wD4DVqbT8HkuJu7Y9Zr2OJ8Z0fns2UsZh/IN0MKqf/87aRz+uSauL9UiGUrMtRWD1E kJ5vADOSQW8a2KaLZi8tFk+Do6jA5G2JfJOz32pt4lSZxWKTuPzzGc66hsGzjS/Q3fzq B4DTrDCNQgo9aC0oCRfe67N4eqVDgyMEtAikKJpQ5toHpA+NKvCzMQGl8Ow8pYOWT0Hx ap8l+hYbLODUCwV5X1ZjMg0i22urh/RrbF1OVWbbv9txMowb07uDug8qGYxeV5x+UDOQ 3YzV/hEVDmlX13hBS8qSc+OIEVaMprrTrTR5FRY/ILAaE775PPYUKWYX4x9t5VkJPf+o OvDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature:dkim-filter; bh=bOGrmU6NUJggYnzWmuc9qoU/MOCxvOGYc67+N2HsTDc=; b=zJ4q5EQIhKHUmX03R7n7YRgQwyzNeQTzC7y/4AkGvkcDnWnE7xihgl3GURwtRgPDSk R28e9/MLCXTUWC4EwRqxAtziEvRrZlE0vPU8urH3EldcJbbC4OPoP2X4Xa9VOM2d3gxy FN9JfytATzx/fMl2J0M6vpEdFTI9zuLh2eYyOLNNjp/JMfDGAmBLBTxffOEix1IS/S79 k7aXHFrbCL9HkaV/H1yMswJItSzIcp2wYCvKlfUMmdctEizJ+BJrpVqN8w71dw+nGlmZ aVDSwoRLdS4xhdU0kWbwMArsf5ARngxGDML1Ka/5s1ouuvnbvkINOIYRMZGzZBeabEv5 OGpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@candelatech.com header.s=default header.b=jKUYz1LL; spf=pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=candelatech.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q124si3644741oig.228.2019.12.12.13.30.45; Thu, 12 Dec 2019 13:31:02 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@candelatech.com header.s=default header.b=jKUYz1LL; spf=pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=candelatech.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731052AbfLLV3J (ORCPT + 99 others); Thu, 12 Dec 2019 16:29:09 -0500 Received: from mail2.candelatech.com ([208.74.158.173]:48208 "EHLO mail3.candelatech.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730811AbfLLV3I (ORCPT ); Thu, 12 Dec 2019 16:29:08 -0500 Received: from [192.168.100.195] (50-251-239-81-static.hfc.comcastbusiness.net [50.251.239.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail3.candelatech.com (Postfix) with ESMTPSA id 7A02113C283; Thu, 12 Dec 2019 13:29:07 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail3.candelatech.com 7A02113C283 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=candelatech.com; s=default; t=1576186147; bh=1gtuydINIwsIZMa+/eWNaF7yM91GiVdaDA6M+mF7eZI=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=jKUYz1LLddiojTZbxkrO1Lo5FuYND6EJdaB1MMpF2lMKTa6YihLQm3Qmw6iXsyvQg b13MxK6lKq+/gfJ72ytZGoRhPXGRdVj3b+jGORSqEwIhUDbcaa4ytMPvckBvzxFlKd uJOC2FF9GNfqO8o5gc/EKxgHp1w3Y3qzAXweiqSo= Subject: Re: debugging TCP stalls on high-speed wifi To: Johannes Berg , Eric Dumazet , Neal Cardwell Cc: =?UTF-8?Q?Toke_H=c3=b8iland-J=c3=b8rgensen?= , linux-wireless@vger.kernel.org, Netdev References: <14cedbb9300f887fecc399ebcdb70c153955f876.camel@sipsolutions.net> <99748db5-7898-534b-d407-ed819f07f939@gmail.com> From: Ben Greear Organization: Candela Technologies Message-ID: <04dc171a-7385-6544-6cc6-141aae9f2782@candelatech.com> Date: Thu, 12 Dec 2019 13:29:07 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 12/12/19 1:11 PM, Johannes Berg wrote: > Hi Eric, > > Thanks for looking :) > >>> I'm not sure how to do headers-only, but I guess -s100 will work. >>> >>> https://johannes.sipsolutions.net/files/he-tcp.pcap.xz >>> >> >> Lack of GRO on receiver is probably what is killing performance, >> both for receiver (generating gazillions of acks) and sender >> (to process all these acks) > Yes, I'm aware of this, to some extent. And I'm not saying we should see > even close to 1800 Mbps like we have with UDP... > > Mind you, the biggest thing that kills performance with many ACKs isn't > the load on the system - the sender system is only moderately loaded at > ~20-25% of a single core with TSO, and around double that without TSO. > The thing that kills performance is eating up all the medium time with > small non-aggregated packets, due to the the half-duplex nature of WiFi. > I know you know, but in case somebody else is reading along :-) > > But unless somehow you think processing the (many) ACKs on the sender > will cause it to stop transmitting, or something like that, I don't > think I should be seeing what I described earlier: we sometimes (have > to?) reclaim the entire transmit queue before TCP starts pushing data > again. That's less than 2MB split across at least two TCP streams, I > don't see why we should have to get to 0 (which takes about 7ms) until > more packets come in from TCP? > > Or put another way - if I free say 400kB worth of SKBs, what could be > the reason we don't see more packets be sent out of the TCP stack within > the few ms or so? I guess I have to correlate this somehow with the ACKs > so I know how much data is outstanding for ACKs. (*) > > The sk_pacing_shift is set to 7, btw, which should give us 8ms of > outstanding data. For now in this setup that's enough(**), and indeed > bumping the limit up (setting sk_pacing_shift to say 5) doesn't change > anything. So I think this part we actually solved - I get basically the > same performance and behaviour with two streams (needed due to GBit LAN > on the other side) as with 20 streams. > > >> I had a plan about enabling compressing ACK as I did for SACK >> in commit >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5d9f4262b7ea41ca9981cc790e37cca6e37c789e >> >> But I have not done it yet. >> It is a pity because this would tremendously help wifi I am sure. > > Nice :-) > > But that is something the *receiver* would have to do. > > The dirty secret here is that we're getting close to 1700 Mbps TCP with > Windows in place of Linux in the setup, with the same receiver on the > other end (which is actually a single Linux machine with two GBit > network connections to the AP). So if we had this I'm sure it'd increase > performance, but it still wouldn't explain why we're so much slower than > Windows :-) > > Now, I'm certainly not saying that TCP behaviour is the only reason for > the difference, we already found an issue for example where due to a > small Windows driver bug some packet extension was always used, and the > AP is also buggy in that it needs the extension but didn't request it > ... so the two bugs cancelled each other out and things worked well, but > our Linux driver believed the AP ... :) Certainly there can be more > things like that still, I just started on the TCP side and ran into the > queueing behaviour that I cannot explain. > > > In any case, I'll try to dig deeper into the TCP stack to understand the > reason for this transmit behaviour. > > Thanks, > johannes > > > (*) Hmm. Now I have another idea. Maybe we have some kind of problem > with the medium access configuration, and we transmit all this data > without the AP having a chance to send back all the ACKs? Too bad I > can't put an air sniffer into the setup - it's a conductive setup. splitter/combiner? If it is just delayed acks coming back, which would slow down a stream, then multiple streams would tend to work around that problem? I would actually expect similar speedup with multiple streams if some TCP socket was blocked on waiting for ACKs too. Even if you can't sniff the air, you could sniff the wire or just look at packet in/out counts. If you have a huge number of ACKs, that would show up in raw pkt counters. I'm not sure it matters these days, but this patch greatly helped TCP throughput on ath10k for a while, and we are still using it. Maybe your sk_pacing change already tweaked the same logic: https://github.com/greearb/linux-ct-5.4/commit/65651d4269eb2b0d4b4952483c56316a7fbe2f48 if [ -w /proc/sys/net/ipv4/tcp_tsq_limit_output_interval ] then # This helps TCP tx throughput when using ath10k. Setting > 1 likely # increases latency in some cases, but on average, seems a win for us. TCP_TSQ=200 echo -n "Setting TCP-tsq limit to $TCP_TSQ....................................... " echo $TCP_TSQ > /proc/sys/net/ipv4/tcp_tsq_limit_output_interval echo "DONE" fi Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com