Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp54870ybl; Thu, 12 Dec 2019 13:58:46 -0800 (PST) X-Google-Smtp-Source: APXvYqwU0zVo6+E2+m7aaPHuQlc5Pc6TYusT8AFytijvrYNGEbLdbivd4jmokz6W8GbCfXKrUTSt X-Received: by 2002:a9d:6a98:: with SMTP id l24mr11006507otq.160.1576187926755; Thu, 12 Dec 2019 13:58:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576187926; cv=none; d=google.com; s=arc-20160816; b=Uc071xRGSzpVudTcimOwPFiIMzCjoOuv5ovx23iY7L7W0e7ss4iDCLV0V5dG9PJK2Q fX4cRgKFd6nAbyyQyzzLe21i34/DV9+3ckdJYzlVfuRavpEHfrq4WiM+aMp6k5aY0v5J ISe0dNk5yyF13kb4IhqY15T7Gm9K8+B5MHUsqSbEl9wC6MNUCYMGIr3SzcoT1NavT+4C 40gNViTQ+Kocun/m87SLikoHG7hLggwjINgr4MwaOvu4eMCEZjk3Q29fpEaByOqe/kbc tGLl1/eQcuxRulgfYZy5gb9zvCx5D/xfOt3MflE/OPypueNjTlQCV98s9zisjja5KtjV seAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id; bh=cU4a0gizplIjFhBfEXgUr5+kNRJhlNsqiD/pbw3iEx8=; b=a23KuoVmlOXcu6CJkouRINB3o4O1ktAaruV0KDcLp/z+8E/k3T0p+qLnVdGw3sqJxR tNydV3HMQNFSNVuW7GwRoAX8tZuhnP1dKyM0IJR5l3oxKEUHqjLTccla2djEYBOaFuRb zISExz++T4W6tHWJrnUUP+p9gIdLs8E9GY6r//3PT+2r2owgbGU1kYtNddGjO9gasFpt qzeXSNLzSIl2K7wmiehFiyl92mCJcMfHwTJ3XnVVm5+cA1dKcTl+90R+dXPeXHL8M7dm 4DEYS5O4ZOrGa20ak9N5mcvRf+2IL7uvTQVZEmmlFJBSI4wom1f4EPGx8AHm0VMuvORy 96Tg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y186si4276583oia.108.2019.12.12.13.58.16; Thu, 12 Dec 2019 13:58:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730943AbfLLVqe (ORCPT + 99 others); Thu, 12 Dec 2019 16:46:34 -0500 Received: from s3.sipsolutions.net ([144.76.43.62]:53108 "EHLO sipsolutions.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730876AbfLLVqe (ORCPT ); Thu, 12 Dec 2019 16:46:34 -0500 Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.92.3) (envelope-from ) id 1ifWI1-008AXt-IM; Thu, 12 Dec 2019 22:46:29 +0100 Message-ID: <49cd2d6c7bf597c224edb8806cd56c126b5901b4.camel@sipsolutions.net> Subject: Re: debugging TCP stalls on high-speed wifi From: Johannes Berg To: Ben Greear , Eric Dumazet , Neal Cardwell Cc: Toke =?ISO-8859-1?Q?H=F8iland-J=F8rgensen?= , linux-wireless@vger.kernel.org, Netdev Date: Thu, 12 Dec 2019 22:46:27 +0100 In-Reply-To: <04dc171a-7385-6544-6cc6-141aae9f2782@candelatech.com> References: <14cedbb9300f887fecc399ebcdb70c153955f876.camel@sipsolutions.net> <99748db5-7898-534b-d407-ed819f07f939@gmail.com> <04dc171a-7385-6544-6cc6-141aae9f2782@candelatech.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.2 (3.34.2-1.fc31) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On Thu, 2019-12-12 at 13:29 -0800, Ben Greear wrote: > > > (*) Hmm. Now I have another idea. Maybe we have some kind of problem > > with the medium access configuration, and we transmit all this data > > without the AP having a chance to send back all the ACKs? Too bad I > > can't put an air sniffer into the setup - it's a conductive setup. > > splitter/combiner? I guess. I haven't looked at it, it's halfway around the world or something :) > If it is just delayed acks coming back, which would slow down a stream, then > multiple streams would tend to work around that problem? Only a bit, because it allows somewhat more outstanding data. But each stream estimates the throughput lower in its congestion control algorithm, so it would have a smaller window size? What I was thinking is that if we have some kind of skew in the system and always/frequently/sometimes make our transmissions have priority over the AP transmissions, then we'd not get ACKs back, and that might cause what I see - the queue drains entirely and *then* we get an ACK back... That's not a _bad_ theory and I'll have to find a good way to test it, but I'm not entirely convinced that's the problem. Oh, actually, I guess I know it's *not* the problem because otherwise the ss output would show we're blocked on congestion window far more than it looks like now? I think? > I would actually expect similar speedup with multiple streams if some TCP socket > was blocked on waiting for ACKs too. > > Even if you can't sniff the air, you could sniff the wire or just look at packet > in/out counts. If you have a huge number of ACKs, that would show up in raw pkt > counters. I know I have a huge number of ACKs, but I also know that's not the (only) problem. My question/observation was related to the timing of them. > I'm not sure it matters these days, but this patch greatly helped TCP throughput on > ath10k for a while, and we are still using it. Maybe your sk_pacing change already > tweaked the same logic: > > https://github.com/greearb/linux-ct-5.4/commit/65651d4269eb2b0d4b4952483c56316a7fbe2f48 Yes, you should be able to drop that patch - look at it, it just multiples the thing there that you have with "sk->sk_pacing_shift", instead we currently by default set sk->sk_pacing_shift to 7 instead of 10 or something, so that'd be equivalent to setting your sysctl to 8. > TCP_TSQ=200 Setting it to 200 is way excessive. In particular since you already get the *8 from the default mac80211 behaviour, so now you effectively have *1600, which means instead of 1ms you can have 1.6s worth of TCP data on the queues ... way too much :) johannes