Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1892124ybl; Thu, 5 Dec 2019 08:36:20 -0800 (PST) X-Google-Smtp-Source: APXvYqwrgIUMV6XABii2AbwapmgLM+ejJifL2jrg7E8+QFqWJzxwh5wjTktdY6e6Y/PbqnW6h+iN X-Received: by 2002:a9d:1425:: with SMTP id h34mr7253730oth.166.1575563780740; Thu, 05 Dec 2019 08:36:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575563780; cv=none; d=google.com; s=arc-20160816; b=YT3XrGCe/qb6rwvv8+zIaj2N/p21Rub9kYVFbdi+lyWpWnozvU/oVYS8qNdVKgB6sP SY8+8e2W6vGqo1yoL/nVZ2vXKYUFOwa4hOpiAPlrEpuUxL55xXg4bnlJyl2BPnMdBVgo 1owLdosqOrCo3MhnkWeDD2WcptKwIjPrjpn/Jp+sXXFgY4JvLDnl822QaxiKb6oYQQAL 2OpHHkd67uW3a5dYZn3BBAIn5zKGSiaQ7idbvGmKsiH+OLC/p67WRU78uR75tMo9S471 YekIdmgtWur1SvDBl1JhahLS98RwiGEWsYjXviKBNRCAFbSaSZhk6LhO2CWugBuc7Xk9 XD6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:date:cc:to:from:subject:message-id; bh=qql/qnazZK3qfGcXjWpokjShmDvrVlxPM8fW1O13HtY=; b=QgfXVzaQZ772o9tz9ZJB0VZyMEJBoYHGA/cHw7lKjk+kd7n8fDqi0HKyXOINh5H635 UYWK/wCeXYPMu7P/Pt2d1nJv+Z7IA7vXAaQe+fY7/uuO2pMqNd4IrvyfCXaHpZiUQ/vh l8CYcv6NCV6IQBSHGirQLJJxzIcGTqBDamqdZuoUuceDdWzJnqOQucfhry0See/FWEmV EaAolwFjv2aKplh/b7zOyNDdsHDeueMZBvbLSYxh9x8dZzcV4/x6EwH4wB1VsDaHbnRF J0sZIK0kZ41NHacvkbQZWxZFHIMXr5HGMoET6Lj0r46UVUz0UE+McYq/7in/SzFimSkI 72XQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g17si5379123otg.160.2019.12.05.08.35.55; Thu, 05 Dec 2019 08:36:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-wireless-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729099AbfLEQeU (ORCPT + 99 others); Thu, 5 Dec 2019 11:34:20 -0500 Received: from s3.sipsolutions.net ([144.76.43.62]:57918 "EHLO sipsolutions.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726028AbfLEQeU (ORCPT ); Thu, 5 Dec 2019 11:34:20 -0500 Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.92.3) (envelope-from ) id 1icu54-004ABQ-5d; Thu, 05 Dec 2019 17:34:18 +0100 Message-ID: Subject: debugging TXQs being empty From: Johannes Berg To: Toke =?ISO-8859-1?Q?H=F8iland-J=F8rgensen?= Cc: linux-wireless@vger.kernel.org Date: Thu, 05 Dec 2019 17:34:16 +0100 Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.2 (3.34.2-1.fc31) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org Hi Toke, all, I'm debugging some throughput issues and wondered if you had a hint. This is at HE rates 2x2 80 MHz, so you'd expect ~1Gbps or a bit more, I'm getting ~900 Mbps. Just to set the stage. What I think is (part of) the problem is that I see in the logs that our hardware queues become empty every once a while. This seems to be when/because ieee80211_tx_dequeue() returns NULL, and we hit the skb = ieee80211_tx_dequeue(hw, txq); if (!skb) { if (txq->sta) IWL_DEBUG_TX(mvm, "TXQ of sta %pM tid %d is now empty\n", txq->sta->addr, txq->tid); printout, e.g. iwlwifi 0000:00:14.3: I iwl_mvm_mac_itxq_xmit TXQ of sta 0c:9d:92:03:12:44 tid 0 is now empty This isn't always bad, but in most cases I see it happen the hardware queue actually is rather shallow at the time, say only 57 packets in some instance. Then we can basically send all the packets in the queue in one or two aggregations (see I here an example with 57 packets in the queue, ieee80211_tx_dequeue() returns NULL, and we then send an A-MPDU with 38 followed by one with 19 packets, making the HW queue empty.) This is with 10 simultaneous TCP streams, so there *shouldn't* be any issues with that, I did indeed try to lower the pacing shift and it had no effect. I couldn't try with just one or two streams (actually one stream is not enough because the AP has only GBit LAN ... so in the ideal case wireless is faster than ethernet!!) - somehow the test hangs then, but I'll get back to that later. Anyhow, do you have any tips on debugging this? This is still without AQL code. The AQM stats for the AP look fine, basically everything is 0 except for "new-flows", "tx-bytes" and "tx-packets". One thing that does seem odd is that the new-flows counter is increasing this rapidly - shouldn't we expect it to be like 10 new flows for 10 TCP sessions? I see this counter increase by the thousands per second. I don't see any calls to __ieee80211_stop_queue() either, as expected (per trace-cmd). CPU load is not an issue AFAICT, even with all the debugging being written into the syslog (or journal or something) that's the only thing that takes noticable CPU time - ~50% for systemd-journal and ~20% for rsyslogd, <10% for the throughput testing program and that's about it. The system has 4 threads and seems mostly idle. All this seems to mean that the TCP stack isn't feeding us fast enough, but is that really possible? Any other ideas? Thanks, johannes