Return-path: Received: from mail.atheros.com ([12.19.149.2]:60175 "EHLO mail.atheros.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750786Ab1B1FO0 (ORCPT ); Mon, 28 Feb 2011 00:14:26 -0500 Received: from mail.atheros.com ([10.10.20.105]) by sidewinder.atheros.com for ; Sun, 27 Feb 2011 21:14:04 -0800 Date: Mon, 28 Feb 2011 10:44:39 +0530 From: Senthil Balasubramanian To: Denis 'GNUtoo' Carikli CC: "linux-wireless@vger.kernel.org" Subject: Re: ath9k stopped queue bug Message-ID: <20110228051439.GB6441@ksenthil-lnx.users.atheros.com> References: <1298390485.2575.6.camel@gnutoo-laptop> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" In-Reply-To: <1298390485.2575.6.camel@gnutoo-laptop> Sender: linux-wireless-owner@vger.kernel.org List-ID: On Tue, Feb 22, 2011 at 09:31:24PM +0530, Denis 'GNUtoo' Carikli wrote: > Hi, > > When I transfer large files at high speed(rsync to my x86 router, > locally, not trough the Internet) I get: > ping: sendmsg: No buffer space available > > And I can't send anymore data. > > /sys/kernel/debug/ieee80211/phy*/queues is > 00: 0x00000000/0 > 01: 0x00000000/0 > 02: 0x00000000/0 > 03: 0x00000000/0 > In normal conditions. > > But when I can't send anymore data I've that: > 00: 0x00000000/0 > 01: 0x00000000/0 > 02: 0x00000001/0 > 03: 0x00000000/0 > or that: > 00: 0x00000000/0 > 01: 0x00000000/0 > 02: 0x00000001/333 > 03: 0x00000000/0 As Johannes has pointed out it is an issue with the driver and already addressed in wireless-testing (commit 92460412367c00e97f99babdb898d0930ce604fc). I have ported this commit to 2.6.37 kernel for your reference.. and I believe we should push this patch down to stable kernel also as running with NM can cause this issue. diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index da5c645..f6a2a19 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -292,6 +292,7 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw, } ps_restore: + ieee80211_wake_queues(hw); spin_unlock_bh(&sc->sc_pcu_lock); ath9k_ps_restore(sc); diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c index 07b7804..3751e92 100644 --- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -1205,8 +1205,17 @@ bool ath_drain_all_txq(struct ath_softc *sc, bool retry_tx) ath_err(common, "Failed to stop TX DMA!\n"); for (i = 0; i < ATH9K_NUM_TX_QUEUES; i++) { - if (ATH_TXQ_SETUP(sc, i)) - ath_draintxq(sc, &sc->tx.txq[i], retry_tx); + if (!ATH_TXQ_SETUP(sc, i)) + continue; + + /* + * The caller will resume queues with ieee80211_wake_queues. + * Mark the queue as not stopped to prevent ath_tx_complete + * from waking the queue too early. + */ + txq = &sc->tx.txq[i]; + txq->stopped = false; + ath_draintxq(sc, txq, retry_tx); } return !npend; @@ -1860,6 +1869,11 @@ static void ath_tx_complete(struct ath_softc *sc, struct sk_buff *skb, spin_lock_bh(&txq->axq_lock); if (WARN_ON(--txq->pending_frames < 0)) txq->pending_frames = 0; + if (txq->stopped && + txq->pending_frames < ATH_MAX_QDEPTH) { + if (ath_mac80211_start_queue(sc, q)) + txq->stopped = 0; + } spin_unlock_bh(&txq->axq_lock); } @@ -1971,19 +1985,6 @@ static void ath_tx_rc_status(struct ath_buf *bf, struct ath_tx_status *ts, tx_info->status.rates[tx_rateindex].count = ts->ts_longretry + 1; } -static void ath_wake_mac80211_queue(struct ath_softc *sc, int qnum) -{ - struct ath_txq *txq; - - txq = sc->tx.txq_map[qnum]; - spin_lock_bh(&txq->axq_lock); - if (txq->stopped && txq->pending_frames < ATH_MAX_QDEPTH) { - if (ath_mac80211_start_queue(sc, qnum)) - txq->stopped = 0; - } - spin_unlock_bh(&txq->axq_lock); -} - static void ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq) { struct ath_hw *ah = sc->sc_ah; @@ -2081,9 +2082,6 @@ static void ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq) else ath_tx_complete_buf(sc, bf, txq, &bf_head, &ts, txok, 0); - if (txq == sc->tx.txq_map[qnum]) - ath_wake_mac80211_queue(sc, qnum); - spin_lock_bh(&txq->axq_lock); if (sc->sc_flags & SC_OP_TXAGGR) ath_txq_schedule(sc, txq); @@ -2205,9 +2203,6 @@ void ath_tx_edma_tasklet(struct ath_softc *sc) ath_tx_complete_buf(sc, bf, txq, &bf_head, &txs, txok, 0); - if (txq == sc->tx.txq_map[qnum]) - ath_wake_mac80211_queue(sc, qnum); - spin_lock_bh(&txq->axq_lock); if (!list_empty(&txq->txq_fifo_pending)) { INIT_LIST_HEAD(&bf_head); > > > Here's my irc conversation in #linux-wireless on Freeenode about that > issue: > > Feb 22 16:28:38 hi, > Feb 22 16:29:23 when I rsync to my router at high speed > over wifi, huge amount of data, I've that: > Feb 22 16:29:24 ping: sendmsg: No buffer space available > Feb 22 16:29:27 and wifi breaks > Feb 22 16:29:30 I've to reconnect > Feb 22 16:29:40 should I try setting a lower MTU? > Feb 22 16:29:43 what should I try? > Feb 22 16:29:53 and why isn't there any more buffer > space? > Feb 22 16:31:57 sounds like a queue management bug > Feb 22 16:32:06 with packets stuck somewhere > Feb 22 16:32:09 what driver? > Feb 22 16:34:04 * an-t (~ant@srv1.gnpx.net) has joined #linux-wireless > Feb 22 16:34:42 ath9k > Feb 22 16:34:51 on 2.6.37-020637-generic > Feb 22 16:34:57 I think that's mainline > Feb 22 16:35:01 let me check > Feb 22 16:35:02 hm, dunno > Feb 22 16:35:09 there were some queue mgmt things there > Feb 22 16:35:12 don't really konw > Feb 22 16:35:49 GNUtoo|laptop: Probably useful to share your > driver DDoS on linux-wireless; some idea of how many files & what size. > Feb 22 16:36:10 basically what I do is that: > Feb 22 16:36:20 I use openembedded to cross-compile > files > Feb 22 16:36:25 and sync the result with my router > Feb 22 16:36:26 * Blues-Man > (~bluesman@host137-190-dynamic.43-79-r.retail.telecomitalia.it) has > joined #linux-wireless > Feb 22 16:36:35 that is an x86 computer with ath9k and > hostapd > Feb 22 16:36:52 > cd /home/gnutoo/embedded/oe/oetmps/eee701/deploy/glibc > Feb 22 16:36:56 rsync -av -e "ssh -l gnutoo -p 222" * > router:/var/www/gnutoo.homelinux.org/openembedded/eee701 > Feb 22 16:37:05 is the script I use to sync it > Feb 22 16:37:37 I bet when this happens you never get a ping > pcket through > Feb 22 16:38:10 and /sys/kernel/debug/ieee80211/phy*/queues is > non-zero > Feb 22 16:38:18 the info from that file would be useful > Feb 22 16:38:37 ok I was pastebining the file sizes > Feb 22 16:38:41 as there are a lot of files.... > Feb 22 16:39:06 ok I'll try to reproduce > Feb 22 16:39:13 tough that will disconnect me from irc > Feb 22 16:40:54 I bet it'll be 0x0001/n > Feb 22 16:40:56 n > 0 > Feb 22 16:42:16 ping also increase during the huge > transfer > Feb 22 16:42:30 that's "bufferbloat" but expected now > Feb 22 16:42:35 ok > Feb 22 16:42:49 I learned what bufferbloat was not so > long ago > Feb 22 16:44:27 * Topic for #linux-wireless is: User-level discussions > about wireless LANs on Linux | compat-wireless-2.6 only available for > kernels >= 2.6.27, work is underway to enable older kernels now that we > don't use multiqueue on mac80211 > Feb 22 16:44:27 * Topic for #linux-wireless set by linville at Wed Jul > 8 21:06:20 2009 > Feb 22 16:44:30 it starts with > Feb 22 16:44:32 02: 0x00000001/0 > Feb 22 16:44:41 and then increase to > Feb 22 16:44:47 02: 0x00000001/333 > Feb 22 16:44:50 the reset is 0 > Feb 22 16:45:02 yeah > Feb 22 16:45:06 as expected > Feb 22 16:45:08 ok > Feb 22 16:45:12 what's that exactly? > Feb 22 16:45:19 the reason why the queue is stopped > Feb 22 16:45:23 and the number of packets in the queue > Feb 22 16:45:24 oh nice > Feb 22 16:45:32 0x000 == not stopped > Feb 22 16:45:35 ok > Feb 22 16:45:39 and what's the reason? > Feb 22 16:45:41 /0 = no packets > Feb 22 16:45:54 BIT(0) == driver asked for queue to be stopped > Feb 22 16:46:03 (IEEE80211_QUEUE_STOP_REASON_DRIVER) > Feb 22 16:46:08 ok > Feb 22 16:46:11 (net/mac80211/ieee80211_i.h) > Feb 22 16:46:15 so driver's fault > Feb 22 16:46:28 dmesg shows nothing tough > Feb 22 16:46:37 only normal stuff > Feb 22 16:46:38 yeah not surprising either > Feb 22 16:46:45 ah debugfs? > Feb 22 16:46:47 queue start/stop happens often enough, no > logging for it > Feb 22 16:46:50 or something like that should be used > Feb 22 16:46:50 ok > Feb 22 16:49:45 what should I do now? > Feb 22 16:50:10 report a bug on ath9k > > Denis. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html