Return-path: Received: from youngberry.canonical.com ([91.189.89.112]:34517 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933268Ab2JZOXu (ORCPT ); Fri, 26 Oct 2012 10:23:50 -0400 From: Seth Forshee To: linux-wireless@vger.kernel.org Cc: "John W. Linville" , Arend van Spriel , "Franky (Zhenhui) Lin" , Brett Rudley , Roland Vossen , Kan Yan , brcm80211-dev-list@broadcom.com, Seth Forshee Subject: [PATCH 00/18] brcmsmac: Tx rework and expanded debug/trace support Date: Fri, 26 Oct 2012 09:23:15 -0500 Message-Id: <1351261413-20821-1-git-send-email-seth.forshee@canonical.com> (sfid-20121026_162354_545445_35E07215) Sender: linux-wireless-owner@vger.kernel.org List-ID: I've been looking into the issues with brcmsmac performance reported at [0] and [1]. I started out looking into the tx queueing based on the "No where to go" messages in the logs. This code has a number of shortcomings: - The amount of bufferring is excessive. The tx queue will buffer up to 228 packets, and each of the tx DMA rings will queue up to 256 more. - There's no flow control. If the queue fills up packets begin to get dropped, as evidenced by the "No where to go" messages. - Without flow control the tx queue probably helps avoid dropping packets for short bursts due to the sheer number of packets that will be buffered, but if flow control is added the only remaining benefit that I can see is that it accumulates packets for aggregation. The tx queue is far more complex than needed for supporting aggergation, however. As a result I worked up the following patches to add flow control remove the tx queue. These patches change the tx handler to directly hand off packets to the DMA code. The convoluted priority->precedence->fifo mapping is converted to a simple one-to-one mapping of the mac80211 queues to fifos. Non- aggregate frames are immediately inserted into the DMA ring. Handling of aggregate frames is not as simple, as some of the tx header fixups can only happen once we have all the frames for an AMPDU. To support this without resyncing buffers after they've been added to the DMA ring I've added the concept of AMPDU sessions. An AMPDU session simply queues up the frames for a single AMPDU until we are ready to insert them into the tx ring. There is one session per DMA ring, and descriptors are reserved in the corresponding ring for all frames queued in the AMPDU session. This also has the benefit of allowing non- aggregate frames to be sent without affecting aggregation and without mapping these frames to a different fifo. The patches also add flow control to stop incoming tx packets when the DMA ring is full. In practice I found that we will sometimes receive a single frame from mac80211 after stopping the queues, so some headroom is reserved when stopping the queues. I also reduced the number of tx descriptors per ring to 64 and fixed a bug that prevented having differing non-zero numbers of tx and rx descriptors for a given ring. When workig on this I made extensive use of ftrace for debug and verification. I'm including patches I wrote which expand the trace support and introduce debug macros which can log messages both to dmesg and the trace buffer. iwlwifi has similar trace support which we've enabled in Ubuntu, making it easier to collect debug information from users experiencing wireless problems. With these changes I'm no longer seeing dropped frames when the tx queues are full. Anecdotally I'd also say that my testing with iperf using TCP seems to show more consistent data rates, resulting in a higher average data rate (sometimes significantly so), but I don't have sufficient amounts of data to be sure this is the case. I'm still observing a few problems when testing with iperf, however. The first is "Pkt tx suppressed, illegal channel" messages. There are also large drops in the data rate reported when using TCP, sometimes even resulting in iperf reporting that no data was transferred for several seconds. Finally, when using iperf with UDP the number of dropped frames periodically spikes to high levels. I'm not sure yet, but it looks like the second and third problems may coincide with scanning. I also continue to see flush timeouts, but the frequency seems to be reduced with these changes. Likely this is related to the much smaller number of packets that will be queued internally for tx. Thanks, Seth [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1046507 [1] http://www.spinics.net/lists/linux-wireless/msg96287.html Seth Forshee (18): brcmsmac: Rework tx code to avoid internal buffering of packets brcmsmac: Use correct descriptor count when calculating next rx descriptor brcmsmac: Reduce number of entries in tx DMA rings brcm80211: Allow trace support to be enabled separately from debug brcm80211: Add macro for checking if debug log levels are enabled brcm80211: Convert log message levels to debug levels brcmsmac: Add module parameter for setting the debug level brcmsmac: Add support for writing debug messages to the trace buffer brcmsmac: Use debug macros for general error and debug statements brcmsmac: Add BRCMS_DBG_MAC80211 debug macro brcmsmac: Add RX and TX debug macros brcmsmac: Add INT debug macro brcmsmac: Add DMA debug macro brcmsmac: Add HT debug macro brcmsmac: Improve tx trace and debug support brcmsmac: Add tracepoint for macintstatus brcmsmac: Add tracepoint for AMPDU session information brcmsmac: Remove some noisy but uninformative debug messages drivers/net/wireless/brcm80211/Kconfig | 12 + drivers/net/wireless/brcm80211/brcmsmac/Makefile | 3 +- drivers/net/wireless/brcm80211/brcmsmac/ampdu.c | 726 ++++++------ drivers/net/wireless/brcm80211/brcmsmac/ampdu.h | 29 +- drivers/net/wireless/brcm80211/brcmsmac/antsel.c | 4 +- .../net/wireless/brcm80211/brcmsmac/brcms_debug.c | 44 + .../net/wireless/brcm80211/brcmsmac/brcms_debug.h | 46 + .../brcm80211/brcmsmac/brcms_trace_events.h | 175 ++- drivers/net/wireless/brcm80211/brcmsmac/channel.c | 10 +- drivers/net/wireless/brcm80211/brcmsmac/dma.c | 343 ++++-- drivers/net/wireless/brcm80211/brcmsmac/dma.h | 11 +- .../net/wireless/brcm80211/brcmsmac/mac80211_if.c | 123 ++- drivers/net/wireless/brcm80211/brcmsmac/main.c | 1157 ++++++-------------- drivers/net/wireless/brcm80211/brcmsmac/main.h | 48 +- drivers/net/wireless/brcm80211/brcmsmac/stf.c | 8 +- drivers/net/wireless/brcm80211/brcmsmac/types.h | 5 +- drivers/net/wireless/brcm80211/include/defs.h | 11 +- 17 files changed, 1302 insertions(+), 1453 deletions(-) create mode 100644 drivers/net/wireless/brcm80211/brcmsmac/brcms_debug.c create mode 100644 drivers/net/wireless/brcm80211/brcmsmac/brcms_debug.h