2013-09-16 12:49:38

by Michal Kazior

[permalink] [raw]
Subject: [RFC 0/4] ath10k: improve RX performance

Hi,

This patchset gives clear RX performance
improvement on AP135. Throughput is at least
doubled (300mbps -> 600mbps, UDP RX, 2x2).

Patches depend on my RFC patch "mac80211: support
reporting A-MSDU subframes individually".


Michal.


Michal Kazior (4):
ath10k: report A-MSDU subframes individually
ath10k: document decap modes
ath10k: cleanup RX decap handling
ath10k: align RX frames properly

drivers/net/wireless/ath/ath10k/htt.h | 1 +
drivers/net/wireless/ath/ath10k/htt_rx.c | 251 ++++++++++++-----------------
drivers/net/wireless/ath/ath10k/hw.h | 6 +-
drivers/net/wireless/ath/ath10k/rx_desc.h | 24 ++-
drivers/net/wireless/ath/ath10k/txrx.c | 5 +
5 files changed, 135 insertions(+), 152 deletions(-)

--
1.7.9.5



2013-09-24 07:19:51

by Michal Kazior

[permalink] [raw]
Subject: Re: [RFC 0/4] ath10k: improve RX performance

On 24 September 2013 08:42, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>> This patchset gives clear RX performance
>> improvement on AP135. Throughput is at least
>> doubled (300mbps -> 600mbps, UDP RX, 2x2).
>>
>> Patches depend on my RFC patch "mac80211: support
>> reporting A-MSDU subframes individually".
>
> Preferably I can apply these changes only after the mac80211 patch goes
> to net-next (to avoid unnecessary rebasing). Is there a way to
> workaround the need for the mac80211 patch in ath10k so that we could
> apply these patches ASAP? And once the mac80211 patch is commited we
> could just remove the workaround in ath10k.

The workaround could be to clear the retransmission flag when
reporting A-MSDU frames individually.


Michał.

2013-09-16 12:49:42

by Michal Kazior

[permalink] [raw]
Subject: [RFC 3/4] ath10k: cleanup RX decap handling

Native Wifi frames are always decapped as non-QoS
data frames meaning mac80211 couldn't set sk_buff
priority properly. The patch fixes this by using
the original 802.11 header.

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt_rx.c | 70 +++++++++++++++++-------------
1 file changed, 41 insertions(+), 29 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index dad584f..083dd9a 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -618,7 +618,7 @@ static void ath10k_htt_rx_amsdu(struct ath10k_htt *htt,
enum rx_msdu_decap_format fmt;
enum htt_rx_mpdu_encrypt_type enctype;
struct ieee80211_hdr *hdr;
- u8 hdr_buf[64];
+ u8 hdr_buf[64], addr[ETH_ALEN], *qos;
unsigned int hdr_len;

rxd = (void *)skb->data - sizeof(*rxd);
@@ -652,6 +652,19 @@ static void ath10k_htt_rx_amsdu(struct ath10k_htt *htt,
skb_trim(skb, skb->len - FCS_LEN);
break;
case RX_MSDU_DECAP_NATIVE_WIFI:
+ hdr = (struct ieee80211_hdr *)skb->data;
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);
+ memcpy(addr, ieee80211_get_DA(hdr), ETH_ALEN);
+ skb_pull(skb, hdr_len);
+
+ hdr = (struct ieee80211_hdr *)hdr_buf;
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);
+ memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
+
+ hdr = (struct ieee80211_hdr *)skb->data;
+ qos = ieee80211_get_qos_ctl(hdr);
+ qos[0] &= ~IEEE80211_QOS_CTL_A_MSDU_PRESENT;
+ memcpy(ieee80211_get_DA(hdr), addr, ETH_ALEN);
break;
case RX_MSDU_DECAP_ETHERNET2_DIX:
decap_len = 0;
@@ -687,6 +700,9 @@ static void ath10k_htt_rx_msdu(struct ath10k_htt *htt, struct htt_rx_info *info)
struct ieee80211_hdr *hdr;
enum rx_msdu_decap_format fmt;
enum htt_rx_mpdu_encrypt_type enctype;
+ u8 addr[ETH_ALEN];
+ int hdr_len;
+ void *rfc1042;

/* This shouldn't happen. If it does than it may be a FW bug. */
if (skb->next) {
@@ -700,48 +716,44 @@ static void ath10k_htt_rx_msdu(struct ath10k_htt *htt, struct htt_rx_info *info)
RX_MSDU_START_INFO1_DECAP_FORMAT);
enctype = MS(__le32_to_cpu(rxd->mpdu_start.info0),
RX_MPDU_START_INFO0_ENCRYPT_TYPE);
- hdr = (void *)skb->data - RX_HTT_HDR_STATUS_LEN;
+ hdr = (struct ieee80211_hdr *)rxd->rx_hdr_status;
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);

skb->ip_summed = ath10k_htt_rx_get_csum_state(skb);

switch (fmt) {
case RX_MSDU_DECAP_RAW:
- /* remove trailing FCS */
- skb_trim(skb, skb->len - 4);
+ skb_trim(skb, skb->len - FCS_LEN);
break;
case RX_MSDU_DECAP_NATIVE_WIFI:
- /* nothing to do here */
+ hdr = (struct ieee80211_hdr *)skb->data;
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);
+ memcpy(addr, ieee80211_get_DA(hdr), ETH_ALEN);
+ skb_pull(skb, hdr_len);
+
+ hdr = (struct ieee80211_hdr *)rxd->rx_hdr_status;
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);
+ memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
+
+ hdr = (struct ieee80211_hdr *)skb->data;
+ memcpy(ieee80211_get_DA(hdr), addr, ETH_ALEN);
break;
case RX_MSDU_DECAP_ETHERNET2_DIX:
- /* macaddr[6] + macaddr[6] + ethertype[2] */
- skb_pull(skb, 6 + 6 + 2);
+ rfc1042 = hdr;
+ rfc1042 += roundup(hdr_len, 4);
+ rfc1042 += roundup(ath10k_htt_rx_crypto_param_len(enctype), 4);
+
+ skb_pull(skb, sizeof(struct ethhdr));
+ memcpy(skb_push(skb, sizeof(struct rfc1042_hdr)),
+ rfc1042, sizeof(struct rfc1042_hdr));
+ memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
break;
case RX_MSDU_DECAP_8023_SNAP_LLC:
- /* macaddr[6] + macaddr[6] + len[2] */
- /* we don't need this for non-A-MSDU */
- skb_pull(skb, 6 + 6 + 2);
+ skb_pull(skb, sizeof(struct amsdu_subframe_hdr));
+ memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
break;
}

- if (fmt == RX_MSDU_DECAP_ETHERNET2_DIX) {
- void *llc;
- int llclen;
-
- llclen = 8;
- llc = hdr;
- llc += roundup(ieee80211_hdrlen(hdr->frame_control), 4);
- llc += roundup(ath10k_htt_rx_crypto_param_len(enctype), 4);
-
- skb_push(skb, llclen);
- memcpy(skb->data, llc, llclen);
- }
-
- if (fmt >= RX_MSDU_DECAP_ETHERNET2_DIX) {
- int len = ieee80211_hdrlen(hdr->frame_control);
- skb_push(skb, len);
- memcpy(skb->data, hdr, len);
- }
-
info->skb = skb;
info->encrypt_type = enctype;

--
1.7.9.5


2013-09-17 05:19:38

by Michal Kazior

[permalink] [raw]
Subject: Re: [RFC 3/4] ath10k: cleanup RX decap handling

On 16 September 2013 23:30, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>> Native Wifi frames are always decapped as non-QoS
>> data frames meaning mac80211 couldn't set sk_buff
>> priority properly. The patch fixes this by using
>> the original 802.11 header.
>>
>> Signed-off-by: Michal Kazior <[email protected]>
>
> The patch title doesn't seem to match with the content. Unless I'm
> mistaken it looks like you are adding native wifi frame format support
> and doing some cleanup at the same time. They should be in separate
> patches.

You're right. I'll split it up.

Nwifi was supported, however QoS Data frames were reported as Data
frames though.


>> @@ -652,6 +652,19 @@ static void ath10k_htt_rx_amsdu(struct ath10k_htt *htt,
>> skb_trim(skb, skb->len - FCS_LEN);
>> break;
>> case RX_MSDU_DECAP_NATIVE_WIFI:
>> + hdr = (struct ieee80211_hdr *)skb->data;
>> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
>> + memcpy(addr, ieee80211_get_DA(hdr), ETH_ALEN);
>> + skb_pull(skb, hdr_len);
>> +
>> + hdr = (struct ieee80211_hdr *)hdr_buf;
>> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
>> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
>> +
>> + hdr = (struct ieee80211_hdr *)skb->data;
>> + qos = ieee80211_get_qos_ctl(hdr);
>> + qos[0] &= ~IEEE80211_QOS_CTL_A_MSDU_PRESENT;
>> + memcpy(ieee80211_get_DA(hdr), addr, ETH_ALEN);
>> break;
>
> It would be good to have small comments what each part is doing("remove
> the native wifi header", "copy the real 802.11 header", "copy correct
> destination address" etc), makes it a lot faster to skim the code. Also
> document why IEEE80211_QOS_CTL_A_MSDU_PRESENT needs to be cleared.

Okay.


>> case RX_MSDU_DECAP_RAW:
>> - /* remove trailing FCS */
>> - skb_trim(skb, skb->len - 4);
>> + skb_trim(skb, skb->len - FCS_LEN);
>> break;
>
> Please keep the comment still

Why? The point of the comment was to explain the literal "4". Using
define makes the comment redundant.


>
>> case RX_MSDU_DECAP_NATIVE_WIFI:
>> - /* nothing to do here */
>> + hdr = (struct ieee80211_hdr *)skb->data;
>> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
>> + memcpy(addr, ieee80211_get_DA(hdr), ETH_ALEN);
>> + skb_pull(skb, hdr_len);
>> +
>> + hdr = (struct ieee80211_hdr *)rxd->rx_hdr_status;
>> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
>> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
>> +
>> + hdr = (struct ieee80211_hdr *)skb->data;
>> + memcpy(ieee80211_get_DA(hdr), addr, ETH_ALEN);
>> break;
>
> Again add small comments describing how we are modifying the headers.

Okay.


>> case RX_MSDU_DECAP_ETHERNET2_DIX:
>> - /* macaddr[6] + macaddr[6] + ethertype[2] */
>> - skb_pull(skb, 6 + 6 + 2);
>> + rfc1042 = hdr;
>> + rfc1042 += roundup(hdr_len, 4);
>> + rfc1042 += roundup(ath10k_htt_rx_crypto_param_len(enctype), 4);
>> +
>> + skb_pull(skb, sizeof(struct ethhdr));
>> + memcpy(skb_push(skb, sizeof(struct rfc1042_hdr)),
>> + rfc1042, sizeof(struct rfc1042_hdr));
>> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
>> break;
>
> Ditto.

Comment was supposed to explain where those numbers come from. Using
structures explains it now.


>> case RX_MSDU_DECAP_8023_SNAP_LLC:
>> - /* macaddr[6] + macaddr[6] + len[2] */
>> - /* we don't need this for non-A-MSDU */
>> - skb_pull(skb, 6 + 6 + 2);
>> + skb_pull(skb, sizeof(struct amsdu_subframe_hdr));
>> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
>> break;
>
> And here.

Ditto.


Michał.

2013-09-16 12:49:40

by Michal Kazior

[permalink] [raw]
Subject: [RFC 2/4] ath10k: document decap modes

Clarify how each decap mode works in one place.

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/rx_desc.h | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/rx_desc.h b/drivers/net/wireless/ath/ath10k/rx_desc.h
index bfec6c8..1c584c4 100644
--- a/drivers/net/wireless/ath/ath10k/rx_desc.h
+++ b/drivers/net/wireless/ath/ath10k/rx_desc.h
@@ -422,10 +422,30 @@ struct rx_mpdu_end {
#define RX_MSDU_START_INFO1_IP_FRAG (1 << 14)
#define RX_MSDU_START_INFO1_TCP_ONLY_ACK (1 << 15)

+/* The decapped header (rx_hdr_status) contains the following:
+ * a) 802.11 header
+ * [padding to 4 bytes]
+ * b) HW crypto parameter
+ * - 0 bytes for no security
+ * - 4 bytes for WEP
+ * - 8 bytes for TKIP, AES
+ * [padding to 4 bytes]
+ * c) A-MSDU subframe header (14 bytes) if appliable
+ * d) LLC/SNAP (RFC1042, 8 bytes)
+ *
+ * In case of A-MSDU only first frame in sequence contains (a) and (b). */
enum rx_msdu_decap_format {
- RX_MSDU_DECAP_RAW = 0,
- RX_MSDU_DECAP_NATIVE_WIFI = 1,
+ RX_MSDU_DECAP_RAW = 0,
+
+ /* Note: QoS frames are reported as non-QoS. The rx_hdr_status in
+ * htt_rx_desc contains the original decapped 802.11 header. */
+ RX_MSDU_DECAP_NATIVE_WIFI = 1,
+
+ /* Payload contains an ethernet header (struct ethhdr). */
RX_MSDU_DECAP_ETHERNET2_DIX = 2,
+
+ /* Payload contains two 48-bit addresses and 2-byte length (14 bytes
+ * total), followed by an RFC1042 header (8 bytes). */
RX_MSDU_DECAP_8023_SNAP_LLC = 3
};

--
1.7.9.5


2013-09-16 12:49:43

by Michal Kazior

[permalink] [raw]
Subject: [RFC 4/4] ath10k: align RX frames properly

Ethernet-like decapping mode leaves IP protocol
frame not aligned to 4-byte boundaries. This leads
to re-aligning in mac80211 which in turn leads to
CPU cache thrashing on lower end host machines and
very poor performance.

Since HW doesn't allow to change payload offset
properly the solution is to force HW to decap in
Native Wifi mode which always has 24-bytes long
802.11 header (even for QoS frames). This means IP
frame is properly aligned in this decap mode.

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/hw.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/hw.h b/drivers/net/wireless/ath/ath10k/hw.h
index 643f0c9..8c1be768 100644
--- a/drivers/net/wireless/ath/ath10k/hw.h
+++ b/drivers/net/wireless/ath/ath10k/hw.h
@@ -74,7 +74,11 @@ enum ath10k_mcast2ucast_mode {
#define TARGET_RX_CHAIN_MASK (BIT(0) | BIT(1) | BIT(2))
#define TARGET_RX_TIMEOUT_LO_PRI 100
#define TARGET_RX_TIMEOUT_HI_PRI 40
-#define TARGET_RX_DECAP_MODE ATH10K_HW_TXRX_ETHERNET
+
+/* Native Wifi decap mode is used to align IP frames to 4-byte boundaries and
+ * avoid a very expensive re-alignment in mac80211. */
+#define TARGET_RX_DECAP_MODE ATH10K_HW_TXRX_NATIVE_WIFI
+
#define TARGET_SCAN_MAX_PENDING_REQS 4
#define TARGET_BMISS_OFFLOAD_MAX_VDEV 3
#define TARGET_ROAM_OFFLOAD_MAX_VDEV 3
--
1.7.9.5


2013-09-24 06:42:30

by Kalle Valo

[permalink] [raw]
Subject: Re: [RFC 0/4] ath10k: improve RX performance

Michal Kazior <[email protected]> writes:

> This patchset gives clear RX performance
> improvement on AP135. Throughput is at least
> doubled (300mbps -> 600mbps, UDP RX, 2x2).
>
> Patches depend on my RFC patch "mac80211: support
> reporting A-MSDU subframes individually".

Preferably I can apply these changes only after the mac80211 patch goes
to net-next (to avoid unnecessary rebasing). Is there a way to
workaround the need for the mac80211 patch in ath10k so that we could
apply these patches ASAP? And once the mac80211 patch is commited we
could just remove the workaround in ath10k.

--
Kalle Valo

2013-09-24 06:47:47

by Kalle Valo

[permalink] [raw]
Subject: Re: [RFC 3/4] ath10k: cleanup RX decap handling

Michal Kazior <[email protected]> writes:

> On 16 September 2013 23:30, Kalle Valo <[email protected]> wrote:
>> Michal Kazior <[email protected]> writes:
>>
>>> Native Wifi frames are always decapped as non-QoS
>>> data frames meaning mac80211 couldn't set sk_buff
>>> priority properly. The patch fixes this by using
>>> the original 802.11 header.
>>>
>>> Signed-off-by: Michal Kazior <[email protected]>
>>
>> The patch title doesn't seem to match with the content. Unless I'm
>> mistaken it looks like you are adding native wifi frame format support
>> and doing some cleanup at the same time. They should be in separate
>> patches.
>
> You're right. I'll split it up.
>
> Nwifi was supported, however QoS Data frames were reported as Data
> frames though.

Oh, ok. It would be good to document that in the commit log as well.

>>> case RX_MSDU_DECAP_RAW:
>>> - /* remove trailing FCS */
>>> - skb_trim(skb, skb->len - 4);
>>> + skb_trim(skb, skb->len - FCS_LEN);
>>> break;
>>
>> Please keep the comment still
>
> Why? The point of the comment was to explain the literal "4". Using
> define makes the comment redundant.

I know it's redundant, but this is just to improve readability.

>>> case RX_MSDU_DECAP_ETHERNET2_DIX:
>>> - /* macaddr[6] + macaddr[6] + ethertype[2] */
>>> - skb_pull(skb, 6 + 6 + 2);
>>> + rfc1042 = hdr;
>>> + rfc1042 += roundup(hdr_len, 4);
>>> + rfc1042 += roundup(ath10k_htt_rx_crypto_param_len(enctype), 4);
>>> +
>>> + skb_pull(skb, sizeof(struct ethhdr));
>>> + memcpy(skb_push(skb, sizeof(struct rfc1042_hdr)),
>>> + rfc1042, sizeof(struct rfc1042_hdr));
>>> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
>>> break;
>>
>> Ditto.
>
> Comment was supposed to explain where those numbers come from. Using
> structures explains it now.

Sure, the structures are very good here. But like above, having small
comments improve readability. Think of it as a "title" or something like
that.

>>> case RX_MSDU_DECAP_8023_SNAP_LLC:
>>> - /* macaddr[6] + macaddr[6] + len[2] */
>>> - /* we don't need this for non-A-MSDU */
>>> - skb_pull(skb, 6 + 6 + 2);
>>> + skb_pull(skb, sizeof(struct amsdu_subframe_hdr));
>>> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
>>> break;
>>
>> And here.
>
> Ditto.

Same here :)

--
Kalle Valo

2013-09-16 21:30:52

by Kalle Valo

[permalink] [raw]
Subject: Re: [RFC 3/4] ath10k: cleanup RX decap handling

Michal Kazior <[email protected]> writes:

> Native Wifi frames are always decapped as non-QoS
> data frames meaning mac80211 couldn't set sk_buff
> priority properly. The patch fixes this by using
> the original 802.11 header.
>
> Signed-off-by: Michal Kazior <[email protected]>

The patch title doesn't seem to match with the content. Unless I'm
mistaken it looks like you are adding native wifi frame format support
and doing some cleanup at the same time. They should be in separate
patches.

> @@ -652,6 +652,19 @@ static void ath10k_htt_rx_amsdu(struct ath10k_htt *htt,
> skb_trim(skb, skb->len - FCS_LEN);
> break;
> case RX_MSDU_DECAP_NATIVE_WIFI:
> + hdr = (struct ieee80211_hdr *)skb->data;
> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
> + memcpy(addr, ieee80211_get_DA(hdr), ETH_ALEN);
> + skb_pull(skb, hdr_len);
> +
> + hdr = (struct ieee80211_hdr *)hdr_buf;
> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
> +
> + hdr = (struct ieee80211_hdr *)skb->data;
> + qos = ieee80211_get_qos_ctl(hdr);
> + qos[0] &= ~IEEE80211_QOS_CTL_A_MSDU_PRESENT;
> + memcpy(ieee80211_get_DA(hdr), addr, ETH_ALEN);
> break;

It would be good to have small comments what each part is doing("remove
the native wifi header", "copy the real 802.11 header", "copy correct
destination address" etc), makes it a lot faster to skim the code. Also
document why IEEE80211_QOS_CTL_A_MSDU_PRESENT needs to be cleared.

> case RX_MSDU_DECAP_RAW:
> - /* remove trailing FCS */
> - skb_trim(skb, skb->len - 4);
> + skb_trim(skb, skb->len - FCS_LEN);
> break;

Please keep the comment still

> case RX_MSDU_DECAP_NATIVE_WIFI:
> - /* nothing to do here */
> + hdr = (struct ieee80211_hdr *)skb->data;
> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
> + memcpy(addr, ieee80211_get_DA(hdr), ETH_ALEN);
> + skb_pull(skb, hdr_len);
> +
> + hdr = (struct ieee80211_hdr *)rxd->rx_hdr_status;
> + hdr_len = ieee80211_hdrlen(hdr->frame_control);
> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
> +
> + hdr = (struct ieee80211_hdr *)skb->data;
> + memcpy(ieee80211_get_DA(hdr), addr, ETH_ALEN);
> break;

Again add small comments describing how we are modifying the headers.

> case RX_MSDU_DECAP_ETHERNET2_DIX:
> - /* macaddr[6] + macaddr[6] + ethertype[2] */
> - skb_pull(skb, 6 + 6 + 2);
> + rfc1042 = hdr;
> + rfc1042 += roundup(hdr_len, 4);
> + rfc1042 += roundup(ath10k_htt_rx_crypto_param_len(enctype), 4);
> +
> + skb_pull(skb, sizeof(struct ethhdr));
> + memcpy(skb_push(skb, sizeof(struct rfc1042_hdr)),
> + rfc1042, sizeof(struct rfc1042_hdr));
> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
> break;

Ditto.

> case RX_MSDU_DECAP_8023_SNAP_LLC:
> - /* macaddr[6] + macaddr[6] + len[2] */
> - /* we don't need this for non-A-MSDU */
> - skb_pull(skb, 6 + 6 + 2);
> + skb_pull(skb, sizeof(struct amsdu_subframe_hdr));
> + memcpy(skb_push(skb, hdr_len), hdr, hdr_len);
> break;

And here.

--
Kalle Valo

2013-09-16 12:49:41

by Michal Kazior

[permalink] [raw]
Subject: [RFC 1/4] ath10k: report A-MSDU subframes individually

HW reports each A-MSDU subframe as a separate
sk_buff. It is impossible to configure it to
behave differently.

Until now ath10k was reconstructing A-MSDUs from
subframes which involved a lot of memory
operations. This proved to be a significant
contributor to degraded RX performance.

This at least doubles RX performance on AP135.

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt.h | 1 +
drivers/net/wireless/ath/ath10k/htt_rx.c | 189 ++++++++++--------------------
drivers/net/wireless/ath/ath10k/txrx.c | 5 +
3 files changed, 71 insertions(+), 124 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt.h b/drivers/net/wireless/ath/ath10k/htt.h
index e090902..9bcff3e 100644
--- a/drivers/net/wireless/ath/ath10k/htt.h
+++ b/drivers/net/wireless/ath/ath10k/htt.h
@@ -1181,6 +1181,7 @@ struct htt_rx_info {
u32 info2;
} rate;
bool fcs_err;
+ bool amsdu_more;
};

struct ath10k_htt {
diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index a39fbf4..dad584f 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -41,6 +41,10 @@
/* when under memory pressure rx ring refill may fail and needs a retry */
#define HTT_RX_RING_REFILL_RETRY_MS 50

+
+static int ath10k_htt_rx_get_csum_state(struct sk_buff *skb);
+
+
static int ath10k_htt_rx_ring_size(struct ath10k_htt *htt)
{
int size;
@@ -591,136 +595,92 @@ static bool ath10k_htt_rx_hdr_is_amsdu(struct ieee80211_hdr *hdr)
return false;
}

-static int ath10k_htt_rx_amsdu(struct ath10k_htt *htt,
- struct htt_rx_info *info)
+struct rfc1042_hdr {
+ u8 llc_dsap;
+ u8 llc_ssap;
+ u8 llc_ctrl;
+ u8 snap_oui[3];
+ __be16 snap_type;
+} __packed;
+
+struct amsdu_subframe_hdr {
+ u8 dst[ETH_ALEN];
+ u8 src[ETH_ALEN];
+ __be16 len;
+} __packed;
+
+static void ath10k_htt_rx_amsdu(struct ath10k_htt *htt,
+ struct htt_rx_info *info)
{
struct htt_rx_desc *rxd;
- struct sk_buff *amsdu;
struct sk_buff *first;
- struct ieee80211_hdr *hdr;
struct sk_buff *skb = info->skb;
enum rx_msdu_decap_format fmt;
enum htt_rx_mpdu_encrypt_type enctype;
+ struct ieee80211_hdr *hdr;
+ u8 hdr_buf[64];
unsigned int hdr_len;
- int crypto_len;

rxd = (void *)skb->data - sizeof(*rxd);
- fmt = MS(__le32_to_cpu(rxd->msdu_start.info1),
- RX_MSDU_START_INFO1_DECAP_FORMAT);
enctype = MS(__le32_to_cpu(rxd->mpdu_start.info0),
RX_MPDU_START_INFO0_ENCRYPT_TYPE);

- /* FIXME: No idea what assumptions are safe here. Need logs */
- if ((fmt == RX_MSDU_DECAP_RAW && skb->next)) {
- ath10k_htt_rx_free_msdu_chain(skb->next);
- skb->next = NULL;
- return -ENOTSUPP;
- }
-
- /* A-MSDU max is a little less than 8K */
- amsdu = dev_alloc_skb(8*1024);
- if (!amsdu) {
- ath10k_warn("A-MSDU allocation failed\n");
- ath10k_htt_rx_free_msdu_chain(skb->next);
- skb->next = NULL;
- return -ENOMEM;
- }
-
- if (fmt >= RX_MSDU_DECAP_NATIVE_WIFI) {
- int hdrlen;
-
- hdr = (void *)rxd->rx_hdr_status;
- hdrlen = ieee80211_hdrlen(hdr->frame_control);
- memcpy(skb_put(amsdu, hdrlen), hdr, hdrlen);
- }
+ hdr = (void *)rxd->rx_hdr_status;
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);
+ memcpy(hdr_buf, hdr, hdr_len);

first = skb;
while (skb) {
void *decap_hdr;
- int decap_len = 0;
+ int decap_len;

rxd = (void *)skb->data - sizeof(*rxd);
fmt = MS(__le32_to_cpu(rxd->msdu_start.info1),
- RX_MSDU_START_INFO1_DECAP_FORMAT);
+ RX_MSDU_START_INFO1_DECAP_FORMAT);
decap_hdr = (void *)rxd->rx_hdr_status;

- if (skb == first) {
- /* We receive linked A-MSDU subframe skbuffs. The
- * first one contains the original 802.11 header (and
- * possible crypto param) in the RX descriptor. The
- * A-MSDU subframe header follows that. Each part is
- * aligned to 4 byte boundary. */
-
- hdr = (void *)amsdu->data;
- hdr_len = ieee80211_hdrlen(hdr->frame_control);
- crypto_len = ath10k_htt_rx_crypto_param_len(enctype);
-
- decap_hdr += roundup(hdr_len, 4);
- decap_hdr += roundup(crypto_len, 4);
- }
-
- /* When fmt == RX_MSDU_DECAP_8023_SNAP_LLC:
- *
- * SNAP 802.3 consists of:
- * [dst:6][src:6][len:2][dsap:1][ssap:1][ctl:1][snap:5]
- * [data][fcs:4].
- *
- * Since this overlaps with A-MSDU header (da, sa, len)
- * there's nothing extra to do. */
-
- if (fmt == RX_MSDU_DECAP_ETHERNET2_DIX) {
- /* Ethernet2 decap inserts ethernet header in place of
- * A-MSDU subframe header. */
- skb_pull(skb, 6 + 6 + 2);
-
- /* A-MSDU subframe header length */
- decap_len += 6 + 6 + 2;
-
- /* Ethernet2 decap also strips the LLC/SNAP so we need
- * to re-insert it. The LLC/SNAP follows A-MSDU
- * subframe header. */
- /* FIXME: Not all LLCs are 8 bytes long */
- decap_len += 8;
-
- memcpy(skb_put(amsdu, decap_len), decap_hdr, decap_len);
- }
-
- if (fmt == RX_MSDU_DECAP_NATIVE_WIFI) {
- /* Native Wifi decap inserts regular 802.11 header
- * in place of A-MSDU subframe header. */
- hdr = (struct ieee80211_hdr *)skb->data;
- skb_pull(skb, ieee80211_hdrlen(hdr->frame_control));
+ skb->ip_summed = ath10k_htt_rx_get_csum_state(skb);

- /* A-MSDU subframe header length */
- decap_len += 6 + 6 + 2;
-
- memcpy(skb_put(amsdu, decap_len), decap_hdr, decap_len);
+ /* First frame in an A-MSDU chain has more decapped data. */
+ if (skb == first) {
+ decap_hdr += round_up(ieee80211_hdrlen(hdr->frame_control), 4);
+ decap_hdr += round_up(ath10k_htt_rx_crypto_param_len(enctype), 4);
}

- if (fmt == RX_MSDU_DECAP_RAW)
- skb_trim(skb, skb->len - 4); /* remove FCS */
-
- memcpy(skb_put(amsdu, skb->len), skb->data, skb->len);
-
- /* A-MSDU subframes are padded to 4bytes
- * but relative to first subframe, not the whole MPDU */
- if (skb->next && ((decap_len + skb->len) & 3)) {
- int padlen = 4 - ((decap_len + skb->len) & 3);
- memset(skb_put(amsdu, padlen), 0, padlen);
+ switch (fmt) {
+ case RX_MSDU_DECAP_RAW:
+ skb_trim(skb, skb->len - FCS_LEN);
+ break;
+ case RX_MSDU_DECAP_NATIVE_WIFI:
+ break;
+ case RX_MSDU_DECAP_ETHERNET2_DIX:
+ decap_len = 0;
+ decap_len += sizeof(struct rfc1042_hdr);
+ decap_len += sizeof(struct amsdu_subframe_hdr);
+
+ skb_pull(skb, sizeof(struct ethhdr));
+ memcpy(skb_push(skb, decap_len), decap_hdr, decap_len);
+ memcpy(skb_push(skb, hdr_len), hdr_buf, hdr_len);
+ break;
+ case RX_MSDU_DECAP_8023_SNAP_LLC:
+ memcpy(skb_push(skb, hdr_len), hdr_buf, hdr_len);
+ break;
}

+ info->skb = skb;
+ info->amsdu_more = (skb->next != NULL);
+ info->encrypt_type = enctype;
skb = skb->next;
- }
+ info->skb->next = NULL;

- info->skb = amsdu;
- info->encrypt_type = enctype;
-
- ath10k_htt_rx_free_msdu_chain(first);
+ ath10k_process_rx(htt->ar, info);
+ }

- return 0;
+ /* FIXME: It might be nice to re-assemble the A-MSDU when there's a
+ * monitor interface active for sniffing purposes. */
}

-static int ath10k_htt_rx_msdu(struct ath10k_htt *htt, struct htt_rx_info *info)
+static void ath10k_htt_rx_msdu(struct ath10k_htt *htt, struct htt_rx_info *info)
{
struct sk_buff *skb = info->skb;
struct htt_rx_desc *rxd;
@@ -742,6 +702,8 @@ static int ath10k_htt_rx_msdu(struct ath10k_htt *htt, struct htt_rx_info *info)
RX_MPDU_START_INFO0_ENCRYPT_TYPE);
hdr = (void *)skb->data - RX_HTT_HDR_STATUS_LEN;

+ skb->ip_summed = ath10k_htt_rx_get_csum_state(skb);
+
switch (fmt) {
case RX_MSDU_DECAP_RAW:
/* remove trailing FCS */
@@ -782,7 +744,8 @@ static int ath10k_htt_rx_msdu(struct ath10k_htt *htt, struct htt_rx_info *info)

info->skb = skb;
info->encrypt_type = enctype;
- return 0;
+
+ ath10k_process_rx(htt->ar, info);
}

static bool ath10k_htt_rx_has_decrypt_err(struct sk_buff *skb)
@@ -854,8 +817,6 @@ static void ath10k_htt_rx_handler(struct ath10k_htt *htt,
int fw_desc_len;
u8 *fw_desc;
int i, j;
- int ret;
- int ip_summed;

memset(&info, 0, sizeof(info));

@@ -930,11 +891,6 @@ static void ath10k_htt_rx_handler(struct ath10k_htt *htt,
continue;
}

- /* The skb is not yet processed and it may be
- * reallocated. Since the offload is in the original
- * skb extract the checksum now and assign it later */
- ip_summed = ath10k_htt_rx_get_csum_state(msdu_head);
-
info.skb = msdu_head;
info.fcs_err = ath10k_htt_rx_has_fcs_err(msdu_head);
info.signal = ATH10K_DEFAULT_NOISE_FLOOR;
@@ -947,24 +903,9 @@ static void ath10k_htt_rx_handler(struct ath10k_htt *htt,
hdr = ath10k_htt_rx_skb_get_hdr(msdu_head);

if (ath10k_htt_rx_hdr_is_amsdu(hdr))
- ret = ath10k_htt_rx_amsdu(htt, &info);
+ ath10k_htt_rx_amsdu(htt, &info);
else
- ret = ath10k_htt_rx_msdu(htt, &info);
-
- if (ret && !info.fcs_err) {
- ath10k_warn("error processing msdus %d\n", ret);
- dev_kfree_skb_any(info.skb);
- continue;
- }
-
- if (ath10k_htt_rx_hdr_is_amsdu((void *)info.skb->data))
- ath10k_dbg(ATH10K_DBG_HTT, "htt mpdu is amsdu\n");
-
- info.skb->ip_summed = ip_summed;
-
- ath10k_dbg_dump(ATH10K_DBG_HTT_DUMP, NULL, "htt mpdu: ",
- info.skb->data, info.skb->len);
- ath10k_process_rx(htt->ar, &info);
+ ath10k_htt_rx_msdu(htt, &info);
}
}

diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
index 68b6fae..ed54bce 100644
--- a/drivers/net/wireless/ath/ath10k/txrx.c
+++ b/drivers/net/wireless/ath/ath10k/txrx.c
@@ -261,6 +261,9 @@ void ath10k_process_rx(struct ath10k *ar, struct htt_rx_info *info)
if (info->fcs_err)
status->flag |= RX_FLAG_FAILED_FCS_CRC;

+ if (info->amsdu_more)
+ status->flag |= RX_FLAG_AMSDU_MORE;
+
status->signal = info->signal;

spin_lock_bh(&ar->data_lock);
@@ -293,6 +296,8 @@ void ath10k_process_rx(struct ath10k *ar, struct htt_rx_info *info)
status->vht_nss,
status->freq,
status->band);
+ ath10k_dbg_dump(ATH10K_DBG_HTT_DUMP, NULL, "rx skb: ",
+ info->skb->data, info->skb->len);

ieee80211_rx(ar->hw, info->skb);
}
--
1.7.9.5


2013-09-24 07:29:20

by Kalle Valo

[permalink] [raw]
Subject: Re: [RFC 0/4] ath10k: improve RX performance

Michal Kazior <[email protected]> writes:

> On 24 September 2013 08:42, Kalle Valo <[email protected]> wrote:
>> Michal Kazior <[email protected]> writes:
>>
>>> This patchset gives clear RX performance
>>> improvement on AP135. Throughput is at least
>>> doubled (300mbps -> 600mbps, UDP RX, 2x2).
>>>
>>> Patches depend on my RFC patch "mac80211: support
>>> reporting A-MSDU subframes individually".
>>
>> Preferably I can apply these changes only after the mac80211 patch goes
>> to net-next (to avoid unnecessary rebasing). Is there a way to
>> workaround the need for the mac80211 patch in ath10k so that we could
>> apply these patches ASAP? And once the mac80211 patch is commited we
>> could just remove the workaround in ath10k.
>
> The workaround could be to clear the retransmission flag when
> reporting A-MSDU frames individually.

That sounds very good. Can you do that and resend the patchset, please?

--
Kalle Valo