hello!
it seems the atheros hardware needs a padding to 4 byte boundaries after the
801.11 header when it sends QoS frames and it will add such a padding
automatically when it receives QoS data frames.
in my case this made the ath5k driver associate with an QoS enabled AP (same
type of card with the madwifi driver in default config) fine, but i could not
get any ARP requests thru - since the padding was missing there were 2 bytes
lost in the packet received on the AP side, so no further communication...
this need for padding is also reflected in the madwifi code and this comment:
/*
* Indicate we need the 802.11 header padded to a
* 32-bit boundary for 4-address and QoS frames.
*/
i created the following simple hack, to see if the problem really is the
missing padding. it just adds the padding all the time, so it will only work
with QoS enabled APs, but it makes me able to communicate with my QoS AP.
i didn't see how mac80211 would allow to solve this elegantly at this time.
are there any other drivers with a similar kind of padding requirement? how
could that be implemented cleanly?
luis, does this hack fix your problem with DHCP too?
bruno
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 064924c..bb11e63 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -1019,6 +1019,8 @@ ieee80211_rx_h_data(struct ieee80211_txrx_data *rx)
return TXRX_DROP;
hdrlen = ieee80211_get_hdrlen(fc);
+ if ((fc & IEEE80211_FCTL_FTYPE) == IEEE80211_FTYPE_DATA)
+ hdrlen = hdrlen + 2;
/* convert IEEE 802.11 header + possible LLC headers into Ethernet
* header
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 957ec3c..6be9b95 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1405,7 +1405,7 @@ int ieee80211_subif_start_xmit(struct sk_buff *skb,
if (sta) {
if (sta->flags & WLAN_STA_WME) {
fc |= IEEE80211_STYPE_QOS_DATA;
- hdrlen += 2;
+ hdrlen += 4; //2;
}
sta_info_put(sta);
}
what do you think about the following solution?
Atheros hardware requires the header to be padded to 4 byte (32 bit)
boundaries. It expects the payload data to start at multiples of 4 byte for TX
frames and will add the padding for received frames. For most headers there is
no need for padding, since they are multiples of 4 already - except QoS and 4
address headers.
This adds a new hw flag IEEE80211_HW_HEADER_PADDING to tell mac80211 that such
a padding is needed.
---
drivers/net/wireless/ath5k/base.c | 1 +
include/net/mac80211.h | 8 ++++++++
net/mac80211/rx.c | 20 ++++++++++++++++++++
net/mac80211/tx.c | 5 +++++
4 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/drivers/net/wireless/ath5k/base.c
b/drivers/net/wireless/ath5k/base.c
index 08530d5..b705f68 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -2382,6 +2382,7 @@ static int __devinit ath_pci_probe(struct pci_dev *pdev,
hw->extra_tx_headroom = 2;
hw->channel_change_time = 5000;
hw->max_rssi = 127; /* FIXME: get a real value for this. */
+ hw->flags |= IEEE80211_HW_HEADER_PADDING;
sc = hw->priv;
sc->hw = hw;
sc->pdev = pdev;
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 2b1bffb..18c30e6 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -662,12 +662,20 @@ enum sta_notify_cmd {
* @IEEE80211_HW_DEFAULT_REG_DOMAIN_CONFIGURED:
* Channels are already configured to the default regulatory domain
* specified in the device's EEPROM
+ *
+ * @IEEE80211_HW_HEADER_PADDING:
+ * Some hardware requires the header to be padded to 4 byte (32 bit)
+ * boundaries. It expects the payload data to start at multiples
+ * of 4 byte for TX frames and will add the padding for received
+ * frames. For most headers there is no need for padding, since they
+ * are multiples of 4 already - except QoS and 4 address headers.
*/
enum ieee80211_hw_flags {
IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE = 1<<0,
IEEE80211_HW_RX_INCLUDES_FCS = 1<<1,
IEEE80211_HW_HOST_BROADCAST_PS_BUFFERING = 1<<2,
IEEE80211_HW_DEFAULT_REG_DOMAIN_CONFIGURED = 1<<3,
+ IEEE80211_HW_HEADER_PADDING = 1<<4,
};
/**
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 064924c..2ad94d1 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -419,6 +419,25 @@ ieee80211_rx_h_check(struct ieee80211_txrx_data *rx)
static ieee80211_txrx_result
+ieee80211_rx_h_remove_padding(struct ieee80211_txrx_data *rx)
+{
+ struct ieee80211_hw *hw = local_to_hw(rx->local);
+ int hdrlen;
+ int pad;
+
+ if (!(hw->flags & IEEE80211_HW_HEADER_PADDING))
+ return TXRX_CONTINUE;
+
+ hdrlen = ieee80211_get_hdrlen(rx->fc);
+ if ((pad = hdrlen % 4)) {
+ memmove(rx->skb->data + pad, rx->skb->data, hdrlen);
+ skb_pull(rx->skb, pad);
+ }
+
+ return TXRX_CONTINUE;
+}
+
+static ieee80211_txrx_result
ieee80211_rx_h_decrypt(struct ieee80211_txrx_data *rx)
{
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) rx->skb->data;
@@ -1320,6 +1339,7 @@ ieee80211_rx_handler ieee80211_rx_handlers[] =
ieee80211_rx_h_if_stats,
ieee80211_rx_h_passive_scan,
ieee80211_rx_h_check,
+ ieee80211_rx_h_remove_padding,
ieee80211_rx_h_decrypt,
ieee80211_rx_h_sta_process,
ieee80211_rx_h_defragment,
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 957ec3c..ce4c6ce 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1334,6 +1334,7 @@ int ieee80211_subif_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
struct ieee80211_local *local = wdev_priv(dev->ieee80211_ptr);
+ struct ieee80211_hw *hw = local_to_hw(local);
struct ieee80211_tx_packet_data *pkt_data;
struct ieee80211_sub_if_data *sdata;
int ret = 1, head_need;
@@ -1410,6 +1411,10 @@ int ieee80211_subif_start_xmit(struct sk_buff *skb,
sta_info_put(sta);
}
+ if (hw->flags & IEEE80211_HW_HEADER_PADDING) {
+ hdrlen = roundup(hdrlen, 4);
+ }
+
hdr.frame_control = cpu_to_le16(fc);
hdr.duration_id = 0;
hdr.seq_ctrl = 0;
--
1.5.3.4
On Mon, 2007-10-15 at 16:34 +0900, bruno randolf wrote:
> what do you think about the following solution?
I don't like it. If we wanted to implement this in mac80211 we should do
a zero-copy solution with pointers to header/data portion which is very
hard right now because mac80211 has much assumptions that it's all
contiguous. Secondly, this adds overhead for all hardware to mac80211
for something that can be *trivially* solved in the driver, and only a
single driver so far requires it. I think the driver is a *much* better
place to do this.
> Atheros hardware requires the header to be padded to 4 byte (32 bit)
> boundaries. It expects the payload data to start at multiples of 4 byte for TX
> frames and will add the padding for received frames. For most headers there is
> no need for padding, since they are multiples of 4 already - except QoS and 4
> address headers.
>
> This adds a new hw flag IEEE80211_HW_HEADER_PADDING to tell mac80211 that such
> a padding is needed.
[patch snipped]
johannes
ok then, lets do it in the driver...
Author: Bruno Randolf <[email protected]>
Date: Mon Oct 15 19:46:59 2007 +0900
handle padding between header and data
Atheros hardware requires the header to be padded to 4 byte (32 bit)
boundaries. It expects the payload data to start at multiples of 4 byte for TX
frames and will add the padding for received frames. For most headers there is
no need for padding, since they are multiples of 4 already - except QoS and 4
address headers.
Changes-licensed-under: 3-clause-BSD
Signed-off-by: Bruno Randolf <[email protected]>
diff --git a/drivers/net/wireless/ath5k/base.c
b/drivers/net/wireless/ath5k/base.c
index 0997577..0a7cb41 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -375,6 +375,8 @@ static void ath_tasklet_rx(unsigned long data)
u16 len;
u8 stat;
int ret;
+ int hdrlen;
+ int pad;
spin_lock(&sc->rxbuflock);
do {
@@ -449,13 +451,20 @@ accept:
PCI_DMA_FROMDEVICE);
bf->skb = NULL;
- if (unlikely((ieee80211_get_hdrlen_from_skb(skb) & 3) &&
- net_ratelimit()))
- printk(KERN_DEBUG "rx len is not %%4: %u\n",
- ieee80211_get_hdrlen_from_skb(skb));
-
skb_put(skb, len);
+ /*
+ * the hardware adds a padding to 4 byte boundaries between
+ * the header and the payload data if the header length is
+ * not multiples of 4 - remove it
+ */
+ hdrlen = ieee80211_get_hdrlen_from_skb(skb);
+ if (hdrlen & 3) {
+ pad = hdrlen % 4;
+ memmove(skb->data + pad, skb->data, hdrlen);
+ skb_pull(skb, pad);
+ }
+
if (sc->opmode == IEEE80211_IF_TYPE_MNTR)
rxs.mactime = ath_extend_tsf(sc->ah,
ds->ds_rxstat.rs_tstamp);
@@ -1153,7 +1162,7 @@ static int ath_tx_bf(struct ath_softc *sc, struct
ath_buf *bf,
struct ath_txq *txq = sc->txq;
struct ath_desc *ds = bf->desc;
struct sk_buff *skb = bf->skb;
- unsigned int hdrpad, pktlen, flags, keyidx = AR5K_TXKEYIX_INVALID;
+ unsigned int pktlen, flags, keyidx = AR5K_TXKEYIX_INVALID;
int ret;
flags = AR5K_TXDESC_INTREQ | AR5K_TXDESC_CLRDMASK;
@@ -1165,12 +1174,7 @@ static int ath_tx_bf(struct ath_softc *sc, struct
ath_buf *bf,
if (ctl->flags & IEEE80211_TXCTL_NO_ACK)
flags |= AR5K_TXDESC_NOACK;
- if ((ieee80211_get_hdrlen_from_skb(skb) & 3) && net_ratelimit())
- printk(KERN_DEBUG "tx len is not %%4: %u\n",
- ieee80211_get_hdrlen_from_skb(skb));
-
- hdrpad = 0;
- pktlen = skb->len - hdrpad + FCS_LEN;
+ pktlen = skb->len + FCS_LEN;
if (!(ctl->flags & IEEE80211_TXCTL_DO_NOT_ENCRYPT)) {
keyidx = ctl->key_idx;
@@ -1214,12 +1218,32 @@ static int ath_tx(struct ieee80211_hw *hw, struct
sk_buff *skb,
struct ath_softc *sc = hw->priv;
struct ath_buf *bf;
unsigned long flags;
+ int hdrlen;
+ int pad;
ath_dump_skb(skb, "t");
if (sc->opmode == IEEE80211_IF_TYPE_MNTR)
DPRINTF(sc, ATH_DEBUG_XMIT, "tx in monitor (scan?)\n");
+ /*
+ * the hardware expects the header padded to 4 byte boundaries
+ * if this is not the case we add the padding after the header
+ */
+ hdrlen = ieee80211_get_hdrlen_from_skb(skb);
+ if (hdrlen & 3) {
+ pad = hdrlen % 4;
+ if (skb_headroom(skb) < pad) {
+ if (net_ratelimit())
+ printk(KERN_ERR "ath: tx hdrlen not %%4: %d "
+ "not enough headroom to pad %d\n",
+ hdrlen, pad);
+ return -1;
+ }
+ skb_push(skb, pad);
+ memmove(skb->data, skb->data+pad, hdrlen);
+ }
+
sc->led_txrate = ctl->tx_rate;
spin_lock_irqsave(&sc->txbuflock, flags);
On Mon, 2007-10-15 at 12:41 +0900, bruno randolf wrote:
> but isn't doing a memmove() quite inefficient? especially since we are dealing
> with QoS packets that might be important.
It's only ~20 bytes that are moved, should be within one cache line and
we touch the packet data anyway. I don't think it's a problem.
> another solution i thought of was signalling the mac80211 layer that we need
> padding which could then just adjust it's headerlen. but then different
> drivers might need different padding in different places (i don't know?).
> what do you think of this approach?
That could be worthwhile, though the headerlen calculation is called
very very often and adding another branch into it could very well impact
performance worse than doing the memmove.
johannes
On Fri, 2007-10-12 at 19:46 +0900, bruno randolf wrote:
> i didn't see how mac80211 would allow to solve this elegantly at this time.
> are there any other drivers with a similar kind of padding requirement? how
> could that be implemented cleanly?
b43 also has padding added on RX which we simply remove with a memmove()
johannes
On Fri, 2007-10-12 at 13:58 +0200, Johannes Berg wrote:
> b43 also has padding added on RX which we simply remove with a memmove()
Umm, no, this is wrong, the padding is in another place.
johannes
On Mon, 2007-10-15 at 18:00 +0900, bruno randolf wrote:
> On Monday 15 October 2007 17:47:16 Johannes Berg wrote:
> > > another solution i thought of was signalling the mac80211 layer that we
> > > need padding which could then just adjust it's headerlen. but then
> > > different drivers might need different padding in different places (i
> > > don't know?). what do you think of this approach?
> >
> > That could be worthwhile, though the headerlen calculation is called
> > very very often and adding another branch into it could very well impact
> > performance worse than doing the memmove.
>
> now you confuse me. this is exactly what i attempted with the previous patch,
> which you didn't like...
... for the reason that you put it into mac80211 rather than the atheros
driver that requires this workaround.
johannes
On Monday 15 October 2007 17:47:16 Johannes Berg wrote:
> > another solution i thought of was signalling the mac80211 layer that we
> > need padding which could then just adjust it's headerlen. but then
> > different drivers might need different padding in different places (i
> > don't know?). what do you think of this approach?
>
> That could be worthwhile, though the headerlen calculation is called
> very very often and adding another branch into it could very well impact
> performance worse than doing the memmove.
now you confuse me. this is exactly what i attempted with the previous patch,
which you didn't like...
unfortunately it's not possible to implement it there as a zero copy solution,
because ieee80211_rx_h_remove_qos_control messes with the header itself.
bruno
On Friday 12 October 2007 22:09:38 Johannes Berg wrote:
> On Fri, 2007-10-12 at 13:58 +0200, Johannes Berg wrote:
> > b43 also has padding added on RX which we simply remove with a memmove()
>
> Umm, no, this is wrong, the padding is in another place.
nevermind. i thought of this as an option as well.
but isn't doing a memmove() quite inefficient? especially since we are dealing
with QoS packets that might be important.
another solution i thought of was signalling the mac80211 layer that we need
padding which could then just adjust it's headerlen. but then different
drivers might need different padding in different places (i don't know?).
what do you think of this approach?
bruno