2011-02-04 06:24:07

by Vivek Natarajan

[permalink] [raw]
Subject: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

In a highly noisy environment, a data frame which is queued before
a nullfunc frame on a different hw queue may be sent over the air
after the tx completion of nullfunc frame. This causes the station
to enter power save and the AP to see the station as awake and
continues to send the data traffic. Fix this by draining all tx
queues before we send the null function frame with PM bit set.

Signed-off-by: Vivek Natarajan <[email protected]>
---
drivers/net/wireless/ath/ath9k/xmit.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 68a1c76..0cb6017 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -1710,7 +1710,6 @@ static struct ath_buf *ath_tx_setup_buffer(struct ieee80211_hw *hw,
bf->bf_buf_addr,
txq->axq_qnum);

-
return bf;
}

@@ -1813,6 +1812,15 @@ int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb,
if (unlikely(!bf))
return -ENOMEM;

+ if (ieee80211_is_nullfunc(hdr->frame_control) &&
+ ieee80211_has_pm(hdr->frame_control)) {
+ /* Drain all the pending frames before we send a nullfunc frame
+ * to avoid any power save state mismatch between the station
+ * and the AP.
+ */
+ ath_drain_all_txq(sc, false);
+ }
+
q = skb_get_queue_mapping(skb);
spin_lock_bh(&txq->axq_lock);
if (txq == sc->tx.txq_map[q] &&
--
1.6.3.3



2011-02-04 06:24:14

by Vivek Natarajan

[permalink] [raw]
Subject: [PATCH 2/2] mac80211: Fix a race on enabling power save.

There is a race of queuing a data frame before the tx completion
of nullfunc frame for enabling power save. This has caused a power
save state mismatch between the station and the AP. This patch
addresses this issue.

Signed-off-by: Vivek Natarajan <[email protected]>
---
net/mac80211/ieee80211_i.h | 1 +
net/mac80211/mlme.c | 8 ++++++--
net/mac80211/tx.c | 6 ++++++
3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 533fd32..6ad97f6 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -346,6 +346,7 @@ enum ieee80211_sta_flags {
IEEE80211_STA_UAPSD_ENABLED = BIT(7),
IEEE80211_STA_NULLFUNC_ACKED = BIT(8),
IEEE80211_STA_RESET_SIGNAL_AVE = BIT(9),
+ IEEE80211_STA_PS_PENDING = BIT(10),
};

struct ieee80211_if_managed {
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index e059b3a..45f736e 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -727,13 +727,17 @@ void ieee80211_dynamic_ps_enable_work(struct work_struct *work)
return;

if ((local->hw.flags & IEEE80211_HW_PS_NULLFUNC_STACK) &&
- (!(ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED)))
+ (!(ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED))) {
+ ifmgd->flags |= IEEE80211_STA_PS_PENDING;
ieee80211_send_nullfunc(local, sdata, 1);
+ }

if (!((local->hw.flags & IEEE80211_HW_REPORTS_TX_ACK_STATUS) &&
(local->hw.flags & IEEE80211_HW_PS_NULLFUNC_STACK)) ||
- (ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED)) {
+ ((ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED) &&
+ ifmgd->flags & IEEE80211_STA_PS_PENDING)) {
ifmgd->flags &= ~IEEE80211_STA_NULLFUNC_ACKED;
+ ifmgd->flags &= ~IEEE80211_STA_PS_PENDING;
local->hw.conf.flags |= IEEE80211_CONF_PS;
ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_PS);
}
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 8fbbc7a..25f576a 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1574,6 +1574,7 @@ static void ieee80211_xmit(struct ieee80211_sub_if_data *sdata,
struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
struct ieee80211_sub_if_data *tmp_sdata;
+ struct ieee80211_if_managed *ifmgd;
int headroom;
bool may_encrypt;

@@ -1648,6 +1649,11 @@ static void ieee80211_xmit(struct ieee80211_sub_if_data *sdata,
return;
}

+ ifmgd = &sdata->u.mgd;
+ if (!(ieee80211_is_nullfunc(hdr->frame_control)) &&
+ (ifmgd->flags & IEEE80211_STA_PS_PENDING))
+ ifmgd->flags &= ~IEEE80211_STA_PS_PENDING;
+
ieee80211_set_qos_hdr(local, skb);
ieee80211_tx(sdata, skb, false);
rcu_read_unlock();
--
1.6.3.3


2011-02-04 07:37:58

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 2/2] mac80211: Fix a race on enabling power save.

On Thu, Feb 3, 2011 at 10:24 PM, Vivek Natarajan <[email protected]> wrote:
> There is a race of queuing a data frame before the tx completion
> of nullfunc frame for enabling power save. This has caused a power
> save state mismatch between the station and the AP. This patch
> addresses this issue.
>
> Signed-off-by: Vivek Natarajan <[email protected]>

Not stable fix? What't the effect of not having this patch? The
description could use some love to explain better what is observed in
terms of metrics or loss.

Luis

2011-02-04 11:07:25

by Sujith

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

Vivek Natarajan wrote:
> This is true for retry_tx set as false(for ath_draintxq) but if the
> retry is set, I suppose the frame will be in the hw queue till all the
> hw retries(20) and sw retries(10) (200 retries in total)are over. This
> will take a lot of time for completion if the channel is busy.

Time for completion ? We are _draining_ the pending frames, they are never
sent out ...

> I am thinking of calling ath_drain_all_txq() in flush() and the only
> query is should the retry_tx be set.

Not sure.

Sujith

2011-02-04 09:26:27

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [PATCH 2/2] mac80211: Fix a race on enabling power save.

On Fri, Feb 4, 2011 at 1:07 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, Feb 3, 2011 at 10:24 PM, Vivek Natarajan <[email protected]> wrote:
>> There is a race of queuing a data frame before the tx completion
>> of nullfunc frame for enabling power save. This has caused a power
>> save state mismatch between the station and the AP. This patch
>> addresses this issue.
>>
>> Signed-off-by: Vivek Natarajan <[email protected]>
>
> Not stable fix? What't the effect of not having this patch? The
> description could use some love to explain better what is observed in
> terms of metrics or loss.

The issue is similar to the other patch I posted for ath9k. As there
is a power save state mismatch, AP continues to send the data traffic
whereas the station might be in sleep mode. I see a IxChariot
disconnect issue while testing continuous Rx UDP traffic. I will add
these details in the next version after the review of these two PS
related patches.

VIvek.

2011-02-09 19:06:52

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

Vivek Natarajan <[email protected]> writes:

> We were considering yet another approach too:
> In this case, if a nullfunc frame for PS comes to the driver when
> there are pending frames in the hw queue(the frames queued before
> 100ms and still pending because of highly noisy environment), silently
> drop the frame so that mac80211 will try once again after 100ms to go
> to PS. The issue that I face here is, ath9k does not know whether this
> frame is actually for PS or for fake sleep before scanning.

There's also a third option. Keep sending the data frames after the
nullfunc frame but make sure that pm bit is set. But that would need
hardware support and someway to notify the driver about the pm state.

--
Kalle Valo

2011-02-04 10:47:50

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

On Fri, Feb 4, 2011 at 3:53 PM, Sujith <[email protected]> wrote:
> Vivek Natarajan wrote:
>> So, will it be appropriate to add stop and wake queue while calling
>> ath_drain_all_txq?
>
> Am not sure. I happened to look at that comment and thought it a bit
> odd that the caller is expected to call wake_queues().
>
>> Currently, flush() is called only when mac80211 is moving to off
>> channel. Even if we call flush before sending nullfunc frame for power
>> save, a question arises if it should be done with drop = true or
>> false. ?If set as false, there is a possibility that the frames might
>> be still around in the hw when nullfunc is sent and the same issue
>> might pop up.
>
> "Still around in the hw" ?
>
> ath_drain_all_txq() will stop HW DMA first and then remove all the
> pending frames in all HW queues and add them to the free buffer list.
>

This is true for retry_tx set as false(for ath_draintxq) but if the
retry is set, I suppose the frame will be in the hw queue till all the
hw retries(20) and sw retries(10) (200 retries in total)are over. This
will take a lot of time for completion if the channel is busy.

> When flush() is called, the driver would empty its HW queues and then
> mac80211 would proceed to send the nullfunc frame. Isn't this the fix
> that you require ?

I am thinking of calling ath_drain_all_txq() in flush() and the only
query is should the retry_tx be set.

>> We were considering yet another approach too:
>> In this case, if a nullfunc frame for PS comes to the driver when
>> there are pending frames in the hw queue(the frames queued before
>> 100ms and still pending because of highly noisy environment), silently
>> drop the frame so that mac80211 will try once again after 100ms to go
>> to PS. The issue that I face here is, ath9k does not know whether this
>> frame is actually for PS or for fake sleep before scanning.
>
> Well, silently dropping frames is generally frowned upon. :)

return -ENOMEM is what I meant though mac80211 does not check the
return status here.

Vivek.

2011-02-04 09:17:29

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

On Fri, Feb 4, 2011 at 1:58 PM, Sujith <[email protected]> wrote:
> Vivek Natarajan wrote:
>> In a highly noisy environment, a data frame which is queued before
>> a nullfunc frame on a different hw queue may be sent over the air
>> after the tx completion of nullfunc frame. This causes the station
>> to enter power save and the AP to see the station as awake and
>> continues to send the data traffic. Fix this by draining all tx
>> queues before we send the null function frame with PM bit set.
>>
>> Signed-off-by: Vivek Natarajan <[email protected]>
>> ---
>> ?drivers/net/wireless/ath/ath9k/xmit.c | ? 10 +++++++++-
>> ?1 files changed, 9 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
>> index 68a1c76..0cb6017 100644
>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>> @@ -1710,7 +1710,6 @@ static struct ath_buf *ath_tx_setup_buffer(struct ieee80211_hw *hw,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? bf->bf_buf_addr,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? txq->axq_qnum);
>>
>> -
>> ? ? ? return bf;
>> ?}
>>
>> @@ -1813,6 +1812,15 @@ int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb,
>> ? ? ? if (unlikely(!bf))
>> ? ? ? ? ? ? ? return -ENOMEM;
>>
>> + ? ? if (ieee80211_is_nullfunc(hdr->frame_control) &&
>> + ? ? ? ? ieee80211_has_pm(hdr->frame_control)) {
>> + ? ? ? ? ? ? /* Drain all the pending frames before we send a nullfunc frame
>> + ? ? ? ? ? ? ?* to avoid any power save state mismatch between the station
>> + ? ? ? ? ? ? ?* and the AP.
>> + ? ? ? ? ? ? ?*/
>> + ? ? ? ? ? ? ath_drain_all_txq(sc, false);
>> + ? ? }
>> +
>
> There is a comment about the caller waking queues up properly in ath_drain_all_txq(),
> I don't think that requirement is handled properly.

So, will it be appropriate to add stop and wake queue while calling
ath_drain_all_txq?

> Also, mac80211's flush() callback exists precisely for this purpose.

Currently, flush() is called only when mac80211 is moving to off
channel. Even if we call flush before sending nullfunc frame for power
save, a question arises if it should be done with drop = true or
false. If set as false, there is a possibility that the frames might
be still around in the hw when nullfunc is sent and the same issue
might pop up.
Drop is set as false for scanning. iwlwifi waits for 2sec for all the
packets to complete but for ath9k, since it is asynchronous, I am not
quite sure of how to wait for tx_completion of all pending packets.

> I think it would be better to use that rather than adding a check for every packet.

We were considering yet another approach too:
In this case, if a nullfunc frame for PS comes to the driver when
there are pending frames in the hw queue(the frames queued before
100ms and still pending because of highly noisy environment), silently
drop the frame so that mac80211 will try once again after 100ms to go
to PS. The issue that I face here is, ath9k does not know whether this
frame is actually for PS or for fake sleep before scanning.

Vivek.

2011-02-04 07:36:49

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

Adding Greg for real this time.

Luis

On Thu, Feb 3, 2011 at 11:36 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, Feb 3, 2011 at 10:23 PM, Vivek Natarajan <[email protected]> wrote:
>> In a highly noisy environment, a data frame which is queued before
>> a nullfunc frame on a different hw queue may be sent over the air
>> after the tx completion of nullfunc frame. This causes the station
>> to enter power save and the AP to see the station as awake and
>> continues to send the data traffic. Fix this by draining all tx
>> queues before we send the null function frame with PM bit set.
>>
>> Signed-off-by: Vivek Natarajan <[email protected]>
>
> Hm nice, this is a good example of one of those random not-so-critical
> but still nice fixes which I wish would go to stable. John, Greg, any
> qualms for such things to go to stable if they apply? It was my
> impression we could send it, but I want to verify with both you guys
> so neither we or John get bashed if we try to send it as a stable fix.
>
> Vivek, is this applicable to stable kernels or does the hunk not even apply?
>
>  Luis
>
>> ---
>>  drivers/net/wireless/ath/ath9k/xmit.c |   10 +++++++++-
>>  1 files changed, 9 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
>> index 68a1c76..0cb6017 100644
>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>> @@ -1710,7 +1710,6 @@ static struct ath_buf *ath_tx_setup_buffer(struct ieee80211_hw *hw,
>>                            bf->bf_buf_addr,
>>                            txq->axq_qnum);
>>
>> -
>>        return bf;
>>  }
>>
>> @@ -1813,6 +1812,15 @@ int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb,
>>        if (unlikely(!bf))
>>                return -ENOMEM;
>>
>> +       if (ieee80211_is_nullfunc(hdr->frame_control) &&
>> +           ieee80211_has_pm(hdr->frame_control)) {
>> +               /* Drain all the pending frames before we send a nullfunc frame
>> +                * to avoid any power save state mismatch between the station
>> +                * and the AP.
>> +                */
>> +               ath_drain_all_txq(sc, false);
>> +       }
>> +
>>        q = skb_get_queue_mapping(skb);
>>        spin_lock_bh(&txq->axq_lock);
>>        if (txq == sc->tx.txq_map[q] &&
>> --
>> 1.6.3.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
>> the body of a message to [email protected]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

2011-02-04 14:30:22

by John W. Linville

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

On Thu, Feb 03, 2011 at 11:36:27PM -0800, Luis R. Rodriguez wrote:
> Adding Greg for real this time.
>
> Luis
>
> On Thu, Feb 3, 2011 at 11:36 PM, Luis R. Rodriguez <[email protected]> wrote:
> > On Thu, Feb 3, 2011 at 10:23 PM, Vivek Natarajan <[email protected]> wrote:
> >> In a highly noisy environment, a data frame which is queued before
> >> a nullfunc frame on a different hw queue may be sent over the air
> >> after the tx completion of nullfunc frame. This causes the station
> >> to enter power save and the AP to see the station as awake and
> >> continues to send the data traffic. Fix this by draining all tx
> >> queues before we send the null function frame with PM bit set.
> >>
> >> Signed-off-by: Vivek Natarajan <[email protected]>
> >
> > Hm nice, this is a good example of one of those random not-so-critical
> > but still nice fixes which I wish would go to stable. John, Greg, any
> > qualms for such things to go to stable if they apply? It was my
> > impression we could send it, but I want to verify with both you guys
> > so neither we or John get bashed if we try to send it as a stable fix.

I think the feeling is that it is silly to mark something for stable if
it is not important enough to be sent as a fix for the current release.

The standard for being sent for the current release seems to
vary a bit between different maintainers, the particular release
(i.e. how grumpy is Linus), and the point of the release cycle (i.e.
loser at -rc1, tighter at -rc6). But for the most part, to go to the
current release it needs to fix a bug rather than adding a feature.
The strictest definitions of "bug" include crashes, data corrupters,
and some (or possibly all) regressions. As usual, judging bug
"worthiness" is a bit of a judgment call.

All that said, I think the question is what grey area exists for
something that arrives late in the current release cycle but that
might still be viewed as a (minor) bug fix?

John
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2011-02-04 07:36:21

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

On Thu, Feb 3, 2011 at 10:23 PM, Vivek Natarajan <[email protected]> wrote:
> In a highly noisy environment, a data frame which is queued before
> a nullfunc frame on a different hw queue may be sent over the air
> after the tx completion of nullfunc frame. This causes the station
> to enter power save and the AP to see the station as awake and
> continues to send the data traffic. Fix this by draining all tx
> queues before we send the null function frame with PM bit set.
>
> Signed-off-by: Vivek Natarajan <[email protected]>

Hm nice, this is a good example of one of those random not-so-critical
but still nice fixes which I wish would go to stable. John, Greg, any
qualms for such things to go to stable if they apply? It was my
impression we could send it, but I want to verify with both you guys
so neither we or John get bashed if we try to send it as a stable fix.

Vivek, is this applicable to stable kernels or does the hunk not even apply?

Luis

> ---
>  drivers/net/wireless/ath/ath9k/xmit.c |   10 +++++++++-
>  1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
> index 68a1c76..0cb6017 100644
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -1710,7 +1710,6 @@ static struct ath_buf *ath_tx_setup_buffer(struct ieee80211_hw *hw,
>                            bf->bf_buf_addr,
>                            txq->axq_qnum);
>
> -
>        return bf;
>  }
>
> @@ -1813,6 +1812,15 @@ int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb,
>        if (unlikely(!bf))
>                return -ENOMEM;
>
> +       if (ieee80211_is_nullfunc(hdr->frame_control) &&
> +           ieee80211_has_pm(hdr->frame_control)) {
> +               /* Drain all the pending frames before we send a nullfunc frame
> +                * to avoid any power save state mismatch between the station
> +                * and the AP.
> +                */
> +               ath_drain_all_txq(sc, false);
> +       }
> +
>        q = skb_get_queue_mapping(skb);
>        spin_lock_bh(&txq->axq_lock);
>        if (txq == sc->tx.txq_map[q] &&
> --
> 1.6.3.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

2011-02-04 10:31:35

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/2] mac80211: Fix a race on enabling power save.

On Fri, 2011-02-04 at 11:54 +0530, Vivek Natarajan wrote:

> @@ -1648,6 +1649,11 @@ static void ieee80211_xmit(struct ieee80211_sub_if_data *sdata,
> return;
> }
>
> + ifmgd = &sdata->u.mgd;
> + if (!(ieee80211_is_nullfunc(hdr->frame_control)) &&
> + (ifmgd->flags & IEEE80211_STA_PS_PENDING))
> + ifmgd->flags &= ~IEEE80211_STA_PS_PENDING;
> +

Without even trying to understand the logic of the patch, this is
clearly broken since it assumes that all interfaces are of managed type.

johannes


2011-02-04 14:56:02

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

On Fri, Feb 04, 2011 at 09:21:30AM -0500, John W. Linville wrote:
> On Thu, Feb 03, 2011 at 11:36:27PM -0800, Luis R. Rodriguez wrote:
> > Adding Greg for real this time.
> >
> > Luis
> >
> > On Thu, Feb 3, 2011 at 11:36 PM, Luis R. Rodriguez <[email protected]> wrote:
> > > On Thu, Feb 3, 2011 at 10:23 PM, Vivek Natarajan <[email protected]> wrote:
> > >> In a highly noisy environment, a data frame which is queued before
> > >> a nullfunc frame on a different hw queue may be sent over the air
> > >> after the tx completion of nullfunc frame. This causes the station
> > >> to enter power save and the AP to see the station as awake and
> > >> continues to send the data traffic. Fix this by draining all tx
> > >> queues before we send the null function frame with PM bit set.
> > >>
> > >> Signed-off-by: Vivek Natarajan <[email protected]>
> > >
> > > Hm nice, this is a good example of one of those random not-so-critical
> > > but still nice fixes which I wish would go to stable. John, Greg, any
> > > qualms for such things to go to stable if they apply? It was my
> > > impression we could send it, but I want to verify with both you guys
> > > so neither we or John get bashed if we try to send it as a stable fix.
>
> I think the feeling is that it is silly to mark something for stable if
> it is not important enough to be sent as a fix for the current release.

I totally agree.

thanks,

greg k-h

2011-02-04 08:28:53

by Sujith

[permalink] [raw]
Subject: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

Vivek Natarajan wrote:
> In a highly noisy environment, a data frame which is queued before
> a nullfunc frame on a different hw queue may be sent over the air
> after the tx completion of nullfunc frame. This causes the station
> to enter power save and the AP to see the station as awake and
> continues to send the data traffic. Fix this by draining all tx
> queues before we send the null function frame with PM bit set.
>
> Signed-off-by: Vivek Natarajan <[email protected]>
> ---
> drivers/net/wireless/ath/ath9k/xmit.c | 10 +++++++++-
> 1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
> index 68a1c76..0cb6017 100644
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -1710,7 +1710,6 @@ static struct ath_buf *ath_tx_setup_buffer(struct ieee80211_hw *hw,
> bf->bf_buf_addr,
> txq->axq_qnum);
>
> -
> return bf;
> }
>
> @@ -1813,6 +1812,15 @@ int ath_tx_start(struct ieee80211_hw *hw, struct sk_buff *skb,
> if (unlikely(!bf))
> return -ENOMEM;
>
> + if (ieee80211_is_nullfunc(hdr->frame_control) &&
> + ieee80211_has_pm(hdr->frame_control)) {
> + /* Drain all the pending frames before we send a nullfunc frame
> + * to avoid any power save state mismatch between the station
> + * and the AP.
> + */
> + ath_drain_all_txq(sc, false);
> + }
> +

There is a comment about the caller waking queues up properly in ath_drain_all_txq(),
I don't think that requirement is handled properly.

Also, mac80211's flush() callback exists precisely for this purpose.
I think it would be better to use that rather than adding a check for every packet.

Sujith

2011-02-04 10:23:37

by Sujith

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath9k: Drain txq before sending a nullfunc frame.

Vivek Natarajan wrote:
> So, will it be appropriate to add stop and wake queue while calling
> ath_drain_all_txq?

Am not sure. I happened to look at that comment and thought it a bit
odd that the caller is expected to call wake_queues().

> Currently, flush() is called only when mac80211 is moving to off
> channel. Even if we call flush before sending nullfunc frame for power
> save, a question arises if it should be done with drop = true or
> false. If set as false, there is a possibility that the frames might
> be still around in the hw when nullfunc is sent and the same issue
> might pop up.

"Still around in the hw" ?

ath_drain_all_txq() will stop HW DMA first and then remove all the
pending frames in all HW queues and add them to the free buffer list.

When flush() is called, the driver would empty its HW queues and then
mac80211 would proceed to send the nullfunc frame. Isn't this the fix
that you require ?

> Drop is set as false for scanning. iwlwifi waits for 2sec for all the
> packets to complete but for ath9k, since it is asynchronous, I am not
> quite sure of how to wait for tx_completion of all pending packets.

Not sure I understand. Why should we wait for completion of pending packets
when we have just drained them ?

> We were considering yet another approach too:
> In this case, if a nullfunc frame for PS comes to the driver when
> there are pending frames in the hw queue(the frames queued before
> 100ms and still pending because of highly noisy environment), silently
> drop the frame so that mac80211 will try once again after 100ms to go
> to PS. The issue that I face here is, ath9k does not know whether this
> frame is actually for PS or for fake sleep before scanning.

Well, silently dropping frames is generally frowned upon. :)

Sujith