2011-04-01 11:45:31

by Vivek Natarajan

[permalink] [raw]
Subject: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

In a highly noisy environment, since the driver could not send
the packets out in 100ms, some applications stall and eventually
fail as mac80211 stops the netdevice queues for flush when power
save is triggered. Most of the WMM testcases fail in the Wifi
testing because of this.
Increasing the dynamic ps timeout to 200-300ms helps but in noisy
channel conditions even if there is a continuous tx traffic, mac80211
tries to go into PS. The new implementation checks for any frames
queued in the driver tx queues and based on that mac80211 decides
to go for power save. This also prevents redundant stopping of
netdevice queues which helps more applications to run properly.

Signed-off-by: Vivek Natarajan <[email protected]>
---
include/net/mac80211.h | 4 ++++
net/mac80211/driver-ops.h | 13 +++++++++++++
net/mac80211/driver-trace.h | 20 ++++++++++++++++++++
net/mac80211/mlme.c | 17 +++++++++--------
4 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index cefe1b3..dca4d21 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -1808,6 +1808,9 @@ enum ieee80211_ampdu_mlme_action {
* @set_ringparam: Set tx and rx ring sizes.
*
* @get_ringparam: Get tx and rx ring current and maximum sizes.
+ *
+ * @tx_frames_pending: Check if there is any pending frame in the hardware
+ * queues before entering power save.
*/
struct ieee80211_ops {
void (*tx)(struct ieee80211_hw *hw, struct sk_buff *skb);
@@ -1895,6 +1898,7 @@ struct ieee80211_ops {
int (*set_ringparam)(struct ieee80211_hw *hw, u32 tx, u32 rx);
void (*get_ringparam)(struct ieee80211_hw *hw,
u32 *tx, u32 *tx_max, u32 *rx, u32 *rx_max);
+ bool (*tx_frames_pending)(struct ieee80211_hw *hw);
};

/**
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 9c0d62b..00a0685 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -552,4 +552,17 @@ static inline void drv_get_ringparam(struct ieee80211_local *local,
trace_drv_return_void(local);
}

+static inline bool drv_tx_frames_pending(struct ieee80211_local *local)
+{
+ bool ret = false;
+
+ might_sleep();
+
+ trace_drv_tx_frames_pending(local);
+ if (local->ops->tx_frames_pending)
+ ret = local->ops->tx_frames_pending(&local->hw);
+ trace_drv_return_bool(local, ret);
+
+ return ret;
+}
#endif /* __MAC80211_DRIVER_OPS */
diff --git a/net/mac80211/driver-trace.h b/net/mac80211/driver-trace.h
index 45aab80..c8c934d 100644
--- a/net/mac80211/driver-trace.h
+++ b/net/mac80211/driver-trace.h
@@ -74,6 +74,21 @@ TRACE_EVENT(drv_return_int,
TP_printk(LOCAL_PR_FMT " - %d", LOCAL_PR_ARG, __entry->ret)
);

+TRACE_EVENT(drv_return_bool,
+ TP_PROTO(struct ieee80211_local *local, bool ret),
+ TP_ARGS(local, ret),
+ TP_STRUCT__entry(
+ LOCAL_ENTRY
+ __field(bool, ret)
+ ),
+ TP_fast_assign(
+ LOCAL_ASSIGN;
+ __entry->ret = ret;
+ ),
+ TP_printk(LOCAL_PR_FMT " - %s", LOCAL_PR_ARG, (__entry->ret) ?
+ "true" : "false")
+);
+
TRACE_EVENT(drv_return_u64,
TP_PROTO(struct ieee80211_local *local, u64 ret),
TP_ARGS(local, ret),
@@ -964,6 +979,11 @@ TRACE_EVENT(drv_get_ringparam,
)
);

+DEFINE_EVENT(local_only_evt, drv_tx_frames_pending,
+ TP_PROTO(struct ieee80211_local *local),
+ TP_ARGS(local)
+);
+
DEFINE_EVENT(local_only_evt, drv_offchannel_tx_cancel_wait,
TP_PROTO(struct ieee80211_local *local),
TP_ARGS(local)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 64d92d5..fd49280 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -770,15 +770,16 @@ void ieee80211_dynamic_ps_enable_work(struct work_struct *work)
if ((local->hw.flags & IEEE80211_HW_PS_NULLFUNC_STACK) &&
(!(ifmgd->flags & IEEE80211_STA_NULLFUNC_ACKED))) {
netif_tx_stop_all_queues(sdata->dev);
- /*
- * Flush all the frames queued in the driver before
- * going to power save
- */
- drv_flush(local, false);
- ieee80211_send_nullfunc(local, sdata, 1);

- /* Flush once again to get the tx status of nullfunc frame */
- drv_flush(local, false);
+ if (drv_tx_frames_pending(local))
+ mod_timer(&local->dynamic_ps_timer, jiffies +
+ msecs_to_jiffies(
+ local->hw.conf.dynamic_ps_timeout));
+ else {
+ ieee80211_send_nullfunc(local, sdata, 1);
+ /* Flush to get the tx status of nullfunc frame */
+ drv_flush(local, false);
+ }
}

if (!((local->hw.flags & IEEE80211_HW_REPORTS_TX_ACK_STATUS) &&
--
1.6.3.3



2011-04-07 12:04:34

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Thu, Apr 7, 2011 at 5:21 PM, Johannes Berg <[email protected]> wrote:
> On Thu, 2011-04-07 at 17:17 +0530, Vivek Natarajan wrote:
>
>> > I'm not sure I understand. Where does the 100ms come from? This code
>> > flushes the TX queues, which can take as much time as it wants, no? How
>> > does it break applications?
>>
>> 100ms is the time after which the dynamic ps timer stops the netif
>> queues and flushes all the frames in the driver. Actually two factors
>> caused the application to timeout:
>> ?One is the delay in the flush() for which the netif queues are
>> stopped and eventually the application could not send out the frames.
>> The second is the dropping of frames in the flush() callback when it
>> reaches the timeout period.
>
> Is that an ath9k specific thing? mac80211 never invokes flush() with
> drop=true, at this point. I thought we needed it and added it, but I
> guess we don't really need it.

Currently, ath9k gives a timeout of 200 ms for the pending frames to
complete. After this timeout, the pending frames will be dropped even
if drop is set as false in mac80211.

Vivek.

2011-04-07 11:47:59

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Thu, Apr 7, 2011 at 4:03 PM, Johannes Berg <[email protected]> wrote:
> On Fri, 2011-04-01 at 17:42 +0530, Vivek Natarajan wrote:
>
>> Ideally the driver should enter power save only when there is no tx
>> frame. When there are about 10 APs in the environment the tx rate of
>> the driver drops and the application slows down since it has not yet
>> received ACKs for the frames already queued in the hardware. Since
>> this ACK may take more than 100ms, stopping the dev queues for
>> entering PS at this stage breaks applications, WMM test case in my
>> testing. If there are tx_frames already pending in the queue,
>> postponing the PS logic helps to avoid redundant queue stops. Since, I
>> could not find any other way in mac80211 to see if the driver has not
>> completed the transmission of any frame, a new API to check for
>> pending frames is used. When power save is enabled by default and in a
>> noisy environment, this API certainly helps in improving the average
>> throughput. Any other idea?
>
> I'm not sure I understand. Where does the 100ms come from? This code
> flushes the TX queues, which can take as much time as it wants, no? How
> does it break applications?

100ms is the time after which the dynamic ps timer stops the netif
queues and flushes all the frames in the driver. Actually two factors
caused the application to timeout:
One is the delay in the flush() for which the netif queues are
stopped and eventually the application could not send out the frames.
The second is the dropping of frames in the flush() callback when it
reaches the timeout period.

> I'll agree that entering powersave is pointless if that means you won't
> be able to transmit all frames. It seems to me that maybe instead we
> should give flush() a timeout, and if it can't complete in that time, we
> can postpone the powersave then?

This seems better as it addresses both the issues I listed above. The
flush() timeout should not be high enough for the application to
timeout and also flush() should not drop any frame. Maybe it can
return a status if the pending frames are not completed in that
timeout so that PS can be deferred. I will check this out.

Thanks,
Vivek.

2011-04-07 11:51:23

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Thu, 2011-04-07 at 17:17 +0530, Vivek Natarajan wrote:

> > I'm not sure I understand. Where does the 100ms come from? This code
> > flushes the TX queues, which can take as much time as it wants, no? How
> > does it break applications?
>
> 100ms is the time after which the dynamic ps timer stops the netif
> queues and flushes all the frames in the driver. Actually two factors
> caused the application to timeout:
> One is the delay in the flush() for which the netif queues are
> stopped and eventually the application could not send out the frames.
> The second is the dropping of frames in the flush() callback when it
> reaches the timeout period.

Is that an ath9k specific thing? mac80211 never invokes flush() with
drop=true, at this point. I thought we needed it and added it, but I
guess we don't really need it.

johannes



2011-04-01 12:12:10

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Fri, Apr 1, 2011 at 5:21 PM, Johannes Berg <[email protected]> wrote:
> On Fri, 2011-04-01 at 17:15 +0530, Vivek Natarajan wrote:
>> In a highly noisy environment, since the driver could not send
>> the packets out in 100ms, some applications stall and eventually
>> fail as mac80211 stops the netdevice queues for flush when power
>> save is triggered. Most of the WMM testcases fail in the Wifi
>> testing because of this.
>> Increasing the dynamic ps timeout to 200-300ms helps but in noisy
>> channel conditions even if there is a continuous tx traffic, mac80211
>> tries to go into PS. The new implementation checks for any frames
>> queued in the driver tx queues and based on that mac80211 decides
>> to go for power save. This also prevents redundant stopping of
>> netdevice queues which helps more applications to run properly.
>
> I think as an API change this needs more justification. Why does this
> need a new callback, for example?

Ideally the driver should enter power save only when there is no tx
frame. When there are about 10 APs in the environment the tx rate of
the driver drops and the application slows down since it has not yet
received ACKs for the frames already queued in the hardware. Since
this ACK may take more than 100ms, stopping the dev queues for
entering PS at this stage breaks applications, WMM test case in my
testing. If there are tx_frames already pending in the queue,
postponing the PS logic helps to avoid redundant queue stops. Since, I
could not find any other way in mac80211 to see if the driver has not
completed the transmission of any frame, a new API to check for
pending frames is used. When power save is enabled by default and in a
noisy environment, this API certainly helps in improving the average
throughput. Any other idea?

Vivek.

2011-04-01 11:45:35

by Vivek Natarajan

[permalink] [raw]
Subject: [RFC PATCH 2/2] ath9k: Implement dev_tx_frames_pending callback.

This function returns true if there is atleast one frame
in any one of the tx queues.

Signed-off-by: Vivek Natarajan <[email protected]>
---
drivers/net/wireless/ath/ath9k/main.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 6a41302..4506968 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -2186,6 +2186,21 @@ out:
ath9k_ps_restore(sc);
}

+static bool ath9k_tx_frames_pending(struct ieee80211_hw *hw)
+{
+ struct ath_softc *sc = hw->priv;
+ int i;
+
+ for (i = 0; i < ATH9K_NUM_TX_QUEUES; i++) {
+ if (!ATH_TXQ_SETUP(sc, i))
+ continue;
+
+ if (ath9k_has_pending_frames(sc, &sc->tx.txq[i]))
+ return true;
+ }
+ return false;
+}
+
struct ieee80211_ops ath9k_ops = {
.tx = ath9k_tx,
.start = ath9k_start,
@@ -2208,4 +2223,5 @@ struct ieee80211_ops ath9k_ops = {
.rfkill_poll = ath9k_rfkill_poll_state,
.set_coverage_class = ath9k_set_coverage_class,
.flush = ath9k_flush,
+ .tx_frames_pending = ath9k_tx_frames_pending,
};
--
1.6.3.3


2011-04-07 10:33:56

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Fri, 2011-04-01 at 17:42 +0530, Vivek Natarajan wrote:

> Ideally the driver should enter power save only when there is no tx
> frame. When there are about 10 APs in the environment the tx rate of
> the driver drops and the application slows down since it has not yet
> received ACKs for the frames already queued in the hardware. Since
> this ACK may take more than 100ms, stopping the dev queues for
> entering PS at this stage breaks applications, WMM test case in my
> testing. If there are tx_frames already pending in the queue,
> postponing the PS logic helps to avoid redundant queue stops. Since, I
> could not find any other way in mac80211 to see if the driver has not
> completed the transmission of any frame, a new API to check for
> pending frames is used. When power save is enabled by default and in a
> noisy environment, this API certainly helps in improving the average
> throughput. Any other idea?

I'm not sure I understand. Where does the 100ms come from? This code
flushes the TX queues, which can take as much time as it wants, no? How
does it break applications?

I'll agree that entering powersave is pointless if that means you won't
be able to transmit all frames. It seems to me that maybe instead we
should give flush() a timeout, and if it can't complete in that time, we
can postpone the powersave then?

johannes


2011-04-11 14:17:43

by Vivek Natarajan

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Thu, Apr 7, 2011 at 5:17 PM, Vivek Natarajan <[email protected]> wrote:
> On Thu, Apr 7, 2011 at 4:03 PM, Johannes Berg <[email protected]> wrote:
>> I'll agree that entering powersave is pointless if that means you won't
>> be able to transmit all frames. It seems to me that maybe instead we
>> should give flush() a timeout, and if it can't complete in that time, we
>> can postpone the powersave then?
>
> This seems better as it addresses both the issues I listed above. The
> flush() timeout should not be high enough for the application to
> timeout and also flush() should not drop any frame. Maybe it can
> return a status if the pending frames are not completed in that
> timeout so that PS can be deferred. I will check this out.

I tried the following as explained above:

@@ -765,11 +766,15 @@ void ieee80211_dynamic_ps_enable_work(struct
work_struct *work)
* Flush all the frames queued in the driver before
* going to power save
*/
- drv_flush(local, false);
- ieee80211_send_nullfunc(local, sdata, 1);
+ if (drv_flush(local, false))
//FLUSH 1
+ mod_timer(&local->dynamic_ps_timer, jiffies +
+ msecs_to_jiffies(conf->dynamic_ps_timeout));
+ else {
+ ieee80211_send_nullfunc(local, sdata, 1);

- /* Flush once again to get the tx status of nullfunc frame */
- drv_flush(local, false);
+ /* Flush to get the tx status of nullfunc frame */
+ drv_flush(local, false); //FLUSH 2
+ }
}

if (!((local->hw.flags & IEEE80211_HW_REPORTS_TX_ACK_STATUS) &&

When there are pending frames in FLUSH 1, ps_timer is started which
will execute in 100ms. If FLUSH 2 also takes 200ms(flush timeout in
ath9k) to complete, dynamic ps timer will be triggered at once and the
netif queues continue to be stopped. Hence flush timeout should not be
more than the dynamic ps timeout for this case. So, postponing PS
based on flush is not working. Can we have the earlier implementation
of new callback to check for pending frames before stopping the queues
for PS?

Vivek.

2011-04-01 11:49:51

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] mac80211: Check for queued frames before entering power save.

On Fri, 2011-04-01 at 17:15 +0530, Vivek Natarajan wrote:
> In a highly noisy environment, since the driver could not send
> the packets out in 100ms, some applications stall and eventually
> fail as mac80211 stops the netdevice queues for flush when power
> save is triggered. Most of the WMM testcases fail in the Wifi
> testing because of this.
> Increasing the dynamic ps timeout to 200-300ms helps but in noisy
> channel conditions even if there is a continuous tx traffic, mac80211
> tries to go into PS. The new implementation checks for any frames
> queued in the driver tx queues and based on that mac80211 decides
> to go for power save. This also prevents redundant stopping of
> netdevice queues which helps more applications to run properly.

I think as an API change this needs more justification. Why does this
need a new callback, for example?

johannes