This is an updated version of my previous patch series to move TXQ
scheduling into mac80211, allowing more drivers to take advantage of
airtime fairness scheduling. This version keeps compatibility with the
old wake_tx_queue API, so drivers can be ported to the new API one at a
time. For now, only ath9k is changed.
There's also an update to the scheduling to use a sequence number to
keep track of scheduling rounds, so drivers have a clean way to loop
through all queued TXQs exactly once without having to keep track of the
first TXQ seen in a round.
Finally, the airtime fairness scheduler also incorporates weights into
the scheduling, which allows userspace to implement policies for how
airtime is divided between stations. I have a patch series for hostapd
to implement configurable policies, that I will post separately once the
kernel side is in place.
---
Toke Høiland-Jørgensen (4):
mac80211: Add TXQ scheduling API
mac80211: Add airtime accounting and scheduling to TXQs
cfg80211: Add airtime statistics and settings
ath9k: Switch to mac80211 TXQ scheduling and airtime APIs
drivers/net/wireless/ath/ath9k/ath9k.h | 14 --
drivers/net/wireless/ath/ath9k/debug.c | 3
drivers/net/wireless/ath/ath9k/debug.h | 8 -
drivers/net/wireless/ath/ath9k/debug_sta.c | 54 ------
drivers/net/wireless/ath/ath9k/init.c | 3
drivers/net/wireless/ath/ath9k/recv.c | 9 -
drivers/net/wireless/ath/ath9k/xmit.c | 245 ++++++++--------------------
include/net/cfg80211.h | 15 +-
include/net/mac80211.h | 78 ++++++++-
include/uapi/linux/nl80211.h | 17 ++
net/mac80211/agg-tx.c | 2
net/mac80211/cfg.c | 6 +
net/mac80211/debugfs.c | 3
net/mac80211/debugfs_sta.c | 35 ++++
net/mac80211/ieee80211_i.h | 10 +
net/mac80211/main.c | 5 +
net/mac80211/sta_info.c | 42 +++++
net/mac80211/sta_info.h | 13 +
net/mac80211/tx.c | 101 ++++++++++++
net/wireless/core.c | 2
net/wireless/nl80211.c | 50 ++++++
21 files changed, 449 insertions(+), 266 deletions(-)
On 2018-07-11 13:48, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
>> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
>> [...]
>>> +/**
>>> + * ieee80211_schedule_txq - add txq to scheduling loop
>>> + *
>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>> + * @txq: pointer obtained from station or virtual interface
>>> + * @reset_seqno: Whether to reset the internal scheduling sequence
>>> number,
>>> + * allowing this txq to appear again in the current
>>> scheduling
>>> + * round (see doc for ieee80211_next_txq()).
>>> + *
>>> + * Returns %true if the txq was actually added to the scheduling,
>>> + * %false otherwise.
>>> + */
>>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>>> + struct ieee80211_txq *txq,
>>> + bool reset_seqno);
>>> +
>>> +/**
>>> + * ieee80211_next_txq - get next tx queue to pull packets from
>>> + *
>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>> + * @ac: filter returned txqs with this AC number. Pass -1 for no
>>> filtering.
>>> + * @inc_seqno: Whether to increase the scheduling sequence number.
>>> Setting this
>>> + * to true signifies the start of a new scheduling
>>> round.
>>> Each TXQ
>>> + * will only be returned exactly once in each round
>>> (unless its
>>> + * sequence number is explicitly reset when calling
>>> + * ieee80211_schedule_txq()).
>>> + *
>> Toke,
>>
>> Seems like seqno is internal to mac80211 and meant for active_txq list
>> manipulation. If so, why would drivers have to worry about increment
>> or resetting seqno?
>
> The drivers need to be able to signal when they start a new "scheduling
> round" (which is the parameter to next_txq()), and when a queue should
> be immediately rescheduled (which is the parameter to schedule_txq()).
> See the subsequent patch to ath9k for how this is used; the latter is
> signalled when a TXQ successfully dequeued an aggregate...
>
> Now, you're right that the choice to track this via a sequence number
> is
> an implementation detail internal to mac80211... so maybe the
> parameters
> should be called something different.
>
>> IMHO to avoid over serving same txq, two lists (activeq and waitq) can
>> be used and always add new txq into waitq list. So that driver will
>> not worry about mac80211 txq manipulation. Please correct me If Im
>> wrong.
>>
>> ieee80211_schedule_txq
>> - if schedule_order empty, add txq into waitq list tail.
>>
>> ieee80211_next_txq
>> - if activeq empty,
>> - move waitq list into activeq
>>
>> - if activeq not empty
>> - fetch appropriate txq from activeq
>> - remove txq from activeq list.
>>
>> - If txq found, return txq else return NULL
>
>
> Erm, how would this prevent an infinite loop? With this scheme, at some
> point, ieee80211_next_txq() removes the last txq from activeq, and
> returns that. Then, when it is called again the next time the driver
> loops, it's back to the first case (activeq empty, waitq non-empty); so
> it'll move waitq back as activeq and start over... Only the driver
> really knows when it is starting a logical "loop through all active
> TXQs".
>
Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
activeq only
and keep processed txqs separately. Having single list eventually needs
tracking
mechanism. The point is that once activeq becomes empty, splice waitq
list and
return NULL. So that driver can break from the loop.
ieee80211_next_txq
- if activeq empty,
- move waitq list into activeq
- return NULL
- if activeq not empty
- fetch appropriate txq from activeq
- remove txq from activeq list.
- If txq found, return txq else return NULL
> Also, for airtime fairness, the queues are not scheduled strictly
> round-robin, so the dual-list scheme wouldn't work there either...
>
As you know, ath10k driver will operate in two tx modes (push-only,
push-pull).
These modes will be switched dynamically depends on firmware load or
resource
availability. In push-pull mode, firmware will query N number of frames
for set of
STA, TID. So the driver will directly dequeue N number of frames from
given txq.
In this case txq ordering alone wont help. I am planning to add below
changes in
exiting API and add new API ieee80211_reorder_txq.
ieee80211_txq_get_depth
- return deficit status along with frm_cnt
ieee80211_reorder_txq
- if txq deficit > 0
- return;
- if txq is last
- return
- delete txq from list
- move it to tail
- update deficit by quantum
ath10k_htt_rx_tx_fetch_ind
- get txq deficit status
- if txq deficit > 0
- dequeue skb
- else if deficit < 0
- return NULL
Please share your thoughts.
-Rajkumar
Rajkumar Manoharan <[email protected]> writes:
> On 2018-07-12 16:13, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Rajkumar Manoharan <[email protected]> writes:
>>=20
>>> On 2018-07-11 13:48, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>>> Rajkumar Manoharan <[email protected]> writes:
>>>>=20
>>>>> On 2018-07-09 09:37, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>>>> [...]
>>>>>> +/**
> [...]
>
>>>> Erm, how would this prevent an infinite loop? With this scheme, at=20
>>>> some
>>>> point, ieee80211_next_txq() removes the last txq from activeq, and
>>>> returns that. Then, when it is called again the next time the driver
>>>> loops, it's back to the first case (activeq empty, waitq non-empty);=20
>>>> so
>>>> it'll move waitq back as activeq and start over... Only the driver
>>>> really knows when it is starting a logical "loop through all active
>>>> TXQs".
>>>>=20
>>> Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
>>> activeq only and keep processed txqs separately. Having single list
>>> eventually needs tracking mechanism. The point is that once activeq
>>> becomes empty, splice waitq list and return NULL. So that driver can
>>> break from the loop.
>>>=20
>>> ieee80211_next_txq
>>> - if activeq empty,
>>> - move waitq list into activeq
>>> - return NULL
>>>=20
>>> - if activeq not empty
>>> - fetch appropriate txq from activeq
>>> - remove txq from activeq list.
>>>=20
>>> - If txq found, return txq else return NULL
>>=20
>> Right, so this would ensure the driver only sees each TXQ once *if it
>> keeps looping*. But it doesn't necessarily; if the hardware queues fill
>> up, for instance, it might abort earlier. In that case it would need to
>> signal mac80211 that it is done for now, which is equivalent to
>> signalling when it starts a scheduler round.
>>=20
> Hmm... I thought driver will call ieee80211_schedule_txq when it runs
> out of hardware descriptor and break the loop. The serving txq will be
> added back to head of activeq list. no?
Yes, and then the next one will be serviced... It's basically:
while (!hwq_is_full()) {
txq =3D next_txq():
build_one_aggr(txq); // may or may not succeed
if (!empty(txq))
schedule_txq(txq);
}
It is not generally predictable how many times this will loop before
exiting...
>>>> Also, for airtime fairness, the queues are not scheduled strictly
>>>> round-robin, so the dual-list scheme wouldn't work there either...
>>>>=20
>>> As you know, ath10k driver will operate in two tx modes (push-only,
>>> push-pull). These modes will be switched dynamically depends on
>>> firmware load or resource availability. In push-pull mode, firmware
>>> will query N number of frames for set of STA, TID.
>>=20
>> Ah, so it will look up the TXQ without looping through the list of
>> pending queues at all? Didn't realise that is what it's doing; yeah,
>> that would be problematic for airtime fairness :)
>>=20
>>> So the driver will directly dequeue N number of frames from given txq.
>>> In this case txq ordering alone wont help. I am planning to add below
>>> changes in exiting API and add new API ieee80211_reorder_txq.
>>>=20
>>> ieee80211_txq_get_depth
>>> - return deficit status along with frm_cnt
>>>=20
>>> ieee80211_reorder_txq
>>> - if txq deficit > 0
>>> - return;
>>> - if txq is last
>>> - return
>>> - delete txq from list
>>> - move it to tail
>>> - update deficit by quantum
>>>=20
>>> ath10k_htt_rx_tx_fetch_ind
>>> - get txq deficit status
>>> - if txq deficit > 0
>>> - dequeue skb
>>> - else if deficit < 0
>>> - return NULL
>>>=20
>>> Please share your thoughts.
>>=20
>> Hmm, not sure exactly how this would work; seems a little complicated?
>> Also, I'd rather if drivers were completely oblivious to the deficit;
>> that is a bit of an implementation detail...
>>=20
> Agree.. Initially I thought of adding deficit check in=20
> ieee80211_tx_dequeue.
> But It will be overhead of taking activeq_lock for every skbs. Perhaps
> it can be renamed as allowed_to_dequeue instead of deficit.
>
>> We could have an ieee80211_txq_may_pull(); or maybe just have
>> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>>=20
> As I said earlier, checking deficit for every skb will be an overhead.
> It should be done once before accessing txq.
Well, it could conceivably be done in a way that doesn't require taking
the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>From an API point of view I think that is more consistent with what we
have already...
>> the reasonable thing for the driver to do, then, would be to ask
>> ieee80211_next_txq() for another TXQ to pull from if the current one
>> doesn't work for whatever reason.
>>=20
>> Would that work for push-pull mode?
>>=20
> Not really. Driver shouldn't send other txq data instead of asked one.
I didn't necessarily mean immediately. As long as it does it eventually.
If a TXQ's deficit runs negative, that TXQ will not be allowed to send
again until its deficit has been restored to positive through enough
cycles of the loop in next_txq().=20
> In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
> So the driver not supposed to send other txq's data.
Hmm, it'll actually be interesting to see how the airtime fairness
scheduler interacts with MU-MIMO. I'm not quite sure that it'll be in a
good way; the DRR scheduler generally only restores one TXQ to positive
deficit at a time, so it may be that MU-MIMO will break completely and
we'll have to come up with another scheduling algorithm.
How does the firmware the airtime of a MU-MIMO transmission to each of
the involved stations?
-Toke
On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
[...]
>> Hmm... I thought driver will call ieee80211_schedule_txq when it runs
>> out of hardware descriptor and break the loop. The serving txq will be
>> added back to head of activeq list. no?
>
> Yes, and then the next one will be serviced... It's basically:
>
> while (!hwq_is_full()) {
> txq = next_txq():
> build_one_aggr(txq); // may or may not succeed
> if (!empty(txq))
> schedule_txq(txq);
> }
>
> It is not generally predictable how many times this will loop before
> exiting...
>
Agree.. It would be better If the driver does not worry about txq
sequence
numbering. Perhaps one more API (ieee80211_first_txq) could solve this.
Will leave it to you.
>>>>
>>>> ieee80211_txq_get_depth
>>>> - return deficit status along with frm_cnt
>>>>
>>>> ieee80211_reorder_txq
>>>> - if txq deficit > 0
>>>> - return;
>>>> - if txq is last
>>>> - return
>>>> - delete txq from list
>>>> - move it to tail
>>>> - update deficit by quantum
>>>>
>>>> ath10k_htt_rx_tx_fetch_ind
>>>> - get txq deficit status
>>>> - if txq deficit > 0
>>>> - dequeue skb
>>>> - else if deficit < 0
>>>> - return NULL
>>>>
>>>> Please share your thoughts.
>>>
>>> Hmm, not sure exactly how this would work; seems a little
>>> complicated?
>>> Also, I'd rather if drivers were completely oblivious to the deficit;
>>> that is a bit of an implementation detail...
>>>
>> Agree.. Initially I thought of adding deficit check in
>> ieee80211_tx_dequeue.
>> But It will be overhead of taking activeq_lock for every skbs. Perhaps
>> it can be renamed as allowed_to_dequeue instead of deficit.
>>
>>> We could have an ieee80211_txq_may_pull(); or maybe just have
>>> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>>>
>> As I said earlier, checking deficit for every skb will be an overhead.
>> It should be done once before accessing txq.
>
> Well, it could conceivably be done in a way that doesn't require taking
> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
> From an API point of view I think that is more consistent with what we
> have already...
>
Make sense. ieee80211_txq_may_pull would be better place to decide
whether
given txq is allowed for transmission. It also makes drivers do not have
to
worry about deficit. Still I may need ieee80211_reorder_txq API after
processing txq. isn't it?
>>> the reasonable thing for the driver to do, then, would be to ask
>>> ieee80211_next_txq() for another TXQ to pull from if the current one
>>> doesn't work for whatever reason.
>>>
>>> Would that work for push-pull mode?
>>>
>> Not really. Driver shouldn't send other txq data instead of asked one.
>
> I didn't necessarily mean immediately. As long as it does it
> eventually.
> If a TXQ's deficit runs negative, that TXQ will not be allowed to send
> again until its deficit has been restored to positive through enough
> cycles of the loop in next_txq().
>
Thats true. Are you suggesting to run the loop until the txq deficit
becomes
positive?
>> In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
>> So the driver not supposed to send other txq's data.
>
> Hmm, it'll actually be interesting to see how the airtime fairness
> scheduler interacts with MU-MIMO. I'm not quite sure that it'll be in a
> good way; the DRR scheduler generally only restores one TXQ to positive
> deficit at a time, so it may be that MU-MIMO will break completely and
> we'll have to come up with another scheduling algorithm.
>
In push-pull method, driver reports to firmware that number of frames
queued for each tid per station by wake_tx_queue. Later firmware will
query
N frames from each TID and after dequeue driver will update remaining
frames
for that tid. In ATF case, when driver is not able to dequeue frames,
driver
will simply update remaining frames. The consecutive fetch_ind get
opportunity
to dequeue the frames. By This way, transmission for serving client will
be paused
for a while and opportunity will be given to others.
-Rajkumar
Rajkumar Manoharan <[email protected]> writes:
> On 2018-07-19 07:18, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Rajkumar Manoharan <[email protected]> writes:
>>=20
>>> On 2018-07-13 06:39, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>>> Rajkumar Manoharan <[email protected]> writes:
>
>>>> It is not generally predictable how many times this will loop before
>>>> exiting...
>>>>=20
>>> Agree.. It would be better If the driver does not worry about txq
>>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>>> solve this. Will leave it to you.
>>=20
>> That is basically what the second parameter to next_txq() does in the
>> current patchset. It just needs to be renamed...
>>=20
> Agree. As next_txq() increments seqno while starting loop for each AC,
> It seems bit confusing. i.e A single ath_txq_schedule_all call will
> bump schedule_seqno by 4. right?
Hmm, good point. Guess there should be one seqno per ac...
> Let assume below case where CPU1 is dequeuing skb from isr context and
> CPU2 is enqueuing skbs into same txq.
>
> CPU1 CPU2
> ---- ----
> ath_txq_schedule
> -> ieee80211_next_txq(first)
> -> inc_seq
> -> delete from list
> -> txq->seqno =3D local->seqno
> ieee80211_tx/fast_xmit
> -> ieee80211_queue_skb
> ->=20
> ieee80211_schedule_txq(reset_seqno)
> -> list_add
> -> txqi->seqno=
=20
> =3D local->seqno - 1
>
> In above sequence, it is quite possible that the same txq will be
> served repeatedly and other txqs will be starved. am I right? IMHO
> seqno mechanism will not guarantee that txqs will be processed only
> once in an iteration.
Yeah, ieee80211_queue_skb() shouldn't set reset_seqno; was experimenting
with that, and guess I ended up picking the wrong value. Thanks for
pointing that out :)
>>>> Well, it could conceivably be done in a way that doesn't require=20
>>>> taking
>>>> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>>>> From an API point of view I think that is more consistent with what=20
>>>> we
>>>> have already...
>>>>=20
>>>=20
>>> Make sense. ieee80211_txq_may_pull would be better place to decide
>>> whether given txq is allowed for transmission. It also makes drivers
>>> do not have to worry about deficit. Still I may need
>>> ieee80211_reorder_txq API after processing txq. isn't it?
>>=20
>> The way I was assuming this would work (and what ath9k does), is that a
>> driver only operates on one TXQ at a time; so it can get that txq,
>> process it, and re-schedule it. In which case no other API is needed;
>> the rotating can be done in next_txq(), and schedule_txq() can insert
>> the txq back into the rotation as needed.
>>=20
>> However, it sounds like this is not how ath10k does things? See below.
>>=20
> correct. The current rotation works only in push-only mode. i.e when
> firmware is not deciding txqs and the driver picks priority txq from
> active_txqs list. Unfortunately rotation won't work when the driver
> selects txq other than first in the list. In any case separate API
> needed for rotation when the driver is processing few packets from txq
> instead of all pending frames.
Rotation is not dependent on the TXQ draining completely. Dequeueing a
few packets, then rescheduling the TXQ is fine.
>> And if so, how does that interact with ath10k_mac_tx_push_pending()?
>> That function is basically doing the same thing that I explained above
>> for ath9k; in the previous version of this patch series I modified that
>> to use next_txq(). But is that a different TX path, or am I
>> misunderstanding you?
>>=20
>> If you could point me to which parts of the ath10k code I should be
>> looking at, that would be helpful as well :)
>>=20
> Depends on firmware htt_tx_mode (push/push_pull),
> ath10k_mac_tx_push_pending() downloads all/partial frames to firmware.
> Please take a look at ath10k_mac_tx_can_push() in push_pending(). Let
> me know If you need any other details.
Right, so looking at this, it looks like the driver will loop through
all the available TXQs, trying to dequeue from each of them if
ath10k_mac_tx_can_push() returns true, right? This should actually work
fine with the next_txq() / schedule_txq() API. Without airtime fairness
that will just translate into basically what the driver is doing now,
and I don't think more API functions are needed, as long as that is the
only point in the driver that pushes packets to the device...
With airtime fairness, what is going to happen is that the loop is only
going to get a single TXQ (the first one with a positive deficit), then
exit. Once that station has transmitted something and its deficit runs
negative, it'll get rotated and the next one will get a chance. So I
expect that airtime fairness will probably work, but MU-MIMO will
break...
Don't think we can do MU-MIMO with a DRR airtime scheduler; we'll have
to replace it with something different. But I think the same next_txq()
/ schedule_txq() API will work for that as well...
-Toke
This adds airtime statistics to the cfg80211 station dump, and also adds
a new parameter to set the airtime weight of each station. The latter
allows userspace to implement policies for different stations by varying
their weights.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
include/net/cfg80211.h | 15 +++++++++++--
include/uapi/linux/nl80211.h | 14 ++++++++++++
net/mac80211/cfg.c | 6 +++++
net/mac80211/main.c | 2 +-
net/mac80211/sta_info.c | 15 +++++++++++++
net/mac80211/sta_info.h | 4 ---
net/wireless/core.c | 2 ++
net/wireless/nl80211.c | 50 ++++++++++++++++++++++++++++++++++++++++++
8 files changed, 101 insertions(+), 7 deletions(-)
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 9ba1f289c439..1480eccbffe9 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -988,6 +988,7 @@ enum station_parameters_apply_mask {
* @support_p2p_ps: information if station supports P2P PS mechanism
* @he_capa: HE capabilities of station
* @he_capa_len: the length of the HE capabilities
+ * @airtime_weight: airtime scheduler weight for this station
*/
struct station_parameters {
const u8 *supported_rates;
@@ -1017,6 +1018,7 @@ struct station_parameters {
int support_p2p_ps;
const struct ieee80211_he_cap_elem *he_capa;
u8 he_capa_len;
+ u16 airtime_weight;
};
/**
@@ -1284,6 +1286,8 @@ struct cfg80211_tid_stats {
* @rx_beacon_signal_avg: signal strength average (in dBm) for beacons received
* from this peer
* @rx_duration: aggregate PPDU duration(usecs) for all the frames from a peer
+ * @tx_duration: aggregate PPDU duration(usecs) for all the frames to a peer
+ * @airtime_weight: current airtime scheduling weight
* @pertid: per-TID statistics, see &struct cfg80211_tid_stats, using the last
* (IEEE80211_NUM_TIDS) index for MSDUs not encapsulated in QoS-MPDUs.
* Note that this doesn't use the @filled bit, but is used if non-NULL.
@@ -1330,10 +1334,12 @@ struct station_info {
u32 expected_throughput;
- u64 rx_beacon;
+ u64 tx_duration;
u64 rx_duration;
+ u64 rx_beacon;
u8 rx_beacon_signal_avg;
struct cfg80211_tid_stats *pertid;
+ u16 airtime_weight;
s8 ack_signal;
s8 avg_ack_signal;
};
@@ -2347,7 +2353,8 @@ enum cfg80211_connect_params_changed {
* @WIPHY_PARAM_DYN_ACK: dynack has been enabled
* @WIPHY_PARAM_TXQ_LIMIT: TXQ packet limit has been changed
* @WIPHY_PARAM_TXQ_MEMORY_LIMIT: TXQ memory limit has been changed
- * @WIPHY_PARAM_TXQ_QUANTUM: TXQ scheduler quantum
+ * @WIPHY_PARAM_TXQ_QUANTUM: TXQ scheduler quantum (bytes)
+ * @WIPHY_PARAM_AIRTIME_QUANTUM: Airtime scheduler quantum (usec)
*/
enum wiphy_params_flags {
WIPHY_PARAM_RETRY_SHORT = 1 << 0,
@@ -2359,8 +2366,11 @@ enum wiphy_params_flags {
WIPHY_PARAM_TXQ_LIMIT = 1 << 6,
WIPHY_PARAM_TXQ_MEMORY_LIMIT = 1 << 7,
WIPHY_PARAM_TXQ_QUANTUM = 1 << 8,
+ WIPHY_PARAM_AIRTIME_QUANTUM = 1 << 9,
};
+#define IEEE80211_DEFAULT_AIRTIME_QUANTUM 300 /* usec */
+
/**
* struct cfg80211_pmksa - PMK Security Association
*
@@ -4104,6 +4114,7 @@ struct wiphy {
u32 txq_limit;
u32 txq_memory_limit;
u32 txq_quantum;
+ u16 airtime_quantum;
char priv[0] __aligned(NETDEV_ALIGN);
};
diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index 28550d01948f..c3ac69640475 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -2241,6 +2241,11 @@ enum nl80211_commands {
* association request when used with NL80211_CMD_NEW_STATION). Can be set
* only if %NL80211_STA_FLAG_WME is set.
*
+ * @NL80211_ATTR_AIRTIME_WEIGHT: Station's weight when scheduled by the airtime
+ * scheduler.
+ * @NL80211_ATTR_AIRTIME_QUANTUM: Airtime scheduler quantum (usec). Airtime
+ * share given to each station on each round of the airtime scheduler.
+ *
* @NUM_NL80211_ATTR: total number of nl80211_attrs available
* @NL80211_ATTR_MAX: highest attribute number currently defined
* @__NL80211_ATTR_AFTER_LAST: internal use
@@ -2682,6 +2687,9 @@ enum nl80211_attrs {
NL80211_ATTR_HE_CAPABILITY,
+ NL80211_ATTR_AIRTIME_WEIGHT,
+ NL80211_ATTR_AIRTIME_QUANTUM,
+
/* add attributes here, update the policy in nl80211.c */
__NL80211_ATTR_AFTER_LAST,
@@ -3052,6 +3060,10 @@ enum nl80211_sta_bss_param {
* @NL80211_STA_INFO_ACK_SIGNAL: signal strength of the last ACK frame(u8, dBm)
* @NL80211_STA_INFO_DATA_ACK_SIGNAL_AVG: avg signal strength of (data)
* ACK frame (s8, dBm)
+ * @NL80211_STA_INFO_TX_DURATION: aggregate PPDU duration for all frames
+ * sent to the station (u64, usec)
+ * @NL80211_STA_INFO_AIRTIME_WEIGHT: current airtime weight for station
+ * (u16, usec)
* @__NL80211_STA_INFO_AFTER_LAST: internal
* @NL80211_STA_INFO_MAX: highest possible station info attribute
*/
@@ -3092,6 +3104,8 @@ enum nl80211_sta_info {
NL80211_STA_INFO_PAD,
NL80211_STA_INFO_ACK_SIGNAL,
NL80211_STA_INFO_DATA_ACK_SIGNAL_AVG,
+ NL80211_STA_INFO_TX_DURATION,
+ NL80211_STA_INFO_AIRTIME_WEIGHT,
/* keep last */
__NL80211_STA_INFO_AFTER_LAST,
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index 02f3672e7b5e..5467e17fb0b2 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -1431,6 +1431,9 @@ static int sta_apply_parameters(struct ieee80211_local *local,
if (ieee80211_vif_is_mesh(&sdata->vif))
sta_apply_mesh_params(local, sta, params);
+ if (params->airtime_weight)
+ sta->airtime.weight = params->airtime_weight;
+
/* set the STA state after all sta info from usermode has been set */
if (test_sta_flag(sta, WLAN_STA_TDLS_PEER) ||
set & BIT(NL80211_STA_FLAG_ASSOCIATED)) {
@@ -2386,6 +2389,9 @@ static int ieee80211_set_wiphy_params(struct wiphy *wiphy, u32 changed)
WIPHY_PARAM_TXQ_QUANTUM))
ieee80211_txq_set_params(local);
+ if (changed & WIPHY_PARAM_AIRTIME_QUANTUM)
+ local->airtime_quantum = wiphy->airtime_quantum;
+
return 0;
}
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index e597b22ce403..e65c2abb2a54 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -637,7 +637,7 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
INIT_LIST_HEAD(&local->active_txqs);
spin_lock_init(&local->active_txq_lock);
local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
- local->airtime_quantum = IEEE80211_AIRTIME_QUANTUM;
+ local->airtime_quantum = IEEE80211_DEFAULT_AIRTIME_QUANTUM;
INIT_LIST_HEAD(&local->chanctx_list);
mutex_init(&local->chanctx_mtx);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 4e1e2628fe7d..f88f02df9e3a 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -2210,6 +2210,21 @@ void sta_set_sinfo(struct sta_info *sta, struct station_info *sinfo,
sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_FAILED);
}
+ if (!(sinfo->filled & BIT(NL80211_STA_INFO_RX_DURATION))) {
+ sinfo->rx_duration = sta->airtime.rx_airtime;
+ sinfo->filled |= BIT(NL80211_STA_INFO_RX_DURATION);
+ }
+
+ if (!(sinfo->filled & BIT(NL80211_STA_INFO_TX_DURATION))) {
+ sinfo->tx_duration = sta->airtime.tx_airtime;
+ sinfo->filled |= BIT(NL80211_STA_INFO_TX_DURATION);
+ }
+
+ if (!(sinfo->filled & BIT(NL80211_STA_INFO_AIRTIME_WEIGHT))) {
+ sinfo->airtime_weight = sta->airtime.weight;
+ sinfo->filled |= BIT(NL80211_STA_INFO_AIRTIME_WEIGHT);
+ }
+
sinfo->rx_dropped_misc = sta->rx_stats.dropped;
if (sta->pcpu_rx_stats) {
for_each_possible_cpu(cpu) {
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index d87a6c31dcb6..d8210ebf0cdd 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -127,10 +127,6 @@ enum ieee80211_agg_stop_reason {
AGG_STOP_DESTROY_STA,
};
-/* How much to increase airtime deficit on each scheduling round, by default
- * (userspace can change this per phy)
- */
-#define IEEE80211_AIRTIME_QUANTUM 300 /* usec */
/* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
#define AIRTIME_USE_TX BIT(0)
#define AIRTIME_USE_RX BIT(1)
diff --git a/net/wireless/core.c b/net/wireless/core.c
index a88551f3bc43..6896cb6a3dd2 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -524,6 +524,8 @@ struct wiphy *wiphy_new_nm(const struct cfg80211_ops *ops, int sizeof_priv,
rdev->wiphy.max_sched_scan_plans = 1;
rdev->wiphy.max_sched_scan_plan_interval = U32_MAX;
+ rdev->wiphy.airtime_quantum = IEEE80211_DEFAULT_AIRTIME_QUANTUM;
+
return &rdev->wiphy;
}
EXPORT_SYMBOL(wiphy_new_nm);
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 8db59129c095..293bc183ae91 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -428,8 +428,12 @@ static const struct nla_policy nl80211_policy[NUM_NL80211_ATTR] = {
[NL80211_ATTR_TXQ_LIMIT] = { .type = NLA_U32 },
[NL80211_ATTR_TXQ_MEMORY_LIMIT] = { .type = NLA_U32 },
[NL80211_ATTR_TXQ_QUANTUM] = { .type = NLA_U32 },
+
[NL80211_ATTR_HE_CAPABILITY] = { .type = NLA_BINARY,
.len = NL80211_HE_MAX_CAPABILITY_LEN },
+
+ [NL80211_ATTR_AIRTIME_WEIGHT] = { .type = NLA_U16 },
+ [NL80211_ATTR_AIRTIME_QUANTUM] = { .type = NLA_U16 },
};
/* policy for the key attributes */
@@ -2088,6 +2092,12 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
goto nla_put_failure;
}
+ if (wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)
+ && nla_put_u16(msg, NL80211_ATTR_AIRTIME_QUANTUM,
+ rdev->wiphy.airtime_quantum))
+ goto nla_put_failure;
+
/* done */
state->split_start = 0;
break;
@@ -2466,6 +2476,7 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
u32 frag_threshold = 0, rts_threshold = 0;
u8 coverage_class = 0;
u32 txq_limit = 0, txq_memory_limit = 0, txq_quantum = 0;
+ u16 airtime_quantum = 0;
ASSERT_RTNL();
@@ -2699,11 +2710,23 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
changed |= WIPHY_PARAM_TXQ_QUANTUM;
}
+ if (info->attrs[NL80211_ATTR_AIRTIME_QUANTUM]) {
+ if (!wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ return -EOPNOTSUPP;
+ airtime_quantum = nla_get_u16(
+ info->attrs[NL80211_ATTR_AIRTIME_QUANTUM]);
+ if (!airtime_quantum)
+ return -EINVAL;
+ changed |= WIPHY_PARAM_AIRTIME_QUANTUM;
+ }
+
if (changed) {
u8 old_retry_short, old_retry_long;
u32 old_frag_threshold, old_rts_threshold;
u8 old_coverage_class;
u32 old_txq_limit, old_txq_memory_limit, old_txq_quantum;
+ u16 old_airtime_quantum;
if (!rdev->ops->set_wiphy_params)
return -EOPNOTSUPP;
@@ -2716,6 +2739,7 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
old_txq_limit = rdev->wiphy.txq_limit;
old_txq_memory_limit = rdev->wiphy.txq_memory_limit;
old_txq_quantum = rdev->wiphy.txq_quantum;
+ old_airtime_quantum = rdev->wiphy.airtime_quantum;
if (changed & WIPHY_PARAM_RETRY_SHORT)
rdev->wiphy.retry_short = retry_short;
@@ -2733,6 +2757,8 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
rdev->wiphy.txq_memory_limit = txq_memory_limit;
if (changed & WIPHY_PARAM_TXQ_QUANTUM)
rdev->wiphy.txq_quantum = txq_quantum;
+ if (changed & WIPHY_PARAM_AIRTIME_QUANTUM)
+ rdev->wiphy.airtime_quantum = airtime_quantum;
result = rdev_set_wiphy_params(rdev, changed);
if (result) {
@@ -2744,6 +2770,7 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
rdev->wiphy.txq_limit = old_txq_limit;
rdev->wiphy.txq_memory_limit = old_txq_memory_limit;
rdev->wiphy.txq_quantum = old_txq_quantum;
+ rdev->wiphy.airtime_quantum = old_airtime_quantum;
return result;
}
}
@@ -4652,6 +4679,11 @@ static int nl80211_send_station(struct sk_buff *msg, u32 cmd, u32 portid,
PUT_SINFO(PLID, plid, u16);
PUT_SINFO(PLINK_STATE, plink_state, u8);
PUT_SINFO_U64(RX_DURATION, rx_duration);
+ PUT_SINFO_U64(TX_DURATION, tx_duration);
+
+ if (wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ PUT_SINFO(AIRTIME_WEIGHT, airtime_weight, u16);
switch (rdev->wiphy.signal_type) {
case CFG80211_SIGNAL_TYPE_MBM:
@@ -5288,6 +5320,17 @@ static int nl80211_set_station(struct sk_buff *skb, struct genl_info *info)
nla_get_u8(info->attrs[NL80211_ATTR_OPMODE_NOTIF]);
}
+ if (info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]) {
+ if (!wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ return -EOPNOTSUPP;
+
+ params.airtime_weight =
+ nla_get_u16(info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]);
+ if (!params.airtime_weight)
+ return -EINVAL;
+ }
+
/* Include parameters for TDLS peer (will check later) */
err = nl80211_set_station_tdls(info, ¶ms);
if (err)
@@ -5426,6 +5469,13 @@ static int nl80211_new_station(struct sk_buff *skb, struct genl_info *info)
return -EINVAL;
}
+ if (info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]) {
+ params.airtime_weight =
+ nla_get_u16(info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]);
+ if (!params.airtime_weight)
+ return -EINVAL;
+ }
+
err = nl80211_parse_sta_channel_info(info, ¶ms);
if (err)
return err;
On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
[...]
> +/**
> + * ieee80211_schedule_txq - add txq to scheduling loop
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @txq: pointer obtained from station or virtual interface
> + * @reset_seqno: Whether to reset the internal scheduling sequence
> number,
> + * allowing this txq to appear again in the current
> scheduling
> + * round (see doc for ieee80211_next_txq()).
> + *
> + * Returns %true if the txq was actually added to the scheduling,
> + * %false otherwise.
> + */
> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
> + struct ieee80211_txq *txq,
> + bool reset_seqno);
> +
> +/**
> + * ieee80211_next_txq - get next tx queue to pull packets from
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @ac: filter returned txqs with this AC number. Pass -1 for no
> filtering.
> + * @inc_seqno: Whether to increase the scheduling sequence number.
> Setting this
> + * to true signifies the start of a new scheduling round.
> Each TXQ
> + * will only be returned exactly once in each round
> (unless its
> + * sequence number is explicitly reset when calling
> + * ieee80211_schedule_txq()).
> + *
Toke,
Seems like seqno is internal to mac80211 and meant for active_txq list
manipulation.
If so, why would drivers have to worry about increment or resetting
seqno? IMHO to
avoid over serving same txq, two lists (activeq and waitq) can be used
and always
add new txq into waitq list. So that driver will not worry about
mac80211 txq
manipulation. Please correct me If Im wrong.
ieee80211_schedule_txq
- if schedule_order empty, add txq into waitq list tail.
ieee80211_next_txq
- if activeq empty,
- move waitq list into activeq
- if activeq not empty
- fetch appropriate txq from activeq
- remove txq from activeq list.
- If txq found, return txq else return NULL
-Rajkumar
Rajkumar Manoharan <[email protected]> writes:
> On 2018-07-13 06:39, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Rajkumar Manoharan <[email protected]> writes:
>>=20
> [...]
>
>>> Hmm... I thought driver will call ieee80211_schedule_txq when it runs
>>> out of hardware descriptor and break the loop. The serving txq will be
>>> added back to head of activeq list. no?
>>=20
>> Yes, and then the next one will be serviced... It's basically:
>>=20
>> while (!hwq_is_full()) {
>> txq =3D next_txq():
>> build_one_aggr(txq); // may or may not succeed
>> if (!empty(txq))
>> schedule_txq(txq);
>> }
>>=20
>> It is not generally predictable how many times this will loop before
>> exiting...
>>=20
> Agree.. It would be better If the driver does not worry about txq
> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
> solve this. Will leave it to you.
That is basically what the second parameter to next_txq() does in the
current patchset. It just needs to be renamed...
>>>>>=20
>>>>> ieee80211_txq_get_depth
>>>>> - return deficit status along with frm_cnt
>>>>>=20
>>>>> ieee80211_reorder_txq
>>>>> - if txq deficit > 0
>>>>> - return;
>>>>> - if txq is last
>>>>> - return
>>>>> - delete txq from list
>>>>> - move it to tail
>>>>> - update deficit by quantum
>>>>>=20
>>>>> ath10k_htt_rx_tx_fetch_ind
>>>>> - get txq deficit status
>>>>> - if txq deficit > 0
>>>>> - dequeue skb
>>>>> - else if deficit < 0
>>>>> - return NULL
>>>>>=20
>>>>> Please share your thoughts.
>>>>=20
>>>> Hmm, not sure exactly how this would work; seems a little=20
>>>> complicated?
>>>> Also, I'd rather if drivers were completely oblivious to the deficit;
>>>> that is a bit of an implementation detail...
>>>>=20
>>> Agree.. Initially I thought of adding deficit check in
>>> ieee80211_tx_dequeue.
>>> But It will be overhead of taking activeq_lock for every skbs. Perhaps
>>> it can be renamed as allowed_to_dequeue instead of deficit.
>>>=20
>>>> We could have an ieee80211_txq_may_pull(); or maybe just have
>>>> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>>>>=20
>>> As I said earlier, checking deficit for every skb will be an overhead.
>>> It should be done once before accessing txq.
>>=20
>> Well, it could conceivably be done in a way that doesn't require taking
>> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>> From an API point of view I think that is more consistent with what we
>> have already...
>>=20
>
> Make sense. ieee80211_txq_may_pull would be better place to decide
> whether given txq is allowed for transmission. It also makes drivers
> do not have to worry about deficit. Still I may need
> ieee80211_reorder_txq API after processing txq. isn't it?
The way I was assuming this would work (and what ath9k does), is that a
driver only operates on one TXQ at a time; so it can get that txq,
process it, and re-schedule it. In which case no other API is needed;
the rotating can be done in next_txq(), and schedule_txq() can insert
the txq back into the rotation as needed.
However, it sounds like this is not how ath10k does things? See below.
>>>> the reasonable thing for the driver to do, then, would be to ask
>>>> ieee80211_next_txq() for another TXQ to pull from if the current one
>>>> doesn't work for whatever reason.
>>>>=20
>>>> Would that work for push-pull mode?
>>>>=20
>>> Not really. Driver shouldn't send other txq data instead of asked one.
>>=20
>> I didn't necessarily mean immediately. As long as it does it=20
>> eventually.
>> If a TXQ's deficit runs negative, that TXQ will not be allowed to send
>> again until its deficit has been restored to positive through enough
>> cycles of the loop in next_txq().
>>=20
>
> Thats true. Are you suggesting to run the loop until the txq deficit
> becomes positive?
Yeah, or rather until the first station with a positive deficit is
found.
>>> In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
>>> So the driver not supposed to send other txq's data.
>>=20
>> Hmm, it'll actually be interesting to see how the airtime fairness
>> scheduler interacts with MU-MIMO. I'm not quite sure that it'll be in a
>> good way; the DRR scheduler generally only restores one TXQ to positive
>> deficit at a time, so it may be that MU-MIMO will break completely and
>> we'll have to come up with another scheduling algorithm.
>>=20
>
> In push-pull method, driver reports to firmware that number of frames
> queued for each tid per station by wake_tx_queue. Later firmware will
> query N frames from each TID and after dequeue driver will update
> remaining frames for that tid. In ATF case, when driver is not able to
> dequeue frames, driver will simply update remaining frames. The
> consecutive fetch_ind get opportunity to dequeue the frames. By This
> way, transmission for serving client will be paused for a while and
> opportunity will be given to others.
This sounds like the driver doesn't do anything other than notify the
firmware that a txq has pending frames, and everything else is handled
by the firmware? Is this the case?
And if so, how does that interact with ath10k_mac_tx_push_pending()?
That function is basically doing the same thing that I explained above
for ath9k; in the previous version of this patch series I modified that
to use next_txq(). But is that a different TX path, or am I
misunderstanding you?
If you could point me to which parts of the ath10k code I should be
looking at, that would be helpful as well :)
-Toke
On 2018-07-21 13:54, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
>> On 2018-07-19 07:18, Toke Høiland-Jørgensen wrote:
>>> Rajkumar Manoharan <[email protected]> writes:
>>>
>>>> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>>>>> Rajkumar Manoharan <[email protected]> writes:
>>
>>>>> It is not generally predictable how many times this will loop
>>>>> before
>>>>> exiting...
>>>>>
>>>> Agree.. It would be better If the driver does not worry about txq
>>>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>>>> solve this. Will leave it to you.
>>>
>>> That is basically what the second parameter to next_txq() does in the
>>> current patchset. It just needs to be renamed...
>>>
>> Agree. As next_txq() increments seqno while starting loop for each AC,
>> It seems bit confusing. i.e A single ath_txq_schedule_all call will
>> bump schedule_seqno by 4. right?
>
> Hmm, good point. Guess there should be one seqno per ac...
>
I would prefer to keep separate list for each AC ;) Prepared one on top
of
your earlier merged changes. Now it needs revisit.
>> Let assume below case where CPU1 is dequeuing skb from isr context and
>> CPU2 is enqueuing skbs into same txq.
>>
>> CPU1 CPU2
>> ---- ----
>> ath_txq_schedule
>> -> ieee80211_next_txq(first)
>> -> inc_seq
>> -> delete from list
>> -> txq->seqno = local->seqno
>> ieee80211_tx/fast_xmit
>> ->
>> ieee80211_queue_skb
>> ->
>> ieee80211_schedule_txq(reset_seqno)
>> -> list_add
>> ->
>> txqi->seqno
>> = local->seqno - 1
>>
>> In above sequence, it is quite possible that the same txq will be
>> served repeatedly and other txqs will be starved. am I right? IMHO
>> seqno mechanism will not guarantee that txqs will be processed only
>> once in an iteration.
>
> Yeah, ieee80211_queue_skb() shouldn't set reset_seqno; was
> experimenting
> with that, and guess I ended up picking the wrong value. Thanks for
> pointing that out :)
>
A simple change in argument may break algo. What would be seqno of first
packet of txq if queue_skb() isn't reset seqno? I am fine in passing
start
of loop as arg to next_ntx(). But prefer to keep loop detecting within
mac80211 itself by tracking txq same as ath10k. So that no change is
needed
in schedule_txq() arg list. thoughts?
>>>>> Well, it could conceivably be done in a way that doesn't require
>>>>> taking
>>>>> the activeq_lock. Adding another STOP flag to the TXQ, for
>>>>> instance.
>>>>> From an API point of view I think that is more consistent with what
>>>>> we
>>>>> have already...
>>>>>
>>>>
>>>> Make sense. ieee80211_txq_may_pull would be better place to decide
>>>> whether given txq is allowed for transmission. It also makes drivers
>>>> do not have to worry about deficit. Still I may need
>>>> ieee80211_reorder_txq API after processing txq. isn't it?
>>>
>>> The way I was assuming this would work (and what ath9k does), is that
>>> a
>>> driver only operates on one TXQ at a time; so it can get that txq,
>>> process it, and re-schedule it. In which case no other API is needed;
>>> the rotating can be done in next_txq(), and schedule_txq() can insert
>>> the txq back into the rotation as needed.
>>>
>>> However, it sounds like this is not how ath10k does things? See
>>> below.
>>>
>> correct. The current rotation works only in push-only mode. i.e when
>> firmware is not deciding txqs and the driver picks priority txq from
>> active_txqs list. Unfortunately rotation won't work when the driver
>> selects txq other than first in the list. In any case separate API
>> needed for rotation when the driver is processing few packets from txq
>> instead of all pending frames.
>
> Rotation is not dependent on the TXQ draining completely. Dequeueing a
> few packets, then rescheduling the TXQ is fine.
>
The problem is that when the driver accesses txq directly, it wont call
next_txq(). So the txq will not dequeued from list and schedule_txq()
wont
be effective. right?
ieee80211_txq_may_pull() - check whether txq is allowed for tx_dequeue()
that helps to keep deficit check in mac80211
itself
ieee80211_reorder_txq() - after dequeuing skb (in direct txq access),
txq will be deleted from list and if there are
pending
frames, it will be requeued.
>>> And if so, how does that interact with ath10k_mac_tx_push_pending()?
>>> That function is basically doing the same thing that I explained
>>> above
>>> for ath9k; in the previous version of this patch series I modified
>>> that
>>> to use next_txq(). But is that a different TX path, or am I
>>> misunderstanding you?
>>>
>>> If you could point me to which parts of the ath10k code I should be
>>> looking at, that would be helpful as well :)
>>>
>> Depends on firmware htt_tx_mode (push/push_pull),
>> ath10k_mac_tx_push_pending() downloads all/partial frames to firmware.
>> Please take a look at ath10k_mac_tx_can_push() in push_pending(). Let
>> me know If you need any other details.
>
> Right, so looking at this, it looks like the driver will loop through
> all the available TXQs, trying to dequeue from each of them if
> ath10k_mac_tx_can_push() returns true, right? This should actually work
> fine with the next_txq() / schedule_txq() API. Without airtime fairness
> that will just translate into basically what the driver is doing now,
> and I don't think more API functions are needed, as long as that is the
> only point in the driver that pushes packets to the device...
>
In push-only mode, this will work with next_txq() / schedule_txq() APIs.
In ath10k, packets are downloaded to firmware in three places.
wake_tx_queue()
tx_push_pending()
htt_rx_tx_fetch_ind() - do txq_lookup() and push_txq()
the above mentioned new APIs are needed to take care of fetch_ind().
> With airtime fairness, what is going to happen is that the loop is only
> going to get a single TXQ (the first one with a positive deficit), then
> exit. Once that station has transmitted something and its deficit runs
> negative, it'll get rotated and the next one will get a chance. So I
> expect that airtime fairness will probably work, but MU-MIMO will
> break...
>
Agree.. In your earlier series, you did similar changes in ath10k
especially
in wake_tx_queue and push_pending().
> Don't think we can do MU-MIMO with a DRR airtime scheduler; we'll have
> to replace it with something different. But I think the same next_txq()
> / schedule_txq() API will work for that as well...
>
As mentioned above, have to take of fetch_ind(). All 10.4 based chips
(QCA99x0,
QCA9984, QCA9888, QCA4019, etc.) operates in push-pull mode, when the
number of
connected station increased. As target does not have enough memory for
buffering
frames for each station, it relied on host memory. ath10k driver can not
identify
that whether the received fetch_ind is for MU-MIMO or regular
transmission.
If we don't handle this case, then ath10k driver can not claim mac80211
ATF support.
Agree that MU-MIMO won't work with DDR scheduler. and it will impact
MU-MIMO performace
in ath10k when mac80211 ATF is claimed by ath10k.
-Rajkumar
On 2018-07-30 15:48, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
> Hmm, the driver really shouldn't need to do any locking apart from
> using
> the next_txq() API. But I think you are right that the seqno mechanism
> doesn't play well with unserialised access to through next_txq() from
> multiple CPUs. I'll see what I can do about that, and also incorporate
> the other changes we've been discussing into a new RFC series.
>
Great.. :)
>> Hmm.. reorder_txq() API may not needed. Calling next_txq() takes care
>> of reordering list though the driver is accessing txqs directly.
>
> Right, as long as the next_txq() path is still called even while
> fetch_ind() is active, that should be fine.
>
I am still wondering how and when to refill deficit in case next_txq()
won't be called. :(
Will post RFC change on top of yours.
-Rajkumar
On 2018-07-12 16:13, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
>> On 2018-07-11 13:48, Toke Høiland-Jørgensen wrote:
>>> Rajkumar Manoharan <[email protected]> writes:
>>>
>>>> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
>>>> [...]
>>>>> +/**
[...]
>>> Erm, how would this prevent an infinite loop? With this scheme, at
>>> some
>>> point, ieee80211_next_txq() removes the last txq from activeq, and
>>> returns that. Then, when it is called again the next time the driver
>>> loops, it's back to the first case (activeq empty, waitq non-empty);
>>> so
>>> it'll move waitq back as activeq and start over... Only the driver
>>> really knows when it is starting a logical "loop through all active
>>> TXQs".
>>>
>> Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
>> activeq only and keep processed txqs separately. Having single list
>> eventually needs tracking mechanism. The point is that once activeq
>> becomes empty, splice waitq list and return NULL. So that driver can
>> break from the loop.
>>
>> ieee80211_next_txq
>> - if activeq empty,
>> - move waitq list into activeq
>> - return NULL
>>
>> - if activeq not empty
>> - fetch appropriate txq from activeq
>> - remove txq from activeq list.
>>
>> - If txq found, return txq else return NULL
>
> Right, so this would ensure the driver only sees each TXQ once *if it
> keeps looping*. But it doesn't necessarily; if the hardware queues fill
> up, for instance, it might abort earlier. In that case it would need to
> signal mac80211 that it is done for now, which is equivalent to
> signalling when it starts a scheduler round.
>
Hmm... I thought driver will call ieee80211_schedule_txq when it runs
out
of hardware descriptor and break the loop. The serving txq will be added
back to head of activeq list. no?
>>> Also, for airtime fairness, the queues are not scheduled strictly
>>> round-robin, so the dual-list scheme wouldn't work there either...
>>>
>> As you know, ath10k driver will operate in two tx modes (push-only,
>> push-pull). These modes will be switched dynamically depends on
>> firmware load or resource availability. In push-pull mode, firmware
>> will query N number of frames for set of STA, TID.
>
> Ah, so it will look up the TXQ without looping through the list of
> pending queues at all? Didn't realise that is what it's doing; yeah,
> that would be problematic for airtime fairness :)
>
>> So the driver will directly dequeue N number of frames from given txq.
>> In this case txq ordering alone wont help. I am planning to add below
>> changes in exiting API and add new API ieee80211_reorder_txq.
>>
>> ieee80211_txq_get_depth
>> - return deficit status along with frm_cnt
>>
>> ieee80211_reorder_txq
>> - if txq deficit > 0
>> - return;
>> - if txq is last
>> - return
>> - delete txq from list
>> - move it to tail
>> - update deficit by quantum
>>
>> ath10k_htt_rx_tx_fetch_ind
>> - get txq deficit status
>> - if txq deficit > 0
>> - dequeue skb
>> - else if deficit < 0
>> - return NULL
>>
>> Please share your thoughts.
>
> Hmm, not sure exactly how this would work; seems a little complicated?
> Also, I'd rather if drivers were completely oblivious to the deficit;
> that is a bit of an implementation detail...
>
Agree.. Initially I thought of adding deficit check in
ieee80211_tx_dequeue.
But It will be overhead of taking activeq_lock for every skbs. Perhaps
it can be renamed as allowed_to_dequeue instead of deficit.
> We could have an ieee80211_txq_may_pull(); or maybe just have
> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>
As I said earlier, checking deficit for every skb will be an overhead.
It should be done once before accessing txq.
> the reasonable thing for the driver to do, then, would be to ask
> ieee80211_next_txq() for another TXQ to pull from if the current one
> doesn't work for whatever reason.
>
> Would that work for push-pull mode?
>
Not really. Driver shouldn't send other txq data instead of asked one.
In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
So the driver not supposed to send other txq's data.
-Rajkumar
Rajkumar Manoharan <[email protected]> writes:
> On 2018-07-11 13:48, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Rajkumar Manoharan <[email protected]> writes:
>>=20
>>> On 2018-07-09 09:37, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>> [...]
>>>> +/**
>>>> + * ieee80211_schedule_txq - add txq to scheduling loop
>>>> + *
>>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>>> + * @txq: pointer obtained from station or virtual interface
>>>> + * @reset_seqno: Whether to reset the internal scheduling sequence
>>>> number,
>>>> + * allowing this txq to appear again in the current
>>>> scheduling
>>>> + * round (see doc for ieee80211_next_txq()).
>>>> + *
>>>> + * Returns %true if the txq was actually added to the scheduling,
>>>> + * %false otherwise.
>>>> + */
>>>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>>>> + struct ieee80211_txq *txq,
>>>> + bool reset_seqno);
>>>> +
>>>> +/**
>>>> + * ieee80211_next_txq - get next tx queue to pull packets from
>>>> + *
>>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>>> + * @ac: filter returned txqs with this AC number. Pass -1 for no
>>>> filtering.
>>>> + * @inc_seqno: Whether to increase the scheduling sequence number.
>>>> Setting this
>>>> + * to true signifies the start of a new scheduling=20
>>>> round.
>>>> Each TXQ
>>>> + * will only be returned exactly once in each round
>>>> (unless its
>>>> + * sequence number is explicitly reset when calling
>>>> + * ieee80211_schedule_txq()).
>>>> + *
>>> Toke,
>>>=20
>>> Seems like seqno is internal to mac80211 and meant for active_txq list
>>> manipulation. If so, why would drivers have to worry about increment
>>> or resetting seqno?
>>=20
>> The drivers need to be able to signal when they start a new "scheduling
>> round" (which is the parameter to next_txq()), and when a queue should
>> be immediately rescheduled (which is the parameter to schedule_txq()).
>> See the subsequent patch to ath9k for how this is used; the latter is
>> signalled when a TXQ successfully dequeued an aggregate...
>>=20
>> Now, you're right that the choice to track this via a sequence number=20
>> is
>> an implementation detail internal to mac80211... so maybe the=20
>> parameters
>> should be called something different.
>>=20
>>> IMHO to avoid over serving same txq, two lists (activeq and waitq) can
>>> be used and always add new txq into waitq list. So that driver will
>>> not worry about mac80211 txq manipulation. Please correct me If Im
>>> wrong.
>>>=20
>>> ieee80211_schedule_txq
>>> - if schedule_order empty, add txq into waitq list tail.
>>>=20
>>> ieee80211_next_txq
>>> - if activeq empty,
>>> - move waitq list into activeq
>>>=20
>>> - if activeq not empty
>>> - fetch appropriate txq from activeq
>>> - remove txq from activeq list.
>>>=20
>>> - If txq found, return txq else return NULL
>>=20
>>=20
>> Erm, how would this prevent an infinite loop? With this scheme, at some
>> point, ieee80211_next_txq() removes the last txq from activeq, and
>> returns that. Then, when it is called again the next time the driver
>> loops, it's back to the first case (activeq empty, waitq non-empty); so
>> it'll move waitq back as activeq and start over... Only the driver
>> really knows when it is starting a logical "loop through all active
>> TXQs".
>>=20
> Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
> activeq only and keep processed txqs separately. Having single list
> eventually needs tracking mechanism. The point is that once activeq
> becomes empty, splice waitq list and return NULL. So that driver can
> break from the loop.
>
> ieee80211_next_txq
> - if activeq empty,
> - move waitq list into activeq
> - return NULL
>
> - if activeq not empty
> - fetch appropriate txq from activeq
> - remove txq from activeq list.
>
> - If txq found, return txq else return NULL
Right, so this would ensure the driver only sees each TXQ once *if it
keeps looping*. But it doesn't necessarily; if the hardware queues fill
up, for instance, it might abort earlier. In that case it would need to
signal mac80211 that it is done for now, which is equivalent to
signalling when it starts a scheduler round.
>> Also, for airtime fairness, the queues are not scheduled strictly
>> round-robin, so the dual-list scheme wouldn't work there either...
>>=20
> As you know, ath10k driver will operate in two tx modes (push-only,
> push-pull). These modes will be switched dynamically depends on
> firmware load or resource availability. In push-pull mode, firmware
> will query N number of frames for set of STA, TID.
Ah, so it will look up the TXQ without looping through the list of
pending queues at all? Didn't realise that is what it's doing; yeah,
that would be problematic for airtime fairness :)
> So the driver will directly dequeue N number of frames from given txq.
> In this case txq ordering alone wont help. I am planning to add below
> changes in exiting API and add new API ieee80211_reorder_txq.
>
> ieee80211_txq_get_depth
> - return deficit status along with frm_cnt
>
> ieee80211_reorder_txq
> - if txq deficit > 0
> - return;
> - if txq is last
> - return
> - delete txq from list
> - move it to tail
> - update deficit by quantum
>
> ath10k_htt_rx_tx_fetch_ind
> - get txq deficit status
> - if txq deficit > 0
> - dequeue skb
> - else if deficit < 0
> - return NULL
>
> Please share your thoughts.
Hmm, not sure exactly how this would work; seems a little complicated?
Also, I'd rather if drivers were completely oblivious to the deficit;
that is a bit of an implementation detail...
We could have an ieee80211_txq_may_pull(); or maybe just have
ieee80211_tx_dequeue() return NULL if the deficit is negative? I think
the reasonable thing for the driver to do, then, would be to ask
ieee80211_next_txq() for another TXQ to pull from if the current one
doesn't work for whatever reason.
Would that work for push-pull mode?
-Toke
Rajkumar Manoharan <[email protected]> writes:
>>> A simple change in argument may break algo. What would be seqno of
>>> first packet of txq if queue_skb() isn't reset seqno?
>>
>> It would be 0, which would be less than the current seqno in all cases
>> except just when the seqno counter wraps.
>>
>
> One point should be mentioned in comment section of next_txq() that
> the driver has to ensure that next_txq() iteration is serialized i.e
> no parallel txq traversal are encouraged. Though txqs_list is maintained
> in mac80211, parallel iteration may change the behavior. For example
> I thought of get rid of txqs_lock in ath10k as txqs_list is moved to
> mac80211.
> But it is still needed. right?
Hmm, the driver really shouldn't need to do any locking apart from using
the next_txq() API. But I think you are right that the seqno mechanism
doesn't play well with unserialised access to through next_txq() from
multiple CPUs. I'll see what I can do about that, and also incorporate
the other changes we've been discussing into a new RFC series.
> Hmm.. reorder_txq() API may not needed. Calling next_txq() takes care
> of reordering list though the driver is accessing txqs directly.
Right, as long as the next_txq() path is still called even while
fetch_ind() is active, that should be fine.
>>> If we don't handle this case, then ath10k driver can not claim
>>> mac80211 ATF support. Agree that MU-MIMO won't work with DDR
>>> scheduler. and it will impact MU-MIMO performace in ath10k when
>>> mac80211 ATF is claimed by ath10k.
>>
>> Yeah, so the question is if this is an acceptable tradeoff? Do you have
>> any idea how often MU-MIMO actually provides a benefit in the real
>> world?
>>
> Hmm.. We have some criteria of ~1.9 gain in Mu-MIMO test cases with 50
> 11ac clients.
Hmm, yeah, that would be a shame to lose; although I do think 50-client
deployments are still relatively rare for many people. So maybe we can
make airtime fairness something that can be switched on and off for
ath10k, depending on whether users think they will be needing MU-MIMO?
Until we come up with a scheduler that can handle it, of course...
-Toke
This adds an API to mac80211 to handle scheduling of TXQs. The API
consists of two new functions: ieee80211_next_txq() and
ieee80211_schedule_txq(). The former returns the next TXQ that should be
scheduled, and is how the driver gets a queue to pull packets from. The
latter is called internally by mac80211 to start scheduling a queue, and
the driver is supposed to call it to re-schedule the TXQ after it is
finished pulling packets from it (unless the queue emptied). Drivers can
optionally filter TXQs on ac to support per-AC hardware queue designs,
and a sequence number mechanism is used to support drivers looping over
all available TXQs exactly once.
Using this API allows drivers to take advantage of mac80211 scheduling
features such as airtime fairness (added in a subsequent commit).
However, usage of the new API is optional, so support can be added to
individual drivers one at a time.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
include/net/mac80211.h | 50 +++++++++++++++++++++++++++++---
net/mac80211/agg-tx.c | 2 +
net/mac80211/ieee80211_i.h | 7 ++++
net/mac80211/main.c | 3 ++
net/mac80211/sta_info.c | 3 ++
net/mac80211/tx.c | 69 ++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 130 insertions(+), 4 deletions(-)
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 5790f55c241d..18e43193b614 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -107,9 +107,16 @@
* The driver is expected to initialize its private per-queue data for stations
* and interfaces in the .add_interface and .sta_add ops.
*
- * The driver can't access the queue directly. To dequeue a frame, it calls
- * ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a queue, it
- * calls the .wake_tx_queue driver op.
+ * The driver can't access the queue directly. To dequeue a frame from a
+ * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
+ * queue, it calls the .wake_tx_queue driver op.
+ *
+ * Drivers can optionally delegate responsibility for scheduling queues to
+ * mac80211, to take advantage of airtime fairness accounting. In this case, to
+ * obtain the next queue to pull frames from, the driver calls
+ * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
+ * using ieee80211_schedule_txq() if it is still active after the driver has
+ * finished pulling packets from it.
*
* For AP powersave TIM handling, the driver only needs to indicate if it has
* buffered packets in the driver specific data structures by calling
@@ -5971,13 +5978,48 @@ void ieee80211_unreserve_tid(struct ieee80211_sta *sta, u8 tid);
* ieee80211_tx_dequeue - dequeue a packet from a software tx queue
*
* @hw: pointer as obtained from ieee80211_alloc_hw()
- * @txq: pointer obtained from station or virtual interface
+ * @txq: pointer obtained from station or virtual interface, or from
+ * ieee80211_next_txq()
*
* Returns the skb if successful, %NULL if no frame was available.
*/
struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
struct ieee80211_txq *txq);
+/**
+ * ieee80211_schedule_txq - add txq to scheduling loop
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @txq: pointer obtained from station or virtual interface
+ * @reset_seqno: Whether to reset the internal scheduling sequence number,
+ * allowing this txq to appear again in the current scheduling
+ * round (see doc for ieee80211_next_txq()).
+ *
+ * Returns %true if the txq was actually added to the scheduling,
+ * %false otherwise.
+ */
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+ struct ieee80211_txq *txq,
+ bool reset_seqno);
+
+/**
+ * ieee80211_next_txq - get next tx queue to pull packets from
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @ac: filter returned txqs with this AC number. Pass -1 for no filtering.
+ * @inc_seqno: Whether to increase the scheduling sequence number. Setting this
+ * to true signifies the start of a new scheduling round. Each TXQ
+ * will only be returned exactly once in each round (unless its
+ * sequence number is explicitly reset when calling
+ * ieee80211_schedule_txq()).
+ *
+ * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
+ * is returned, it will have been removed from the scheduler queue and needs to
+ * be re-scheduled with ieee80211_schedule_txq() to continue to be active.
+ */
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
+ bool inc_seqno);
+
/**
* ieee80211_txq_get_depth - get pending frame/byte count of given txq
*
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c
index 69e831bc317b..0a2e0d64fc11 100644
--- a/net/mac80211/agg-tx.c
+++ b/net/mac80211/agg-tx.c
@@ -227,6 +227,8 @@ ieee80211_agg_start_txq(struct sta_info *sta, int tid, bool enable)
clear_bit(IEEE80211_TXQ_AMPDU, &txqi->flags);
clear_bit(IEEE80211_TXQ_STOP, &txqi->flags);
+ ieee80211_schedule_txq(&sta->sdata->local->hw, txq, true);
+
local_bh_disable();
rcu_read_lock();
drv_wake_tx_queue(sta->sdata->local, txqi);
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 172aeae21ae9..6bd68639c699 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -835,6 +835,8 @@ struct txq_info {
struct codel_vars def_cvars;
struct codel_stats cstats;
struct sk_buff_head frags;
+ struct list_head schedule_order;
+ u16 schedule_seqno;
unsigned long flags;
/* keep last! */
@@ -1126,6 +1128,11 @@ struct ieee80211_local {
struct codel_vars *cvars;
struct codel_params cparams;
+ /* protects active_txqs and txqi->schedule_order */
+ spinlock_t active_txq_lock;
+ struct list_head active_txqs;
+ u16 schedule_seqno;
+
const struct ieee80211_ops *ops;
/*
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 4fb2709cb527..abaca5e1a59f 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -634,6 +634,9 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
spin_lock_init(&local->rx_path_lock);
spin_lock_init(&local->queue_stop_reason_lock);
+ INIT_LIST_HEAD(&local->active_txqs);
+ spin_lock_init(&local->active_txq_lock);
+
INIT_LIST_HEAD(&local->chanctx_list);
mutex_init(&local->chanctx_mtx);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index f34202242d24..bb92bf516b6b 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1244,6 +1244,9 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
if (!txq_has_queue(sta->sta.txq[i]))
continue;
+ ieee80211_schedule_txq(&local->hw,
+ sta->sta.txq[i],
+ true);
drv_wake_tx_queue(local, to_txq_info(sta->sta.txq[i]));
}
}
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 5b93bde248fd..24766566a380 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1437,6 +1437,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
codel_vars_init(&txqi->def_cvars);
codel_stats_init(&txqi->cstats);
__skb_queue_head_init(&txqi->frags);
+ INIT_LIST_HEAD(&txqi->schedule_order);
txqi->txq.vif = &sdata->vif;
@@ -1460,6 +1461,9 @@ void ieee80211_txq_purge(struct ieee80211_local *local,
fq_tin_reset(fq, tin, fq_skb_free_func);
ieee80211_purge_tx_queue(&local->hw, &txqi->frags);
+ spin_lock_bh(&local->active_txq_lock);
+ list_del_init(&txqi->schedule_order);
+ spin_unlock_bh(&local->active_txq_lock);
}
void ieee80211_txq_set_params(struct ieee80211_local *local)
@@ -1576,6 +1580,7 @@ static bool ieee80211_queue_skb(struct ieee80211_local *local,
ieee80211_txq_enqueue(local, txqi, skb);
spin_unlock_bh(&fq->lock);
+ ieee80211_schedule_txq(&local->hw, &txqi->txq, true);
drv_wake_tx_queue(local, txqi);
return true;
@@ -3574,6 +3579,70 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
}
EXPORT_SYMBOL(ieee80211_tx_dequeue);
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+ struct ieee80211_txq *txq,
+ bool reset_seqno)
+{
+ struct ieee80211_local *local = hw_to_local(hw);
+ struct txq_info *txqi = to_txq_info(txq);
+ bool ret = false;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ if (list_empty(&txqi->schedule_order)) {
+ list_add_tail(&txqi->schedule_order, &local->active_txqs);
+ if (reset_seqno)
+ txqi->schedule_seqno = local->schedule_seqno - 1;
+ ret = true;
+ }
+
+ spin_unlock_bh(&local->active_txq_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(ieee80211_schedule_txq);
+
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
+ bool inc_seqno)
+{
+ struct ieee80211_local *local = hw_to_local(hw);
+ struct txq_info *txqi = NULL;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ if (inc_seqno)
+ local->schedule_seqno++;
+
+ if (list_empty(&local->active_txqs))
+ goto out;
+
+ list_for_each_entry(txqi, &local->active_txqs, schedule_order) {
+ if (ac < 0 || txqi->txq.ac == ac) {
+ /* If seqnos are equal, we've seen this txqi before in
+ * this scheduling round, so abort.
+ */
+ if (txqi->schedule_seqno == local->schedule_seqno)
+ txqi = NULL;
+ else
+ list_del_init(&txqi->schedule_order);
+ goto out;
+ }
+ }
+
+ /* no txq with requested ac found */
+ txqi = NULL;
+
+out:
+ spin_unlock_bh(&local->active_txq_lock);
+
+ if (!txqi)
+ return NULL;
+
+ txqi->schedule_seqno = local->schedule_seqno;
+ return &txqi->txq;
+}
+EXPORT_SYMBOL(ieee80211_next_txq);
+
void __ieee80211_subif_start_xmit(struct sk_buff *skb,
struct net_device *dev,
u32 info_flags)
This moves the ath9k driver to use the mac80211 TXQ scheduling and
airtime accounting APIs, removing the corresponding state tracking
inside the driver.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
drivers/net/wireless/ath/ath9k/ath9k.h | 14 --
drivers/net/wireless/ath/ath9k/debug.c | 3
drivers/net/wireless/ath/ath9k/debug.h | 8 -
drivers/net/wireless/ath/ath9k/debug_sta.c | 54 ------
drivers/net/wireless/ath/ath9k/init.c | 3
drivers/net/wireless/ath/ath9k/recv.c | 9 -
drivers/net/wireless/ath/ath9k/xmit.c | 245 ++++++++--------------------
7 files changed, 76 insertions(+), 260 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index ef0de4f1312c..f01aedfef24f 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -112,8 +112,6 @@ int ath_descdma_setup(struct ath_softc *sc, struct ath_descdma *dd,
#define ATH_TXFIFO_DEPTH 8
#define ATH_TX_ERROR 0x01
-#define ATH_AIRTIME_QUANTUM 300 /* usec */
-
/* Stop tx traffic 1ms before the GO goes away */
#define ATH_P2P_PS_STOP_TIME 1000
@@ -246,10 +244,8 @@ struct ath_atx_tid {
s8 bar_index;
bool active;
bool clear_ps_filter;
- bool has_queued;
};
-void __ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid);
void ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid);
struct ath_node {
@@ -263,12 +259,9 @@ struct ath_node {
bool sleeping;
bool no_ps_filter;
- s64 airtime_deficit[IEEE80211_NUM_ACS];
- u32 airtime_rx_start;
#ifdef CONFIG_ATH9K_STATION_STATISTICS
struct ath_rx_rate_stats rx_rate_stats;
- struct ath_airtime_stats airtime_stats;
#endif
u8 key_idx[4];
@@ -986,11 +979,6 @@ void ath_ant_comb_scan(struct ath_softc *sc, struct ath_rx_status *rs);
#define ATH9K_NUM_CHANCTX 2 /* supports 2 operating channels */
-#define AIRTIME_USE_TX BIT(0)
-#define AIRTIME_USE_RX BIT(1)
-#define AIRTIME_USE_NEW_QUEUES BIT(2)
-#define AIRTIME_ACTIVE(flags) (!!(flags & (AIRTIME_USE_TX|AIRTIME_USE_RX)))
-
struct ath_softc {
struct ieee80211_hw *hw;
struct device *dev;
@@ -1034,8 +1022,6 @@ struct ath_softc {
short nbcnvifs;
unsigned long ps_usecount;
- u16 airtime_flags; /* AIRTIME_* */
-
struct ath_rx rx;
struct ath_tx tx;
struct ath_beacon beacon;
diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c
index f685843a2ff3..ce62da4840e2 100644
--- a/drivers/net/wireless/ath/ath9k/debug.c
+++ b/drivers/net/wireless/ath/ath9k/debug.c
@@ -1456,9 +1456,6 @@ int ath9k_init_debug(struct ath_hw *ah)
#endif
debugfs_create_file("tpc", 0600, sc->debug.debugfs_phy, sc, &fops_tpc);
- debugfs_create_u16("airtime_flags", 0600,
- sc->debug.debugfs_phy, &sc->airtime_flags);
-
debugfs_create_file("nf_override", 0600,
sc->debug.debugfs_phy, sc, &fops_nf_override);
diff --git a/drivers/net/wireless/ath/ath9k/debug.h b/drivers/net/wireless/ath/ath9k/debug.h
index 249f8141cd00..559d9628f280 100644
--- a/drivers/net/wireless/ath/ath9k/debug.h
+++ b/drivers/net/wireless/ath/ath9k/debug.h
@@ -319,20 +319,12 @@ ath9k_debug_sync_cause(struct ath_softc *sc, u32 sync_cause)
void ath_debug_rate_stats(struct ath_softc *sc,
struct ath_rx_status *rs,
struct sk_buff *skb);
-void ath_debug_airtime(struct ath_softc *sc,
- struct ath_node *an,
- u32 rx, u32 tx);
#else
static inline void ath_debug_rate_stats(struct ath_softc *sc,
struct ath_rx_status *rs,
struct sk_buff *skb)
{
}
-static inline void ath_debug_airtime(struct ath_softc *sc,
- struct ath_node *an,
- u32 rx, u32 tx)
-{
-}
#endif /* CONFIG_ATH9K_STATION_STATISTICS */
#endif /* DEBUG_H */
diff --git a/drivers/net/wireless/ath/ath9k/debug_sta.c b/drivers/net/wireless/ath/ath9k/debug_sta.c
index a6f45f1bb5bb..bb6f3250aa30 100644
--- a/drivers/net/wireless/ath/ath9k/debug_sta.c
+++ b/drivers/net/wireless/ath/ath9k/debug_sta.c
@@ -242,59 +242,6 @@ static const struct file_operations fops_node_recv = {
.llseek = default_llseek,
};
-void ath_debug_airtime(struct ath_softc *sc,
- struct ath_node *an,
- u32 rx,
- u32 tx)
-{
- struct ath_airtime_stats *astats = &an->airtime_stats;
-
- astats->rx_airtime += rx;
- astats->tx_airtime += tx;
-}
-
-static ssize_t read_airtime(struct file *file, char __user *user_buf,
- size_t count, loff_t *ppos)
-{
- struct ath_node *an = file->private_data;
- struct ath_airtime_stats *astats;
- static const char *qname[4] = {
- "VO", "VI", "BE", "BK"
- };
- u32 len = 0, size = 256;
- char *buf;
- size_t retval;
- int i;
-
- buf = kzalloc(size, GFP_KERNEL);
- if (buf == NULL)
- return -ENOMEM;
-
- astats = &an->airtime_stats;
-
- len += scnprintf(buf + len, size - len, "RX: %u us\n", astats->rx_airtime);
- len += scnprintf(buf + len, size - len, "TX: %u us\n", astats->tx_airtime);
- len += scnprintf(buf + len, size - len, "Deficit: ");
- for (i = 0; i < 4; i++)
- len += scnprintf(buf+len, size - len, "%s: %lld us ", qname[i], an->airtime_deficit[i]);
- if (len < size)
- buf[len++] = '\n';
-
- retval = simple_read_from_buffer(user_buf, count, ppos, buf, len);
- kfree(buf);
-
- return retval;
-}
-
-
-static const struct file_operations fops_airtime = {
- .read = read_airtime,
- .open = simple_open,
- .owner = THIS_MODULE,
- .llseek = default_llseek,
-};
-
-
void ath9k_sta_add_debugfs(struct ieee80211_hw *hw,
struct ieee80211_vif *vif,
struct ieee80211_sta *sta,
@@ -304,5 +251,4 @@ void ath9k_sta_add_debugfs(struct ieee80211_hw *hw,
debugfs_create_file("node_aggr", 0444, dir, an, &fops_node_aggr);
debugfs_create_file("node_recv", 0444, dir, an, &fops_node_recv);
- debugfs_create_file("airtime", 0444, dir, an, &fops_airtime);
}
diff --git a/drivers/net/wireless/ath/ath9k/init.c b/drivers/net/wireless/ath/ath9k/init.c
index c070a9e51ebf..45a31f8a1514 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -676,8 +676,6 @@ static int ath9k_init_softc(u16 devid, struct ath_softc *sc,
/* Will be cleared in ath9k_start() */
set_bit(ATH_OP_INVALID, &common->op_flags);
- sc->airtime_flags = (AIRTIME_USE_TX | AIRTIME_USE_RX |
- AIRTIME_USE_NEW_QUEUES);
sc->sc_ah = ah;
sc->dfs_detector = dfs_pattern_detector_init(common, NL80211_DFS_UNSET);
@@ -1013,6 +1011,7 @@ static void ath9k_set_hw_capab(struct ath_softc *sc, struct ieee80211_hw *hw)
SET_IEEE80211_PERM_ADDR(hw, common->macaddr);
wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_CQM_RSSI_LIST);
+ wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_AIRTIME_FAIRNESS);
}
int ath9k_init_device(u16 devid, struct ath_softc *sc,
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index a8ac42c96d71..fef8fbe12f45 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -1054,14 +1054,7 @@ static void ath_rx_count_airtime(struct ath_softc *sc,
len, rxs->rate_idx, is_sp);
}
- if (!!(sc->airtime_flags & AIRTIME_USE_RX)) {
- spin_lock_bh(&acq->lock);
- an->airtime_deficit[acno] -= airtime;
- if (an->airtime_deficit[acno] <= 0)
- __ath_tx_queue_tid(sc, ATH_AN_2_TID(an, tidno));
- spin_unlock_bh(&acq->lock);
- }
- ath_debug_airtime(sc, an, airtime, 0);
+ ieee80211_sta_register_airtime(sta, tidno, 0, airtime);
exit:
rcu_read_unlock();
}
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 7fdb152be0bb..5b6a1e3c3f24 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -112,44 +112,14 @@ void ath_txq_unlock_complete(struct ath_softc *sc, struct ath_txq *txq)
ath_tx_status(hw, skb);
}
-void __ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
-{
- struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
- struct ath_chanctx *ctx = avp->chanctx;
- struct ath_acq *acq;
- struct list_head *tid_list;
- u8 acno = TID_TO_WME_AC(tid->tidno);
-
- if (!ctx || !list_empty(&tid->list))
- return;
-
-
- acq = &ctx->acq[acno];
- if ((sc->airtime_flags & AIRTIME_USE_NEW_QUEUES) &&
- tid->an->airtime_deficit[acno] > 0)
- tid_list = &acq->acq_new;
- else
- tid_list = &acq->acq_old;
-
- list_add_tail(&tid->list, tid_list);
-}
-
void ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
{
- struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
- struct ath_chanctx *ctx = avp->chanctx;
- struct ath_acq *acq;
+ struct ieee80211_txq *queue =
+ container_of((void *)tid, struct ieee80211_txq, drv_priv);
- if (!ctx || !list_empty(&tid->list))
- return;
-
- acq = &ctx->acq[TID_TO_WME_AC(tid->tidno)];
- spin_lock_bh(&acq->lock);
- __ath_tx_queue_tid(sc, tid);
- spin_unlock_bh(&acq->lock);
+ ieee80211_schedule_txq(sc->hw, queue, true);
}
-
void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue)
{
struct ath_softc *sc = hw->priv;
@@ -162,11 +132,7 @@ void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue)
tid->tidno);
ath_txq_lock(sc, txq);
-
- tid->has_queued = true;
- ath_tx_queue_tid(sc, tid);
ath_txq_schedule(sc, txq);
-
ath_txq_unlock(sc, txq);
}
@@ -216,8 +182,8 @@ ath_get_skb_tid(struct ath_softc *sc, struct ath_node *an, struct sk_buff *skb)
return ATH_AN_2_TID(an, tidno);
}
-static struct sk_buff *
-ath_tid_pull(struct ath_atx_tid *tid)
+static int
+ath_tid_pull(struct ath_atx_tid *tid, struct sk_buff **skbuf)
{
struct ieee80211_txq *txq = container_of((void*)tid, struct ieee80211_txq, drv_priv);
struct ath_softc *sc = tid->an->sc;
@@ -228,20 +194,16 @@ ath_tid_pull(struct ath_atx_tid *tid)
};
struct sk_buff *skb;
struct ath_frame_info *fi;
- int q;
-
- if (!tid->has_queued)
- return NULL;
+ int q, ret;
skb = ieee80211_tx_dequeue(hw, txq);
- if (!skb) {
- tid->has_queued = false;
- return NULL;
- }
+ if (!skb)
+ return -ENOENT;
- if (ath_tx_prepare(hw, skb, &txctl)) {
+ ret = ath_tx_prepare(hw, skb, &txctl);
+ if (ret) {
ieee80211_free_txskb(hw, skb);
- return NULL;
+ return ret;
}
q = skb_get_queue_mapping(skb);
@@ -251,24 +213,19 @@ ath_tid_pull(struct ath_atx_tid *tid)
++tid->txq->pending_frames;
}
- return skb;
-}
-
-
-static bool ath_tid_has_buffered(struct ath_atx_tid *tid)
-{
- return !skb_queue_empty(&tid->retry_q) || tid->has_queued;
+ *skbuf = skb;
+ return 0;
}
-static struct sk_buff *ath_tid_dequeue(struct ath_atx_tid *tid)
+static int ath_tid_dequeue(struct ath_atx_tid *tid,
+ struct sk_buff **skb)
{
- struct sk_buff *skb;
-
- skb = __skb_dequeue(&tid->retry_q);
- if (!skb)
- skb = ath_tid_pull(tid);
+ int ret = 0;
+ *skb = __skb_dequeue(&tid->retry_q);
+ if (!*skb)
+ ret = ath_tid_pull(tid, skb);
- return skb;
+ return ret;
}
static void ath_tx_flush_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
@@ -356,11 +313,12 @@ static void ath_tid_drain(struct ath_softc *sc, struct ath_txq *txq,
struct list_head bf_head;
struct ath_tx_status ts;
struct ath_frame_info *fi;
+ int ret;
memset(&ts, 0, sizeof(ts));
INIT_LIST_HEAD(&bf_head);
- while ((skb = ath_tid_dequeue(tid))) {
+ while ((ret = ath_tid_dequeue(tid, &skb)) == 0) {
fi = get_frame_info(skb);
bf = fi->bf;
@@ -672,7 +630,6 @@ static void ath_tx_complete_aggr(struct ath_softc *sc, struct ath_txq *txq,
skb_queue_splice_tail(&bf_pending, &tid->retry_q);
if (!an->sleeping) {
ath_tx_queue_tid(sc, tid);
-
if (ts->ts_status & (ATH9K_TXERR_FILT | ATH9K_TXERR_XRETRY))
tid->clear_ps_filter = true;
}
@@ -699,11 +656,11 @@ static bool bf_is_ampdu_not_probing(struct ath_buf *bf)
return bf_isampdu(bf) && !(info->flags & IEEE80211_TX_CTL_RATE_CTRL_PROBE);
}
-static void ath_tx_count_airtime(struct ath_softc *sc, struct ath_node *an,
- struct ath_atx_tid *tid, struct ath_buf *bf,
+static void ath_tx_count_airtime(struct ath_softc *sc,
+ struct ieee80211_sta *sta,
+ struct ath_buf *bf,
struct ath_tx_status *ts)
{
- struct ath_txq *txq = tid->txq;
u32 airtime = 0;
int i;
@@ -713,17 +670,7 @@ static void ath_tx_count_airtime(struct ath_softc *sc, struct ath_node *an,
airtime += rate_dur * bf->rates[i].count;
}
- if (sc->airtime_flags & AIRTIME_USE_TX) {
- int q = txq->mac80211_qnum;
- struct ath_acq *acq = &sc->cur_chan->acq[q];
-
- spin_lock_bh(&acq->lock);
- an->airtime_deficit[q] -= airtime;
- if (an->airtime_deficit[q] <= 0)
- __ath_tx_queue_tid(sc, tid);
- spin_unlock_bh(&acq->lock);
- }
- ath_debug_airtime(sc, an, 0, airtime);
+ ieee80211_sta_register_airtime(sta, ts->tid, airtime, 0);
}
static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
@@ -753,7 +700,7 @@ static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
if (sta) {
struct ath_node *an = (struct ath_node *)sta->drv_priv;
tid = ath_get_skb_tid(sc, an, bf->bf_mpdu);
- ath_tx_count_airtime(sc, an, tid, bf, ts);
+ ath_tx_count_airtime(sc, sta, bf, ts);
if (ts->ts_status & (ATH9K_TXERR_FILT | ATH9K_TXERR_XRETRY))
tid->clear_ps_filter = true;
}
@@ -937,20 +884,21 @@ static int ath_compute_num_delims(struct ath_softc *sc, struct ath_atx_tid *tid,
return ndelim;
}
-static struct ath_buf *
+static int
ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
- struct ath_atx_tid *tid)
+ struct ath_atx_tid *tid, struct ath_buf **buf)
{
struct ieee80211_tx_info *tx_info;
struct ath_frame_info *fi;
- struct sk_buff *skb, *first_skb = NULL;
struct ath_buf *bf;
+ struct sk_buff *skb, *first_skb = NULL;
u16 seqno;
+ int ret;
while (1) {
- skb = ath_tid_dequeue(tid);
- if (!skb)
- break;
+ ret = ath_tid_dequeue(tid, &skb);
+ if (ret < 0)
+ return ret;
fi = get_frame_info(skb);
bf = fi->bf;
@@ -981,7 +929,7 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
if (!(tx_info->flags & IEEE80211_TX_CTL_AMPDU)) {
bf->bf_state.bf_type = 0;
- return bf;
+ break;
}
bf->bf_state.bf_type = BUF_AMPDU | BUF_AGGR;
@@ -1000,7 +948,7 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
first_skb = skb;
continue;
}
- break;
+ return -EINPROGRESS;
}
if (tid->bar_index > ATH_BA_INDEX(tid->seq_start, seqno)) {
@@ -1014,10 +962,11 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
continue;
}
- return bf;
+ break;
}
- return NULL;
+ *buf = bf;
+ return 0;
}
static int
@@ -1027,7 +976,7 @@ ath_tx_form_aggr(struct ath_softc *sc, struct ath_txq *txq,
{
#define PADBYTES(_len) ((4 - ((_len) % 4)) % 4)
struct ath_buf *bf = bf_first, *bf_prev = NULL;
- int nframes = 0, ndelim;
+ int nframes = 0, ndelim, ret;
u16 aggr_limit = 0, al = 0, bpad = 0,
al_delta, h_baw = tid->baw_size / 2;
struct ieee80211_tx_info *tx_info;
@@ -1081,7 +1030,9 @@ ath_tx_form_aggr(struct ath_softc *sc, struct ath_txq *txq,
bf_prev = bf;
- bf = ath_tx_get_tid_subframe(sc, txq, tid);
+ ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+ if (ret < 0)
+ break;
}
goto finish;
stop:
@@ -1478,7 +1429,7 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
struct ath_buf *bf_first)
{
struct ath_buf *bf = bf_first, *bf_prev = NULL;
- int nframes = 0;
+ int nframes = 0, ret;
do {
struct ieee80211_tx_info *tx_info;
@@ -1492,8 +1443,8 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
if (nframes >= 2)
break;
- bf = ath_tx_get_tid_subframe(sc, txq, tid);
- if (!bf)
+ ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+ if (ret < 0)
break;
tx_info = IEEE80211_SKB_CB(bf->bf_mpdu);
@@ -1506,30 +1457,27 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
} while (1);
}
-static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
- struct ath_atx_tid *tid)
+static int ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
+ struct ath_atx_tid *tid)
{
- struct ath_buf *bf;
+ struct ath_buf *bf = NULL;
struct ieee80211_tx_info *tx_info;
struct list_head bf_q;
- int aggr_len = 0;
+ int aggr_len = 0, ret;
bool aggr;
- if (!ath_tid_has_buffered(tid))
- return false;
-
INIT_LIST_HEAD(&bf_q);
- bf = ath_tx_get_tid_subframe(sc, txq, tid);
- if (!bf)
- return false;
+ ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+ if (ret < 0)
+ return ret;
tx_info = IEEE80211_SKB_CB(bf->bf_mpdu);
aggr = !!(tx_info->flags & IEEE80211_TX_CTL_AMPDU);
if ((aggr && txq->axq_ampdu_depth >= ATH_AGGR_MIN_QDEPTH) ||
(!aggr && txq->axq_depth >= ATH_NON_AGGR_MIN_QDEPTH)) {
__skb_queue_tail(&tid->retry_q, bf->bf_mpdu);
- return false;
+ return -EBUSY;
}
ath_set_rates(tid->an->vif, tid->an->sta, bf);
@@ -1539,7 +1487,7 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
ath_tx_form_burst(sc, txq, tid, &bf_q, bf);
if (list_empty(&bf_q))
- return false;
+ return -EAGAIN;
if (tid->clear_ps_filter || tid->an->no_ps_filter) {
tid->clear_ps_filter = false;
@@ -1548,7 +1496,7 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
ath_tx_fill_desc(sc, bf, txq, aggr_len);
ath_tx_txqaddbuf(sc, txq, &bf_q, false);
- return true;
+ return 0;
}
int ath_tx_aggr_start(struct ath_softc *sc, struct ieee80211_sta *sta,
@@ -1611,28 +1559,16 @@ void ath_tx_aggr_sleep(struct ieee80211_sta *sta, struct ath_softc *sc,
{
struct ath_common *common = ath9k_hw_common(sc->sc_ah);
struct ath_atx_tid *tid;
- struct ath_txq *txq;
int tidno;
ath_dbg(common, XMIT, "%s called\n", __func__);
for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
tid = ath_node_to_tid(an, tidno);
- txq = tid->txq;
-
- ath_txq_lock(sc, txq);
-
- if (list_empty(&tid->list)) {
- ath_txq_unlock(sc, txq);
- continue;
- }
if (!skb_queue_empty(&tid->retry_q))
ieee80211_sta_set_buffered(sta, tid->tidno, true);
- list_del_init(&tid->list);
-
- ath_txq_unlock(sc, txq);
}
}
@@ -1651,11 +1587,12 @@ void ath_tx_aggr_wakeup(struct ath_softc *sc, struct ath_node *an)
ath_txq_lock(sc, txq);
tid->clear_ps_filter = true;
- if (ath_tid_has_buffered(tid)) {
+ if (!skb_queue_empty(&tid->retry_q)) {
ath_tx_queue_tid(sc, tid);
ath_txq_schedule(sc, txq);
}
ath_txq_unlock_complete(sc, txq);
+
}
}
@@ -1670,9 +1607,9 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,
struct ath_txq *txq = sc->tx.uapsdq;
struct ieee80211_tx_info *info;
struct list_head bf_q;
- struct ath_buf *bf_tail = NULL, *bf;
+ struct ath_buf *bf_tail = NULL, *bf = NULL;
int sent = 0;
- int i;
+ int i, ret;
INIT_LIST_HEAD(&bf_q);
for (i = 0; tids && nframes; i++, tids >>= 1) {
@@ -1685,8 +1622,9 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,
ath_txq_lock(sc, tid->txq);
while (nframes > 0) {
- bf = ath_tx_get_tid_subframe(sc, sc->tx.uapsdq, tid);
- if (!bf)
+ ret = ath_tx_get_tid_subframe(sc, sc->tx.uapsdq,
+ tid, &bf);
+ if (ret < 0)
break;
list_add_tail(&bf->list, &bf_q);
@@ -1950,11 +1888,12 @@ void ath_tx_cleanupq(struct ath_softc *sc, struct ath_txq *txq)
*/
void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
{
+ struct ieee80211_hw *hw = sc->hw;
struct ath_common *common = ath9k_hw_common(sc->sc_ah);
+ struct ieee80211_txq *queue;
struct ath_atx_tid *tid;
- struct list_head *tid_list;
- struct ath_acq *acq;
- bool active = AIRTIME_ACTIVE(sc->airtime_flags);
+ bool first = true;
+ int ret;
if (txq->mac80211_qnum < 0)
return;
@@ -1964,51 +1903,19 @@ void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
spin_lock_bh(&sc->chan_lock);
rcu_read_lock();
- acq = &sc->cur_chan->acq[txq->mac80211_qnum];
if (sc->cur_chan->stopped)
goto out;
-begin:
- tid_list = &acq->acq_new;
- if (list_empty(tid_list)) {
- tid_list = &acq->acq_old;
- if (list_empty(tid_list))
- goto out;
- }
- tid = list_first_entry(tid_list, struct ath_atx_tid, list);
-
- if (active && tid->an->airtime_deficit[txq->mac80211_qnum] <= 0) {
- spin_lock_bh(&acq->lock);
- tid->an->airtime_deficit[txq->mac80211_qnum] += ATH_AIRTIME_QUANTUM;
- list_move_tail(&tid->list, &acq->acq_old);
- spin_unlock_bh(&acq->lock);
- goto begin;
- }
-
- if (!ath_tid_has_buffered(tid)) {
- spin_lock_bh(&acq->lock);
- if ((tid_list == &acq->acq_new) && !list_empty(&acq->acq_old))
- list_move_tail(&tid->list, &acq->acq_old);
- else {
- list_del_init(&tid->list);
- }
- spin_unlock_bh(&acq->lock);
- goto begin;
- }
+ while ((queue = ieee80211_next_txq(hw, txq->mac80211_qnum, first))) {
+ tid = (struct ath_atx_tid *)queue->drv_priv;
+ first = false;
+ ret = ath_tx_sched_aggr(sc, txq, tid);
+ ath_dbg(common, QUEUE, "ath_tx_sched_aggr returned %d\n", ret);
- /*
- * If we succeed in scheduling something, immediately restart to make
- * sure we keep the HW busy.
- */
- if(ath_tx_sched_aggr(sc, txq, tid)) {
- if (!active) {
- spin_lock_bh(&acq->lock);
- list_move_tail(&tid->list, &acq->acq_old);
- spin_unlock_bh(&acq->lock);
- }
- goto begin;
+ if (ret != -ENOENT)
+ ieee80211_schedule_txq(hw, queue, (ret == 0));
}
out:
@@ -2863,9 +2770,6 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an)
struct ath_atx_tid *tid;
int tidno, acno;
- for (acno = 0; acno < IEEE80211_NUM_ACS; acno++)
- an->airtime_deficit[acno] = ATH_AIRTIME_QUANTUM;
-
for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
tid = ath_node_to_tid(an, tidno);
tid->an = an;
@@ -2875,7 +2779,6 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an)
tid->baw_head = tid->baw_tail = 0;
tid->active = false;
tid->clear_ps_filter = true;
- tid->has_queued = false;
__skb_queue_head_init(&tid->retry_q);
INIT_LIST_HEAD(&tid->list);
acno = TID_TO_WME_AC(tidno);
On 2018-07-24 04:08, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
Sorry for the late reply.
>> I would prefer to keep separate list for each AC ;) Prepared one on
>> top of your earlier merged changes. Now it needs revisit.
>
> Fine with me. Only reason I used a single list + the filtering
> mechanism
> was to keep things compatible with ath10k ;)
>
Thanks :). If so, I will defer the per-ac-list later.
>> A simple change in argument may break algo. What would be seqno of
>> first packet of txq if queue_skb() isn't reset seqno?
>
> It would be 0, which would be less than the current seqno in all cases
> except just when the seqno counter wraps.
>
One point should be mentioned in comment section of next_txq() that
the driver has to ensure that next_txq() iteration is serialized i.e
no parallel txq traversal are encouraged. Though txqs_list is maintained
in mac80211, parallel iteration may change the behavior. For example
I thought of get rid of txqs_lock in ath10k as txqs_list is moved to
mac80211.
But it is still needed. right?
>> I am fine in passing start of loop as arg to next_ntx(). But prefer to
>> keep loop detecting within mac80211 itself by tracking txq same as
>> ath10k. So that no change is needed in schedule_txq() arg list.
>> thoughts?
>
> If we remove the reset argument to schedule_txq we lose the ability for
> the driver to signal that it is OK to dequeue another series of packets
> from the same TXQ in the current scheduling round. I think that things
> would mostly still work, but we may risk the hardware running idle for
> short periods of time (in ath9k at least). I am not quite sure which
> conditions this would happen under, though; and I haven't been able to
> trigger it myself with my test hardware. So one approach would be to
> try
> it out without the parameter and put it back later if it turns out to
> be
> problematic...
>
Agree. Based on experiments, additional arguments can be extended later.
Currently in ath10k, txqs are added back to list when there are pending
frames and are not considered again in the same iteration. It will be
processed in next loop.
>> ieee80211_reorder_txq() - after dequeuing skb (in direct txq access),
>> txq will be deleted from list and if there
>> are
>> pending
>> frames, it will be requeued.
>
> This could work, but reorder_txq() can't do the reordering from the
> middle of the rotation. I.e, it can only reorder if the TXQ being
> passed
> in is currently at the head of the list of active TXQs.
>
> If we do go for this, I think it would make sense to use it everywhere.
> I.e., next_txq() doesn't remove the queue, it just returns what is
> currently at the head; and the driver will have to call reorder() after
> processing a TXQ, instead of schedule_txq().
>
Hmm.. reorder_txq() API may not needed. Calling next_txq() takes care of
reordering list though the driver is accessing txqs directly.
And also as we discussed earlier, I will drop txq_may_pull() API and
move
the deficit check under tx_dequeue().
> Right, so with the current DRR scheduler, the only thing we can do with
> this setup is throttle fetch_ind() if the TXQ isn't eligible for
> scheduling. What happens if we do that? As far as I can tell,
> fetch_ind() is triggered on tx completion from the same TXQ, right? So
> if we throttle that, the driver falls back to the tx_push_pending()
> rotation?
>
The fallback push will be effective only if current queued count is
lesser than
allowed. Otherwise push_pending() won't be effective even after
throttling
fetch_ind(). The same txq will be served in further fetch_ind() when the
driver
reports to firmware about pending counts.
> I think adding the throttling to tx_fetch_ind() would basically amount
> to disabling that mechanism for most transmission scenarios...
>
If you don't throttle fetch_ind(), fairness won't be effective. Also I
am
thinking of reordering push_pending() after processing fetch_ind_q. This
will
give enough chances for fetch_ind().
>> If we don't handle this case, then ath10k driver can not claim
>> mac80211 ATF support. Agree that MU-MIMO won't work with DDR
>> scheduler. and it will impact MU-MIMO performace in ath10k when
>> mac80211 ATF is claimed by ath10k.
>
> Yeah, so the question is if this is an acceptable tradeoff? Do you have
> any idea how often MU-MIMO actually provides a benefit in the real
> world?
>
Hmm.. We have some criteria of ~1.9 gain in Mu-MIMO test cases with 50
11ac clients.
-Rajkumar
Rajkumar Manoharan <[email protected]> writes:
> On 2018-07-21 13:54, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Rajkumar Manoharan <[email protected]> writes:
>>=20
>>> On 2018-07-19 07:18, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>>> Rajkumar Manoharan <[email protected]> writes:
>>>>=20
>>>>> On 2018-07-13 06:39, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>>>>> Rajkumar Manoharan <[email protected]> writes:
>>>=20
>>>>>> It is not generally predictable how many times this will loop=20
>>>>>> before
>>>>>> exiting...
>>>>>>=20
>>>>> Agree.. It would be better If the driver does not worry about txq
>>>>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>>>>> solve this. Will leave it to you.
>>>>=20
>>>> That is basically what the second parameter to next_txq() does in the
>>>> current patchset. It just needs to be renamed...
>>>>=20
>>> Agree. As next_txq() increments seqno while starting loop for each AC,
>>> It seems bit confusing. i.e A single ath_txq_schedule_all call will
>>> bump schedule_seqno by 4. right?
>>=20
>> Hmm, good point. Guess there should be one seqno per ac...
>>=20
> I would prefer to keep separate list for each AC ;) Prepared one on
> top of your earlier merged changes. Now it needs revisit.
Fine with me. Only reason I used a single list + the filtering mechanism
was to keep things compatible with ath10k ;)
>>> Let assume below case where CPU1 is dequeuing skb from isr context and
>>> CPU2 is enqueuing skbs into same txq.
>>>=20
>>> CPU1 CPU2
>>> ---- ----
>>> ath_txq_schedule
>>> -> ieee80211_next_txq(first)
>>> -> inc_seq
>>> -> delete from list
>>> -> txq->seqno =3D local->seqno
>>> ieee80211_tx/fast_xmit
>>> ->=20
>>> ieee80211_queue_skb
>>> ->
>>> ieee80211_schedule_txq(reset_seqno)
>>> -> list_add
>>> ->=20
>>> txqi->seqno
>>> =3D local->seqno - 1
>>>=20
>>> In above sequence, it is quite possible that the same txq will be
>>> served repeatedly and other txqs will be starved. am I right? IMHO
>>> seqno mechanism will not guarantee that txqs will be processed only
>>> once in an iteration.
>>=20
>> Yeah, ieee80211_queue_skb() shouldn't set reset_seqno; was=20
>> experimenting
>> with that, and guess I ended up picking the wrong value. Thanks for
>> pointing that out :)
>>=20
> A simple change in argument may break algo. What would be seqno of
> first packet of txq if queue_skb() isn't reset seqno?
It would be 0, which would be less than the current seqno in all cases
except just when the seqno counter wraps.
> I am fine in passing start of loop as arg to next_ntx(). But prefer to
> keep loop detecting within mac80211 itself by tracking txq same as
> ath10k. So that no change is needed in schedule_txq() arg list.
> thoughts?
If we remove the reset argument to schedule_txq we lose the ability for
the driver to signal that it is OK to dequeue another series of packets
from the same TXQ in the current scheduling round. I think that things
would mostly still work, but we may risk the hardware running idle for
short periods of time (in ath9k at least). I am not quite sure which
conditions this would happen under, though; and I haven't been able to
trigger it myself with my test hardware. So one approach would be to try
it out without the parameter and put it back later if it turns out to be
problematic...
>>>>>> Well, it could conceivably be done in a way that doesn't require
>>>>>> taking
>>>>>> the activeq_lock. Adding another STOP flag to the TXQ, for=20
>>>>>> instance.
>>>>>> From an API point of view I think that is more consistent with what
>>>>>> we
>>>>>> have already...
>>>>>>=20
>>>>>=20
>>>>> Make sense. ieee80211_txq_may_pull would be better place to decide
>>>>> whether given txq is allowed for transmission. It also makes drivers
>>>>> do not have to worry about deficit. Still I may need
>>>>> ieee80211_reorder_txq API after processing txq. isn't it?
>>>>=20
>>>> The way I was assuming this would work (and what ath9k does), is that=
=20
>>>> a
>>>> driver only operates on one TXQ at a time; so it can get that txq,
>>>> process it, and re-schedule it. In which case no other API is needed;
>>>> the rotating can be done in next_txq(), and schedule_txq() can insert
>>>> the txq back into the rotation as needed.
>>>>=20
>>>> However, it sounds like this is not how ath10k does things? See=20
>>>> below.
>>>>=20
>>> correct. The current rotation works only in push-only mode. i.e when
>>> firmware is not deciding txqs and the driver picks priority txq from
>>> active_txqs list. Unfortunately rotation won't work when the driver
>>> selects txq other than first in the list. In any case separate API
>>> needed for rotation when the driver is processing few packets from txq
>>> instead of all pending frames.
>>=20
>> Rotation is not dependent on the TXQ draining completely. Dequeueing a
>> few packets, then rescheduling the TXQ is fine.
>>=20
> The problem is that when the driver accesses txq directly, it wont
> call next_txq(). So the txq will not dequeued from list and
> schedule_txq() wont be effective. right?
>
> ieee80211_txq_may_pull() - check whether txq is allowed for tx_dequeue()
> that helps to keep deficit check in mac80211=20
> itself
>
> ieee80211_reorder_txq() - after dequeuing skb (in direct txq access),
> txq will be deleted from list and if there are=
=20
> pending
> frames, it will be requeued.
This could work, but reorder_txq() can't do the reordering from the
middle of the rotation. I.e, it can only reorder if the TXQ being passed
in is currently at the head of the list of active TXQs.
If we do go for this, I think it would make sense to use it everywhere.
I.e., next_txq() doesn't remove the queue, it just returns what is
currently at the head; and the driver will have to call reorder() after
processing a TXQ, instead of schedule_txq().
>>>> And if so, how does that interact with ath10k_mac_tx_push_pending()?
>>>> That function is basically doing the same thing that I explained=20
>>>> above
>>>> for ath9k; in the previous version of this patch series I modified=20
>>>> that
>>>> to use next_txq(). But is that a different TX path, or am I
>>>> misunderstanding you?
>>>>=20
>>>> If you could point me to which parts of the ath10k code I should be
>>>> looking at, that would be helpful as well :)
>>>>=20
>>> Depends on firmware htt_tx_mode (push/push_pull),
>>> ath10k_mac_tx_push_pending() downloads all/partial frames to firmware.
>>> Please take a look at ath10k_mac_tx_can_push() in push_pending(). Let
>>> me know If you need any other details.
>>=20
>> Right, so looking at this, it looks like the driver will loop through
>> all the available TXQs, trying to dequeue from each of them if
>> ath10k_mac_tx_can_push() returns true, right? This should actually work
>> fine with the next_txq() / schedule_txq() API. Without airtime fairness
>> that will just translate into basically what the driver is doing now,
>> and I don't think more API functions are needed, as long as that is the
>> only point in the driver that pushes packets to the device...
>>=20
> In push-only mode, this will work with next_txq() / schedule_txq() APIs.
> In ath10k, packets are downloaded to firmware in three places.
>
> wake_tx_queue()
> tx_push_pending()
> htt_rx_tx_fetch_ind() - do txq_lookup() and push_txq()
>
> the above mentioned new APIs are needed to take care of fetch_ind().
Ah! This was what I was missing; thanks!
Right, so with the current DRR scheduler, the only thing we can do with
this setup is throttle fetch_ind() if the TXQ isn't eligible for
scheduling. What happens if we do that? As far as I can tell,
fetch_ind() is triggered on tx completion from the same TXQ, right? So
if we throttle that, the driver falls back to the tx_push_pending()
rotation?
I think adding the throttling to tx_fetch_ind() would basically amount
to disabling that mechanism for most transmission scenarios...
>> With airtime fairness, what is going to happen is that the loop is only
>> going to get a single TXQ (the first one with a positive deficit), then
>> exit. Once that station has transmitted something and its deficit runs
>> negative, it'll get rotated and the next one will get a chance. So I
>> expect that airtime fairness will probably work, but MU-MIMO will
>> break...
>>=20
> Agree.. In your earlier series, you did similar changes in ath10k
> especially in wake_tx_queue and push_pending().
>
>> Don't think we can do MU-MIMO with a DRR airtime scheduler; we'll have
>> to replace it with something different. But I think the same next_txq()
>> / schedule_txq() API will work for that as well...
>>=20
> As mentioned above, have to take of fetch_ind(). All 10.4 based chips
> (QCA99x0, QCA9984, QCA9888, QCA4019, etc.) operates in push-pull mode,
> when the number of connected station increased. As target does not
> have enough memory for buffering frames for each station, it relied on
> host memory. ath10k driver can not identify that whether the received
> fetch_ind is for MU-MIMO or regular transmission.
>
> If we don't handle this case, then ath10k driver can not claim
> mac80211 ATF support. Agree that MU-MIMO won't work with DDR
> scheduler. and it will impact MU-MIMO performace in ath10k when
> mac80211 ATF is claimed by ath10k.
Yeah, so the question is if this is an acceptable tradeoff? Do you have
any idea how often MU-MIMO actually provides a benefit in the real
world?
One approach could be to implement the API and driver support with the
current DRR scheduler, and then evolve the scheduler into something that
can handle MU-MIMO afterwards. I have some ideas for this, but not sure
if they will work; and we want to avoid O(n) behaviour in the number of
associated stations.
-Toke
On 2018-07-19 07:18, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
>> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>>> Rajkumar Manoharan <[email protected]> writes:
>>> It is not generally predictable how many times this will loop before
>>> exiting...
>>>
>> Agree.. It would be better If the driver does not worry about txq
>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>> solve this. Will leave it to you.
>
> That is basically what the second parameter to next_txq() does in the
> current patchset. It just needs to be renamed...
>
Agree. As next_txq() increments seqno while starting loop for each AC,
It seems bit confusing. i.e A single ath_txq_schedule_all call will bump
schedule_seqno by 4. right?
Let assume below case where CPU1 is dequeuing skb from isr context and
CPU2 is enqueuing skbs into same txq.
CPU1 CPU2
---- ----
ath_txq_schedule
-> ieee80211_next_txq(first)
-> inc_seq
-> delete from list
-> txq->seqno = local->seqno
ieee80211_tx/fast_xmit
-> ieee80211_queue_skb
->
ieee80211_schedule_txq(reset_seqno)
-> list_add
-> txqi->seqno
= local->seqno - 1
In above sequence, it is quite possible that the same txq
will be served repeatedly and other txqs will be starved. am I right?
IMHO seqno mechanism will not guarantee that txqs will be processed
only once in an iteration.
[...]
>>> Well, it could conceivably be done in a way that doesn't require
>>> taking
>>> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>>> From an API point of view I think that is more consistent with what
>>> we
>>> have already...
>>>
>>
>> Make sense. ieee80211_txq_may_pull would be better place to decide
>> whether given txq is allowed for transmission. It also makes drivers
>> do not have to worry about deficit. Still I may need
>> ieee80211_reorder_txq API after processing txq. isn't it?
>
> The way I was assuming this would work (and what ath9k does), is that a
> driver only operates on one TXQ at a time; so it can get that txq,
> process it, and re-schedule it. In which case no other API is needed;
> the rotating can be done in next_txq(), and schedule_txq() can insert
> the txq back into the rotation as needed.
>
> However, it sounds like this is not how ath10k does things? See below.
>
correct. The current rotation works only in push-only mode. i.e when
firmware
is not deciding txqs and the driver picks priority txq from active_txqs
list.
Unfortunately rotation won't work when the driver selects txq other than
first
in the list. In any case separate API needed for rotation when the
driver is
processing few packets from txq instead of all pending frames.
>> In push-pull method, driver reports to firmware that number of frames
>> queued for each tid per station by wake_tx_queue. Later firmware will
>> query N frames from each TID and after dequeue driver will update
>> remaining frames for that tid. In ATF case, when driver is not able to
>> dequeue frames, driver will simply update remaining frames. The
>> consecutive fetch_ind get opportunity to dequeue the frames. By This
>> way, transmission for serving client will be paused for a while and
>> opportunity will be given to others.
>
> This sounds like the driver doesn't do anything other than notify the
> firmware that a txq has pending frames, and everything else is handled
> by the firmware? Is this the case?
>
Correct. In push-pull method, DRR algorithm running in firmware and so
firmware decides txq.
> And if so, how does that interact with ath10k_mac_tx_push_pending()?
> That function is basically doing the same thing that I explained above
> for ath9k; in the previous version of this patch series I modified that
> to use next_txq(). But is that a different TX path, or am I
> misunderstanding you?
>
> If you could point me to which parts of the ath10k code I should be
> looking at, that would be helpful as well :)
>
Depends on firmware htt_tx_mode (push/push_pull),
ath10k_mac_tx_push_pending()
downloads all/partial frames to firmware. Please take a look at
ath10k_mac_tx_can_push()
in push_pending(). Let me know If you need any other details.
-Rajkumar
Rajkumar Manoharan <[email protected]> writes:
> On 2018-07-09 09:37, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
> [...]
>> +/**
>> + * ieee80211_schedule_txq - add txq to scheduling loop
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @txq: pointer obtained from station or virtual interface
>> + * @reset_seqno: Whether to reset the internal scheduling sequence=20
>> number,
>> + * allowing this txq to appear again in the current=20
>> scheduling
>> + * round (see doc for ieee80211_next_txq()).
>> + *
>> + * Returns %true if the txq was actually added to the scheduling,
>> + * %false otherwise.
>> + */
>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>> + struct ieee80211_txq *txq,
>> + bool reset_seqno);
>> +
>> +/**
>> + * ieee80211_next_txq - get next tx queue to pull packets from
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @ac: filter returned txqs with this AC number. Pass -1 for no=20
>> filtering.
>> + * @inc_seqno: Whether to increase the scheduling sequence number.=20
>> Setting this
>> + * to true signifies the start of a new scheduling round.=20
>> Each TXQ
>> + * will only be returned exactly once in each round=20
>> (unless its
>> + * sequence number is explicitly reset when calling
>> + * ieee80211_schedule_txq()).
>> + *
> Toke,
>
> Seems like seqno is internal to mac80211 and meant for active_txq list
> manipulation. If so, why would drivers have to worry about increment
> or resetting seqno?
The drivers need to be able to signal when they start a new "scheduling
round" (which is the parameter to next_txq()), and when a queue should
be immediately rescheduled (which is the parameter to schedule_txq()).
See the subsequent patch to ath9k for how this is used; the latter is
signalled when a TXQ successfully dequeued an aggregate...
Now, you're right that the choice to track this via a sequence number is
an implementation detail internal to mac80211... so maybe the parameters
should be called something different.
> IMHO to avoid over serving same txq, two lists (activeq and waitq) can
> be used and always add new txq into waitq list. So that driver will
> not worry about mac80211 txq manipulation. Please correct me If Im
> wrong.
>
> ieee80211_schedule_txq
> - if schedule_order empty, add txq into waitq list tail.
>
> ieee80211_next_txq
> - if activeq empty,
> - move waitq list into activeq
>
> - if activeq not empty
> - fetch appropriate txq from activeq
> - remove txq from activeq list.
>
> - If txq found, return txq else return NULL
Erm, how would this prevent an infinite loop? With this scheme, at some
point, ieee80211_next_txq() removes the last txq from activeq, and
returns that. Then, when it is called again the next time the driver
loops, it's back to the first case (activeq empty, waitq non-empty); so
it'll move waitq back as activeq and start over... Only the driver
really knows when it is starting a logical "loop through all active
TXQs".
Also, for airtime fairness, the queues are not scheduled strictly
round-robin, so the dual-list scheme wouldn't work there either...
-Toke
This adds airtime accounting and scheduling to the mac80211 TXQ
scheduler. A new callback, ieee80211_sta_register_airtime(), is added
that drivers can call to report airtime usage for stations, and an
extended feature flag is added that drivers can use to opt into airtime
fairness scheduling.
When the airtime fairness feature is enabled, mac80211 will schedule TXQs
(through ieee80211_next_txq()) in a way that enforces airtime fairness
between active stations. This scheduling works the same way as the ath9k
in-driver airtime fairness scheduling, but also adds weighted fairness
to support airtime policies.
If the extended feature is not set, the scheduler will default to
round-robin scheduling.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
include/net/mac80211.h | 28 ++++++++++++++++++++
include/uapi/linux/nl80211.h | 3 ++
net/mac80211/debugfs.c | 3 ++
net/mac80211/debugfs_sta.c | 35 +++++++++++++++++++++++++
net/mac80211/ieee80211_i.h | 3 ++
net/mac80211/main.c | 2 +
net/mac80211/sta_info.c | 24 +++++++++++++++++
net/mac80211/sta_info.h | 17 ++++++++++++
net/mac80211/tx.c | 60 ++++++++++++++++++++++++++++++++----------
9 files changed, 161 insertions(+), 14 deletions(-)
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 18e43193b614..17759d55b7d4 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -5291,6 +5291,34 @@ void ieee80211_sta_eosp(struct ieee80211_sta *pubsta);
*/
void ieee80211_send_eosp_nullfunc(struct ieee80211_sta *pubsta, int tid);
+/**
+ * ieee80211_sta_register_airtime - register airtime usage for a sta/tid
+ *
+ * Register airtime usage for a given sta on a given tid. The driver can call
+ * this function to notify mac80211 that a station used a certain amount of
+ * airtime. This information will be used by the TXQ scheduler to schedule
+ * stations in a way that ensures airtime fairness.
+ *
+ * The reported airtime should as a minimum include all time that is spent
+ * transmitting to the remote station, including overhead and padding, but not
+ * including time spent waiting for a TXOP. If the time is not reported by the
+ * hardware it can in some cases be calculated from the rate and known frame
+ * composition. When possible, the time should include any failed transmission
+ * attempts.
+ *
+ * The driver can either call this function synchronously for every packet or
+ * aggregate, or asynchronously as airtime usage information becomes available.
+ * TX and RX airtime can be reported together, or separately by setting one of
+ * them to 0.
+ *
+ * @pubsta: the station
+ * @tid: the TID to register airtime for
+ * @tx_airtime: airtime used during TX (in usec)
+ * @rx_airtime: airtime used during RX (in usec)
+ */
+void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
+ u32 tx_airtime, u32 rx_airtime);
+
/**
* ieee80211_iter_keys - iterate keys programmed into the device
* @hw: pointer obtained from ieee80211_alloc_hw()
diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index 7acc16f34942..28550d01948f 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -5223,6 +5223,8 @@ enum nl80211_feature_flags {
* @NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT: Driver/device can omit all data
* except for supported rates from the probe request content if requested
* by the %NL80211_SCAN_FLAG_MIN_PREQ_CONTENT flag.
+ * @NL80211_EXT_FEATURE_AIRTIME_FAIRNESS: Driver supports airtime fairness
+ * scheduling.
*
* @NUM_NL80211_EXT_FEATURES: number of extended features.
* @MAX_NL80211_EXT_FEATURES: highest extended feature index.
@@ -5259,6 +5261,7 @@ enum nl80211_ext_feature_index {
NL80211_EXT_FEATURE_TXQS,
NL80211_EXT_FEATURE_SCAN_RANDOM_SN,
NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS,
/* add new features before the definition below */
NUM_NL80211_EXT_FEATURES,
diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index b5adf3625d16..08d112dc8bd6 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -379,6 +379,9 @@ void debugfs_hw_add(struct ieee80211_local *local)
if (local->ops->wake_tx_queue)
DEBUGFS_ADD_MODE(aqm, 0600);
+ debugfs_create_u16("airtime_flags", 0600,
+ phyd, &local->airtime_flags);
+
statsd = debugfs_create_dir("statistics", phyd);
/* if the dir failed, don't put all the other things into the root! */
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index 4105081dc1df..de4067bc11cd 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -192,6 +192,37 @@ static ssize_t sta_aqm_read(struct file *file, char __user *userbuf,
}
STA_OPS(aqm);
+static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ struct sta_info *sta = file->private_data;
+ struct ieee80211_local *local = sta->sdata->local;
+ size_t bufsz = 200;
+ char *buf = kzalloc(bufsz, GFP_KERNEL), *p = buf;
+ ssize_t rv;
+
+ if (!buf)
+ return -ENOMEM;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ p += scnprintf(p, bufsz + buf - p,
+ "RX: %llu us\nTX: %llu us\n"
+ "Deficit: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
+ sta->airtime.rx_airtime,
+ sta->airtime.tx_airtime,
+ sta->airtime.deficit[0],
+ sta->airtime.deficit[1],
+ sta->airtime.deficit[2],
+ sta->airtime.deficit[3]);
+
+ spin_unlock_bh(&local->active_txq_lock);
+ rv = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
+ kfree(buf);
+ return rv;
+}
+STA_OPS(airtime);
+
static ssize_t sta_agg_status_read(struct file *file, char __user *userbuf,
size_t count, loff_t *ppos)
{
@@ -546,6 +577,10 @@ void ieee80211_sta_debugfs_add(struct sta_info *sta)
if (local->ops->wake_tx_queue)
DEBUGFS_ADD(aqm);
+ if (wiphy_ext_feature_isset(local->hw.wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ DEBUGFS_ADD(airtime);
+
if (sizeof(sta->driver_buffered_tids) == sizeof(u32))
debugfs_create_x32("driver_buffered_tids", 0400,
sta->debugfs_dir,
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 6bd68639c699..c1f8b9f6128d 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1133,6 +1133,9 @@ struct ieee80211_local {
struct list_head active_txqs;
u16 schedule_seqno;
+ u16 airtime_flags;
+ u16 airtime_quantum;
+
const struct ieee80211_ops *ops;
/*
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index abaca5e1a59f..e597b22ce403 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -636,6 +636,8 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
INIT_LIST_HEAD(&local->active_txqs);
spin_lock_init(&local->active_txq_lock);
+ local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
+ local->airtime_quantum = IEEE80211_AIRTIME_QUANTUM;
INIT_LIST_HEAD(&local->chanctx_list);
mutex_init(&local->chanctx_mtx);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index bb92bf516b6b..4e1e2628fe7d 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -384,6 +384,7 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
for (i = 0; i < IEEE80211_NUM_ACS; i++) {
skb_queue_head_init(&sta->ps_tx_buf[i]);
skb_queue_head_init(&sta->tx_filtered[i]);
+ sta->airtime.deficit[i] = local->airtime_quantum;
}
for (i = 0; i < IEEE80211_NUM_TIDS; i++)
@@ -427,6 +428,8 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
sta->cparams.interval = MS2TIME(100);
sta->cparams.ecn = true;
+ sta->airtime.weight = 1;
+
sta_dbg(sdata, "Allocated STA %pM\n", sta->sta.addr);
return sta;
@@ -1824,6 +1827,27 @@ void ieee80211_sta_set_buffered(struct ieee80211_sta *pubsta,
}
EXPORT_SYMBOL(ieee80211_sta_set_buffered);
+void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
+ u32 tx_airtime, u32 rx_airtime)
+{
+ struct sta_info *sta = container_of(pubsta, struct sta_info, sta);
+ struct ieee80211_local *local = sta->sdata->local;
+ u8 ac = ieee80211_ac_from_tid(tid);
+ u32 airtime = 0;
+
+ if (sta->local->airtime_flags & AIRTIME_USE_TX)
+ airtime += tx_airtime;
+ if (sta->local->airtime_flags & AIRTIME_USE_RX)
+ airtime += rx_airtime;
+
+ spin_lock_bh(&local->active_txq_lock);
+ sta->airtime.tx_airtime += tx_airtime;
+ sta->airtime.rx_airtime += rx_airtime;
+ sta->airtime.deficit[ac] -= airtime;
+ spin_unlock_bh(&local->active_txq_lock);
+}
+EXPORT_SYMBOL(ieee80211_sta_register_airtime);
+
int sta_info_move_state(struct sta_info *sta,
enum ieee80211_sta_state new_state)
{
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 9a04327d71d1..d87a6c31dcb6 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -127,6 +127,21 @@ enum ieee80211_agg_stop_reason {
AGG_STOP_DESTROY_STA,
};
+/* How much to increase airtime deficit on each scheduling round, by default
+ * (userspace can change this per phy)
+ */
+#define IEEE80211_AIRTIME_QUANTUM 300 /* usec */
+/* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
+#define AIRTIME_USE_TX BIT(0)
+#define AIRTIME_USE_RX BIT(1)
+
+struct airtime_info {
+ u64 rx_airtime;
+ u64 tx_airtime;
+ s64 deficit[IEEE80211_NUM_ACS];
+ u16 weight;
+};
+
struct sta_info;
/**
@@ -563,6 +578,8 @@ struct sta_info {
} tx_stats;
u16 tid_seq[IEEE80211_QOS_CTL_TID_MASK + 1];
+ struct airtime_info airtime;
+
/*
* Aggregation information, locked with lock.
*/
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 24766566a380..202717924ac3 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3590,7 +3590,21 @@ bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
spin_lock_bh(&local->active_txq_lock);
if (list_empty(&txqi->schedule_order)) {
- list_add_tail(&txqi->schedule_order, &local->active_txqs);
+ /* If airtime accounting is active, always enqueue STAs at the
+ * head of the list to ensure that they only get moved to the
+ * back by the airtime DRR scheduler once they have a negative
+ * deficit. A station that already has a negative deficit will
+ * get immediately moved to the back of the list on the next
+ * call to ieee80211_next_txq().
+ */
+ if (wiphy_ext_feature_isset(local->hw.wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)
+ && txqi->txq.sta)
+ list_add(&txqi->schedule_order, &local->active_txqs);
+ else
+ list_add_tail(&txqi->schedule_order,
+ &local->active_txqs);
+
if (reset_seqno)
txqi->schedule_seqno = local->schedule_seqno - 1;
ret = true;
@@ -3602,6 +3616,17 @@ bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
}
EXPORT_SYMBOL(ieee80211_schedule_txq);
+static inline struct txq_info *find_txqi(struct list_head *head, s8 ac)
+{
+ struct txq_info *txqi;
+
+ list_for_each_entry(txqi, head, schedule_order) {
+ if (ac < 0 || txqi->txq.ac == ac)
+ return txqi;
+ }
+ return NULL;
+}
+
struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
bool inc_seqno)
{
@@ -3613,24 +3638,31 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
if (inc_seqno)
local->schedule_seqno++;
- if (list_empty(&local->active_txqs))
+begin:
+ txqi = find_txqi(&local->active_txqs, ac);
+ if (!txqi)
goto out;
- list_for_each_entry(txqi, &local->active_txqs, schedule_order) {
- if (ac < 0 || txqi->txq.ac == ac) {
- /* If seqnos are equal, we've seen this txqi before in
- * this scheduling round, so abort.
- */
- if (txqi->schedule_seqno == local->schedule_seqno)
- txqi = NULL;
- else
- list_del_init(&txqi->schedule_order);
- goto out;
+ if (txqi->txq.sta) {
+ struct sta_info *sta = container_of(txqi->txq.sta,
+ struct sta_info, sta);
+
+ if (sta->airtime.deficit[txqi->txq.ac] < 0) {
+ sta->airtime.deficit[txqi->txq.ac] +=
+ local->airtime_quantum * sta->airtime.weight;
+ list_move_tail(&txqi->schedule_order,
+ &local->active_txqs);
+ goto begin;
}
}
- /* no txq with requested ac found */
- txqi = NULL;
+ /* If seqnos are equal, we've seen this txqi before in this scheduling
+ * round, so abort.
+ */
+ if (txqi->schedule_seqno == local->schedule_seqno)
+ txqi = NULL;
+ else
+ list_del_init(&txqi->schedule_order);
out:
spin_unlock_bh(&local->active_txq_lock);
Johannes Berg <[email protected]> writes:
> On Wed, 2018-08-29 at 11:27 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> Hmm, the problem with a higher weight is that weight*quantum becomes the
>> time each station is scheduled, so having a higher value means higher
>> latency. This could be fixed by replacing the station weights with
>> per-station quantums, but I felt that this was exposing an
>> implementation detail in the API; if we ever change the enforcement
>> mechanism to not be quantum-based (as may be necessary for MU-MIMO for
>> instance), we'll have to convert values in the kernel. Whereas weights
>> are a conceptual measure that is not tied to any particular
>> implementation.
>
> Ok, but that's also an effect you should describe in the API for it.
What's the right place to put that? In the netlink header file?
> Perhaps then it should just be fractional? i.e. 8.8 bits or so?, so
> the default would be 1.0 (0x0100) and then you could scale down to 0.5
> (0x0080) etc?
Hmm, that's an interesting idea. I'll have to run some numbers to see
how the precision holds up on various examples; but that would allow us
to get rid of the quantum in the userspace API entirely, which is a good
thing as far as I'm concerned!
>> For the drivers that get airtime as part of TX completion, sure; but as
>> I understand it, at least ath10k gets airtime information in out of band
>> status reports, so there would need to be a callback for that in any
>> case...
>
> Hmm, ok, but perhaps then we should also tie those to the existing
> airtime things?
Eh? Tie what to which existing airtime things?
-Toke
Johannes Berg <[email protected]> writes:
> On Mon, 2018-07-09 at 18:37 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>=20
>> @@ -427,6 +428,8 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub=
_if_data *sdata,
>> sta->cparams.interval =3D MS2TIME(100);
>> sta->cparams.ecn =3D true;
>>=20=20
>> + sta->airtime.weight =3D 1;
>
> Perhaps it might be useful to start with a higher default (even
> something like 1<<8) as that would allow adjusting up/down single
> stations, without having to adjust all stations and listening to new
> additions to adjust them quickly etc?
>
> Theoretically this doesn't really matter, but from a practical POV it
> may be easier to leave them all at the default and just adjust the ones
> that need adjustment for some reason.
Hmm, the problem with a higher weight is that weight*quantum becomes the
time each station is scheduled, so having a higher value means higher
latency. This could be fixed by replacing the station weights with
per-station quantums, but I felt that this was exposing an
implementation detail in the API; if we ever change the enforcement
mechanism to not be quantum-based (as may be necessary for MU-MIMO for
instance), we'll have to convert values in the kernel. Whereas weights
are a conceptual measure that is not tied to any particular
implementation.
>> ieee80211_sta_register_airtime
>
> Do we really need this? We already have at least TX status with
> airtime, for ieee80211_sta_tx_notify() and friends, and the station
> pointer in that context, so couldn't we piggy-back on this? At least
> WMM-AC already requires the driver to provide this.
For the drivers that get airtime as part of TX completion, sure; but as
I understand it, at least ath10k gets airtime information in out of band
status reports, so there would need to be a callback for that in any
case...
-Toke
Johannes Berg <[email protected]> writes:
> On Wed, 2018-08-29 at 11:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> > We're also working on adding a TXQ for (bufferable) management packets
>> > - I wonder how that should interact here? Always be scheduled first?
>>=20
>> Ah, cool! It may be that it should be given priority, yeah. Presently,
>> the multicast queue just rotates in the RR with the others, but is never
>> throttled as it doesn't have an airtime measure (though perhaps it
>> should)? But that might not be desirable for management frames, as
>> presumable those need to go out as fast as possible?
>
> Good question. I guess the multicast should have an airtime measure,
> but I don't think we want to do accounting on the management. That
> really should be few frames, and we want them out ASAP in most cases.
Yup, makes sense.
>> Hmm, I seem to recall thinking about just putting the call to
>> schedule_txq() into drv_wake_tx_queue; don't remember why I ended up
>> dropping that; but will take another look at whether it makes sense to
>> combine things.
>
> I was thinking the other way around - but that doesn't work since you'd
> loop from the driver to itself. This way might work, I guess, hadn't
> considered that.
>
> Might be better anyway though to make a new inline that does both, since
> the drv_() calls usually really don't do much on their own (other than
> checks, and in one case backward compatibility code, but still)
ACK, I'll look into that.
-Toke
On Wed, 2018-08-29 at 11:22 +0200, Toke Høiland-Jørgensen wrote:
> > We're also working on adding a TXQ for (bufferable) management packets
> > - I wonder how that should interact here? Always be scheduled first?
>
> Ah, cool! It may be that it should be given priority, yeah. Presently,
> the multicast queue just rotates in the RR with the others, but is never
> throttled as it doesn't have an airtime measure (though perhaps it
> should)? But that might not be desirable for management frames, as
> presumable those need to go out as fast as possible?
Good question. I guess the multicast should have an airtime measure, but
I don't think we want to do accounting on the management. That really
should be few frames, and we want them out ASAP in most cases.
> Hmm, I seem to recall thinking about just putting the call to
> schedule_txq() into drv_wake_tx_queue; don't remember why I ended up
> dropping that; but will take another look at whether it makes sense to
> combine things.
I was thinking the other way around - but that doesn't work since you'd
loop from the driver to itself. This way might work, I guess, hadn't
considered that.
Might be better anyway though to make a new inline that does both, since
the drv_() calls usually really don't do much on their own (other than
checks, and in one case backward compatibility code, but still)
johannes
Johannes Berg <[email protected]> writes:
>> > Perhaps then it should just be fractional? i.e. 8.8 bits or so?, so
>> > the default would be 1.0 (0x0100) and then you could scale down to 0.5
>> > (0x0080) etc?
>>
>> Hmm, that's an interesting idea. I'll have to run some numbers to see
>> how the precision holds up on various examples; but that would allow us
>> to get rid of the quantum in the userspace API entirely, which is a good
>> thing as far as I'm concerned!
>
> :-)
>
> We can always go to 32 bits and then we have lots to play with? It's 16
> bits right now so I picked 8.8 for no real reason - in fact that seems
> quite pointless since dividing 300us by 256 gives you only just over 1us
> which is really short.
>
> So perhaps 12.4 (minimum 18 usec) is more appropriate?
>
> But perhaps I'm just completely mistaken and you don't really want to be
> able to scale the quantum down further at all, and so you have to play
> with all the stations anyway?
>
> No idea, really, how you'd use this.
I have a patch set for hostapd that will allow a user to configure
priorities (I can send you the paper if you want details). The obvious
thing is just to prioritise a specific station by mac address ("give my
TV twice the airtime so it can stream even though it's too far away from
AP"). But the more interesting thing is the dynamic prioritisation
(e.g., "these two SSIDs should share equally no matter how many clients
they have"). For that, group weights need to be calculated and turned
into station weights by dividing with the number of active stations per
group. I've been using integer weights and scaling in userspace, but if
we just express it as fractions in the netlink API that need would go
away.
Hmm, I'll think about which mode is easier :)
>
>> > > For the drivers that get airtime as part of TX completion, sure; but as
>> > > I understand it, at least ath10k gets airtime information in out of band
>> > > status reports, so there would need to be a callback for that in any
>> > > case...
>> >
>> > Hmm, ok, but perhaps then we should also tie those to the existing
>> > airtime things?
>>
>> Eh? Tie what to which existing airtime things?
>
> At least WMM-AC: ieee80211_sta_tx_wmm_ac_notify().
Gotcha; will look at that as well.
-Toke
On Wed, 2018-08-29 at 11:27 +0200, Toke Høiland-Jørgensen wrote:
> Hmm, the problem with a higher weight is that weight*quantum becomes the
> time each station is scheduled, so having a higher value means higher
> latency. This could be fixed by replacing the station weights with
> per-station quantums, but I felt that this was exposing an
> implementation detail in the API; if we ever change the enforcement
> mechanism to not be quantum-based (as may be necessary for MU-MIMO for
> instance), we'll have to convert values in the kernel. Whereas weights
> are a conceptual measure that is not tied to any particular
> implementation.
Ok, but that's also an effect you should describe in the API for it.
Perhaps then it should just be fractional? i.e. 8.8 bits or so?, so the
default would be 1.0 (0x0100) and then you could scale down to 0.5
(0x0080) etc?
> For the drivers that get airtime as part of TX completion, sure; but as
> I understand it, at least ath10k gets airtime information in out of band
> status reports, so there would need to be a callback for that in any
> case...
Hmm, ok, but perhaps then we should also tie those to the existing
airtime things?
johannes
Rather belatedly reviewing this too ...
> + * The driver can't access the queue directly. To dequeue a frame from a
> + * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
> + * queue, it calls the .wake_tx_queue driver op.
> + *
> + * Drivers can optionally delegate responsibility for scheduling queues to
> + * mac80211, to take advantage of airtime fairness accounting. In this case, to
> + * obtain the next queue to pull frames from, the driver calls
> + * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
> + * using ieee80211_schedule_txq() if it is still active after the driver has
> + * finished pulling packets from it.
Maybe you could clarify what "is still active" means here? I'm guessing
it means "tx_dequeue() would return non-NULL" but I doubt you really
want such a strong requirement, perhaps "the last tx_dequeue() returned
non-NULL" is sufficient?
We're also working on adding a TXQ for (bufferable) management packets -
I wonder how that should interact here? Always be scheduled first?
> +/**
> + * ieee80211_schedule_txq - add txq to scheduling loop
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @txq: pointer obtained from station or virtual interface
> + * @reset_seqno: Whether to reset the internal scheduling sequence number,
> + * allowing this txq to appear again in the current scheduling
> + * round (see doc for ieee80211_next_txq()).
> + *
> + * Returns %true if the txq was actually added to the scheduling,
> + * %false otherwise.
> + */
> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
> + struct ieee80211_txq *txq,
> + bool reset_seqno);
I concur with Rajkumar in a way that "seqno" is a really bad name for
this since it's so loaded in the context of wifi. He didn't say it this
way, but said it was an "internal to mac80211" concept here, and was
perhaps also/more referring to the manipulations by drivers.
Perhaps calling it something like scheduling_round or something would be
better? That's not really what it is, schedule_id? Even schedule_seq[no]
would be clearer.
> +/**
> + * ieee80211_next_txq - get next tx queue to pull packets from
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @ac: filter returned txqs with this AC number. Pass -1 for no filtering.
> + * @inc_seqno: Whether to increase the scheduling sequence number. Setting this
> + * to true signifies the start of a new scheduling round. Each TXQ
> + * will only be returned exactly once in each round (unless its
> + * sequence number is explicitly reset when calling
> + * ieee80211_schedule_txq()).
Here you already talk about "scheduling sequence number" after all :)
> + ieee80211_schedule_txq(&local->hw, &txqi->txq, true);
> drv_wake_tx_queue(local, txqi);
These always seem to come paired - perhaps a helper is useful?
Although it looks like there are sometimes different locking contexts,
not sure if that's really necessary though.
johannes
On Wed, 2018-08-29 at 12:33 +0200, Toke Høiland-Jørgensen wrote:
> > > Hmm, the problem with a higher weight is that weight*quantum becomes the
> > > time each station is scheduled, so having a higher value means higher
> > > latency. This could be fixed by replacing the station weights with
> > > per-station quantums, but I felt that this was exposing an
> > > implementation detail in the API; if we ever change the enforcement
> > > mechanism to not be quantum-based (as may be necessary for MU-MIMO for
> > > instance), we'll have to convert values in the kernel. Whereas weights
> > > are a conceptual measure that is not tied to any particular
> > > implementation.
> >
> > Ok, but that's also an effect you should describe in the API for it.
>
> What's the right place to put that? In the netlink header file?
Right.
> > Perhaps then it should just be fractional? i.e. 8.8 bits or so?, so
> > the default would be 1.0 (0x0100) and then you could scale down to 0.5
> > (0x0080) etc?
>
> Hmm, that's an interesting idea. I'll have to run some numbers to see
> how the precision holds up on various examples; but that would allow us
> to get rid of the quantum in the userspace API entirely, which is a good
> thing as far as I'm concerned!
:-)
We can always go to 32 bits and then we have lots to play with? It's 16
bits right now so I picked 8.8 for no real reason - in fact that seems
quite pointless since dividing 300us by 256 gives you only just over 1us
which is really short.
So perhaps 12.4 (minimum 18 usec) is more appropriate?
But perhaps I'm just completely mistaken and you don't really want to be
able to scale the quantum down further at all, and so you have to play
with all the stations anyway?
No idea, really, how you'd use this.
> > > For the drivers that get airtime as part of TX completion, sure; but as
> > > I understand it, at least ath10k gets airtime information in out of band
> > > status reports, so there would need to be a callback for that in any
> > > case...
> >
> > Hmm, ok, but perhaps then we should also tie those to the existing
> > airtime things?
>
> Eh? Tie what to which existing airtime things?
At least WMM-AC: ieee80211_sta_tx_wmm_ac_notify().
johannes
On Mon, 2018-07-09 at 18:37 +0200, Toke Høiland-Jørgensen wrote:
>
> @@ -427,6 +428,8 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
> sta->cparams.interval = MS2TIME(100);
> sta->cparams.ecn = true;
>
> + sta->airtime.weight = 1;
Perhaps it might be useful to start with a higher default (even
something like 1<<8) as that would allow adjusting up/down single
stations, without having to adjust all stations and listening to new
additions to adjust them quickly etc?
Theoretically this doesn't really matter, but from a practical POV it
may be easier to leave them all at the default and just adjust the ones
that need adjustment for some reason.
> ieee80211_sta_register_airtime
Do we really need this? We already have at least TX status with airtime,
for ieee80211_sta_tx_notify() and friends, and the station pointer in
that context, so couldn't we piggy-back on this? At least WMM-AC already
requires the driver to provide this.
johannes
Johannes Berg <[email protected]> writes:
> Rather belatedly reviewing this too ...
>
>> + * The driver can't access the queue directly. To dequeue a frame from a
>> + * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
>> + * queue, it calls the .wake_tx_queue driver op.
>> + *
>> + * Drivers can optionally delegate responsibility for scheduling queues to
>> + * mac80211, to take advantage of airtime fairness accounting. In this case, to
>> + * obtain the next queue to pull frames from, the driver calls
>> + * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
>> + * using ieee80211_schedule_txq() if it is still active after the driver has
>> + * finished pulling packets from it.
>
> Maybe you could clarify what "is still active" means here? I'm
> guessing it means "tx_dequeue() would return non-NULL" but I doubt you
> really want such a strong requirement, perhaps "the last tx_dequeue()
> returned non-NULL" is sufficient?
Yeah, the latter should suffice. Really it should be put back "if it is
not known to be empty", but, well, that doesn't read so well...
> We're also working on adding a TXQ for (bufferable) management packets
> - I wonder how that should interact here? Always be scheduled first?
Ah, cool! It may be that it should be given priority, yeah. Presently,
the multicast queue just rotates in the RR with the others, but is never
throttled as it doesn't have an airtime measure (though perhaps it
should)? But that might not be desirable for management frames, as
presumable those need to go out as fast as possible?
>> +/**
>> + * ieee80211_schedule_txq - add txq to scheduling loop
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @txq: pointer obtained from station or virtual interface
>> + * @reset_seqno: Whether to reset the internal scheduling sequence number,
>> + * allowing this txq to appear again in the current scheduling
>> + * round (see doc for ieee80211_next_txq()).
>> + *
>> + * Returns %true if the txq was actually added to the scheduling,
>> + * %false otherwise.
>> + */
>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>> + struct ieee80211_txq *txq,
>> + bool reset_seqno);
>
> I concur with Rajkumar in a way that "seqno" is a really bad name for
> this since it's so loaded in the context of wifi. He didn't say it this
> way, but said it was an "internal to mac80211" concept here, and was
> perhaps also/more referring to the manipulations by drivers.
>
> Perhaps calling it something like scheduling_round or something would be
> better? That's not really what it is, schedule_id? Even schedule_seq[no]
> would be clearer.
Yeah, definitely going to change that :)
>> +/**
>> + * ieee80211_next_txq - get next tx queue to pull packets from
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @ac: filter returned txqs with this AC number. Pass -1 for no filtering.
>> + * @inc_seqno: Whether to increase the scheduling sequence number. Setting this
>> + * to true signifies the start of a new scheduling round. Each TXQ
>> + * will only be returned exactly once in each round (unless its
>> + * sequence number is explicitly reset when calling
>> + * ieee80211_schedule_txq()).
>
> Here you already talk about "scheduling sequence number" after all :)
>
>> + ieee80211_schedule_txq(&local->hw, &txqi->txq, true);
>> drv_wake_tx_queue(local, txqi);
>
> These always seem to come paired - perhaps a helper is useful?
>
> Although it looks like there are sometimes different locking contexts,
> not sure if that's really necessary though.
Hmm, I seem to recall thinking about just putting the call to
schedule_txq() into drv_wake_tx_queue; don't remember why I ended up
dropping that; but will take another look at whether it makes sense to
combine things.
-Toke