2018-09-08 03:05:57

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

This is an updated version of the patch series to move TXQ scheduling and
airtime fairness scheduling into mac80211. I've only compile tested this
version, but thought it was better to get the conversation moving instead
of being blocked on me.

This addresses most of the comments from the last round. Specifically:

- It folds in Rajkumar's patches to keep per-AC TXQ lists, and to add an
API that ath10k can use to check if a TXQ may transmit according even
when not using get_next_txq(). I changed a few names and descriptions,
but otherwise it's mostly the same. After the discussions we had in the
last series, I *think* it will work this way, but I'm not entirely sure.

- I got rid of any mention of seqno in next_txq() and schedule_txq() - and
removed the third parameter to schedule_txq() entirely, so drivers can no
longer signal that a TXQ should be allowed to re-appear in a scheduling
round. We can add that back if needed.

- Added a helper function to schedule and wake TXQs in a single call, for
internal mac80211 use.

- Changed the default station weight to 256 and got rid of the per-phy
quantum. This makes it possible to lower station weights without having
to change the weights of every other station.

A few things that were discussed in the last round that I did *not* change:

- I did not add any locking around next_txq(); the driver is still supposed
to maintain a lock that prevents two threads from trying to schedule the
same AC at the same time. This is what drivers already do, so I figured it
was easier to just keep it that way rather than do it in mac80211.

- I didn't get rid of the register_airtime() callback. As far as I can tell,
only iwlwifi uses the tx_time field in the struct tx_info. Which means that
we *could* probably use it for this and just make the other drivers set it;
but I'm not sure what effects that would have in relation to WMM-AC for
those drivers, so I chickened out. Will have to try it out, I guess; but it
also depends on whether ath10k needs to be able to report airtime
asynchronously anyway. So I'll hold off on that for a bit more.

-Toke

---

Toke Høiland-Jørgensen (4):
mac80211: Add TXQ scheduling API
mac80211: Add airtime accounting and scheduling to TXQs
cfg80211: Add airtime statistics and settings
ath9k: Switch to mac80211 TXQ scheduling and airtime APIs


drivers/net/wireless/ath/ath9k/ath9k.h | 14 --
drivers/net/wireless/ath/ath9k/debug.c | 3
drivers/net/wireless/ath/ath9k/debug.h | 8 -
drivers/net/wireless/ath/ath9k/debug_sta.c | 54 ------
drivers/net/wireless/ath/ath9k/init.c | 3
drivers/net/wireless/ath/ath9k/recv.c | 9 -
drivers/net/wireless/ath/ath9k/xmit.c | 245 ++++++++--------------------
include/net/cfg80211.h | 12 +
include/net/mac80211.h | 97 +++++++++++
include/uapi/linux/nl80211.h | 15 ++
net/mac80211/agg-tx.c | 2
net/mac80211/cfg.c | 3
net/mac80211/debugfs.c | 3
net/mac80211/debugfs_sta.c | 42 ++++-
net/mac80211/driver-ops.h | 7 +
net/mac80211/ieee80211_i.h | 11 +
net/mac80211/main.c | 5 +
net/mac80211/sta_info.c | 49 +++++-
net/mac80211/sta_info.h | 13 +
net/mac80211/tx.c | 125 ++++++++++++++
net/wireless/nl80211.c | 25 +++
21 files changed, 471 insertions(+), 274 deletions(-)

X-Clacks-Overhead: GNU Terry Pratchett


2018-09-10 16:16:39

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

On Mon, 2018-09-10 at 13:13 +0200, Toke Høiland-Jørgensen wrote:
>
> Yeah, this is the API that would be used by drivers where the actual
> scheduling of TXQs is done by the hardware (i.e., ath10k in pull/push
> mode, and iwlwifi if I understand you correctly); whereas the next_txq()
> call is for drivers that do the scheduling in software (ath10k without
> pull/push, ath9k and mt76).
>
> Basically, the way I envision this is the drivers doing something like:
>
> function on_hw_notify(hw, txq) {
> if (ieee80211_airtime_may_transmit(txq)) {
> while (hw_has_space() && (pkt = ieee80211_tx_dequeue(hw, txq)))
> push_to_hw(pkt);
>
> ieee80211_schedule_txq(txq);
> }
> }

Right, we could do that in iwlwifi.

> > > +static bool ieee80211_txq_check_deficit(struct ieee80211_local *local,
> > > + struct txq_info *txqi)
> >
> > Maybe this should be called "has_deficit" or so - the polarity isn't
> > immediately clear to me.
>
> Yup, I suck at naming; gotcha ;)

Common problem :-)

> The move to the end for check_deficit() is what replenishes the airtime
> deficit and ensures fairness.

Ok, yeah, I guess I was reading mostly the first part ("does this have a
deficit") vs. the action on it.

> So that can loop several times in a single
> call to the function.

That I don't understand, check_deficit() doesn't contain any loops? The
caller might, via "goto begin", but not sure what you're saying.

> The schedule_round counter is there to prevent the
> driver from looping indefinitely when doing `while (txq =
> ieee80211_next_txq())`. We'd generally want the deficit replenish to
> happen in any case; previously, the schedule_round type check was in the
> driver; moving it here is an attempt at simplifying the logic needed in
> the driver when using the API.

Right.

Anyway, maybe the naming should be different than "has_deficit" since
that would seem to imply some sort of "checker function" only.

Maybe the most readable thing would be to have an enum return value,
then the function name matters less.

> > Manipulating the list twice like this here seems slightly odd ...
> > perhaps move the list manipulation out of txq_check_deficit() entirely?
> > Clearly it's logically fine, but seems harder than necessary to
> > understand.
> >
> > Also, if the check_deficit() function just returns a value, the renaming
> > I suggested is easier to accept :)
>
> Yeah, maybe; it'll result in some duplication between next_txq() and
> txq_may_transmit() (to the point where I'm not sure if the separate
> check_deficit() function is really needed anymore), but it may be that
> that is actually easier to understand?

Well to be honest ... I hadn't even fully read check_deficit(), but the
naming didn't make it very clear what the contract between it and the
caller is. That's all I'm saying. As long as we clear up that contract a
bit, everything is fine with me.

What you're saying is that has_deficit() doesn't really work since it
actually does something in a few rounds there.

I guess I also got thrown off by "check" since that (to me) implies it's
just checked. Perhaps "adjust_deficit" would be better, and then you
could reverse polarity so that returning true indicates that something
was modified?

johannes

2018-09-10 13:11:17

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:
>
> +/**
> + * ieee80211_airtime_may_transmit - check whether TXQ is allowed to transmit
> + *
> + * This function is used to check whether given txq is allowed to transmit by
> + * the airtime scheduler, and can be used by drivers to access the airtime
> + * fairness accounting without going using the scheduling order enfored by
> + * next_txq().
> + *
> + * Returns %true if the airtime scheduler does not think the TXQ should be
> + * throttled. This function can also have the side effect of rotating the TXQ in
> + * the scheduler rotation, which will eventually bring the deficit to positive
> + * and allow the station to transmit again.
> + *
> + * If this function returns %true, the TXQ will have been removed from the
> + * scheduling. It is the driver's responsibility to add it back (by calling
> + * ieee80211_schedule_txq()) if needed.
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @txq: pointer obtained from station or virtual interface
> + *
> + */

Spurious newline there at the end.

(As an aside from the review, perhaps this would be what we could use in
iwlwifi, but I'll need to think about _when_ to add it back to the
scheduling later).

> + * @NL80211_EXT_FEATURE_AIRTIME_FAIRNESS: Driver supports airtime fairness
> + * scheduling.
> + *
> * @NUM_NL80211_EXT_FEATURES: number of extended features.
> * @MAX_NL80211_EXT_FEATURES: highest extended feature index.
> */
> @@ -5269,6 +5272,7 @@ enum nl80211_ext_feature_index {
> NL80211_EXT_FEATURE_SCAN_RANDOM_SN,
> NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT,
> NL80211_EXT_FEATURE_CAN_REPLACE_PTK0,
> + NL80211_EXT_FEATURE_AIRTIME_FAIRNESS,

Why is this necessary in this patch?

I think that this should probably come with more documentation too,
particularly what's expected of the driver here.

I'd prefer you reorder patches 2 and 3, and define everything in
cfg80211 first. That also makes it easier to reason about what happens
when the feature is not supported (since it will not be supported
anywhere). Then this can be included in the cfg80211 patch instead,
which places it better, and the mac80211 bits from the cfg80211 patch
can be included here, which also places those better.

> - txqi->flags & (1<<IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
> - txqi->flags & (1<<IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
> - txqi->flags & (1<<IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" : "");
> + txqi->flags & (1 << IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
> + txqi->flags & (1 << IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
> + txqi->flags & (1 << IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" : "");

consider BIT() instead as a cleanup? :)

(or maybe this is just a leftover from merging some other patches?)

> +static bool ieee80211_txq_check_deficit(struct ieee80211_local *local,
> + struct txq_info *txqi)

Maybe this should be called "has_deficit" or so - the polarity isn't
immediately clear to me.

> +{
> + struct sta_info *sta;
> +
> + lockdep_assert_held(&local->active_txq_lock);
> +
> + if (!txqi->txq.sta)
> + return true;
> +
> + sta = container_of(txqi->txq.sta, struct sta_info, sta);
> +
> + if (sta->airtime.deficit[txqi->txq.ac] > 0)
> + return true;
> +
> + if (txqi == list_first_entry_or_null(&local->active_txqs[txqi->txq.ac],

nit: I don't think you need _or_null here, since list_first_entry() will
"return" the head itself if the list is empty, which cannot match txqi?
Doesn't matter that much though, and if you think this is easier to
understand then that's fine.

> + struct txq_info,
> + schedule_order)) {
> + sta->airtime.deficit[txqi->txq.ac] += sta->airtime.weight;
> + list_move_tail(&txqi->schedule_order,
> + &local->active_txqs[txqi->txq.ac]);
> + }
> +
> + return false;
> +}
> +
> struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
> bool first)
> {
> @@ -3659,6 +3701,7 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
> if (first)
> local->schedule_round[ac]++;
>
> + begin:
> if (list_empty(&local->active_txqs[ac]))
> goto out;
>
> @@ -3666,6 +3709,9 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
> struct txq_info,
> schedule_order);
>
> + if (!ieee80211_txq_check_deficit(local, txqi))
> + goto begin;

This code isn't so badly indented - you could use a do { } while () loop
instead?

> if (txqi->schedule_round == local->schedule_round[ac])
> txqi = NULL;

and maybe you want this check first, it's much cheaper?

do {
...
} while (txqi->schedule_round != local->schedule_round[ac] &&
!has_deficit())

or so?

That is to say, I'm not sure why you'd want to abort if you find a queue
that has no deficit but is part of the next round? Or is that the abort
condition for having gone around the whole list once? Hmm. But
check_deficit() also moves them to the end, so you don't really get that
anyway?



> @@ -3681,6 +3727,25 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
> }
> EXPORT_SYMBOL(ieee80211_next_txq);
>
> +bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
> + struct ieee80211_txq *txq)
> +{
> + struct ieee80211_local *local = hw_to_local(hw);
> + struct txq_info *txqi = to_txq_info(txq);
> + bool may_tx = false;
> +
> + spin_lock_bh(&local->active_txq_lock);
> +
> + if (ieee80211_txq_check_deficit(local, txqi)) {
> + may_tx = true;
> + list_del_init(&txqi->schedule_order);

Manipulating the list twice like this here seems slightly odd ...
perhaps move the list manipulation out of txq_check_deficit() entirely?
Clearly it's logically fine, but seems harder than necessary to
understand.

Also, if the check_deficit() function just returns a value, the renaming
I suggested is easier to accept :)

johannes

2018-09-10 16:08:50

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 3/4] cfg80211: Add airtime statistics and settings

Johannes Berg <[email protected]> writes:

> On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> This adds airtime statistics to the cfg80211 station dump, and also adds=
a new
>> parameter to set the airtime weight of each station. The latter allows u=
serspace
>> to implement policies for different stations by varying their weights.
>
> Maybe some documentation on the weights - both for other implementers as
> well as users - would be useful.
>
>> + * @NL80211_STA_INFO_AIRTIME_WEIGHT: current airtime weight for station
>> + * (u16, usec)
>
> how could the weight be in usec? copy/paste issue?

Well, technically it is a usec value (the amount of airtime added to the
deficit each round); but I agree that it shouldn't be documented as such
if we want freedom to change the mechanism later...

-Toke

2018-09-12 21:29:06

by Rajkumar Manoharan

[permalink] [raw]
Subject: Re: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

On 2018-09-12 04:10, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <[email protected]> writes:
>
>> On 2018-09-10 04:13, Toke Høiland-Jørgensen wrote:
>>> Johannes Berg <[email protected]> writes:
>>>>> - txqi->flags & (1<<IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>>>>> - txqi->flags & (1<<IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>>>>> - txqi->flags & (1<<IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU"
>>>>> :
>>>>> "");
>>>>> + txqi->flags & (1 << IEEE80211_TXQ_STOP) ? "STOP" :
>>>>> "RUN",
>>>>> + txqi->flags & (1 << IEEE80211_TXQ_AMPDU) ? " AMPDU" :
>>>>> "",
>>>>> + txqi->flags & (1 << IEEE80211_TXQ_NO_AMSDU) ? "
>>>>> NO-AMSDU"
>>>>> : "");
>>>>
>>>> consider BIT() instead as a cleanup? :)
>>>>
>>>> (or maybe this is just a leftover from merging some other patches?)
>>>
>>> Yeah, this is a merging artifact; Rajkumar's patch added another
>>> flag,
>>> that I removed again. Didn't notice that there was still a whitespace
>>> change in this patch...
>>>
>> I added the flag based on our last discussion. The driver needs to
>> check
>> txq status for each tx_dequeue(). One time txq check is not sufficient
>> as it allows the driver to dequeue all frames from txq.
>>
>> drv_func() {
>> while (ieee80211_airtime_may_transmit(txq) &&
>> hw_has_space() &&
>> (pkt = ieee80211_tx_dequeue(hw, txq)))
>> push_to_hw(pkt);
>> }
>
> Yeah, but with airtime only being recorded on TX completion, the odds
> of
> the value changing within that loop are quite low; so it's not going to
> work, which is why I removed it.
>
> However, after reading Kan's patches I get where you're coming from; a
> check in tx_dequeue() is needed for the BQL-style queue limiting. Will
> try to incorporate a version of that in the next series so you can see
> what I mean when I say it should be orthogonal; and I'm still not sure
> it needs a flag :)
>
Got it.. Will wait for next version.. thanks.

>>>>> +bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
>>>>> + struct ieee80211_txq *txq)
>>>>> +{
>>>>> + struct ieee80211_local *local = hw_to_local(hw);
>>>>> + struct txq_info *txqi = to_txq_info(txq);
>>>>> + bool may_tx = false;
>>>>> +
>>>>> + spin_lock_bh(&local->active_txq_lock);
>>>>> +
>>>>> + if (ieee80211_txq_check_deficit(local, txqi)) {
>>>>> + may_tx = true;
>>>>> + list_del_init(&txqi->schedule_order);
>>>>
>>
>> To handle above case, may_transmit should remove the node only
>> when it is in list.
>>
>> if (list_empty(&txqi->schedule_order))
>> list_del_init(&txqi->schedule_order);
>
> I assume you missed a ! in that if, right? :)
>
Oops.. yes it should be ! :)

-Rajkumar

2018-09-10 15:57:20

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Mon, 2018-09-10 at 12:57 +0200, Toke Høiland-Jørgensen wrote:
> Johannes Berg <[email protected]> writes:
>
> > On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:
> > >
> > > Usage of the new API is optional, so drivers can be ported one at a time.
> >
> > With the 1:1 hardware queue/txq mapping in iwlwifi (we're close to
> > getting that patch in, though now the Jewish holidays mean a delay),
> > I'm not sure we'd be able to do this at all in iwlwifi. So this may
> > not be a case of porting one at a time until we can get rid of it ...
>
> Could you elaborate a bit more on how the hwq/txq stuff works in iwl?

I can try.

> Does the driver just hand off a TXQ to the hardware on wake_txq(), which
> is then scheduled by the hardware after that? Or how does the mapping to
> hwqs work, and how is the hardware notified that there are still packets
> queued / that new packets have arrived for an already mapped txq?

Basically, we use the TXQ driver data to do this. On initialization, we
set all TXQs to have no hardware queue mapped. Then, when the first wake
happens for this TXQ, we allocate a hardware queue for this RA/TID.

>From then on, right now we basically just try to refill the hardware
queue from the TXQ whenever the hardware releases frames from that
queue. If there are none, we fall back to putting them on the hardware
queue from the wake signal.

> > It would be nice to be able to use it, for better queue behaviour, but
> > it would mean much more accounting in iwlwifi? Not even sure off the
> > top of my head how to do that.
>
> I think we'll need to have some kind of fallback airtime estimation in
> mac80211 that calculates airtime from packet size and rate information.
> Not all drivers can get this from the hardware, it seems. See my reply
> to Kan about the BQL-like behaviour as well.
>
> Does iwl get airtime usage information for every packet on tx complete?

Yes, we do.

johannes

2018-09-10 08:11:40

by Kan Yan

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

Hi Toke,

It is great to see finally there will be a general airtime fairness
support in mac80211. I feel this patch set could still use some
improvements to make it works better with 11ac drivers like ath10k. I
did a version of airtime fairness in the ath10k driver that used in
Google Wifi and I'd like to share a few things I learned from this
experience.

Airtime fairness is a cool feature, but that's not the motivation for
me to implement that. The goal is to solve the bufferbloat problem at
wireless interface. Fq_CoDel has been proved to be very effective in
fixing bufferbloat for wired interface, and it has been integrated
with mac802.11 for more than 2 years. However, it still does not work
well with 11ac drivers like ath10k, which relies heavily on firmware
to do most of transmit scheduling, and hence has deep firmware queue.
FQ_CoDel expects a thin driver layer queue so it can get an accurate
measurement of sojourn time and use the sojourn time to control
latency. However, the queue in ath10k firmware is so deep, that
packets only stay in mac80211 layer queue transiently. The sojourn
time is almost always below target, and hence CoDel doesn't do much to
control the latency.

The key to make Fq_Codel works effectively with 11ac wireless driver
is to limit the queue depth in lower layer. It looks to me this patch
set only enforce fairness among stations (airtime wise), but did not
limit how many packets can be released to firmware/hardware. When a
txq run out of its airtime deficit, the txq is put to the back of the
list but will get a refill of airtime deficit in next round. It
doesn't keep counts of the total pending airtime released to hardware
or firmware. Packets will be released as long as wireless driver keeps
calling the dequeue function. This maybe not an issue for ath9k,
because the number of available tx descriptors is more limited.
However, for 11ac drivers like ath10k, it routinely queues thousands
packets when the bandwidth is oversubscribed in my test.

The version I did is to use pending airtime to restrict the firmware
queue depth. Another difference is it does not use airtime reported
from firmware, which is "past" or spent airtime, but use "future"
airtime, the estimated airtime calculated using last reported tx rate
for the inflight packets pending in the firmware queue. Host driver
keeps counts of total pending airtime for each TxQ (per station, per
AC). It tries to release up to 4-8 ms worth of frames for each TxQ.
The idea is to use airtime accounting to give firmware just enough
frames to keep it busy and form good aggregations, and keeps the rest
of frames in mac80211 layer queues so Fq_Codel can work on it to
control latency.

Here are some of the patches from the version I did in ath10k for
kernel in ChromeOs.
CHROMIUM: net: mac80211: Recording last tx bitrate.
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/=
588189

CHROMIUM: ath10k: Implementing airtime fairness based TX scheduler.
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/=
588190/13

CHROMIUM: ath10k: use frame count as defict for fq_codel.
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/=
588192/13

I had some discussions with Rajkumar on how to adapt some methods from
my patch and make it fit with the new TXQ scheduling API proposed
here. The major issue is the lack of pending airtime accounting. It
could be done in individual wireless drivers, such as in ath10k, or
even in firmware, but implement it in mac80211 could be a better
choice than do it one driver at a time.

Thanks,
Kan

On Fri, Sep 7, 2018 at 3:22 PM, Toke H=C3=B8iland-J=C3=B8rgensen <toke@toke=
.dk> wrote:
> This is an updated version of the patch series to move TXQ scheduling and
> airtime fairness scheduling into mac80211. I've only compile tested this
> version, but thought it was better to get the conversation moving instead
> of being blocked on me.
>
> This addresses most of the comments from the last round. Specifically:
>
> - It folds in Rajkumar's patches to keep per-AC TXQ lists, and to add an
> API that ath10k can use to check if a TXQ may transmit according even
> when not using get_next_txq(). I changed a few names and descriptions,
> but otherwise it's mostly the same. After the discussions we had in the
> last series, I *think* it will work this way, but I'm not entirely sure=
.
>
> - I got rid of any mention of seqno in next_txq() and schedule_txq() - an=
d
> removed the third parameter to schedule_txq() entirely, so drivers can =
no
> longer signal that a TXQ should be allowed to re-appear in a scheduling
> round. We can add that back if needed.
>
> - Added a helper function to schedule and wake TXQs in a single call, for
> internal mac80211 use.
>
> - Changed the default station weight to 256 and got rid of the per-phy
> quantum. This makes it possible to lower station weights without having
> to change the weights of every other station.
>
> A few things that were discussed in the last round that I did *not* chang=
e:
>
> - I did not add any locking around next_txq(); the driver is still suppos=
ed
> to maintain a lock that prevents two threads from trying to schedule th=
e
> same AC at the same time. This is what drivers already do, so I figured=
it
> was easier to just keep it that way rather than do it in mac80211.
>
> - I didn't get rid of the register_airtime() callback. As far as I can te=
ll,
> only iwlwifi uses the tx_time field in the struct tx_info. Which means =
that
> we *could* probably use it for this and just make the other drivers set=
it;
> but I'm not sure what effects that would have in relation to WMM-AC for
> those drivers, so I chickened out. Will have to try it out, I guess; bu=
t it
> also depends on whether ath10k needs to be able to report airtime
> asynchronously anyway. So I'll hold off on that for a bit more.
>
> -Toke
>
> ---
>
> Toke H=C3=B8iland-J=C3=B8rgensen (4):
> mac80211: Add TXQ scheduling API
> mac80211: Add airtime accounting and scheduling to TXQs
> cfg80211: Add airtime statistics and settings
> ath9k: Switch to mac80211 TXQ scheduling and airtime APIs
>
>
> drivers/net/wireless/ath/ath9k/ath9k.h | 14 --
> drivers/net/wireless/ath/ath9k/debug.c | 3
> drivers/net/wireless/ath/ath9k/debug.h | 8 -
> drivers/net/wireless/ath/ath9k/debug_sta.c | 54 ------
> drivers/net/wireless/ath/ath9k/init.c | 3
> drivers/net/wireless/ath/ath9k/recv.c | 9 -
> drivers/net/wireless/ath/ath9k/xmit.c | 245 ++++++++--------------=
------
> include/net/cfg80211.h | 12 +
> include/net/mac80211.h | 97 +++++++++++
> include/uapi/linux/nl80211.h | 15 ++
> net/mac80211/agg-tx.c | 2
> net/mac80211/cfg.c | 3
> net/mac80211/debugfs.c | 3
> net/mac80211/debugfs_sta.c | 42 ++++-
> net/mac80211/driver-ops.h | 7 +
> net/mac80211/ieee80211_i.h | 11 +
> net/mac80211/main.c | 5 +
> net/mac80211/sta_info.c | 49 +++++-
> net/mac80211/sta_info.h | 13 +
> net/mac80211/tx.c | 125 ++++++++++++++
> net/wireless/nl80211.c | 25 +++
> 21 files changed, 471 insertions(+), 274 deletions(-)
>
> X-Clacks-Overhead: GNU Terry Pratchett

2018-09-10 12:41:30

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:
>
> Usage of the new API is optional, so drivers can be ported one at a time.

With the 1:1 hardware queue/txq mapping in iwlwifi (we're close to
getting that patch in, though now the Jewish holidays mean a delay), I'm
not sure we'd be able to do this at all in iwlwifi. So this may not be a
case of porting one at a time until we can get rid of it ...

It would be nice to be able to use it, for better queue behaviour, but
it would mean much more accounting in iwlwifi? Not even sure off the top
of my head how to do that.

johannes

2018-09-10 16:11:26

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

Johannes Berg <[email protected]> writes:

> On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>=20
>> A few things that were discussed in the last round that I did *not* chan=
ge:
>
> Thanks for this list btw.
>
>> - I did not add any locking around next_txq(); the driver is still suppo=
sed
>> to maintain a lock that prevents two threads from trying to schedule t=
he
>> same AC at the same time. This is what drivers already do, so I figure=
d it
>> was easier to just keep it that way rather than do it in mac80211.
>
> I'll look at this in the code, but from a maintainer perspective I'm
> somewhat worried that this will lead to issues that are really the
> driver's fault, but surface in mac80211. I don't know how easy it
> would be to catch that.

Yeah, I get what you mean. The alternative would be to have a
ieee80211_start_schedule(ac) and ieee80211_end_schedule(ac), which
basically just takes a lock. Would mean we could get rid of the 'first'
parameter for next_txq(), so might not be such a bad idea; and if the
driver has its own locking the extra locking in mac80211 would just be
an always-uncontested spinlock, which shouldn't be much overhead, right?

-Toke

2018-09-08 03:05:59

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: [PATCH RFC v3 3/4] cfg80211: Add airtime statistics and settings

This adds airtime statistics to the cfg80211 station dump, and also adds a new
parameter to set the airtime weight of each station. The latter allows userspace
to implement policies for different stations by varying their weights.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
include/net/cfg80211.h | 12 ++++++++++--
include/uapi/linux/nl80211.h | 11 +++++++++++
net/mac80211/cfg.c | 3 +++
net/mac80211/sta_info.c | 15 +++++++++++++++
net/mac80211/sta_info.h | 3 ---
net/wireless/nl80211.c | 25 +++++++++++++++++++++++++
6 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 9f3ed79c39d7..6fa4a97eefee 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -990,6 +990,7 @@ enum station_parameters_apply_mask {
* @support_p2p_ps: information if station supports P2P PS mechanism
* @he_capa: HE capabilities of station
* @he_capa_len: the length of the HE capabilities
+ * @airtime_weight: airtime scheduler weight for this station
*/
struct station_parameters {
const u8 *supported_rates;
@@ -1019,6 +1020,7 @@ struct station_parameters {
int support_p2p_ps;
const struct ieee80211_he_cap_elem *he_capa;
u8 he_capa_len;
+ u16 airtime_weight;
};

/**
@@ -1286,6 +1288,8 @@ struct cfg80211_tid_stats {
* @rx_beacon_signal_avg: signal strength average (in dBm) for beacons received
* from this peer
* @rx_duration: aggregate PPDU duration(usecs) for all the frames from a peer
+ * @tx_duration: aggregate PPDU duration(usecs) for all the frames to a peer
+ * @airtime_weight: current airtime scheduling weight
* @pertid: per-TID statistics, see &struct cfg80211_tid_stats, using the last
* (IEEE80211_NUM_TIDS) index for MSDUs not encapsulated in QoS-MPDUs.
* Note that this doesn't use the @filled bit, but is used if non-NULL.
@@ -1332,10 +1336,12 @@ struct station_info {

u32 expected_throughput;

- u64 rx_beacon;
+ u64 tx_duration;
u64 rx_duration;
+ u64 rx_beacon;
u8 rx_beacon_signal_avg;
struct cfg80211_tid_stats *pertid;
+ u16 airtime_weight;
s8 ack_signal;
s8 avg_ack_signal;
};
@@ -2349,7 +2355,7 @@ enum cfg80211_connect_params_changed {
* @WIPHY_PARAM_DYN_ACK: dynack has been enabled
* @WIPHY_PARAM_TXQ_LIMIT: TXQ packet limit has been changed
* @WIPHY_PARAM_TXQ_MEMORY_LIMIT: TXQ memory limit has been changed
- * @WIPHY_PARAM_TXQ_QUANTUM: TXQ scheduler quantum
+ * @WIPHY_PARAM_TXQ_QUANTUM: TXQ scheduler quantum (bytes)
*/
enum wiphy_params_flags {
WIPHY_PARAM_RETRY_SHORT = 1 << 0,
@@ -2363,6 +2369,8 @@ enum wiphy_params_flags {
WIPHY_PARAM_TXQ_QUANTUM = 1 << 8,
};

+#define IEEE80211_DEFAULT_AIRTIME_WEIGHT 256
+
/**
* struct cfg80211_pmksa - PMK Security Association
*
diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index 6b2b61646b6a..009c2e7b762f 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -2241,6 +2241,9 @@ enum nl80211_commands {
* association request when used with NL80211_CMD_NEW_STATION). Can be set
* only if %NL80211_STA_FLAG_WME is set.
*
+ * @NL80211_ATTR_AIRTIME_WEIGHT: Station's weight when scheduled by the airtime
+ * scheduler.
+ *
* @NUM_NL80211_ATTR: total number of nl80211_attrs available
* @NL80211_ATTR_MAX: highest attribute number currently defined
* @__NL80211_ATTR_AFTER_LAST: internal use
@@ -2682,6 +2685,8 @@ enum nl80211_attrs {

NL80211_ATTR_HE_CAPABILITY,

+ NL80211_ATTR_AIRTIME_WEIGHT,
+
/* add attributes here, update the policy in nl80211.c */

__NL80211_ATTR_AFTER_LAST,
@@ -3051,6 +3056,10 @@ enum nl80211_sta_bss_param {
* @NL80211_STA_INFO_PAD: attribute used for padding for 64-bit alignment
* @NL80211_STA_INFO_ACK_SIGNAL: signal strength of the last ACK frame(u8, dBm)
* @NL80211_STA_INFO_ACK_SIGNAL_AVG: avg signal strength of ACK frames (s8, dBm)
+ * @NL80211_STA_INFO_TX_DURATION: aggregate PPDU duration for all frames
+ * sent to the station (u64, usec)
+ * @NL80211_STA_INFO_AIRTIME_WEIGHT: current airtime weight for station
+ * (u16, usec)
* @__NL80211_STA_INFO_AFTER_LAST: internal
* @NL80211_STA_INFO_MAX: highest possible station info attribute
*/
@@ -3091,6 +3100,8 @@ enum nl80211_sta_info {
NL80211_STA_INFO_PAD,
NL80211_STA_INFO_ACK_SIGNAL,
NL80211_STA_INFO_ACK_SIGNAL_AVG,
+ NL80211_STA_INFO_TX_DURATION,
+ NL80211_STA_INFO_AIRTIME_WEIGHT,

/* keep last */
__NL80211_STA_INFO_AFTER_LAST,
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index 504627e2117f..8bd7c917a7b5 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -1388,6 +1388,9 @@ static int sta_apply_parameters(struct ieee80211_local *local,
if (ieee80211_vif_is_mesh(&sdata->vif))
sta_apply_mesh_params(local, sta, params);

+ if (params->airtime_weight)
+ sta->airtime.weight = params->airtime_weight;
+
/* set the STA state after all sta info from usermode has been set */
if (test_sta_flag(sta, WLAN_STA_TDLS_PEER) ||
set & BIT(NL80211_STA_FLAG_ASSOCIATED)) {
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index aba13f88bc86..b09d4f164797 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -2220,6 +2220,21 @@ void sta_set_sinfo(struct sta_info *sta, struct station_info *sinfo,
sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_FAILED);
}

+ if (!(sinfo->filled & BIT(NL80211_STA_INFO_RX_DURATION))) {
+ sinfo->rx_duration = sta->airtime.rx_airtime;
+ sinfo->filled |= BIT(NL80211_STA_INFO_RX_DURATION);
+ }
+
+ if (!(sinfo->filled & BIT(NL80211_STA_INFO_TX_DURATION))) {
+ sinfo->tx_duration = sta->airtime.tx_airtime;
+ sinfo->filled |= BIT(NL80211_STA_INFO_TX_DURATION);
+ }
+
+ if (!(sinfo->filled & BIT(NL80211_STA_INFO_AIRTIME_WEIGHT))) {
+ sinfo->airtime_weight = sta->airtime.weight;
+ sinfo->filled |= BIT(NL80211_STA_INFO_AIRTIME_WEIGHT);
+ }
+
sinfo->rx_dropped_misc = sta->rx_stats.dropped;
if (sta->pcpu_rx_stats) {
for_each_possible_cpu(cpu) {
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 97fff032eee5..d8210ebf0cdd 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -127,9 +127,6 @@ enum ieee80211_agg_stop_reason {
AGG_STOP_DESTROY_STA,
};

-/* How much to increase airtime deficit on each scheduling round, by default
- * (userspace can change this per phy) */
-#define IEEE80211_DEFAULT_AIRTIME_WEIGHT 256
/* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
#define AIRTIME_USE_TX BIT(0)
#define AIRTIME_USE_RX BIT(1)
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index d5f9b5235cdd..4402d6f0ba03 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -430,6 +430,8 @@ static const struct nla_policy nl80211_policy[NUM_NL80211_ATTR] = {
[NL80211_ATTR_TXQ_QUANTUM] = { .type = NLA_U32 },
[NL80211_ATTR_HE_CAPABILITY] = { .type = NLA_BINARY,
.len = NL80211_HE_MAX_CAPABILITY_LEN },
+
+ [NL80211_ATTR_AIRTIME_WEIGHT] = { .type = NLA_U16 },
};

/* policy for the key attributes */
@@ -4656,6 +4658,11 @@ static int nl80211_send_station(struct sk_buff *msg, u32 cmd, u32 portid,
PUT_SINFO(PLID, plid, u16);
PUT_SINFO(PLINK_STATE, plink_state, u8);
PUT_SINFO_U64(RX_DURATION, rx_duration);
+ PUT_SINFO_U64(TX_DURATION, tx_duration);
+
+ if (wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ PUT_SINFO(AIRTIME_WEIGHT, airtime_weight, u16);

switch (rdev->wiphy.signal_type) {
case CFG80211_SIGNAL_TYPE_MBM:
@@ -5293,6 +5300,17 @@ static int nl80211_set_station(struct sk_buff *skb, struct genl_info *info)
nla_get_u8(info->attrs[NL80211_ATTR_OPMODE_NOTIF]);
}

+ if (info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]) {
+ if (!wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ return -EOPNOTSUPP;
+
+ params.airtime_weight =
+ nla_get_u16(info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]);
+ if (!params.airtime_weight)
+ return -EINVAL;
+ }
+
/* Include parameters for TDLS peer (will check later) */
err = nl80211_set_station_tdls(info, &params);
if (err)
@@ -5431,6 +5449,13 @@ static int nl80211_new_station(struct sk_buff *skb, struct genl_info *info)
return -EINVAL;
}

+ if (info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]) {
+ params.airtime_weight =
+ nla_get_u16(info->attrs[NL80211_ATTR_AIRTIME_WEIGHT]);
+ if (!params.airtime_weight)
+ return -EINVAL;
+ }
+
err = nl80211_parse_sta_channel_info(info, &params);
if (err)
return err;

2018-09-10 12:44:59

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:
>
> A few things that were discussed in the last round that I did *not* change:

Thanks for this list btw.

> - I did not add any locking around next_txq(); the driver is still supposed
> to maintain a lock that prevents two threads from trying to schedule the
> same AC at the same time. This is what drivers already do, so I figured it
> was easier to just keep it that way rather than do it in mac80211.

I'll look at this in the code, but from a maintainer perspective I'm
somewhat worried that this will lead to issues that are really the
driver's fault, but surface in mac80211. I don't know how easy it would
be to catch that.

Might be fine, but something to keep in mind.

johannes

2018-09-08 03:05:58

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

This adds airtime accounting and scheduling to the mac80211 TXQ
scheduler. A new callback, ieee80211_sta_register_airtime(), is added
that drivers can call to report airtime usage for stations.

When airtime information is present, mac80211 will schedule TXQs
(through ieee80211_next_txq()) in a way that enforces airtime fairness
between active stations. This scheduling works the same way as the ath9k
in-driver airtime fairness scheduling. If no airtime usage is reported
by the driver, the scheduler will default to round-robin scheduling.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
include/net/mac80211.h | 52 ++++++++++++++++++++++++++++++++
include/uapi/linux/nl80211.h | 4 ++
net/mac80211/debugfs.c | 3 ++
net/mac80211/debugfs_sta.c | 42 ++++++++++++++++++++++++--
net/mac80211/ieee80211_i.h | 2 +
net/mac80211/main.c | 1 +
net/mac80211/sta_info.c | 32 +++++++++++++++++++
net/mac80211/sta_info.h | 16 ++++++++++
net/mac80211/tx.c | 69 +++++++++++++++++++++++++++++++++++++++++-
9 files changed, 216 insertions(+), 5 deletions(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index ff413d9caa50..a744ad4f4f59 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -5350,6 +5350,34 @@ void ieee80211_sta_eosp(struct ieee80211_sta *pubsta);
*/
void ieee80211_send_eosp_nullfunc(struct ieee80211_sta *pubsta, int tid);

+/**
+ * ieee80211_sta_register_airtime - register airtime usage for a sta/tid
+ *
+ * Register airtime usage for a given sta on a given tid. The driver can call
+ * this function to notify mac80211 that a station used a certain amount of
+ * airtime. This information will be used by the TXQ scheduler to schedule
+ * stations in a way that ensures airtime fairness.
+ *
+ * The reported airtime should as a minimum include all time that is spent
+ * transmitting to the remote station, including overhead and padding, but not
+ * including time spent waiting for a TXOP. If the time is not reported by the
+ * hardware it can in some cases be calculated from the rate and known frame
+ * composition. When possible, the time should include any failed transmission
+ * attempts.
+ *
+ * The driver can either call this function synchronously for every packet or
+ * aggregate, or asynchronously as airtime usage information becomes available.
+ * TX and RX airtime can be reported together, or separately by setting one of
+ * them to 0.
+ *
+ * @pubsta: the station
+ * @tid: the TID to register airtime for
+ * @tx_airtime: airtime used during TX (in usec)
+ * @rx_airtime: airtime used during RX (in usec)
+ */
+void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
+ u32 tx_airtime, u32 rx_airtime);
+
/**
* ieee80211_iter_keys - iterate keys programmed into the device
* @hw: pointer obtained from ieee80211_alloc_hw()
@@ -6089,6 +6117,30 @@ bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
bool first);

+/**
+ * ieee80211_airtime_may_transmit - check whether TXQ is allowed to transmit
+ *
+ * This function is used to check whether given txq is allowed to transmit by
+ * the airtime scheduler, and can be used by drivers to access the airtime
+ * fairness accounting without going using the scheduling order enfored by
+ * next_txq().
+ *
+ * Returns %true if the airtime scheduler does not think the TXQ should be
+ * throttled. This function can also have the side effect of rotating the TXQ in
+ * the scheduler rotation, which will eventually bring the deficit to positive
+ * and allow the station to transmit again.
+ *
+ * If this function returns %true, the TXQ will have been removed from the
+ * scheduling. It is the driver's responsibility to add it back (by calling
+ * ieee80211_schedule_txq()) if needed.
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @txq: pointer obtained from station or virtual interface
+ *
+ */
+bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
+ struct ieee80211_txq *txq);
+
/**
* ieee80211_txq_get_depth - get pending frame/byte count of given txq
*
diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index cfc94178d608..6b2b61646b6a 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -5231,6 +5231,9 @@ enum nl80211_feature_flags {
* if this flag is not set. Ignoring this can leak clear text packets and/or
* freeze the connection.
*
+ * @NL80211_EXT_FEATURE_AIRTIME_FAIRNESS: Driver supports airtime fairness
+ * scheduling.
+ *
* @NUM_NL80211_EXT_FEATURES: number of extended features.
* @MAX_NL80211_EXT_FEATURES: highest extended feature index.
*/
@@ -5269,6 +5272,7 @@ enum nl80211_ext_feature_index {
NL80211_EXT_FEATURE_SCAN_RANDOM_SN,
NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT,
NL80211_EXT_FEATURE_CAN_REPLACE_PTK0,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS,

/* add new features before the definition below */
NUM_NL80211_EXT_FEATURES,
diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index 3fe541e358f3..81c5fec2eae7 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -383,6 +383,9 @@ void debugfs_hw_add(struct ieee80211_local *local)
if (local->ops->wake_tx_queue)
DEBUGFS_ADD_MODE(aqm, 0600);

+ debugfs_create_u16("airtime_flags", 0600,
+ phyd, &local->airtime_flags);
+
statsd = debugfs_create_dir("statistics", phyd);

/* if the dir failed, don't put all the other things into the root! */
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index af5185a836e5..b6950d883a3a 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -181,9 +181,9 @@ static ssize_t sta_aqm_read(struct file *file, char __user *userbuf,
txqi->tin.tx_bytes,
txqi->tin.tx_packets,
txqi->flags,
- txqi->flags & (1<<IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
- txqi->flags & (1<<IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
- txqi->flags & (1<<IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" : "");
+ txqi->flags & (1 << IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
+ txqi->flags & (1 << IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
+ txqi->flags & (1 << IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" : "");
}

rcu_read_unlock();
@@ -195,6 +195,38 @@ static ssize_t sta_aqm_read(struct file *file, char __user *userbuf,
}
STA_OPS(aqm);

+static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ struct sta_info *sta = file->private_data;
+ struct ieee80211_local *local = sta->sdata->local;
+ size_t bufsz = 200;
+ char *buf = kzalloc(bufsz, GFP_KERNEL), *p = buf;
+ ssize_t rv;
+
+ if (!buf)
+ return -ENOMEM;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ p += scnprintf(p, bufsz + buf - p,
+ "RX: %llu us\nTX: %llu us\nWeight: %u\n"
+ "Deficit: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
+ sta->airtime.rx_airtime,
+ sta->airtime.tx_airtime,
+ sta->airtime.weight,
+ sta->airtime.deficit[0],
+ sta->airtime.deficit[1],
+ sta->airtime.deficit[2],
+ sta->airtime.deficit[3]);
+
+ spin_unlock_bh(&local->active_txq_lock);
+ rv = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
+ kfree(buf);
+ return rv;
+}
+STA_OPS(airtime);
+
static ssize_t sta_agg_status_read(struct file *file, char __user *userbuf,
size_t count, loff_t *ppos)
{
@@ -906,6 +938,10 @@ void ieee80211_sta_debugfs_add(struct sta_info *sta)
if (local->ops->wake_tx_queue)
DEBUGFS_ADD(aqm);

+ if (wiphy_ext_feature_isset(local->hw.wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
+ DEBUGFS_ADD(airtime);
+
if (sizeof(sta->driver_buffered_tids) == sizeof(u32))
debugfs_create_x32("driver_buffered_tids", 0400,
sta->debugfs_dir,
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 75f9c9ab364f..838542b4f583 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1136,6 +1136,8 @@ struct ieee80211_local {
struct list_head active_txqs[IEEE80211_NUM_ACS];
u16 schedule_round[IEEE80211_NUM_ACS];

+ u16 airtime_flags;
+
const struct ieee80211_ops *ops;

/*
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index c6d88758cbb7..9e7b33a0d0a6 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -666,6 +666,7 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
for (i = 0; i < IEEE80211_NUM_ACS; i++)
INIT_LIST_HEAD(&local->active_txqs[i]);
spin_lock_init(&local->active_txq_lock);
+ local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;

INIT_LIST_HEAD(&local->chanctx_list);
mutex_init(&local->chanctx_mtx);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index c2f5cb7df54f..aba13f88bc86 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -387,9 +387,12 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
if (sta_prepare_rate_control(local, sta, gfp))
goto free_txq;

+ sta->airtime.weight = IEEE80211_DEFAULT_AIRTIME_WEIGHT;
+
for (i = 0; i < IEEE80211_NUM_ACS; i++) {
skb_queue_head_init(&sta->ps_tx_buf[i]);
skb_queue_head_init(&sta->tx_filtered[i]);
+ sta->airtime.deficit[i] = sta->airtime.weight;
}

for (i = 0; i < IEEE80211_NUM_TIDS; i++)
@@ -1826,6 +1829,35 @@ void ieee80211_sta_set_buffered(struct ieee80211_sta *pubsta,
}
EXPORT_SYMBOL(ieee80211_sta_set_buffered);

+void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
+ u32 tx_airtime, u32 rx_airtime)
+{
+ struct sta_info *sta = container_of(pubsta, struct sta_info, sta);
+ struct ieee80211_local *local = sta->sdata->local;
+ struct txq_info *txqi;
+ u8 ac = ieee80211_ac_from_tid(tid);
+ u32 airtime = 0;
+
+ if (sta->local->airtime_flags & AIRTIME_USE_TX)
+ airtime += tx_airtime;
+ if (sta->local->airtime_flags & AIRTIME_USE_RX)
+ airtime += rx_airtime;
+
+ spin_lock_bh(&local->active_txq_lock);
+ sta->airtime.tx_airtime += tx_airtime;
+ sta->airtime.rx_airtime += rx_airtime;
+ sta->airtime.deficit[ac] -= airtime;
+
+ if (sta->airtime.deficit[ac] < 0) {
+ txqi = to_txq_info(pubsta->txq[tid]);
+ if (list_empty(&txqi->schedule_order))
+ list_add_tail(&txqi->schedule_order,
+ &local->active_txqs[ac]);
+ }
+ spin_unlock_bh(&local->active_txq_lock);
+}
+EXPORT_SYMBOL(ieee80211_sta_register_airtime);
+
int sta_info_move_state(struct sta_info *sta,
enum ieee80211_sta_state new_state)
{
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 9a04327d71d1..97fff032eee5 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -127,6 +127,20 @@ enum ieee80211_agg_stop_reason {
AGG_STOP_DESTROY_STA,
};

+/* How much to increase airtime deficit on each scheduling round, by default
+ * (userspace can change this per phy) */
+#define IEEE80211_DEFAULT_AIRTIME_WEIGHT 256
+/* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
+#define AIRTIME_USE_TX BIT(0)
+#define AIRTIME_USE_RX BIT(1)
+
+struct airtime_info {
+ u64 rx_airtime;
+ u64 tx_airtime;
+ s64 deficit[IEEE80211_NUM_ACS];
+ u16 weight;
+};
+
struct sta_info;

/**
@@ -563,6 +577,8 @@ struct sta_info {
} tx_stats;
u16 tid_seq[IEEE80211_QOS_CTL_TID_MASK + 1];

+ struct airtime_info airtime;
+
/*
* Aggregation information, locked with lock.
*/
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 8bcc447e2cbc..758368d6106e 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1446,6 +1446,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
codel_stats_init(&txqi->cstats);
__skb_queue_head_init(&txqi->frags);
INIT_LIST_HEAD(&txqi->schedule_order);
+ txqi->flags = 0;

txqi->txq.vif = &sdata->vif;

@@ -1486,6 +1487,7 @@ void ieee80211_txq_purge(struct ieee80211_local *local,

fq_tin_reset(fq, tin, fq_skb_free_func);
ieee80211_purge_tx_queue(&local->hw, &txqi->frags);
+ txqi->flags = 0;
spin_lock_bh(&local->active_txq_lock);
list_del_init(&txqi->schedule_order);
spin_unlock_bh(&local->active_txq_lock);
@@ -3637,8 +3639,22 @@ bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
spin_lock_bh(&local->active_txq_lock);

if (list_empty(&txqi->schedule_order)) {
- list_add_tail(&txqi->schedule_order,
- &local->active_txqs[txq->ac]);
+ /* If airtime accounting is active, always enqueue STAs at the
+ * head of the list to ensure that they only get moved to the
+ * back by the airtime DRR scheduler once they have a negative
+ * deficit. A station that already has a negative deficit will
+ * get immediately moved to the back of the list on the next
+ * call to ieee80211_next_txq().
+ */
+ if (wiphy_ext_feature_isset(local->hw.wiphy,
+ NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)
+ && txqi->txq.sta)
+ list_add(&txqi->schedule_order,
+ &local->active_txqs[txq->ac]);
+ else
+ list_add_tail(&txqi->schedule_order,
+ &local->active_txqs[txq->ac]);
+
ret = true;
}

@@ -3648,6 +3664,32 @@ bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
}
EXPORT_SYMBOL(ieee80211_schedule_txq);

+static bool ieee80211_txq_check_deficit(struct ieee80211_local *local,
+ struct txq_info *txqi)
+{
+ struct sta_info *sta;
+
+ lockdep_assert_held(&local->active_txq_lock);
+
+ if (!txqi->txq.sta)
+ return true;
+
+ sta = container_of(txqi->txq.sta, struct sta_info, sta);
+
+ if (sta->airtime.deficit[txqi->txq.ac] > 0)
+ return true;
+
+ if (txqi == list_first_entry_or_null(&local->active_txqs[txqi->txq.ac],
+ struct txq_info,
+ schedule_order)) {
+ sta->airtime.deficit[txqi->txq.ac] += sta->airtime.weight;
+ list_move_tail(&txqi->schedule_order,
+ &local->active_txqs[txqi->txq.ac]);
+ }
+
+ return false;
+}
+
struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
bool first)
{
@@ -3659,6 +3701,7 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
if (first)
local->schedule_round[ac]++;

+ begin:
if (list_empty(&local->active_txqs[ac]))
goto out;

@@ -3666,6 +3709,9 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
struct txq_info,
schedule_order);

+ if (!ieee80211_txq_check_deficit(local, txqi))
+ goto begin;
+
if (txqi->schedule_round == local->schedule_round[ac])
txqi = NULL;
else
@@ -3681,6 +3727,25 @@ struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
}
EXPORT_SYMBOL(ieee80211_next_txq);

+bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
+ struct ieee80211_txq *txq)
+{
+ struct ieee80211_local *local = hw_to_local(hw);
+ struct txq_info *txqi = to_txq_info(txq);
+ bool may_tx = false;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ if (ieee80211_txq_check_deficit(local, txqi)) {
+ may_tx = true;
+ list_del_init(&txqi->schedule_order);
+ }
+
+ spin_unlock_bh(&local->active_txq_lock);
+ return may_tx;
+}
+EXPORT_SYMBOL(ieee80211_txq_may_transmit);
+
void __ieee80211_subif_start_xmit(struct sk_buff *skb,
struct net_device *dev,
u32 info_flags)

2018-09-10 18:05:02

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Mon, 2018-09-10 at 15:08 +0200, Toke Høiland-Jørgensen wrote:
> Johannes Berg <[email protected]> writes:
>
> > On Mon, 2018-09-10 at 14:39 +0200, Toke Høiland-Jørgensen wrote:
> >
> > > > From then on, right now we basically just try to refill the hardware
> > > > queue from the TXQ whenever the hardware releases frames from that
> > > > queue. If there are none, we fall back to putting them on the hardware
> > > > queue from the wake signal.
> > >
> > > OK. So if the TXQ is ineligible to dequeue packets, but still have them
> > > queued, there would need to be a signal (such as wake_txq()) when it
> > > becomes eligible again, right?
> >
> > Right. It wouldn't have to be wake_txq, but that seems as appropriate as
> > anything else.
> >
> > If we were to use ieee80211_airtime_may_transmit(), we'd have to have
> > something that tells us "I previously told you it may not transmit, but
> > now I changed my mind" :-)
>
> Yes, exactly. Hmm, I'll see what I can come up with :)

No need to look at this right now - let's get this stuff sorted out
first :)

johannes

2018-09-12 16:14:24

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

Rajkumar Manoharan <[email protected]> writes:

> On 2018-09-10 04:13, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Johannes Berg <[email protected]> writes:
>>>> - txqi->flags & (1<<IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>>>> - txqi->flags & (1<<IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>>>> - txqi->flags & (1<<IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" :=20
>>>> "");
>>>> + txqi->flags & (1 << IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>>>> + txqi->flags & (1 << IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>>>> + txqi->flags & (1 << IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU"=20
>>>> : "");
>>>=20
>>> consider BIT() instead as a cleanup? :)
>>>=20
>>> (or maybe this is just a leftover from merging some other patches?)
>>=20
>> Yeah, this is a merging artifact; Rajkumar's patch added another flag,
>> that I removed again. Didn't notice that there was still a whitespace
>> change in this patch...
>>=20
> I added the flag based on our last discussion. The driver needs to check
> txq status for each tx_dequeue(). One time txq check is not sufficient
> as it allows the driver to dequeue all frames from txq.
>
> drv_func() {
> while (ieee80211_airtime_may_transmit(txq) &&
> hw_has_space() &&
> (pkt =3D ieee80211_tx_dequeue(hw, txq)))
> push_to_hw(pkt);
> }

Yeah, but with airtime only being recorded on TX completion, the odds of
the value changing within that loop are quite low; so it's not going to
work, which is why I removed it.

However, after reading Kan's patches I get where you're coming from; a
check in tx_dequeue() is needed for the BQL-style queue limiting. Will
try to incorporate a version of that in the next series so you can see
what I mean when I say it should be orthogonal; and I'm still not sure
it needs a flag :)

>>>> +bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
>>>> + struct ieee80211_txq *txq)
>>>> +{
>>>> + struct ieee80211_local *local =3D hw_to_local(hw);
>>>> + struct txq_info *txqi =3D to_txq_info(txq);
>>>> + bool may_tx =3D false;
>>>> +
>>>> + spin_lock_bh(&local->active_txq_lock);
>>>> +
>>>> + if (ieee80211_txq_check_deficit(local, txqi)) {
>>>> + may_tx =3D true;
>>>> + list_del_init(&txqi->schedule_order);
>>>=20
>
> To handle above case, may_transmit should remove the node only
> when it is in list.
>
> if (list_empty(&txqi->schedule_order))
> list_del_init(&txqi->schedule_order);

I assume you missed a ! in that if, right? :)

> So that it can be used to determine whether txq is running negative.

But still not sure what you mean here?

-Toke

2018-09-10 13:16:20

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 3/4] cfg80211: Add airtime statistics and settings

On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:
> This adds airtime statistics to the cfg80211 station dump, and also adds a new
> parameter to set the airtime weight of each station. The latter allows userspace
> to implement policies for different stations by varying their weights.

Maybe some documentation on the weights - both for other implementers as
well as users - would be useful.

> + * @NL80211_STA_INFO_AIRTIME_WEIGHT: current airtime weight for station
> + * (u16, usec)

how could the weight be in usec? copy/paste issue?

johannes

2018-09-08 03:06:00

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: [PATCH RFC v3 4/4] ath9k: Switch to mac80211 TXQ scheduling and airtime APIs

This moves the ath9k driver to use the mac80211 TXQ scheduling and
airtime accounting APIs, removing the corresponding state tracking
inside the driver.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
drivers/net/wireless/ath/ath9k/ath9k.h | 14 --
drivers/net/wireless/ath/ath9k/debug.c | 3
drivers/net/wireless/ath/ath9k/debug.h | 8 -
drivers/net/wireless/ath/ath9k/debug_sta.c | 54 ------
drivers/net/wireless/ath/ath9k/init.c | 3
drivers/net/wireless/ath/ath9k/recv.c | 9 -
drivers/net/wireless/ath/ath9k/xmit.c | 245 ++++++++--------------------
7 files changed, 74 insertions(+), 262 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index 21ba20981a80..90b56766dab1 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -112,8 +112,6 @@ int ath_descdma_setup(struct ath_softc *sc, struct ath_descdma *dd,
#define ATH_TXFIFO_DEPTH 8
#define ATH_TX_ERROR 0x01

-#define ATH_AIRTIME_QUANTUM 300 /* usec */
-
/* Stop tx traffic 1ms before the GO goes away */
#define ATH_P2P_PS_STOP_TIME 1000

@@ -246,10 +244,8 @@ struct ath_atx_tid {
s8 bar_index;
bool active;
bool clear_ps_filter;
- bool has_queued;
};

-void __ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid);
void ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid);

struct ath_node {
@@ -263,12 +259,9 @@ struct ath_node {

bool sleeping;
bool no_ps_filter;
- s64 airtime_deficit[IEEE80211_NUM_ACS];
- u32 airtime_rx_start;

#ifdef CONFIG_ATH9K_STATION_STATISTICS
struct ath_rx_rate_stats rx_rate_stats;
- struct ath_airtime_stats airtime_stats;
#endif
u8 key_idx[4];

@@ -986,11 +979,6 @@ void ath_ant_comb_scan(struct ath_softc *sc, struct ath_rx_status *rs);

#define ATH9K_NUM_CHANCTX 2 /* supports 2 operating channels */

-#define AIRTIME_USE_TX BIT(0)
-#define AIRTIME_USE_RX BIT(1)
-#define AIRTIME_USE_NEW_QUEUES BIT(2)
-#define AIRTIME_ACTIVE(flags) (!!(flags & (AIRTIME_USE_TX|AIRTIME_USE_RX)))
-
struct ath_softc {
struct ieee80211_hw *hw;
struct device *dev;
@@ -1034,8 +1022,6 @@ struct ath_softc {
short nbcnvifs;
unsigned long ps_usecount;

- u16 airtime_flags; /* AIRTIME_* */
-
struct ath_rx rx;
struct ath_tx tx;
struct ath_beacon beacon;
diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c
index 0a6eb8a8c1ed..f928d3a07caa 100644
--- a/drivers/net/wireless/ath/ath9k/debug.c
+++ b/drivers/net/wireless/ath/ath9k/debug.c
@@ -1456,9 +1456,6 @@ int ath9k_init_debug(struct ath_hw *ah)
#endif
debugfs_create_file("tpc", 0600, sc->debug.debugfs_phy, sc, &fops_tpc);

- debugfs_create_u16("airtime_flags", 0600,
- sc->debug.debugfs_phy, &sc->airtime_flags);
-
debugfs_create_file("nf_override", 0600,
sc->debug.debugfs_phy, sc, &fops_nf_override);

diff --git a/drivers/net/wireless/ath/ath9k/debug.h b/drivers/net/wireless/ath/ath9k/debug.h
index 249f8141cd00..559d9628f280 100644
--- a/drivers/net/wireless/ath/ath9k/debug.h
+++ b/drivers/net/wireless/ath/ath9k/debug.h
@@ -319,20 +319,12 @@ ath9k_debug_sync_cause(struct ath_softc *sc, u32 sync_cause)
void ath_debug_rate_stats(struct ath_softc *sc,
struct ath_rx_status *rs,
struct sk_buff *skb);
-void ath_debug_airtime(struct ath_softc *sc,
- struct ath_node *an,
- u32 rx, u32 tx);
#else
static inline void ath_debug_rate_stats(struct ath_softc *sc,
struct ath_rx_status *rs,
struct sk_buff *skb)
{
}
-static inline void ath_debug_airtime(struct ath_softc *sc,
- struct ath_node *an,
- u32 rx, u32 tx)
-{
-}
#endif /* CONFIG_ATH9K_STATION_STATISTICS */

#endif /* DEBUG_H */
diff --git a/drivers/net/wireless/ath/ath9k/debug_sta.c b/drivers/net/wireless/ath/ath9k/debug_sta.c
index a6f45f1bb5bb..bb6f3250aa30 100644
--- a/drivers/net/wireless/ath/ath9k/debug_sta.c
+++ b/drivers/net/wireless/ath/ath9k/debug_sta.c
@@ -242,59 +242,6 @@ static const struct file_operations fops_node_recv = {
.llseek = default_llseek,
};

-void ath_debug_airtime(struct ath_softc *sc,
- struct ath_node *an,
- u32 rx,
- u32 tx)
-{
- struct ath_airtime_stats *astats = &an->airtime_stats;
-
- astats->rx_airtime += rx;
- astats->tx_airtime += tx;
-}
-
-static ssize_t read_airtime(struct file *file, char __user *user_buf,
- size_t count, loff_t *ppos)
-{
- struct ath_node *an = file->private_data;
- struct ath_airtime_stats *astats;
- static const char *qname[4] = {
- "VO", "VI", "BE", "BK"
- };
- u32 len = 0, size = 256;
- char *buf;
- size_t retval;
- int i;
-
- buf = kzalloc(size, GFP_KERNEL);
- if (buf == NULL)
- return -ENOMEM;
-
- astats = &an->airtime_stats;
-
- len += scnprintf(buf + len, size - len, "RX: %u us\n", astats->rx_airtime);
- len += scnprintf(buf + len, size - len, "TX: %u us\n", astats->tx_airtime);
- len += scnprintf(buf + len, size - len, "Deficit: ");
- for (i = 0; i < 4; i++)
- len += scnprintf(buf+len, size - len, "%s: %lld us ", qname[i], an->airtime_deficit[i]);
- if (len < size)
- buf[len++] = '\n';
-
- retval = simple_read_from_buffer(user_buf, count, ppos, buf, len);
- kfree(buf);
-
- return retval;
-}
-
-
-static const struct file_operations fops_airtime = {
- .read = read_airtime,
- .open = simple_open,
- .owner = THIS_MODULE,
- .llseek = default_llseek,
-};
-
-
void ath9k_sta_add_debugfs(struct ieee80211_hw *hw,
struct ieee80211_vif *vif,
struct ieee80211_sta *sta,
@@ -304,5 +251,4 @@ void ath9k_sta_add_debugfs(struct ieee80211_hw *hw,

debugfs_create_file("node_aggr", 0444, dir, an, &fops_node_aggr);
debugfs_create_file("node_recv", 0444, dir, an, &fops_node_recv);
- debugfs_create_file("airtime", 0444, dir, an, &fops_airtime);
}
diff --git a/drivers/net/wireless/ath/ath9k/init.c b/drivers/net/wireless/ath/ath9k/init.c
index c070a9e51ebf..45a31f8a1514 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -676,8 +676,6 @@ static int ath9k_init_softc(u16 devid, struct ath_softc *sc,

/* Will be cleared in ath9k_start() */
set_bit(ATH_OP_INVALID, &common->op_flags);
- sc->airtime_flags = (AIRTIME_USE_TX | AIRTIME_USE_RX |
- AIRTIME_USE_NEW_QUEUES);

sc->sc_ah = ah;
sc->dfs_detector = dfs_pattern_detector_init(common, NL80211_DFS_UNSET);
@@ -1013,6 +1011,7 @@ static void ath9k_set_hw_capab(struct ath_softc *sc, struct ieee80211_hw *hw)
SET_IEEE80211_PERM_ADDR(hw, common->macaddr);

wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_CQM_RSSI_LIST);
+ wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_AIRTIME_FAIRNESS);
}

int ath9k_init_device(u16 devid, struct ath_softc *sc,
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index a8ac42c96d71..fef8fbe12f45 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -1054,14 +1054,7 @@ static void ath_rx_count_airtime(struct ath_softc *sc,
len, rxs->rate_idx, is_sp);
}

- if (!!(sc->airtime_flags & AIRTIME_USE_RX)) {
- spin_lock_bh(&acq->lock);
- an->airtime_deficit[acno] -= airtime;
- if (an->airtime_deficit[acno] <= 0)
- __ath_tx_queue_tid(sc, ATH_AN_2_TID(an, tidno));
- spin_unlock_bh(&acq->lock);
- }
- ath_debug_airtime(sc, an, airtime, 0);
+ ieee80211_sta_register_airtime(sta, tidno, 0, airtime);
exit:
rcu_read_unlock();
}
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 43b6c8508e49..77a19c8097d0 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -113,44 +113,14 @@ void ath_txq_unlock_complete(struct ath_softc *sc, struct ath_txq *txq)
ath_tx_status(hw, skb);
}

-void __ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
-{
- struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
- struct ath_chanctx *ctx = avp->chanctx;
- struct ath_acq *acq;
- struct list_head *tid_list;
- u8 acno = TID_TO_WME_AC(tid->tidno);
-
- if (!ctx || !list_empty(&tid->list))
- return;
-
-
- acq = &ctx->acq[acno];
- if ((sc->airtime_flags & AIRTIME_USE_NEW_QUEUES) &&
- tid->an->airtime_deficit[acno] > 0)
- tid_list = &acq->acq_new;
- else
- tid_list = &acq->acq_old;
-
- list_add_tail(&tid->list, tid_list);
-}
-
void ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
{
- struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
- struct ath_chanctx *ctx = avp->chanctx;
- struct ath_acq *acq;
+ struct ieee80211_txq *queue = container_of(
+ (void *)tid, struct ieee80211_txq, drv_priv);

- if (!ctx || !list_empty(&tid->list))
- return;
-
- acq = &ctx->acq[TID_TO_WME_AC(tid->tidno)];
- spin_lock_bh(&acq->lock);
- __ath_tx_queue_tid(sc, tid);
- spin_unlock_bh(&acq->lock);
+ ieee80211_schedule_txq(sc->hw, queue);
}

-
void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue)
{
struct ath_softc *sc = hw->priv;
@@ -163,11 +133,7 @@ void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue)
tid->tidno);

ath_txq_lock(sc, txq);
-
- tid->has_queued = true;
- ath_tx_queue_tid(sc, tid);
ath_txq_schedule(sc, txq);
-
ath_txq_unlock(sc, txq);
}

@@ -217,8 +183,8 @@ ath_get_skb_tid(struct ath_softc *sc, struct ath_node *an, struct sk_buff *skb)
return ATH_AN_2_TID(an, tidno);
}

-static struct sk_buff *
-ath_tid_pull(struct ath_atx_tid *tid)
+static int
+ath_tid_pull(struct ath_atx_tid *tid, struct sk_buff **skbuf)
{
struct ieee80211_txq *txq = container_of((void*)tid, struct ieee80211_txq, drv_priv);
struct ath_softc *sc = tid->an->sc;
@@ -229,20 +195,15 @@ ath_tid_pull(struct ath_atx_tid *tid)
};
struct sk_buff *skb;
struct ath_frame_info *fi;
- int q;
-
- if (!tid->has_queued)
- return NULL;
+ int q, ret;

skb = ieee80211_tx_dequeue(hw, txq);
- if (!skb) {
- tid->has_queued = false;
- return NULL;
- }
+ if (!skb)
+ return -ENOENT;

- if (ath_tx_prepare(hw, skb, &txctl)) {
+ if ((ret = ath_tx_prepare(hw, skb, &txctl))) {
ieee80211_free_txskb(hw, skb);
- return NULL;
+ return ret;
}

q = skb_get_queue_mapping(skb);
@@ -252,24 +213,19 @@ ath_tid_pull(struct ath_atx_tid *tid)
++tid->txq->pending_frames;
}

- return skb;
-}
-
-
-static bool ath_tid_has_buffered(struct ath_atx_tid *tid)
-{
- return !skb_queue_empty(&tid->retry_q) || tid->has_queued;
-}
+ *skbuf = skb;
+ return 0;
+ }

-static struct sk_buff *ath_tid_dequeue(struct ath_atx_tid *tid)
+static int ath_tid_dequeue(struct ath_atx_tid *tid,
+ struct sk_buff **skb)
{
- struct sk_buff *skb;
+ int ret = 0;
+ *skb = __skb_dequeue(&tid->retry_q);
+ if (!*skb)
+ ret = ath_tid_pull(tid, skb);

- skb = __skb_dequeue(&tid->retry_q);
- if (!skb)
- skb = ath_tid_pull(tid);
-
- return skb;
+ return ret;
}

static void ath_tx_flush_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
@@ -365,11 +321,12 @@ static void ath_tid_drain(struct ath_softc *sc, struct ath_txq *txq,
struct list_head bf_head;
struct ath_tx_status ts;
struct ath_frame_info *fi;
+ int ret;

memset(&ts, 0, sizeof(ts));
INIT_LIST_HEAD(&bf_head);

- while ((skb = ath_tid_dequeue(tid))) {
+ while ((ret = ath_tid_dequeue(tid, &skb)) == 0) {
fi = get_frame_info(skb);
bf = fi->bf;

@@ -681,7 +638,6 @@ static void ath_tx_complete_aggr(struct ath_softc *sc, struct ath_txq *txq,
skb_queue_splice_tail(&bf_pending, &tid->retry_q);
if (!an->sleeping) {
ath_tx_queue_tid(sc, tid);
-
if (ts->ts_status & (ATH9K_TXERR_FILT | ATH9K_TXERR_XRETRY))
tid->clear_ps_filter = true;
}
@@ -708,11 +664,9 @@ static bool bf_is_ampdu_not_probing(struct ath_buf *bf)
return bf_isampdu(bf) && !(info->flags & IEEE80211_TX_CTL_RATE_CTRL_PROBE);
}

-static void ath_tx_count_airtime(struct ath_softc *sc, struct ath_node *an,
- struct ath_atx_tid *tid, struct ath_buf *bf,
- struct ath_tx_status *ts)
+static void ath_tx_count_airtime(struct ath_softc *sc, struct ieee80211_sta *sta,
+ struct ath_buf *bf, struct ath_tx_status *ts)
{
- struct ath_txq *txq = tid->txq;
u32 airtime = 0;
int i;

@@ -722,17 +676,7 @@ static void ath_tx_count_airtime(struct ath_softc *sc, struct ath_node *an,
airtime += rate_dur * bf->rates[i].count;
}

- if (sc->airtime_flags & AIRTIME_USE_TX) {
- int q = txq->mac80211_qnum;
- struct ath_acq *acq = &sc->cur_chan->acq[q];
-
- spin_lock_bh(&acq->lock);
- an->airtime_deficit[q] -= airtime;
- if (an->airtime_deficit[q] <= 0)
- __ath_tx_queue_tid(sc, tid);
- spin_unlock_bh(&acq->lock);
- }
- ath_debug_airtime(sc, an, 0, airtime);
+ ieee80211_sta_register_airtime(sta, ts->tid, airtime, 0);
}

static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
@@ -762,7 +706,7 @@ static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq,
if (sta) {
struct ath_node *an = (struct ath_node *)sta->drv_priv;
tid = ath_get_skb_tid(sc, an, bf->bf_mpdu);
- ath_tx_count_airtime(sc, an, tid, bf, ts);
+ ath_tx_count_airtime(sc, sta, bf, ts);
if (ts->ts_status & (ATH9K_TXERR_FILT | ATH9K_TXERR_XRETRY))
tid->clear_ps_filter = true;
}
@@ -946,20 +890,21 @@ static int ath_compute_num_delims(struct ath_softc *sc, struct ath_atx_tid *tid,
return ndelim;
}

-static struct ath_buf *
+static int
ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
- struct ath_atx_tid *tid)
+ struct ath_atx_tid *tid, struct ath_buf **buf)
{
struct ieee80211_tx_info *tx_info;
struct ath_frame_info *fi;
- struct sk_buff *skb, *first_skb = NULL;
struct ath_buf *bf;
+ struct sk_buff *skb, *first_skb = NULL;
u16 seqno;
+ int ret;

while (1) {
- skb = ath_tid_dequeue(tid);
- if (!skb)
- break;
+ ret = ath_tid_dequeue(tid, &skb);
+ if (ret < 0)
+ return ret;

fi = get_frame_info(skb);
bf = fi->bf;
@@ -991,7 +936,7 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,

if (!(tx_info->flags & IEEE80211_TX_CTL_AMPDU)) {
bf->bf_state.bf_type = 0;
- return bf;
+ break;
}

bf->bf_state.bf_type = BUF_AMPDU | BUF_AGGR;
@@ -1010,7 +955,7 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
first_skb = skb;
continue;
}
- break;
+ return -EINPROGRESS;
}

if (tid->bar_index > ATH_BA_INDEX(tid->seq_start, seqno)) {
@@ -1027,10 +972,11 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
if (bf_isampdu(bf))
ath_tx_addto_baw(sc, tid, bf);

- return bf;
+ break;
}

- return NULL;
+ *buf = bf;
+ return 0;
}

static int
@@ -1040,7 +986,7 @@ ath_tx_form_aggr(struct ath_softc *sc, struct ath_txq *txq,
{
#define PADBYTES(_len) ((4 - ((_len) % 4)) % 4)
struct ath_buf *bf = bf_first, *bf_prev = NULL;
- int nframes = 0, ndelim;
+ int nframes = 0, ndelim, ret;
u16 aggr_limit = 0, al = 0, bpad = 0,
al_delta, h_baw = tid->baw_size / 2;
struct ieee80211_tx_info *tx_info;
@@ -1092,7 +1038,9 @@ ath_tx_form_aggr(struct ath_softc *sc, struct ath_txq *txq,

bf_prev = bf;

- bf = ath_tx_get_tid_subframe(sc, txq, tid);
+ ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+ if (ret < 0)
+ break;
}
goto finish;
stop:
@@ -1489,7 +1437,7 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
struct ath_buf *bf_first)
{
struct ath_buf *bf = bf_first, *bf_prev = NULL;
- int nframes = 0;
+ int nframes = 0, ret;

do {
struct ieee80211_tx_info *tx_info;
@@ -1503,8 +1451,8 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
if (nframes >= 2)
break;

- bf = ath_tx_get_tid_subframe(sc, txq, tid);
- if (!bf)
+ ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+ if (ret < 0)
break;

tx_info = IEEE80211_SKB_CB(bf->bf_mpdu);
@@ -1517,30 +1465,27 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
} while (1);
}

-static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
- struct ath_atx_tid *tid)
+static int ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
+ struct ath_atx_tid *tid)
{
- struct ath_buf *bf;
+ struct ath_buf *bf = NULL;
struct ieee80211_tx_info *tx_info;
struct list_head bf_q;
- int aggr_len = 0;
+ int aggr_len = 0, ret;
bool aggr;

- if (!ath_tid_has_buffered(tid))
- return false;
-
INIT_LIST_HEAD(&bf_q);

- bf = ath_tx_get_tid_subframe(sc, txq, tid);
- if (!bf)
- return false;
+ ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+ if (ret < 0)
+ return ret;

tx_info = IEEE80211_SKB_CB(bf->bf_mpdu);
aggr = !!(tx_info->flags & IEEE80211_TX_CTL_AMPDU);
if ((aggr && txq->axq_ampdu_depth >= ATH_AGGR_MIN_QDEPTH) ||
(!aggr && txq->axq_depth >= ATH_NON_AGGR_MIN_QDEPTH)) {
__skb_queue_tail(&tid->retry_q, bf->bf_mpdu);
- return false;
+ return -EBUSY;
}

ath_set_rates(tid->an->vif, tid->an->sta, bf);
@@ -1550,7 +1495,7 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
ath_tx_form_burst(sc, txq, tid, &bf_q, bf);

if (list_empty(&bf_q))
- return false;
+ return -EAGAIN;

if (tid->clear_ps_filter || tid->an->no_ps_filter) {
tid->clear_ps_filter = false;
@@ -1559,7 +1504,7 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,

ath_tx_fill_desc(sc, bf, txq, aggr_len);
ath_tx_txqaddbuf(sc, txq, &bf_q, false);
- return true;
+ return 0;
}

int ath_tx_aggr_start(struct ath_softc *sc, struct ieee80211_sta *sta,
@@ -1622,28 +1567,16 @@ void ath_tx_aggr_sleep(struct ieee80211_sta *sta, struct ath_softc *sc,
{
struct ath_common *common = ath9k_hw_common(sc->sc_ah);
struct ath_atx_tid *tid;
- struct ath_txq *txq;
int tidno;

ath_dbg(common, XMIT, "%s called\n", __func__);

for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
tid = ath_node_to_tid(an, tidno);
- txq = tid->txq;
-
- ath_txq_lock(sc, txq);
-
- if (list_empty(&tid->list)) {
- ath_txq_unlock(sc, txq);
- continue;
- }

if (!skb_queue_empty(&tid->retry_q))
ieee80211_sta_set_buffered(sta, tid->tidno, true);

- list_del_init(&tid->list);
-
- ath_txq_unlock(sc, txq);
}
}

@@ -1662,11 +1595,12 @@ void ath_tx_aggr_wakeup(struct ath_softc *sc, struct ath_node *an)

ath_txq_lock(sc, txq);
tid->clear_ps_filter = true;
- if (ath_tid_has_buffered(tid)) {
+ if (!skb_queue_empty(&tid->retry_q)) {
ath_tx_queue_tid(sc, tid);
ath_txq_schedule(sc, txq);
}
ath_txq_unlock_complete(sc, txq);
+
}
}

@@ -1697,9 +1631,9 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,
struct ath_txq *txq = sc->tx.uapsdq;
struct ieee80211_tx_info *info;
struct list_head bf_q;
- struct ath_buf *bf_tail = NULL, *bf;
+ struct ath_buf *bf_tail = NULL, *bf = NULL;
int sent = 0;
- int i;
+ int i, ret;

INIT_LIST_HEAD(&bf_q);
for (i = 0; tids && nframes; i++, tids >>= 1) {
@@ -1712,8 +1646,8 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,

ath_txq_lock(sc, tid->txq);
while (nframes > 0) {
- bf = ath_tx_get_tid_subframe(sc, sc->tx.uapsdq, tid);
- if (!bf)
+ ret = ath_tx_get_tid_subframe(sc, sc->tx.uapsdq, tid, &bf);
+ if (ret < 0)
break;

ath9k_set_moredata(sc, bf, true);
@@ -1979,11 +1913,12 @@ void ath_tx_cleanupq(struct ath_softc *sc, struct ath_txq *txq)
*/
void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
{
+ struct ieee80211_hw *hw = sc->hw;
struct ath_common *common = ath9k_hw_common(sc->sc_ah);
+ struct ieee80211_txq *queue;
struct ath_atx_tid *tid;
- struct list_head *tid_list;
- struct ath_acq *acq;
- bool active = AIRTIME_ACTIVE(sc->airtime_flags);
+ bool first = true;
+ int ret;

if (txq->mac80211_qnum < 0)
return;
@@ -1993,51 +1928,19 @@ void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)

spin_lock_bh(&sc->chan_lock);
rcu_read_lock();
- acq = &sc->cur_chan->acq[txq->mac80211_qnum];

if (sc->cur_chan->stopped)
goto out;

-begin:
- tid_list = &acq->acq_new;
- if (list_empty(tid_list)) {
- tid_list = &acq->acq_old;
- if (list_empty(tid_list))
- goto out;
- }
- tid = list_first_entry(tid_list, struct ath_atx_tid, list);
-
- if (active && tid->an->airtime_deficit[txq->mac80211_qnum] <= 0) {
- spin_lock_bh(&acq->lock);
- tid->an->airtime_deficit[txq->mac80211_qnum] += ATH_AIRTIME_QUANTUM;
- list_move_tail(&tid->list, &acq->acq_old);
- spin_unlock_bh(&acq->lock);
- goto begin;
- }
-
- if (!ath_tid_has_buffered(tid)) {
- spin_lock_bh(&acq->lock);
- if ((tid_list == &acq->acq_new) && !list_empty(&acq->acq_old))
- list_move_tail(&tid->list, &acq->acq_old);
- else {
- list_del_init(&tid->list);
- }
- spin_unlock_bh(&acq->lock);
- goto begin;
- }
+ while ((queue = ieee80211_next_txq(hw, txq->mac80211_qnum, first))) {
+ tid = (struct ath_atx_tid *) queue->drv_priv;
+ first = false;

+ ret = ath_tx_sched_aggr(sc, txq, tid);
+ ath_dbg(common, QUEUE, "ath_tx_sched_aggr returned %d\n", ret);

- /*
- * If we succeed in scheduling something, immediately restart to make
- * sure we keep the HW busy.
- */
- if(ath_tx_sched_aggr(sc, txq, tid)) {
- if (!active) {
- spin_lock_bh(&acq->lock);
- list_move_tail(&tid->list, &acq->acq_old);
- spin_unlock_bh(&acq->lock);
- }
- goto begin;
+ if (ret != -ENOENT)
+ ieee80211_schedule_txq(hw, queue);
}

out:
@@ -2886,9 +2789,6 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an)
struct ath_atx_tid *tid;
int tidno, acno;

- for (acno = 0; acno < IEEE80211_NUM_ACS; acno++)
- an->airtime_deficit[acno] = ATH_AIRTIME_QUANTUM;
-
for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
tid = ath_node_to_tid(an, tidno);
tid->an = an;
@@ -2898,7 +2798,6 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an)
tid->baw_head = tid->baw_tail = 0;
tid->active = false;
tid->clear_ps_filter = true;
- tid->has_queued = false;
__skb_queue_head_init(&tid->retry_q);
INIT_LIST_HEAD(&tid->list);
acno = TID_TO_WME_AC(tidno);

2018-09-10 18:02:58

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Mon, 2018-09-10 at 14:39 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> > From then on, right now we basically just try to refill the hardware
>> > queue from the TXQ whenever the hardware releases frames from that
>> > queue. If there are none, we fall back to putting them on the hardware
>> > queue from the wake signal.
>>=20
>> OK. So if the TXQ is ineligible to dequeue packets, but still have them
>> queued, there would need to be a signal (such as wake_txq()) when it
>> becomes eligible again, right?
>
> Right. It wouldn't have to be wake_txq, but that seems as appropriate as
> anything else.
>
> If we were to use ieee80211_airtime_may_transmit(), we'd have to have
> something that tells us "I previously told you it may not transmit, but
> now I changed my mind" :-)

Yes, exactly. Hmm, I'll see what I can come up with :)

-Toke

2018-09-10 19:54:43

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Mon, 2018-09-10 at 15:18 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> If we have the start_schedule() / end_schedule() pair anyway, the latter
>> could notify any TXQs that became eligible during the scheduling round.
>
> Do we even need end_schedule()? It's hard to pass multiple things to a
> single call (do you build a list?), so having
>
> start_schedule(), get_txq(), return_txq()
>
> would be sufficient?

Well, start_schedule() / end_schedule() would be needed if we are going
to add locking in mac80211?

>> Also, instead of having the three different API functions
>> (next_txq()/may_tx()/schedule_txq()), we could have get_txq(txq)/put_tx=
q(txq)
>> which would always need to be paired; but the argument to get_txq()
>> could be optional, and if the driver passes NULL it means "give me the
>> next available TXQ".
>
> I can't say I like this. It makes the meaning totally different:
>
> * with NULL: use the internal scheduler to determine which one is good
> to use next
> * non-NULL: essentially equivalent to may_tx()

Yeah, it'll be two completely different uses for the same function; but
there wouldn't be two different APIs to keep track of. I'm fine with
keeping them as separately named functions. :)

>> So for ath9k it would be:
>>=20
>>=20
>> start_schedule(ac);
>> while ((txq =3D get_txq(NULL)) {
>> queue_aggregate(txq);
>> put_txq(txq);
>> }
>> end_schedule(ac);
>>=20
>> And for ath10k/iwlwifi it would be:
>>=20
>> on_hw_notify(txq) {
>> start_schedule(ac);
>> if (txq =3D get_txq(txq)) {
>> queue_packets(txq);
>> put_txq(txq);
>> }
>> end_schedule(ac);
>> }
>>=20
>>=20
>> I think that would be simpler, API-wise?
>
> I can't say I see much point in overloading get_txq() that way. You'd
> never use it the same way.
>
> Also, would you really start_schedule(ac) in the hw-managed case?

If we decide mac80211 needs to do locking to prevent two threads from
scheduling the same ac, that would also be needed for the hw-managed
case?

> It seems like not? Basically it seems to me that in the hw-managed
> case all you need is may_tx()? And in fact, once you opt in you don't
> even really need *that* since mac80211 can just return NULL from
> get_skb()?

Yeah, we could just throttle in get_skb(); the separate call was to
avoid the overhead of the check for every packet. Typically, you'll pick
a TXQ, then dequeue multiple packets from it in succession; with a
separate call to may_tx(), you only do the check once, not for every
packet...

-Toke

2018-09-10 17:33:01

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Mon, 2018-09-10 at 12:57 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Johannes Berg <[email protected]> writes:
>>=20
>> > On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wr=
ote:
>> > >=20
>> > > Usage of the new API is optional, so drivers can be ported one at a =
time.
>> >=20
>> > With the 1:1 hardware queue/txq mapping in iwlwifi (we're close to
>> > getting that patch in, though now the Jewish holidays mean a delay),
>> > I'm not sure we'd be able to do this at all in iwlwifi. So this may
>> > not be a case of porting one at a time until we can get rid of it ...
>>=20
>> Could you elaborate a bit more on how the hwq/txq stuff works in iwl?
>
> I can try.
>
>> Does the driver just hand off a TXQ to the hardware on wake_txq(), which
>> is then scheduled by the hardware after that? Or how does the mapping to
>> hwqs work, and how is the hardware notified that there are still packets
>> queued / that new packets have arrived for an already mapped txq?
>
> Basically, we use the TXQ driver data to do this. On initialization, we
> set all TXQs to have no hardware queue mapped. Then, when the first wake
> happens for this TXQ, we allocate a hardware queue for this RA/TID.
>
> From then on, right now we basically just try to refill the hardware
> queue from the TXQ whenever the hardware releases frames from that
> queue. If there are none, we fall back to putting them on the hardware
> queue from the wake signal.

OK. So if the TXQ is ineligible to dequeue packets, but still have them
queued, there would need to be a signal (such as wake_txq()) when it
becomes eligible again, right?

-Toke

2018-09-10 16:06:13

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Mon, 2018-09-10 at 13:02 +0200, Toke Høiland-Jørgensen wrote:
>
> > Is there a case where it's allowed to call this while *not* empty? At
> > least for the drivers it seems not, so perhaps when called from the
> > driver it should WARN() here or so?
>
> Yeah, mac80211 calls this on wake_txq, where it will often be scheduled
> already. And since that can happen while drivers schedule stuff, it
> could also be the case that a driver re-schedules a queue that is
> already scheduled.

Right, I kinda gathered that but was thinking more from a driver POV as
I was reviewing, makes sense.

> > I'm just a bit worried that drivers will get this a bit wrong, and
> > we'll never know, but things won't work properly in some weird way.
>
> What, subtle bugs in the data path that causes hangs in hard-to-debug
> cases? Nah, that never happens ;)

Good :-P

Actually though, we can't put a warn on there, because the driver might
take a txq, then mac80211 puts it on the list due to a new packet, and
then the driver also returns it.


> > > + txqi = list_first_entry(&local->active_txqs[ac],
> > > + struct txq_info,
> > > + schedule_order);
> > > +
> > > + if (txqi->schedule_round == local->schedule_round[ac])
> > > + txqi = NULL;
> >
> > that assigment seems unnecessary? txqi defaults to NULL.
>
> No, this is an 'undo' of the previous line. I.e., we get the first txqi,
> check if we've already seen it this round, and if we have, unset txqi so
> the function returns NULL (causing the driver to stop looping).

Err, yeah, obviously.

johannes

2018-09-11 14:47:11

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Mon, 2018-09-10 at 17:00 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> > Do we even need end_schedule()? It's hard to pass multiple things to a
>> > single call (do you build a list?), so having
>> >=20
>> > start_schedule(), get_txq(), return_txq()
>> >=20
>> > would be sufficient?
>>=20
>> Well, start_schedule() / end_schedule() would be needed if we are going
>> to add locking in mac80211?
>
> [...]
>
>> If we decide mac80211 needs to do locking to prevent two threads from
>> scheduling the same ac, that would also be needed for the hw-managed
>> case?
>
> Yes, good point.
>
>> > It seems like not? Basically it seems to me that in the hw-managed
>> > case all you need is may_tx()? And in fact, once you opt in you don't
>> > even really need *that* since mac80211 can just return NULL from
>> > get_skb()?
>>=20
>> Yeah, we could just throttle in get_skb(); the separate call was to
>> avoid the overhead of the check for every packet. Typically, you'll pick
>> a TXQ, then dequeue multiple packets from it in succession; with a
>> separate call to may_tx(), you only do the check once, not for every
>> packet...
>
> Yeah, also a good point.
>
> Still, txq =3D get_txq(txq) doesn't feel right, so better to keep that
> separate I think.

Right, will do :)

-Toke

2018-09-10 15:51:09

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>=20
>> Usage of the new API is optional, so drivers can be ported one at a time.
>
> With the 1:1 hardware queue/txq mapping in iwlwifi (we're close to
> getting that patch in, though now the Jewish holidays mean a delay),
> I'm not sure we'd be able to do this at all in iwlwifi. So this may
> not be a case of porting one at a time until we can get rid of it ...

Could you elaborate a bit more on how the hwq/txq stuff works in iwl?
Does the driver just hand off a TXQ to the hardware on wake_txq(), which
is then scheduled by the hardware after that? Or how does the mapping to
hwqs work, and how is the hardware notified that there are still packets
queued / that new packets have arrived for an already mapped txq?

> It would be nice to be able to use it, for better queue behaviour, but
> it would mean much more accounting in iwlwifi? Not even sure off the
> top of my head how to do that.

I think we'll need to have some kind of fallback airtime estimation in
mac80211 that calculates airtime from packet size and rate information.
Not all drivers can get this from the hardware, it seems. See my reply
to Kan about the BQL-like behaviour as well.

Does iwl get airtime usage information for every packet on tx complete?

-Toke

2018-09-10 16:20:14

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

On Mon, 2018-09-10 at 13:17 +0200, Toke Høiland-Jørgensen wrote:

> > > - I did not add any locking around next_txq(); the driver is still supposed
> > > to maintain a lock that prevents two threads from trying to schedule the
> > > same AC at the same time. This is what drivers already do, so I figured it
> > > was easier to just keep it that way rather than do it in mac80211.
> >
> > I'll look at this in the code, but from a maintainer perspective I'm
> > somewhat worried that this will lead to issues that are really the
> > driver's fault, but surface in mac80211. I don't know how easy it
> > would be to catch that.
>
> Yeah, I get what you mean. The alternative would be to have a
> ieee80211_start_schedule(ac) and ieee80211_end_schedule(ac), which
> basically just takes a lock.

And I guess start would increment the schedule number, which is now
dependent on first

> Would mean we could get rid of the 'first'
> parameter for next_txq(), so might not be such a bad idea;

Right, that's what I meant.

> and if the
> driver has its own locking the extra locking in mac80211 would just be
> an always-uncontested spinlock, which shouldn't be much overhead, right?

It may still bounce around CPUs if you call this from other places, but
I suspect that wouldn't be the biggest issue. There are a lot of
calculations going on too...

johannes

2018-09-13 14:34:03

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

Kan Yan <[email protected]> writes:

>> I think this is a great approach, that we should definitely adopt in
>> mac80211. However, I'm not sure the outstanding airtime limiting should
>> be integrated into the fairness scheduling, since there are drivers such
>> as ath9k that don't need it.
>>
>> Rather, I'd propose that we figure out the API for fairness scheduling
>> first, and add the BQL-like limiter as a separate layer. They can be
>> integrated in the sense that the limiter's estimate of airtime can be
>> used for fairness scheduling as well in the case where the
>> driver/firmware doesn't have a better source of airtime usage.
>>
>> Does that make sense?
> Sure that make sense. Yes, the airtime based BQL like queue limiter is
> not needed for drivers such as ath9k. We could leave the outstanding
> airtime accounting and queue depth limit to another subject and get
> airtime fairness scheduling API done first.

Cool! I was thinking I would stub out an untested PoC in a separate
patch on my next submission, to show how I think this could be
incorporated. Have some other stuff to finish up this week, but
hopefully I'll have time to do the next version over the weekend :)

-Toke

2018-09-10 17:40:33

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Mon, 2018-09-10 at 14:39 +0200, Toke Høiland-Jørgensen wrote:

> > From then on, right now we basically just try to refill the hardware
> > queue from the TXQ whenever the hardware releases frames from that
> > queue. If there are none, we fall back to putting them on the hardware
> > queue from the wake signal.
>
> OK. So if the TXQ is ineligible to dequeue packets, but still have them
> queued, there would need to be a signal (such as wake_txq()) when it
> becomes eligible again, right?

Right. It wouldn't have to be wake_txq, but that seems as appropriate as
anything else.

If we were to use ieee80211_airtime_may_transmit(), we'd have to have
something that tells us "I previously told you it may not transmit, but
now I changed my mind" :-)

johannes

2018-09-10 16:18:39

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

On Mon, 2018-09-10 at 13:16 +0200, Toke Høiland-Jørgensen wrote:

> > > - I didn't get rid of the register_airtime() callback. As far as I can tell,
> > > only iwlwifi uses the tx_time field in the struct tx_info. Which means that
> > > we *could* probably use it for this and just make the other drivers set it;
> > > but I'm not sure what effects that would have in relation to WMM-AC for
> > > those drivers, so I chickened out. Will have to try it out, I guess; but it
> > > also depends on whether ath10k needs to be able to report airtime
> > > asynchronously anyway. So I'll hold off on that for a bit more.
> >
> > I don't think you need to be concerned, the reporting through this has
> > no immediate effect as the driver would also have to set the feature
> > flag (NL80211_FEATURE_SUPPORTS_WMM_ADMISSION) for userspace to be able
> > to use WMM admission TSPECs, and getting tx_tspec->admitted_time to be
> > non-zero in ieee80211_sta_tx_wmm_ac_notify().
>
> Great! In that case I'll try moving the reporting go through the tx_info
> struct and check that it works for ath9k :)

I'm not sure it would work for everyone (though I guess it should for
ath9k), so it may well be necessary to report it out-of-band from
frames. My objection wasn't so much to the out-of-band mechanism, but to
the fact that we have two reporting mechanisms that are each tied to one
feature, while reporting the same thing.

So if you figure out that you need to keep the reporting out-of-band I'm
perfectly fine with that, but I guess I'd argue the two should cross
over: reporting in-band should feed back to this, reporting out-of-band
should feed the admission mechanism.

johannes

2018-09-10 18:12:46

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Mon, 2018-09-10 at 15:08 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Johannes Berg <[email protected]> writes:
>>=20
>> > On Mon, 2018-09-10 at 14:39 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wr=
ote:
>> >=20
>> > > > From then on, right now we basically just try to refill the hardwa=
re
>> > > > queue from the TXQ whenever the hardware releases frames from that
>> > > > queue. If there are none, we fall back to putting them on the hard=
ware
>> > > > queue from the wake signal.
>> > >=20
>> > > OK. So if the TXQ is ineligible to dequeue packets, but still have t=
hem
>> > > queued, there would need to be a signal (such as wake_txq()) when it
>> > > becomes eligible again, right?
>> >=20
>> > Right. It wouldn't have to be wake_txq, but that seems as appropriate =
as
>> > anything else.
>> >=20
>> > If we were to use ieee80211_airtime_may_transmit(), we'd have to have
>> > something that tells us "I previously told you it may not transmit, but
>> > now I changed my mind" :-)
>>=20
>> Yes, exactly. Hmm, I'll see what I can come up with :)
>
> No need to look at this right now - let's get this stuff sorted out
> first :)

Right, sure, I'll get the port of ath9k done first; but doesn't hurt to
think about API compatibility with the other drivers at the same time as
well...

If we have the start_schedule() / end_schedule() pair anyway, the latter
could notify any TXQs that became eligible during the scheduling round.

Also, instead of having the three different API functions
(next_txq()/may_tx()/schedule_txq()), we could have get_txq(txq)/put_txq(t=
xq)
which would always need to be paired; but the argument to get_txq()
could be optional, and if the driver passes NULL it means "give me the
next available TXQ".

So for ath9k it would be:


start_schedule(ac);
while ((txq =3D get_txq(NULL)) {
queue_aggregate(txq);
put_txq(txq);
}
end_schedule(ac);

And for ath10k/iwlwifi it would be:

on_hw_notify(txq) {
start_schedule(ac);
if (txq =3D get_txq(txq)) {
queue_packets(txq);
put_txq(txq);
}
end_schedule(ac);
}


I think that would be simpler, API-wise?

-Toke

2018-09-08 03:05:58

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

This adds an API to mac80211 to handle scheduling of TXQs and changes the
interface between driver and mac80211 for TXQ handling as follows:

- The wake_tx_queue callback interface includes only the ac number
instead of the TXQ. The driver is expected to retrieve TXQs from
ieee80211_next_txq()

- Two new mac80211 functions are added: ieee80211_next_txq() and
ieee80211_schedule_txq(). The former returns the next TXQ that should be
scheduled, and is how the driver gets a queue to pull packets from. The
latter is called internally by mac80211 to start scheduling a queue, and
the driver is supposed to call it to re-schedule the TXQ after it is
finished pulling packets from it (unless the queue emptied). Drivers
can optionally filter TXQs on ac to support per-AC hardware queue
designs.

Usage of the new API is optional, so drivers can be ported one at a time.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
---
include/net/mac80211.h | 45 ++++++++++++++++++++++++++++++---
net/mac80211/agg-tx.c | 2 +
net/mac80211/driver-ops.h | 7 +++++
net/mac80211/ieee80211_i.h | 9 +++++++
net/mac80211/main.c | 4 +++
net/mac80211/sta_info.c | 2 +
net/mac80211/tx.c | 60 +++++++++++++++++++++++++++++++++++++++++++-
7 files changed, 122 insertions(+), 7 deletions(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index c4fadbafbf21..ff413d9caa50 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -108,9 +108,16 @@
* The driver is expected to initialize its private per-queue data for stations
* and interfaces in the .add_interface and .sta_add ops.
*
- * The driver can't access the queue directly. To dequeue a frame, it calls
- * ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a queue, it
- * calls the .wake_tx_queue driver op.
+ * The driver can't access the queue directly. To dequeue a frame from a
+ * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
+ * queue, it calls the .wake_tx_queue driver op.
+ *
+ * Drivers can optionally delegate responsibility for scheduling queues to
+ * mac80211, to take advantage of airtime fairness accounting. In this case, to
+ * obtain the next queue to pull frames from, the driver calls
+ * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
+ * using ieee80211_schedule_txq() if it is still active after the driver has
+ * finished pulling packets from it.
*
* For AP powersave TIM handling, the driver only needs to indicate if it has
* buffered packets in the driver specific data structures by calling
@@ -6045,13 +6052,43 @@ void ieee80211_unreserve_tid(struct ieee80211_sta *sta, u8 tid);
* ieee80211_tx_dequeue - dequeue a packet from a software tx queue
*
* @hw: pointer as obtained from ieee80211_alloc_hw()
- * @txq: pointer obtained from station or virtual interface
+ * @txq: pointer obtained from station or virtual interface, or from
+ * ieee80211_next_txq()
*
* Returns the skb if successful, %NULL if no frame was available.
*/
struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
struct ieee80211_txq *txq);

+/**
+ * ieee80211_schedule_txq - add txq to scheduling loop
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @txq: pointer obtained from station or virtual interface
+ *
+ * Returns %true if the txq was actually added to the scheduling,
+ * %false otherwise.
+ */
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+ struct ieee80211_txq *txq);
+
+/**
+ * ieee80211_next_txq - get next tx queue to pull packets from
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @ac: AC number to return packets from.
+ * @first: Setting this to true signifies the start of a new scheduling round.
+ * Each TXQ will only be returned at most exactly once in each round.
+ *
+ * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
+ * is returned, it will have been removed from the scheduler queue and needs to
+ * be re-scheduled with ieee80211_schedule_txq() to continue to be active. Note
+ * that the driver is responsible for locking to ensuring two different threads
+ * don't schedule the same AC simultaneously.
+ */
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
+ bool first);
+
/**
* ieee80211_txq_get_depth - get pending frame/byte count of given txq
*
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c
index 69e831bc317b..e94b1a0407af 100644
--- a/net/mac80211/agg-tx.c
+++ b/net/mac80211/agg-tx.c
@@ -229,7 +229,7 @@ ieee80211_agg_start_txq(struct sta_info *sta, int tid, bool enable)
clear_bit(IEEE80211_TXQ_STOP, &txqi->flags);
local_bh_disable();
rcu_read_lock();
- drv_wake_tx_queue(sta->sdata->local, txqi);
+ schedule_and_wake_txq(sta->sdata->local, txqi);
rcu_read_unlock();
local_bh_enable();
}
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index e42c641b6190..f1578c394f75 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -1173,6 +1173,13 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local,
local->ops->wake_tx_queue(&local->hw, &txq->txq);
}

+static inline void schedule_and_wake_txq(struct ieee80211_local *local,
+ struct txq_info *txqi)
+{
+ ieee80211_schedule_txq(&local->hw, &txqi->txq);
+ drv_wake_tx_queue(local, txqi);
+}
+
static inline int drv_can_aggregate_in_amsdu(struct ieee80211_local *local,
struct sk_buff *head,
struct sk_buff *skb)
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index f40a2167935f..75f9c9ab364f 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -829,6 +829,8 @@ enum txq_info_flags {
* a fq_flow which is already owned by a different tin
* @def_cvars: codel vars for @def_flow
* @frags: used to keep fragments created after dequeue
+ * @schedule_order: used with ieee80211_local->active_txqs
+ * @schedule_round: counter to prevent infinite loops on TXQ scheduling
*/
struct txq_info {
struct fq_tin tin;
@@ -836,6 +838,8 @@ struct txq_info {
struct codel_vars def_cvars;
struct codel_stats cstats;
struct sk_buff_head frags;
+ struct list_head schedule_order;
+ u16 schedule_round;
unsigned long flags;

/* keep last! */
@@ -1127,6 +1131,11 @@ struct ieee80211_local {
struct codel_vars *cvars;
struct codel_params cparams;

+ /* protects active_txqs and txqi->schedule_order */
+ spinlock_t active_txq_lock;
+ struct list_head active_txqs[IEEE80211_NUM_ACS];
+ u16 schedule_round[IEEE80211_NUM_ACS];
+
const struct ieee80211_ops *ops;

/*
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 77381017bac7..c6d88758cbb7 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -663,6 +663,10 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
spin_lock_init(&local->rx_path_lock);
spin_lock_init(&local->queue_stop_reason_lock);

+ for (i = 0; i < IEEE80211_NUM_ACS; i++)
+ INIT_LIST_HEAD(&local->active_txqs[i]);
+ spin_lock_init(&local->active_txq_lock);
+
INIT_LIST_HEAD(&local->chanctx_list);
mutex_init(&local->chanctx_mtx);

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index fb8c2252ac0e..c2f5cb7df54f 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1249,7 +1249,7 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
if (!sta->sta.txq[i] || !txq_has_queue(sta->sta.txq[i]))
continue;

- drv_wake_tx_queue(local, to_txq_info(sta->sta.txq[i]));
+ schedule_and_wake_txq(local, to_txq_info(sta->sta.txq[i]));
}

skb_queue_head_init(&pending);
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index c42bfa1dcd2c..8bcc447e2cbc 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1445,6 +1445,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
codel_vars_init(&txqi->def_cvars);
codel_stats_init(&txqi->cstats);
__skb_queue_head_init(&txqi->frags);
+ INIT_LIST_HEAD(&txqi->schedule_order);

txqi->txq.vif = &sdata->vif;

@@ -1485,6 +1486,9 @@ void ieee80211_txq_purge(struct ieee80211_local *local,

fq_tin_reset(fq, tin, fq_skb_free_func);
ieee80211_purge_tx_queue(&local->hw, &txqi->frags);
+ spin_lock_bh(&local->active_txq_lock);
+ list_del_init(&txqi->schedule_order);
+ spin_unlock_bh(&local->active_txq_lock);
}

void ieee80211_txq_set_params(struct ieee80211_local *local)
@@ -1601,7 +1605,7 @@ static bool ieee80211_queue_skb(struct ieee80211_local *local,
ieee80211_txq_enqueue(local, txqi, skb);
spin_unlock_bh(&fq->lock);

- drv_wake_tx_queue(local, txqi);
+ schedule_and_wake_txq(local, txqi);

return true;
}
@@ -3623,6 +3627,60 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
}
EXPORT_SYMBOL(ieee80211_tx_dequeue);

+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+ struct ieee80211_txq *txq)
+{
+ struct ieee80211_local *local = hw_to_local(hw);
+ struct txq_info *txqi = to_txq_info(txq);
+ bool ret = false;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ if (list_empty(&txqi->schedule_order)) {
+ list_add_tail(&txqi->schedule_order,
+ &local->active_txqs[txq->ac]);
+ ret = true;
+ }
+
+ spin_unlock_bh(&local->active_txq_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(ieee80211_schedule_txq);
+
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
+ bool first)
+{
+ struct ieee80211_local *local = hw_to_local(hw);
+ struct txq_info *txqi = NULL;
+
+ spin_lock_bh(&local->active_txq_lock);
+
+ if (first)
+ local->schedule_round[ac]++;
+
+ if (list_empty(&local->active_txqs[ac]))
+ goto out;
+
+ txqi = list_first_entry(&local->active_txqs[ac],
+ struct txq_info,
+ schedule_order);
+
+ if (txqi->schedule_round == local->schedule_round[ac])
+ txqi = NULL;
+ else
+ list_del_init(&txqi->schedule_order);
+ out:
+ spin_unlock_bh(&local->active_txq_lock);
+
+ if (!txqi)
+ return NULL;
+
+ txqi->schedule_round = local->schedule_round[ac];
+ return &txqi->txq;
+}
+EXPORT_SYMBOL(ieee80211_next_txq);
+
void __ieee80211_subif_start_xmit(struct sk_buff *skb,
struct net_device *dev,
u32 info_flags)

2018-09-10 12:57:23

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:

> +/**
> + * ieee80211_next_txq - get next tx queue to pull packets from
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @ac: AC number to return packets from.
> + * @first: Setting this to true signifies the start of a new scheduling round.
> + * Each TXQ will only be returned at most exactly once in each round.
> + *
> + * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
> + * is returned, it will have been removed from the scheduler queue and needs to
> + * be re-scheduled with ieee80211_schedule_txq() to continue to be active. Note
> + * that the driver is responsible for locking to ensuring two different threads
> + * don't schedule the same AC simultaneously.

"Note that [...] to ensure [...]" would seem better to me?

> + */
> +struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
> + bool first);

Why should "ac" be signed? Is there some (undocumented) behaviour that
allows for passing -1 for "don't care" or "pick highest with frames"?

You said "optionally" in the commit message, but I don't think that's
true, since you just use ac to index the array in the code.

> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
> + struct ieee80211_txq *txq)
> +{
> + struct ieee80211_local *local = hw_to_local(hw);
> + struct txq_info *txqi = to_txq_info(txq);
> + bool ret = false;
> +
> + spin_lock_bh(&local->active_txq_lock);
> +
> + if (list_empty(&txqi->schedule_order)) {

Is there a case where it's allowed to call this while *not* empty? At
least for the drivers it seems not, so perhaps when called from the
driver it should WARN() here or so?

I'm just a bit worried that drivers will get this a bit wrong, and we'll
never know, but things won't work properly in some weird way.

> + list_add_tail(&txqi->schedule_order,
> + &local->active_txqs[txq->ac]);
> + ret = true;
> + }
> +
> + spin_unlock_bh(&local->active_txq_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(ieee80211_schedule_txq);

I'd remove the return value - the driver has no need to know that it
raced with potential rescheduling in mac80211 due to a new packet, and
you don't use the return value in mac80211 either.

It also seems to me that you forgot to say when it should be
rescheduled? As is, following the documentation would sort of makes it
seem you should take the queue and put it back at the end (without any
conditions), and that could lead to "scheduling loops" since empty
queues would always be put back when they shouldn't be?

Also, come to think of it, but I guess the next patch(es) solve(s) that
- if mac80211 adds a packet while you have this queue scheduled then you
would take that packet even if it's questionable it belongs to this
round?

>From a driver perspective, I think I'd prefer an API called e.g.

ieee80211_return_txq()

that does the check internally if we need to schedule the queue (i.e.
there are packets on it). I was going to say we could even annotate with
sparse, but no - we can't because ieee80211_next_txq() may return NULL.
Still, it's easier to ensure that all cleanup paths call
ieee80211_return_txq() if it's not conditional, IMHO.


> +struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
> + bool first)
> +{
> + struct ieee80211_local *local = hw_to_local(hw);
> + struct txq_info *txqi = NULL;
> +
> + spin_lock_bh(&local->active_txq_lock);
> +
> + if (first)
> + local->schedule_round[ac]++;

(here - clearly ac must be 0 <= ac < IEEE80211_NUM_ACS)

> + if (list_empty(&local->active_txqs[ac]))
> + goto out;
> +
> + txqi = list_first_entry(&local->active_txqs[ac],
> + struct txq_info,
> + schedule_order);
> +
> + if (txqi->schedule_round == local->schedule_round[ac])
> + txqi = NULL;

that assigment seems unnecessary? txqi defaults to NULL.

> + else
> + list_del_init(&txqi->schedule_order);
> + out:
> + spin_unlock_bh(&local->active_txq_lock);
> +
> + if (!txqi)
> + return NULL;
> +
> + txqi->schedule_round = local->schedule_round[ac];
> + return &txqi->txq;
> +}
> +EXPORT_SYMBOL(ieee80211_next_txq);

johannes

2018-09-11 14:18:42

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Mon, 2018-09-10 at 17:00 +0200, Toke Høiland-Jørgensen wrote:

> > Do we even need end_schedule()? It's hard to pass multiple things to a
> > single call (do you build a list?), so having
> >
> > start_schedule(), get_txq(), return_txq()
> >
> > would be sufficient?
>
> Well, start_schedule() / end_schedule() would be needed if we are going
> to add locking in mac80211?

[...]

> If we decide mac80211 needs to do locking to prevent two threads from
> scheduling the same ac, that would also be needed for the hw-managed
> case?

Yes, good point.

> > It seems like not? Basically it seems to me that in the hw-managed
> > case all you need is may_tx()? And in fact, once you opt in you don't
> > even really need *that* since mac80211 can just return NULL from
> > get_skb()?
>
> Yeah, we could just throttle in get_skb(); the separate call was to
> avoid the overhead of the check for every packet. Typically, you'll pick
> a TXQ, then dequeue multiple packets from it in succession; with a
> separate call to may_tx(), you only do the check once, not for every
> packet...

Yeah, also a good point.

Still, txq = get_txq(txq) doesn't feel right, so better to keep that
separate I think.

johannes

2018-09-12 05:09:29

by Rajkumar Manoharan

[permalink] [raw]
Subject: Re: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

On 2018-09-10 04:13, Toke Høiland-Jørgensen wrote:
> Johannes Berg <[email protected]> writes:
>>> - txqi->flags & (1<<IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>>> - txqi->flags & (1<<IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>>> - txqi->flags & (1<<IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" :
>>> "");
>>> + txqi->flags & (1 << IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>>> + txqi->flags & (1 << IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>>> + txqi->flags & (1 << IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU"
>>> : "");
>>
>> consider BIT() instead as a cleanup? :)
>>
>> (or maybe this is just a leftover from merging some other patches?)
>
> Yeah, this is a merging artifact; Rajkumar's patch added another flag,
> that I removed again. Didn't notice that there was still a whitespace
> change in this patch...
>
I added the flag based on our last discussion. The driver needs to check
txq status for each tx_dequeue(). One time txq check is not sufficient
as it allows the driver to dequeue all frames from txq.

drv_func() {
while (ieee80211_airtime_may_transmit(txq) &&
hw_has_space() &&
(pkt = ieee80211_tx_dequeue(hw, txq)))
push_to_hw(pkt);
}

>>> +bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
>>> + struct ieee80211_txq *txq)
>>> +{
>>> + struct ieee80211_local *local = hw_to_local(hw);
>>> + struct txq_info *txqi = to_txq_info(txq);
>>> + bool may_tx = false;
>>> +
>>> + spin_lock_bh(&local->active_txq_lock);
>>> +
>>> + if (ieee80211_txq_check_deficit(local, txqi)) {
>>> + may_tx = true;
>>> + list_del_init(&txqi->schedule_order);
>>

To handle above case, may_transmit should remove the node only
when it is in list.

if (list_empty(&txqi->schedule_order))
list_del_init(&txqi->schedule_order);

So that it can be used to determine whether txq is running negative.

-Rajkumar

2018-09-10 16:07:31

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 2/4] mac80211: Add airtime accounting and scheduling to TXQs

Johannes Berg <[email protected]> writes:

> On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>=20
>> +/**
>> + * ieee80211_airtime_may_transmit - check whether TXQ is allowed to tra=
nsmit
>> + *
>> + * This function is used to check whether given txq is allowed to trans=
mit by
>> + * the airtime scheduler, and can be used by drivers to access the airt=
ime
>> + * fairness accounting without going using the scheduling order enfored=
by
>> + * next_txq().
>> + *
>> + * Returns %true if the airtime scheduler does not think the TXQ should=
be
>> + * throttled. This function can also have the side effect of rotating t=
he TXQ in
>> + * the scheduler rotation, which will eventually bring the deficit to p=
ositive
>> + * and allow the station to transmit again.
>> + *
>> + * If this function returns %true, the TXQ will have been removed from =
the
>> + * scheduling. It is the driver's responsibility to add it back (by cal=
ling
>> + * ieee80211_schedule_txq()) if needed.
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @txq: pointer obtained from station or virtual interface
>> + *
>> + */
>
> Spurious newline there at the end.
>
> (As an aside from the review, perhaps this would be what we could use in
> iwlwifi, but I'll need to think about _when_ to add it back to the
> scheduling later).

Yeah, this is the API that would be used by drivers where the actual
scheduling of TXQs is done by the hardware (i.e., ath10k in pull/push
mode, and iwlwifi if I understand you correctly); whereas the next_txq()
call is for drivers that do the scheduling in software (ath10k without
pull/push, ath9k and mt76).

Basically, the way I envision this is the drivers doing something like:

function on_hw_notify(hw, txq) {
if (ieee80211_airtime_may_transmit(txq)) {
while (hw_has_space() && (pkt =3D ieee80211_tx_dequeue(hw, txq)))
push_to_hw(pkt);

ieee80211_schedule_txq(txq);
}
}
=20=20=20=20=20=20
>> + * @NL80211_EXT_FEATURE_AIRTIME_FAIRNESS: Driver supports airtime fairn=
ess
>> + * scheduling.
>> + *
>> * @NUM_NL80211_EXT_FEATURES: number of extended features.
>> * @MAX_NL80211_EXT_FEATURES: highest extended feature index.
>> */
>> @@ -5269,6 +5272,7 @@ enum nl80211_ext_feature_index {
>> NL80211_EXT_FEATURE_SCAN_RANDOM_SN,
>> NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT,
>> NL80211_EXT_FEATURE_CAN_REPLACE_PTK0,
>> + NL80211_EXT_FEATURE_AIRTIME_FAIRNESS,
>
> Why is this necessary in this patch?

It's meant as a way for the driver to signal that it has a source of
airtime information, and thus would like to be scheduled to take this
into account.

> I think that this should probably come with more documentation too,
> particularly what's expected of the driver here.

Right :)

> I'd prefer you reorder patches 2 and 3, and define everything in
> cfg80211 first. That also makes it easier to reason about what happens
> when the feature is not supported (since it will not be supported
> anywhere). Then this can be included in the cfg80211 patch instead,
> which places it better, and the mac80211 bits from the cfg80211 patch=20
> can be included here, which also places those better.

Good idea, will do!

>> - txqi->flags & (1<<IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>> - txqi->flags & (1<<IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>> - txqi->flags & (1<<IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" : "");
>> + txqi->flags & (1 << IEEE80211_TXQ_STOP) ? "STOP" : "RUN",
>> + txqi->flags & (1 << IEEE80211_TXQ_AMPDU) ? " AMPDU" : "",
>> + txqi->flags & (1 << IEEE80211_TXQ_NO_AMSDU) ? " NO-AMSDU" : "=
");
>
> consider BIT() instead as a cleanup? :)
>
> (or maybe this is just a leftover from merging some other patches?)

Yeah, this is a merging artifact; Rajkumar's patch added another flag,
that I removed again. Didn't notice that there was still a whitespace
change in this patch...

>> +static bool ieee80211_txq_check_deficit(struct ieee80211_local *local,
>> + struct txq_info *txqi)
>
> Maybe this should be called "has_deficit" or so - the polarity isn't
> immediately clear to me.

Yup, I suck at naming; gotcha ;)

>> +{
>> + struct sta_info *sta;
>> +
>> + lockdep_assert_held(&local->active_txq_lock);
>> +
>> + if (!txqi->txq.sta)
>> + return true;
>> +
>> + sta =3D container_of(txqi->txq.sta, struct sta_info, sta);
>> +
>> + if (sta->airtime.deficit[txqi->txq.ac] > 0)
>> + return true;
>> +
>> + if (txqi =3D=3D list_first_entry_or_null(&local->active_txqs[txqi->txq=
.ac],
>
> nit: I don't think you need _or_null here, since list_first_entry() will
> "return" the head itself if the list is empty, which cannot match txqi?
> Doesn't matter that much though, and if you think this is easier to
> understand then that's fine.
>
>> + struct txq_info,
>> + schedule_order)) {
>> + sta->airtime.deficit[txqi->txq.ac] +=3D sta->airtime.weight;
>> + list_move_tail(&txqi->schedule_order,
>> + &local->active_txqs[txqi->txq.ac]);
>> + }
>> +
>> + return false;
>> +}
>> +
>> struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
>> bool first)
>> {
>> @@ -3659,6 +3701,7 @@ struct ieee80211_txq *ieee80211_next_txq(struct ie=
ee80211_hw *hw, s8 ac,
>> if (first)
>> local->schedule_round[ac]++;
>>=20=20
>> + begin:
>> if (list_empty(&local->active_txqs[ac]))
>> goto out;
>>=20=20
>> @@ -3666,6 +3709,9 @@ struct ieee80211_txq *ieee80211_next_txq(struct ie=
ee80211_hw *hw, s8 ac,
>> struct txq_info,
>> schedule_order);
>>=20=20
>> + if (!ieee80211_txq_check_deficit(local, txqi))
>> + goto begin;
>
> This code isn't so badly indented - you could use a do { } while () loop
> instead?
>
>> if (txqi->schedule_round =3D=3D local->schedule_round[ac])
>> txqi =3D NULL;
>
> and maybe you want this check first, it's much cheaper?
>
> do {
> ...
> } while (txqi->schedule_round !=3D local->schedule_round[ac] &&
> !has_deficit())
>
> or so?
>
> That is to say, I'm not sure why you'd want to abort if you find a
> queue that has no deficit but is part of the next round? Or is that
> the abort condition for having gone around the whole list once? Hmm.
> But check_deficit() also moves them to the end, so you don't really
> get that anyway?

The move to the end for check_deficit() is what replenishes the airtime
deficit and ensures fairness. So that can loop several times in a single
call to the function. The schedule_round counter is there to prevent the
driver from looping indefinitely when doing `while (txq =3D
ieee80211_next_txq())`. We'd generally want the deficit replenish to
happen in any case; previously, the schedule_round type check was in the
driver; moving it here is an attempt at simplifying the logic needed in
the driver when using the API.

>> @@ -3681,6 +3727,25 @@ struct ieee80211_txq *ieee80211_next_txq(struct i=
eee80211_hw *hw, s8 ac,
>> }
>> EXPORT_SYMBOL(ieee80211_next_txq);
>>=20=20
>> +bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
>> + struct ieee80211_txq *txq)
>> +{
>> + struct ieee80211_local *local =3D hw_to_local(hw);
>> + struct txq_info *txqi =3D to_txq_info(txq);
>> + bool may_tx =3D false;
>> +
>> + spin_lock_bh(&local->active_txq_lock);
>> +
>> + if (ieee80211_txq_check_deficit(local, txqi)) {
>> + may_tx =3D true;
>> + list_del_init(&txqi->schedule_order);
>
> Manipulating the list twice like this here seems slightly odd ...
> perhaps move the list manipulation out of txq_check_deficit() entirely?
> Clearly it's logically fine, but seems harder than necessary to
> understand.
>
> Also, if the check_deficit() function just returns a value, the renaming
> I suggested is easier to accept :)

Yeah, maybe; it'll result in some duplication between next_txq() and
txq_may_transmit() (to the point where I'm not sure if the separate
check_deficit() function is really needed anymore), but it may be that
that is actually easier to understand?

-Toke

2018-09-10 15:56:17

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

Johannes Berg <[email protected]> writes:

> On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> +/**
>> + * ieee80211_next_txq - get next tx queue to pull packets from
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @ac: AC number to return packets from.
>> + * @first: Setting this to true signifies the start of a new scheduling=
round.
>> + * Each TXQ will only be returned at most exactly once in each =
round.
>> + *
>> + * Returns the next txq if successful, %NULL if no queue is eligible. I=
f a txq
>> + * is returned, it will have been removed from the scheduler queue and =
needs to
>> + * be re-scheduled with ieee80211_schedule_txq() to continue to be acti=
ve. Note
>> + * that the driver is responsible for locking to ensuring two different=
threads
>> + * don't schedule the same AC simultaneously.
>
> "Note that [...] to ensure [...]" would seem better to me?
>
>> + */
>> +struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
>> + bool first);
>
> Why should "ac" be signed? Is there some (undocumented) behaviour that
> allows for passing -1 for "don't care" or "pick highest with frames"?
>
> You said "optionally" in the commit message, but I don't think that's
> true, since you just use ac to index the array in the code.

There was in the previous version, but I removed that; guess I forgot to
change the sign and the commit message ;)

>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>> + struct ieee80211_txq *txq)
>> +{
>> + struct ieee80211_local *local =3D hw_to_local(hw);
>> + struct txq_info *txqi =3D to_txq_info(txq);
>> + bool ret =3D false;
>> +
>> + spin_lock_bh(&local->active_txq_lock);
>> +
>> + if (list_empty(&txqi->schedule_order)) {
>
> Is there a case where it's allowed to call this while *not* empty? At
> least for the drivers it seems not, so perhaps when called from the
> driver it should WARN() here or so?

Yeah, mac80211 calls this on wake_txq, where it will often be scheduled
already. And since that can happen while drivers schedule stuff, it
could also be the case that a driver re-schedules a queue that is
already scheduled.

> I'm just a bit worried that drivers will get this a bit wrong, and
> we'll never know, but things won't work properly in some weird way.

What, subtle bugs in the data path that causes hangs in hard-to-debug
cases? Nah, that never happens ;)

>> + list_add_tail(&txqi->schedule_order,
>> + &local->active_txqs[txq->ac]);
>> + ret =3D true;
>> + }
>> +
>> + spin_unlock_bh(&local->active_txq_lock);
>> +
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(ieee80211_schedule_txq);
>
> I'd remove the return value - the driver has no need to know that it
> raced with potential rescheduling in mac80211 due to a new packet, and
> you don't use the return value in mac80211 either.

Yeah, good point; that was left over from when the call to wake_txq was
conditional on this actually doing something...

> It also seems to me that you forgot to say when it should be
> rescheduled? As is, following the documentation would sort of makes it
> seem you should take the queue and put it back at the end (without any
> conditions), and that could lead to "scheduling loops" since empty
> queues would always be put back when they shouldn't be?
>
> Also, come to think of it, but I guess the next patch(es) solve(s) that
> - if mac80211 adds a packet while you have this queue scheduled then you
> would take that packet even if it's questionable it belongs to this
> round?
>
> From a driver perspective, I think I'd prefer an API called e.g.
>
> ieee80211_return_txq()
>
> that does the check internally if we need to schedule the queue (i.e.
> there are packets on it). I was going to say we could even annotate with
> sparse, but no - we can't because ieee80211_next_txq() may return NULL.
> Still, it's easier to ensure that all cleanup paths call
> ieee80211_return_txq() if it's not conditional, IMHO.

Good idea, will do.

>> +struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
>> + bool first)
>> +{
>> + struct ieee80211_local *local =3D hw_to_local(hw);
>> + struct txq_info *txqi =3D NULL;
>> +
>> + spin_lock_bh(&local->active_txq_lock);
>> +
>> + if (first)
>> + local->schedule_round[ac]++;
>
> (here - clearly ac must be 0 <=3D ac < IEEE80211_NUM_ACS)
>
>> + if (list_empty(&local->active_txqs[ac]))
>> + goto out;
>> +
>> + txqi =3D list_first_entry(&local->active_txqs[ac],
>> + struct txq_info,
>> + schedule_order);
>> +
>> + if (txqi->schedule_round =3D=3D local->schedule_round[ac])
>> + txqi =3D NULL;
>
> that assigment seems unnecessary? txqi defaults to NULL.

No, this is an 'undo' of the previous line. I.e., we get the first txqi,
check if we've already seen it this round, and if we have, unset txqi so
the function returns NULL (causing the driver to stop looping).

-Toke

2018-09-10 15:45:51

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

Kan Yan <[email protected]> writes:

> Hi Toke,
>
> It is great to see finally there will be a general airtime fairness
> support in mac80211. I feel this patch set could still use some
> improvements to make it works better with 11ac drivers like ath10k. I
> did a version of airtime fairness in the ath10k driver that used in
> Google Wifi and I'd like to share a few things I learned from this
> experience.

Hi Kan

Thanks for weighing in!

> The idea is to use airtime accounting to give firmware just enough
> frames to keep it busy and form good aggregations, and keeps the rest
> of frames in mac80211 layer queues so Fq_Codel can work on it to
> control latency.

I think this is a great approach, that we should definitely adopt in
mac80211. However, I'm not sure the outstanding airtime limiting should
be integrated into the fairness scheduling, since there are drivers such
as ath9k that don't need it.

Rather, I'd propose that we figure out the API for fairness scheduling
first, and add the BQL-like limiter as a separate layer. They can be
integrated in the sense that the limiter's estimate of airtime can be
used for fairness scheduling as well in the case where the
driver/firmware doesn't have a better source of airtime usage.

Does that make sense?

-Toke

2018-09-10 12:38:54

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

On Sat, 2018-09-08 at 00:22 +0200, Toke Høiland-Jørgensen wrote:

> - I didn't get rid of the register_airtime() callback. As far as I can tell,
> only iwlwifi uses the tx_time field in the struct tx_info. Which means that
> we *could* probably use it for this and just make the other drivers set it;
> but I'm not sure what effects that would have in relation to WMM-AC for
> those drivers, so I chickened out. Will have to try it out, I guess; but it
> also depends on whether ath10k needs to be able to report airtime
> asynchronously anyway. So I'll hold off on that for a bit more.

I don't think you need to be concerned, the reporting through this has
no immediate effect as the driver would also have to set the feature
flag (NL80211_FEATURE_SUPPORTS_WMM_ADMISSION) for userspace to be able
to use WMM admission TSPECs, and getting tx_tspec->admitted_time to be
non-zero in ieee80211_sta_tx_wmm_ac_notify().

I just think that this may be desirable to drivers eventually, and/or
maybe iwlwifi wants to take advantage of the airtime scheduling
eventually, so having two APIs overlapping seems a bit strange.

johannes

2018-09-13 09:18:13

by Kan Yan

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

> I think this is a great approach, that we should definitely adopt in
> mac80211. However, I'm not sure the outstanding airtime limiting should
> be integrated into the fairness scheduling, since there are drivers such
> as ath9k that don't need it.
>
> Rather, I'd propose that we figure out the API for fairness scheduling
> first, and add the BQL-like limiter as a separate layer. They can be
> integrated in the sense that the limiter's estimate of airtime can be
> used for fairness scheduling as well in the case where the
> driver/firmware doesn't have a better source of airtime usage.
>
> Does that make sense?
Sure that make sense. Yes, the airtime based BQL like queue limiter is
not needed for drivers such as ath9k. We could leave the outstanding
airtime accounting and queue depth limit to another subject and get
airtime fairness scheduling API done first.


On Mon, Sep 10, 2018 at 4:26 AM, Johannes Berg
<[email protected]> wrote:
> On Mon, 2018-09-10 at 13:17 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote=
:
>
>> > > - I did not add any locking around next_txq(); the driver is still s=
upposed
>> > > to maintain a lock that prevents two threads from trying to schedu=
le the
>> > > same AC at the same time. This is what drivers already do, so I fi=
gured it
>> > > was easier to just keep it that way rather than do it in mac80211.
>> >
>> > I'll look at this in the code, but from a maintainer perspective I'm
>> > somewhat worried that this will lead to issues that are really the
>> > driver's fault, but surface in mac80211. I don't know how easy it
>> > would be to catch that.
>>
>> Yeah, I get what you mean. The alternative would be to have a
>> ieee80211_start_schedule(ac) and ieee80211_end_schedule(ac), which
>> basically just takes a lock.
>
> And I guess start would increment the schedule number, which is now
> dependent on first
>
>> Would mean we could get rid of the 'first'
>> parameter for next_txq(), so might not be such a bad idea;
>
> Right, that's what I meant.
>
>> and if the
>> driver has its own locking the extra locking in mac80211 would just be
>> an always-uncontested spinlock, which shouldn't be much overhead, right?
>
> It may still bounce around CPUs if you call this from other places, but
> I suspect that wouldn't be the biggest issue. There are a lot of
> calculations going on too...
>
> johannes

2018-09-10 16:09:54

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH RFC v3 0/4] Move TXQ scheduling into mac80211

Johannes Berg <[email protected]> writes:

> On Sat, 2018-09-08 at 00:22 +0200, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>
>> - I didn't get rid of the register_airtime() callback. As far as I can t=
ell,
>> only iwlwifi uses the tx_time field in the struct tx_info. Which means=
that
>> we *could* probably use it for this and just make the other drivers se=
t it;
>> but I'm not sure what effects that would have in relation to WMM-AC for
>> those drivers, so I chickened out. Will have to try it out, I guess; b=
ut it
>> also depends on whether ath10k needs to be able to report airtime
>> asynchronously anyway. So I'll hold off on that for a bit more.
>
> I don't think you need to be concerned, the reporting through this has
> no immediate effect as the driver would also have to set the feature
> flag (NL80211_FEATURE_SUPPORTS_WMM_ADMISSION) for userspace to be able
> to use WMM admission TSPECs, and getting tx_tspec->admitted_time to be
> non-zero in ieee80211_sta_tx_wmm_ac_notify().

Great! In that case I'll try moving the reporting go through the tx_info
struct and check that it works for ath9k :)

-Toke

2018-09-10 19:46:28

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH RFC v3 1/4] mac80211: Add TXQ scheduling API

On Mon, 2018-09-10 at 15:18 +0200, Toke Høiland-Jørgensen wrote:

> If we have the start_schedule() / end_schedule() pair anyway, the latter
> could notify any TXQs that became eligible during the scheduling round.

Do we even need end_schedule()? It's hard to pass multiple things to a
single call (do you build a list?), so having

start_schedule(), get_txq(), return_txq()

would be sufficient?

> Also, instead of having the three different API functions
> (next_txq()/may_tx()/schedule_txq()), we could have get_txq(txq)/put_txq(txq)
> which would always need to be paired; but the argument to get_txq()
> could be optional, and if the driver passes NULL it means "give me the
> next available TXQ".

I can't say I like this. It makes the meaning totally different:

* with NULL: use the internal scheduler to determine which one is good
to use next
* non-NULL: essentially equivalent to may_tx()

> So for ath9k it would be:
>
>
> start_schedule(ac);
> while ((txq = get_txq(NULL)) {
> queue_aggregate(txq);
> put_txq(txq);
> }
> end_schedule(ac);
>
> And for ath10k/iwlwifi it would be:
>
> on_hw_notify(txq) {
> start_schedule(ac);
> if (txq = get_txq(txq)) {
> queue_packets(txq);
> put_txq(txq);
> }
> end_schedule(ac);
> }
>
>
> I think that would be simpler, API-wise?

I can't say I see much point in overloading get_txq() that way. You'd
never use it the same way.

Also, would you really start_schedule(ac) in the hw-managed case? It
seems like not? Basically it seems to me that in the hw-managed case all
you need is may_tx()? And in fact, once you opt in you don't even really
need *that* since mac80211 can just return NULL from get_skb()?

johannes