Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDA63C4360F for ; Thu, 4 Apr 2019 04:43:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 932472064A for ; Thu, 4 Apr 2019 04:43:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=codeaurora.org header.i=@codeaurora.org header.b="gW4LEbuo"; dkim=fail reason="key not found in DNS" (0-bit key) header.d=codeaurora.org header.i=@codeaurora.org header.b="IiRS/ndm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726229AbfDDEnr (ORCPT ); Thu, 4 Apr 2019 00:43:47 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:48392 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725903AbfDDEnr (ORCPT ); Thu, 4 Apr 2019 00:43:47 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id EF9F56079C; Thu, 4 Apr 2019 04:43:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1554353026; bh=T9etIlKta994IpxNEKKnl6zjQXj/vhmA4yNOjlITGtk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=gW4LEbuoofMWfP+Z3OH5jJfMJ3lgzBq/VxNerMuzSslyr6HimI8hVerra0AmuoUlU j93Svz82gBoHKHkX7WgKfqMSTqlui01at2bgTtwgkakxb6CbIgf9nMeluPwKIUkt3U VZ2JCZDNj1KbIy8Vx6cIkto+/75yFEdY5eSV0gtU= Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.codeaurora.org (Postfix) with ESMTP id ACA4960712; Thu, 4 Apr 2019 04:43:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1554353023; bh=T9etIlKta994IpxNEKKnl6zjQXj/vhmA4yNOjlITGtk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=IiRS/ndmNRhzAiPrWoc2LSSYRyrtZ5VG2pKBKPQdHJRLfB58hC66cq9EWWO+/IkR0 SCTqczPuk6emM6ROSkMf8eEc2n92QbpjOh8yS0iKaodSvXKFYEF2+JuS6how1VwCXW 3ik8CwIJjohltobwh95wDCslU7S/LNzWfz7NEy+w= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Thu, 04 Apr 2019 12:43:43 +0800 From: Yibo Zhao To: =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= Cc: make-wifi-fast@lists.bufferbloat.net, linux-wireless@vger.kernel.org, Felix Fietkau , Rajkumar Manoharan , Kan Yan , linux-wireless-owner@vger.kernel.org Subject: Re: [RFC/RFT] mac80211: Switch to a virtual time-based airtime scheduler In-Reply-To: References: <20190215170512.31512-1-toke@redhat.com> Message-ID: X-Sender: yiboz@codeaurora.org User-Agent: Roundcube Webmail/1.2.5 Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 2019-04-04 12:41, Yibo Zhao wrote: > On 2019-02-16 01:05, Toke Høiland-Jørgensen wrote: >> This switches the airtime scheduler in mac80211 to use a virtual >> time-based >> scheduler instead of the round-robin scheduler used before. This has a >> couple of advantages: >> >> - No need to sync up the round-robin scheduler in firmware/hardware >> with >> the round-robin airtime scheduler. >> >> - If several stations are eligible for transmission we can schedule >> both of >> them; no need to hard-block the scheduling rotation until the head >> of the >> queue has used up its quantum. >> >> - The check of whether a station is eligible for transmission becomes >> simpler (in ieee80211_txq_may_transmit()). >> >> The drawback is that scheduling becomes slightly more expensive, as we >> need >> to maintain an rbtree of TXQs sorted by virtual time. This means that >> ieee80211_register_airtime() becomes O(logN) in the number of >> currently >> scheduled TXQs. However, hopefully this number rarely grows too big >> (it's >> only TXQs currently backlogged, not all associated stations), so it >> shouldn't be too big of an issue. >> >> Signed-off-by: Toke Høiland-Jørgensen >> --- >> @@ -1831,18 +1830,32 @@ void ieee80211_sta_register_airtime(struct >> ieee80211_sta *pubsta, u8 tid, >> { >> struct sta_info *sta = container_of(pubsta, struct sta_info, sta); >> struct ieee80211_local *local = sta->sdata->local; >> + struct ieee80211_txq *txq = sta->sta.txq[tid]; >> u8 ac = ieee80211_ac_from_tid(tid); >> - u32 airtime = 0; >> + u64 airtime = 0, weight_sum; >> + >> + if (!txq) >> + return; >> >> if (sta->local->airtime_flags & AIRTIME_USE_TX) >> airtime += tx_airtime; >> if (sta->local->airtime_flags & AIRTIME_USE_RX) >> airtime += rx_airtime; >> >> + /* Weights scale so the unit weight is 256 */ >> + airtime <<= 8; >> + >> spin_lock_bh(&local->active_txq_lock[ac]); >> + >> sta->airtime[ac].tx_airtime += tx_airtime; >> sta->airtime[ac].rx_airtime += rx_airtime; >> - sta->airtime[ac].deficit -= airtime; >> + >> + weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight; >> + >> + local->airtime_v_t[ac] += airtime / weight_sum; > Hi Toke, > > it looks like local->airtime_v_t acts like a Tx criteria. Only the > stations with less airtime than that are valid for Tx. That means > there are situations, like 50 clients, that some of the stations can > be used to Tx when putting next_txq in the loop. Am I right? > >> + sta->airtime[ac].v_t += airtime / sta->airtime_weight; > Another question, any plan for taking v_t overflow situation into consideration? u64 might be enough for low throughput products but not sure for high end products. Something like below: u64 prev = sta->airtime[ac].v_t; if () >> + ieee80211_resort_txq(&local->hw, txq); >> + >> spin_unlock_bh(&local->active_txq_lock[ac]); >> } >> EXPORT_SYMBOL(ieee80211_sta_register_airtime); >> diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h >> index 05647d835894..4b901a2a998e 100644 >> --- a/net/mac80211/sta_info.h >> +++ b/net/mac80211/sta_info.h >> @@ -130,11 +130,12 @@ enum ieee80211_agg_stop_reason { >> /* Debugfs flags to enable/disable use of RX/TX airtime in scheduler >> */ >> #define AIRTIME_USE_TX BIT(0) >> #define AIRTIME_USE_RX BIT(1) >> +#define AIRTIME_GRACE 500 /* usec of grace period before reset */ >> >> struct airtime_info { >> u64 rx_airtime; >> u64 tx_airtime; >> - s64 deficit; >> + u64 v_t; >> }; >> >> struct sta_info; >> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c >> index 61c7ea9de2cc..8e125df8ade0 100644 >> --- a/net/mac80211/tx.c >> +++ b/net/mac80211/tx.c >> @@ -1449,7 +1449,7 @@ void ieee80211_txq_init(struct >> ieee80211_sub_if_data *sdata, >> codel_vars_init(&txqi->def_cvars); >> codel_stats_init(&txqi->cstats); >> __skb_queue_head_init(&txqi->frags); >> - INIT_LIST_HEAD(&txqi->schedule_order); >> + RB_CLEAR_NODE(&txqi->schedule_order); >> >> txqi->txq.vif = &sdata->vif; >> >> @@ -1493,9 +1493,7 @@ void ieee80211_txq_purge(struct ieee80211_local >> *local, >> ieee80211_purge_tx_queue(&local->hw, &txqi->frags); >> spin_unlock_bh(&fq->lock); >> >> - spin_lock_bh(&local->active_txq_lock[txqi->txq.ac]); >> - list_del_init(&txqi->schedule_order); >> - spin_unlock_bh(&local->active_txq_lock[txqi->txq.ac]); >> + ieee80211_unschedule_txq(&local->hw, &txqi->txq); >> } >> >> void ieee80211_txq_set_params(struct ieee80211_local *local) >> @@ -3640,126 +3638,191 @@ EXPORT_SYMBOL(ieee80211_tx_dequeue); >> struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 >> ac) >> { >> struct ieee80211_local *local = hw_to_local(hw); >> + struct rb_node *node = local->schedule_pos[ac]; >> struct txq_info *txqi = NULL; >> + bool first = false; >> >> lockdep_assert_held(&local->active_txq_lock[ac]); >> >> - begin: >> - txqi = list_first_entry_or_null(&local->active_txqs[ac], >> - struct txq_info, >> - schedule_order); >> - if (!txqi) >> + if (!node) { >> + node = rb_first_cached(&local->active_txqs[ac]); >> + first = true; >> + } else >> + node = rb_next(node); >> + >> + if (!node) >> return NULL; >> >> + txqi = container_of(node, struct txq_info, schedule_order); >> + >> if (txqi->txq.sta) { >> struct sta_info *sta = container_of(txqi->txq.sta, >> struct sta_info, sta); >> >> - if (sta->airtime[txqi->txq.ac].deficit < 0) { >> - sta->airtime[txqi->txq.ac].deficit += >> - sta->airtime_weight; >> - list_move_tail(&txqi->schedule_order, >> - &local->active_txqs[txqi->txq.ac]); >> - goto begin; >> + if (sta->airtime[ac].v_t > local->airtime_v_t[ac]) { >> + if (first) >> + local->airtime_v_t[ac] = sta->airtime[ac].v_t; >> + else >> + return NULL; >> } >> } >> >> >> - if (txqi->schedule_round == local->schedule_round[ac]) >> - return NULL; >> - >> - list_del_init(&txqi->schedule_order); >> - txqi->schedule_round = local->schedule_round[ac]; >> + local->schedule_pos[ac] = node; >> return &txqi->txq; >> } >> EXPORT_SYMBOL(ieee80211_next_txq); >> >> -void ieee80211_return_txq(struct ieee80211_hw *hw, >> +static void __ieee80211_insert_txq(struct rb_root_cached *root, >> + struct txq_info *txqi, u8 ac) >> +{ >> + struct rb_node **new = &root->rb_root.rb_node; >> + struct rb_node *parent = NULL; >> + struct txq_info *__txqi; >> + bool leftmost = true; >> + >> + while (*new) { >> + parent = *new; >> + __txqi = rb_entry(parent, struct txq_info, schedule_order); >> + >> + if (!txqi->txq.sta) { >> + /* new txqi has no sta - insert to the left */ >> + new = &parent->rb_left; >> + } else if (!__txqi->txq.sta) { >> + /* existing txqi has no sta - insert to the right */ >> + new = &parent->rb_right; >> + leftmost = false; >> + } else { >> + struct sta_info *old_sta = container_of(__txqi->txq.sta, >> + struct sta_info, >> + sta); >> + struct sta_info *new_sta = container_of(txqi->txq.sta, >> + struct sta_info, >> + sta); >> + >> + if (new_sta->airtime[ac].v_t <= old_sta->airtime[ac].v_t) >> + new = &parent->rb_left; >> + else { >> + new = &parent->rb_right; >> + leftmost = false; >> + } >> + >> + } >> + } >> + >> + rb_link_node(&txqi->schedule_order, parent, new); >> + rb_insert_color_cached(&txqi->schedule_order, root, leftmost); >> +} >> + >> +void ieee80211_schedule_txq(struct ieee80211_hw *hw, >> + struct ieee80211_txq *txq) >> + __acquires(txq_lock) __releases(txq_lock) >> +{ >> + struct ieee80211_local *local = hw_to_local(hw); >> + struct txq_info *txqi = to_txq_info(txq); >> + u8 ac = txq->ac; >> + >> + spin_lock_bh(&local->active_txq_lock[ac]); >> + >> + if (!RB_EMPTY_NODE(&txqi->schedule_order)) >> + goto out; >> + >> + if (txq->sta) { >> + struct sta_info *sta = container_of(txq->sta, >> + struct sta_info, sta); >> + >> + local->airtime_weight_sum[ac] += sta->airtime_weight; >> + if (local->airtime_v_t[ac] > AIRTIME_GRACE) >> + sta->airtime[ac].v_t = max(local->airtime_v_t[ac] - AIRTIME_GRACE, >> + sta->airtime[ac].v_t); >> + } >> + >> + __ieee80211_insert_txq(&local->active_txqs[ac], txqi, ac); >> + >> + out: >> + spin_unlock_bh(&local->active_txq_lock[ac]); >> +} >> +EXPORT_SYMBOL(ieee80211_schedule_txq); >> + >> +void ieee80211_resort_txq(struct ieee80211_hw *hw, >> struct ieee80211_txq *txq) >> { >> struct ieee80211_local *local = hw_to_local(hw); >> struct txq_info *txqi = to_txq_info(txq); >> + u8 ac = txq->ac; >> + >> + if (!RB_EMPTY_NODE(&txqi->schedule_order)) { >> + rb_erase_cached(&txqi->schedule_order, >> + &local->active_txqs[ac]); >> + RB_CLEAR_NODE(&txqi->schedule_order); >> + __ieee80211_insert_txq(&local->active_txqs[ac], txqi, ac); >> + } >> +} >> + >> +static void __ieee80211_unschedule_txq(struct ieee80211_hw *hw, >> + struct ieee80211_txq *txq) >> +{ >> + struct ieee80211_local *local = hw_to_local(hw); >> + struct txq_info *txqi = to_txq_info(txq); >> + u8 ac = txq->ac; >> >> lockdep_assert_held(&local->active_txq_lock[txq->ac]); >> >> - if (list_empty(&txqi->schedule_order) && >> - (!skb_queue_empty(&txqi->frags) || txqi->tin.backlog_packets)) { >> - /* If airtime accounting is active, always enqueue STAs at the >> - * head of the list to ensure that they only get moved to the >> - * back by the airtime DRR scheduler once they have a negative >> - * deficit. A station that already has a negative deficit will >> - * get immediately moved to the back of the list on the next >> - * call to ieee80211_next_txq(). >> - */ >> - if (txqi->txq.sta && >> - wiphy_ext_feature_isset(local->hw.wiphy, >> - NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)) >> - list_add(&txqi->schedule_order, >> - &local->active_txqs[txq->ac]); >> - else >> - list_add_tail(&txqi->schedule_order, >> - &local->active_txqs[txq->ac]); >> + if (RB_EMPTY_NODE(&txqi->schedule_order)) >> + return; >> + >> + if (txq->sta) { >> + struct sta_info *sta = container_of(txq->sta, >> + struct sta_info, sta); >> + >> + local->airtime_weight_sum[ac] -= sta->airtime_weight; >> } >> + >> + rb_erase_cached(&txqi->schedule_order, >> + &local->active_txqs[txq->ac]); >> + RB_CLEAR_NODE(&txqi->schedule_order); >> } >> -EXPORT_SYMBOL(ieee80211_return_txq); >> >> -void ieee80211_schedule_txq(struct ieee80211_hw *hw, >> - struct ieee80211_txq *txq) >> +void ieee80211_unschedule_txq(struct ieee80211_hw *hw, >> + struct ieee80211_txq *txq) >> __acquires(txq_lock) __releases(txq_lock) >> { >> struct ieee80211_local *local = hw_to_local(hw); >> >> spin_lock_bh(&local->active_txq_lock[txq->ac]); >> - ieee80211_return_txq(hw, txq); >> + __ieee80211_unschedule_txq(hw, txq); >> spin_unlock_bh(&local->active_txq_lock[txq->ac]); >> } >> -EXPORT_SYMBOL(ieee80211_schedule_txq); >> + >> +void ieee80211_return_txq(struct ieee80211_hw *hw, >> + struct ieee80211_txq *txq) >> +{ >> + struct ieee80211_local *local = hw_to_local(hw); >> + struct txq_info *txqi = to_txq_info(txq); >> + >> + lockdep_assert_held(&local->active_txq_lock[txq->ac]); >> + >> + if (!RB_EMPTY_NODE(&txqi->schedule_order) && >> + (skb_queue_empty(&txqi->frags) && !txqi->tin.backlog_packets)) >> + __ieee80211_unschedule_txq(hw, txq); >> +} >> +EXPORT_SYMBOL(ieee80211_return_txq); >> >> bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw, >> struct ieee80211_txq *txq) >> { >> struct ieee80211_local *local = hw_to_local(hw); >> - struct txq_info *iter, *tmp, *txqi = to_txq_info(txq); >> + struct txq_info *txqi = to_txq_info(txq); >> struct sta_info *sta; >> u8 ac = txq->ac; >> >> lockdep_assert_held(&local->active_txq_lock[ac]); >> >> if (!txqi->txq.sta) >> - goto out; >> - >> - if (list_empty(&txqi->schedule_order)) >> - goto out; >> - >> - list_for_each_entry_safe(iter, tmp, &local->active_txqs[ac], >> - schedule_order) { >> - if (iter == txqi) >> - break; >> - >> - if (!iter->txq.sta) { >> - list_move_tail(&iter->schedule_order, >> - &local->active_txqs[ac]); >> - continue; >> - } >> - sta = container_of(iter->txq.sta, struct sta_info, sta); >> - if (sta->airtime[ac].deficit < 0) >> - sta->airtime[ac].deficit += sta->airtime_weight; >> - list_move_tail(&iter->schedule_order, &local->active_txqs[ac]); >> - } >> + return true; >> >> sta = container_of(txqi->txq.sta, struct sta_info, sta); >> - if (sta->airtime[ac].deficit >= 0) >> - goto out; >> - >> - sta->airtime[ac].deficit += sta->airtime_weight; >> - list_move_tail(&txqi->schedule_order, &local->active_txqs[ac]); >> - >> - return false; >> -out: >> - if (!list_empty(&txqi->schedule_order)) >> - list_del_init(&txqi->schedule_order); >> - >> - return true; >> + return (sta->airtime[ac].v_t <= local->airtime_v_t[ac]); >> } >> EXPORT_SYMBOL(ieee80211_txq_may_transmit); >> >> @@ -3769,7 +3832,6 @@ void ieee80211_txq_schedule_start(struct >> ieee80211_hw *hw, u8 ac) >> struct ieee80211_local *local = hw_to_local(hw); >> >> spin_lock_bh(&local->active_txq_lock[ac]); >> - local->schedule_round[ac]++; >> } >> EXPORT_SYMBOL(ieee80211_txq_schedule_start); >> >> @@ -3778,6 +3840,7 @@ void ieee80211_txq_schedule_end(struct >> ieee80211_hw *hw, u8 ac) >> { >> struct ieee80211_local *local = hw_to_local(hw); >> >> + local->schedule_pos[ac] = NULL; >> spin_unlock_bh(&local->active_txq_lock[ac]); >> } >> EXPORT_SYMBOL(ieee80211_txq_schedule_end); -- Yibo