When used with ath10k, the following may happen:
a) radio card firmware receives ADDBA-REQ from peer
b) radio sends back ADDBA-RESP to peer and signals the ath10k driver
c) ath10k calls back ieee80211_manage_rx_ba_offl() in mac80211
to signal rx ba offloading
d) mac80211::agg-rx.c::ieee80211_manage_rx_ba_offl()
d1) sets a flag: sta->ampdu_mlme.tid_rx_manage_offl
d2) queues a call to ht.c::ieee80211_ba_session_work()
e) ...scheduler runs...
f) ht.c::ieee80211_ba_session_work() checks the flag, clears it
and sets up the rx ba session.
During (e), a fast peer may already have sent a BAREQ which is
propagated to rx.c::ieee80211_rx_h_ctrl(). Since the session is not
yet established, mac80211 sends back a DELBA to the peer, which can
hang the BA session.
The phenomenon can be observed between two QCA988X fw 10.2.4 radios,
using a loop of associate/arping from client to AP/disconnect. After
a few thousand loops, arping does not get a response and a sniffer
detects a DELBA action frame from the client, following an ADDBA.
Fix:
1) check the offload flag in addition to the check for a valid
aggregation session
2) surround the checks with the existing dedicated mutex, to avoid
interference from ieee80211_ba_session_work() during the check.
Note 1: there is another dubious DELBA generation in
ieee80211_rx_reorder_ampdu(), where the same kind of fix should fit,
but I did not fix it since I knew no easy way to test.
Note 2: this fix applies to wireless backports from 5.4-rc8.
---
V2: remove debugging code leftovers, sorry for that
Index: b/net/mac80211/rx.c
===================================================================
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
Index: bp/net/mac80211/rx.c
===================================================================
--- bp.orig/net/mac80211/rx.c
+++ bp/net/mac80211/rx.c
@@ -3008,11 +3008,18 @@ ieee80211_rx_h_ctrl(struct ieee80211_rx_
tid = le16_to_cpu(bar_data.control) >> 12;
+ mutex_lock(&rx->sta->ampdu_mlme.mtx);
if (!test_bit(tid, rx->sta->ampdu_mlme.agg_session_valid) &&
- !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg))
+ /* back_req is allowed if the fw just received addba */
+ !test_bit(tid, rx->sta->ampdu_mlme.tid_rx_manage_offl) &&
+ !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg)) {
+ mutex_unlock(&rx->sta->ampdu_mlme.mtx);
ieee80211_send_delba(rx->sdata, rx->sta->sta.addr, tid,
WLAN_BACK_RECIPIENT,
WLAN_REASON_QSTA_REQUIRE_SETUP);
+ } else {
+ mutex_unlock(&rx->sta->ampdu_mlme.mtx);
+ }
tid_agg_rx = rcu_dereference(rx->sta->ampdu_mlme.tid_rx[tid]);
if (!tid_agg_rx)
--
quilt 0.63
On Tue, 2021-10-26 at 14:19 +0000, Jean-Pierre TOSONI wrote:
> When used with ath10k, the following may happen:
> a) radio card firmware receives ADDBA-REQ from peer
> b) radio sends back ADDBA-RESP to peer and signals the ath10k driver
> c) ath10k calls back ieee80211_manage_rx_ba_offl() in mac80211
> to signal rx ba offloading
> d) mac80211::agg-rx.c::ieee80211_manage_rx_ba_offl()
> d1) sets a flag: sta->ampdu_mlme.tid_rx_manage_offl
> d2) queues a call to ht.c::ieee80211_ba_session_work()
> e) ...scheduler runs...
> f) ht.c::ieee80211_ba_session_work() checks the flag, clears it
> and sets up the rx ba session.
>
> During (e), a fast peer may already have sent a BAREQ which is
> propagated to rx.c::ieee80211_rx_h_ctrl(). Since the session is not
> yet established, mac80211 sends back a DELBA to the peer, which can
> hang the BA session.
>
> The phenomenon can be observed between two QCA988X fw 10.2.4 radios,
> using a loop of associate/arping from client to AP/disconnect. After
> a few thousand loops, arping does not get a response and a sniffer
> detects a DELBA action frame from the client, following an ADDBA.
>
> Fix:
> 1) check the offload flag in addition to the check for a valid
> aggregation session
> 2) surround the checks with the existing dedicated mutex, to avoid
> interference from ieee80211_ba_session_work() during the check.
>
> Note 1: there is another dubious DELBA generation in
> ieee80211_rx_reorder_ampdu(), where the same kind of fix should fit,
> but I did not fix it since I knew no easy way to test.
>
> Note 2: this fix applies to wireless backports from 5.4-rc8.
> ---
> V2: remove debugging code leftovers, sorry for that
>
> Index: b/net/mac80211/rx.c
> ===================================================================
> --- a/net/mac80211/rx.c
> +++ b/net/mac80211/rx.c
> Index: bp/net/mac80211/rx.c
> ===================================================================
> --- bp.orig/net/mac80211/rx.c
> +++ bp/net/mac80211/rx.c
> @@ -3008,11 +3008,18 @@ ieee80211_rx_h_ctrl(struct ieee80211_rx_
>
>
> tid = le16_to_cpu(bar_data.control) >> 12;
>
>
> + mutex_lock(&rx->sta->ampdu_mlme.mtx);
You cannot take a mutex in this context.
johannes