2021-10-28 09:15:07

by Jean-Pierre TOSONI

[permalink] [raw]
Subject: [RFC v3] mac80211: fix rx blockack session race condition

When the mac80211 layer is used with ath10k, the following may happen:

a) radio card firmware receives ADDBA-REQ from peer
b) radio sends back ADDBA-RESP to peer and signals the ath10k driver
c) ath10k calls back ieee80211_manage_rx_ba_offl() in mac80211
to signal rx ba offloading
d) mac80211::agg-rx.c::ieee80211_manage_rx_ba_offl()
d1) sets a flag: sta->ampdu_mlme.tid_rx_manage_offl
d2) queues a call to ht.c::ieee80211_ba_session_work()
e) ...scheduler runs...
f) ht.c::ieee80211_ba_session_work() checks the flag, clears it
and sets up the rx ba session.

During (e), a fast peer may already have sent a BAREQ which is
propagated to rx.c::ieee80211_rx_h_ctrl(). Since the session is not
yet established, mac80211 sends back a DELBA to the peer, which can
hang the BA session.

The phenomenon can be observed between two QCA988X fw 10.2.4 radios,
using a loop of associate/arping from client to AP/disconnect. After
a few thousand loops, arping does not get a response and a sniffer
detects a DELBA action frame from the client, following an ADDBA.

Fix:
1) check the offload flag in addition to the check for a valid
aggregation session
2) protect the paired checks of (offload flag, valid aggregation session)
with a spinlock against interference from ieee80211_ba_session_work().

Note 1: there is another dubious DELBA generation in
ieee80211_rx_reorder_ampdu(), where the same kind of fix should fit,
but I did not fix it since I knew no easy way to test.

Note 2: this fix applies to wireless backports from 5.4-rc8.
---
V2: remove debugging code leftovers, sorry for that
V3: use spin_lock_bh instead of a mutex

Index: b/net/mac80211/rx.c
===================================================================
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -3085,11 +3085,18 @@ ieee80211_rx_h_ctrl(struct ieee80211_rx_

tid = le16_to_cpu(bar_data.control) >> 12;

+ spin_lock_bh(&rx->sta->ampdu_mlme.rx_offl_lock);
if (!test_bit(tid, rx->sta->ampdu_mlme.agg_session_valid) &&
- !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg))
+ /* back_req is allowed if the fw just received addba */
+ !test_bit(tid, rx->sta->ampdu_mlme.tid_rx_manage_offl) &&
+ !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg)) {
+ spin_unlock_bh(&rx->sta->ampdu_mlme.rx_offl_lock);
ieee80211_send_delba(rx->sdata, rx->sta->sta.addr, tid,
WLAN_BACK_RECIPIENT,
WLAN_REASON_QSTA_REQUIRE_SETUP);
+ } else {
+ spin_unlock_bh(&rx->sta->ampdu_mlme.rx_offl_lock);
+ }

tid_agg_rx = rcu_dereference(rx->sta->ampdu_mlme.tid_rx[tid]);
if (!tid_agg_rx)
Index: b/net/mac80211/ht.c
===================================================================
--- a/net/mac80211/ht.c
+++ b/net/mac80211/ht.c
@@ -367,11 +367,13 @@ void ieee80211_ba_session_work(struct wo
IEEE80211_MAX_AMPDU_BUF_HT,
false, true, NULL);

+ spin_lock_bh(&sta->ampdu_mlme.rx_offl_lock);
if (test_and_clear_bit(tid + IEEE80211_NUM_TIDS,
sta->ampdu_mlme.tid_rx_manage_offl))
___ieee80211_stop_rx_ba_session(
sta, tid, WLAN_BACK_RECIPIENT,
0, false);
+ spin_unlock_bh(&sta->ampdu_mlme.rx_offl_lock);

spin_lock_bh(&sta->lock);

Index: b/net/mac80211/sta_info.c
===================================================================
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -354,6 +354,7 @@ struct sta_info *sta_info_alloc(struct i

spin_lock_init(&sta->lock);
spin_lock_init(&sta->ps_lock);
+ spin_lock_init(&sta->ampdu_mlme.rx_offl_lock);
INIT_WORK(&sta->drv_deliver_wk, sta_deliver_ps_frames);
INIT_WORK(&sta->ampdu_mlme.work, ieee80211_ba_session_work);
mutex_init(&sta->ampdu_mlme.mtx);
Index: b/net/mac80211/sta_info.h
===================================================================
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -266,6 +266,7 @@ struct tid_ampdu_rx {
* @mtx: mutex to protect all TX data (except non-NULL assignments
* to tid_tx[idx], which are protected by the sta spinlock)
* tid_start_tx is also protected by sta->lock.
+ * @rx_offl_lock: protects transfer from tid_rx_manage_offl to agg_session_valid
* @tid_rx: aggregation info for Rx per TID -- RCU protected
* @tid_rx_token: dialog tokens for valid aggregation sessions
* @tid_rx_timer_expired: bitmap indicating on which TIDs the
@@ -287,6 +288,7 @@ struct tid_ampdu_rx {
struct sta_ampdu_mlme {
struct mutex mtx;
/* rx */
+ spinlock_t rx_offl_lock;
struct tid_ampdu_rx __rcu *tid_rx[IEEE80211_NUM_TIDS];
u8 tid_rx_token[IEEE80211_NUM_TIDS];
unsigned long tid_rx_timer_expired[BITS_TO_LONGS(IEEE80211_NUM_TIDS)];
--
quilt 0.63


2021-10-28 10:08:53

by Jean-Pierre TOSONI

[permalink] [raw]
Subject: RE: [RFC v3] mac80211: fix rx blockack session race condition

Please disregard this v3. The spinlock is misplaced in ht.c.
Sorry.

-----Message d'origine-----
De?: Jean-Pierre TOSONI <[email protected]>
Envoy??: jeudi 28 octobre 2021 11:15
??: Johannes Berg ([email protected]) <[email protected]>; '[email protected]' <[email protected]>
Objet?: [RFC v3] mac80211: fix rx blockack session race condition

When the mac80211 layer is used with ath10k, the following may happen:

a) radio card firmware receives ADDBA-REQ from peer
b) radio sends back ADDBA-RESP to peer and signals the ath10k driver
c) ath10k calls back ieee80211_manage_rx_ba_offl() in mac80211
to signal rx ba offloading
d) mac80211::agg-rx.c::ieee80211_manage_rx_ba_offl()
d1) sets a flag: sta->ampdu_mlme.tid_rx_manage_offl
d2) queues a call to ht.c::ieee80211_ba_session_work()
e) ...scheduler runs...
f) ht.c::ieee80211_ba_session_work() checks the flag, clears it
and sets up the rx ba session.

During (e), a fast peer may already have sent a BAREQ which is
propagated to rx.c::ieee80211_rx_h_ctrl(). Since the session is not
yet established, mac80211 sends back a DELBA to the peer, which can
hang the BA session.

The phenomenon can be observed between two QCA988X fw 10.2.4 radios,
using a loop of associate/arping from client to AP/disconnect. After
a few thousand loops, arping does not get a response and a sniffer
detects a DELBA action frame from the client, following an ADDBA.

Fix:
1) check the offload flag in addition to the check for a valid
aggregation session
2) protect the paired checks of (offload flag, valid aggregation session)
with a spinlock against interference from ieee80211_ba_session_work().

Note 1: there is another dubious DELBA generation in
ieee80211_rx_reorder_ampdu(), where the same kind of fix should fit,
but I did not fix it since I knew no easy way to test.

Note 2: this fix applies to wireless backports from 5.4-rc8.
---
V2: remove debugging code leftovers, sorry for that
V3: use spin_lock_bh instead of a mutex

Index: b/net/mac80211/rx.c
===================================================================
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -3085,11 +3085,18 @@ ieee80211_rx_h_ctrl(struct ieee80211_rx_

tid = le16_to_cpu(bar_data.control) >> 12;

+ spin_lock_bh(&rx->sta->ampdu_mlme.rx_offl_lock);
if (!test_bit(tid, rx->sta->ampdu_mlme.agg_session_valid) &&
- !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg))
+ /* back_req is allowed if the fw just received addba */
+ !test_bit(tid, rx->sta->ampdu_mlme.tid_rx_manage_offl) &&
+ !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg)) {
+ spin_unlock_bh(&rx->sta->ampdu_mlme.rx_offl_lock);
ieee80211_send_delba(rx->sdata, rx->sta->sta.addr, tid,
WLAN_BACK_RECIPIENT,
WLAN_REASON_QSTA_REQUIRE_SETUP);
+ } else {
+ spin_unlock_bh(&rx->sta->ampdu_mlme.rx_offl_lock);
+ }

tid_agg_rx = rcu_dereference(rx->sta->ampdu_mlme.tid_rx[tid]);
if (!tid_agg_rx)
Index: b/net/mac80211/ht.c
===================================================================
--- a/net/mac80211/ht.c
+++ b/net/mac80211/ht.c
@@ -367,11 +367,13 @@ void ieee80211_ba_session_work(struct wo
IEEE80211_MAX_AMPDU_BUF_HT,
false, true, NULL);

+ spin_lock_bh(&sta->ampdu_mlme.rx_offl_lock);
if (test_and_clear_bit(tid + IEEE80211_NUM_TIDS,
sta->ampdu_mlme.tid_rx_manage_offl))
___ieee80211_stop_rx_ba_session(
sta, tid, WLAN_BACK_RECIPIENT,
0, false);
+ spin_unlock_bh(&sta->ampdu_mlme.rx_offl_lock);

spin_lock_bh(&sta->lock);

Index: b/net/mac80211/sta_info.c
===================================================================
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -354,6 +354,7 @@ struct sta_info *sta_info_alloc(struct i

spin_lock_init(&sta->lock);
spin_lock_init(&sta->ps_lock);
+ spin_lock_init(&sta->ampdu_mlme.rx_offl_lock);
INIT_WORK(&sta->drv_deliver_wk, sta_deliver_ps_frames);
INIT_WORK(&sta->ampdu_mlme.work, ieee80211_ba_session_work);
mutex_init(&sta->ampdu_mlme.mtx);
Index: b/net/mac80211/sta_info.h
===================================================================
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -266,6 +266,7 @@ struct tid_ampdu_rx {
* @mtx: mutex to protect all TX data (except non-NULL assignments
* to tid_tx[idx], which are protected by the sta spinlock)
* tid_start_tx is also protected by sta->lock.
+ * @rx_offl_lock: protects transfer from tid_rx_manage_offl to agg_session_valid
* @tid_rx: aggregation info for Rx per TID -- RCU protected
* @tid_rx_token: dialog tokens for valid aggregation sessions
* @tid_rx_timer_expired: bitmap indicating on which TIDs the
@@ -287,6 +288,7 @@ struct tid_ampdu_rx {
struct sta_ampdu_mlme {
struct mutex mtx;
/* rx */
+ spinlock_t rx_offl_lock;
struct tid_ampdu_rx __rcu *tid_rx[IEEE80211_NUM_TIDS];
u8 tid_rx_token[IEEE80211_NUM_TIDS];
unsigned long tid_rx_timer_expired[BITS_TO_LONGS(IEEE80211_NUM_TIDS)];
--
quilt 0.63