2022-09-15 12:57:11

by Alexander Wetzel

[permalink] [raw]
Subject: [PATCH] mac80211: Fix deadlock: Don't start TX while holding fq->lock

ieee80211_txq_purge() calls fq_tin_reset() and
ieee80211_purge_tx_queue(); Both are then calling
ieee80211_free_txskb(). Which can decide to TX the skb again.

There are at least two ways to get a deadlock:

1) When we have a TDLS teardown packet queued in either tin or frags
ieee80211_tdls_td_tx_handle() will call ieee80211_subif_start_xmit()
while we still hold fq->lock. ieee80211_txq_enqueue() will thus
deadlock.

2) A variant of the above happens if aggregation is up and running:
In that case ieee80211_iface_work() will deadlock with the original
task: The original tasks already holds fq->lock and tries to get
sta->lock after kicking off ieee80211_iface_work(). But the worker
can get sta->lock prior to the original task and will then spin for
fq->lock.

Avoid these deadlocks by not sending out any skbs when called via
ieee80211_free_txskb().

Signed-off-by: Alexander Wetzel <[email protected]>
---

I have deadlock traces for the two scenarios above I can share, if
desired.
I found the issue while running the hostapd tests with my WIP patch
to switch all drivers over to iTXQ.

net/mac80211/status.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/status.c b/net/mac80211/status.c
index 8e77fd2e9fdf..3f9ddd7f04b6 100644
--- a/net/mac80211/status.c
+++ b/net/mac80211/status.c
@@ -729,7 +729,7 @@ static void ieee80211_report_used_skb(struct ieee80211_local *local,

if (!sdata) {
skb->dev = NULL;
- } else {
+ } else if (!dropped) {
unsigned int hdr_size =
ieee80211_hdrlen(hdr->frame_control);

--
2.37.3