When a vif is being removed and sdata->bss is cleared, __ieee80211_wake_txqs
can still be called on it, which crashes as soon as sdata->bss is being
dereferenced.
To fix this properly, check for SDATA_STATE_RUNNING before waking queues,
and take the fq lock when setting it (to ensure that __ieee80211_wake_txqs
observes the change when running on a different CPU
Signed-off-by: Felix Fietkau <[email protected]>
---
net/mac80211/iface.c | 2 ++
net/mac80211/util.c | 3 +++
2 files changed, 5 insertions(+)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 41531478437c..15a73b7fdd75 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -377,7 +377,9 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_do
bool cancel_scan;
struct cfg80211_nan_func *func;
+ spin_lock_bh(&local->fq.lock);
clear_bit(SDATA_STATE_RUNNING, &sdata->state);
+ spin_unlock_bh(&local->fq.lock);
cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
if (cancel_scan)
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 1e26b5235add..dad42d42aa84 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -301,6 +301,9 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
local_bh_disable();
spin_lock(&fq->lock);
+ if (!test_bit(SDATA_STATE_RUNNING, &sdata->state))
+ goto out;
+
if (sdata->vif.type == NL80211_IFTYPE_AP)
ps = &sdata->bss->ps;
--
2.36.1
Felix Fietkau <[email protected]> writes:
> When a vif is being removed and sdata->bss is cleared, __ieee80211_wake_txqs
> can still be called on it, which crashes as soon as sdata->bss is being
> dereferenced.
> To fix this properly, check for SDATA_STATE_RUNNING before waking queues,
> and take the fq lock when setting it (to ensure that __ieee80211_wake_txqs
> observes the change when running on a different CPU
>
> Signed-off-by: Felix Fietkau <[email protected]>
I think it's a little ugly to expand usage of fq.lock across more and
more places, I don't really have a good alternative, so:
Acked-by: Toke Høiland-Jørgensen <[email protected]>
On 5/31/22 12:08 PM, Felix Fietkau wrote:
> When a vif is being removed and sdata->bss is cleared, __ieee80211_wake_txqs
> can still be called on it, which crashes as soon as sdata->bss is being
> dereferenced.
> To fix this properly, check for SDATA_STATE_RUNNING before waking queues,
> and take the fq lock when setting it (to ensure that __ieee80211_wake_txqs
> observes the change when running on a different CPU
I patched this into my 5.17+ kernel, and in a test that brings up 16 virtual
station vdevs on an mtk7915, 4 of them on each of two radios will not associate.
They get 4-way timeouts.
So, I think there must be something wrong with assumptions in this patch, or
maybe it depends on some other patch I am missing. I'll remove it from my tree...
Thanks,
Ben
>
> Signed-off-by: Felix Fietkau <[email protected]>
> ---
> net/mac80211/iface.c | 2 ++
> net/mac80211/util.c | 3 +++
> 2 files changed, 5 insertions(+)
>
> diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
> index 41531478437c..15a73b7fdd75 100644
> --- a/net/mac80211/iface.c
> +++ b/net/mac80211/iface.c
> @@ -377,7 +377,9 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_do
> bool cancel_scan;
> struct cfg80211_nan_func *func;
>
> + spin_lock_bh(&local->fq.lock);
> clear_bit(SDATA_STATE_RUNNING, &sdata->state);
> + spin_unlock_bh(&local->fq.lock);
>
> cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
> if (cancel_scan)
> diff --git a/net/mac80211/util.c b/net/mac80211/util.c
> index 1e26b5235add..dad42d42aa84 100644
> --- a/net/mac80211/util.c
> +++ b/net/mac80211/util.c
> @@ -301,6 +301,9 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
> local_bh_disable();
> spin_lock(&fq->lock);
>
> + if (!test_bit(SDATA_STATE_RUNNING, &sdata->state))
> + goto out;
> +
> if (sdata->vif.type == NL80211_IFTYPE_AP)
> ps = &sdata->bss->ps;
>
>
--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com