2019-03-01 13:48:41

by Felix Fietkau

[permalink] [raw]
Subject: [PATCH] mac80211: do not call driver wake_tx_queue op during reconfig

There are several scenarios in which mac80211 can call drv_wake_tx_queue
after ieee80211_restart_hw has been called and has not yet completed.
Driver private structs are considered uninitialized until mac80211 has
uploaded the vifs, stations and keys again, so using private tx queue
data during that time is not safe.

The driver can also not rely on drv_reconfig_complete to figure out when
it is safe to accept drv_wake_tx_queue calls again, because it is only
called after all tx queues are woken again.

To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig
is set.

Cc: [email protected]
Signed-off-by: Felix Fietkau <[email protected]>
---
net/mac80211/driver-ops.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 28d022a3eee3..ae4f0be3b393 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -1195,6 +1195,9 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local,
{
struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif);

+ if (local->in_reconfig)
+ return;
+
if (!check_sdata_in_driver(sdata))
return;

--
2.17.0



2019-03-01 15:29:41

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] mac80211: do not call driver wake_tx_queue op during reconfig

On 3/1/19 5:48 AM, Felix Fietkau wrote:
> There are several scenarios in which mac80211 can call drv_wake_tx_queue
> after ieee80211_restart_hw has been called and has not yet completed.
> Driver private structs are considered uninitialized until mac80211 has
> uploaded the vifs, stations and keys again, so using private tx queue
> data during that time is not safe.
>
> The driver can also not rely on drv_reconfig_complete to figure out when
> it is safe to accept drv_wake_tx_queue calls again, because it is only
> called after all tx queues are woken again.
>
> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig
> is set.

This reminded me of a patch I posted back in 2016. The discussion just sort of
ended on it, but curious if you have a new opinion on it after debugging the
issue in this patch:

https://patchwork.kernel.org/patch/9457709/

For what its worth, I've been using the patch above since I posted it, and
it seems to work well for ath9k and ath10k.

Thanks,
Ben


>
> Cc: [email protected]
> Signed-off-by: Felix Fietkau <[email protected]>
> ---
> net/mac80211/driver-ops.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
> index 28d022a3eee3..ae4f0be3b393 100644
> --- a/net/mac80211/driver-ops.h
> +++ b/net/mac80211/driver-ops.h
> @@ -1195,6 +1195,9 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local,
> {
> struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif);
>
> + if (local->in_reconfig)
> + return;
> +
> if (!check_sdata_in_driver(sdata))
> return;
>
>


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2019-03-01 16:06:53

by Felix Fietkau

[permalink] [raw]
Subject: Re: [PATCH] mac80211: do not call driver wake_tx_queue op during reconfig

On 2019-03-01 16:29, Ben Greear wrote:
> On 3/1/19 5:48 AM, Felix Fietkau wrote:
>> There are several scenarios in which mac80211 can call drv_wake_tx_queue
>> after ieee80211_restart_hw has been called and has not yet completed.
>> Driver private structs are considered uninitialized until mac80211 has
>> uploaded the vifs, stations and keys again, so using private tx queue
>> data during that time is not safe.
>>
>> The driver can also not rely on drv_reconfig_complete to figure out when
>> it is safe to accept drv_wake_tx_queue calls again, because it is only
>> called after all tx queues are woken again.
>>
>> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig
>> is set.
>
> This reminded me of a patch I posted back in 2016. The discussion just sort of
> ended on it, but curious if you have a new opinion on it after debugging the
> issue in this patch:
>
> https://patchwork.kernel.org/patch/9457709/
>
> For what its worth, I've been using the patch above since I posted it, and
> it seems to work well for ath9k and ath10k.
I agree with what Johannes wrote about that patch. Fixing this could
likely be as simple as clearing IEEE80211_SDATA_IN_DRIVER on all
interfaces before bringing any of them back up. That way the normal
interface add logic applies without nasty special cases.
The reconfig code checks for ieee80211_sdata_running (which is
unaffected), so I don't think we need to save the previous value of that
flag.

- Felix

2019-03-01 16:13:52

by Felix Fietkau

[permalink] [raw]
Subject: Re: [PATCH] mac80211: do not call driver wake_tx_queue op during reconfig

On 2019-03-01 17:06, Felix Fietkau wrote:
> On 2019-03-01 16:29, Ben Greear wrote:
>> On 3/1/19 5:48 AM, Felix Fietkau wrote:
>>> There are several scenarios in which mac80211 can call drv_wake_tx_queue
>>> after ieee80211_restart_hw has been called and has not yet completed.
>>> Driver private structs are considered uninitialized until mac80211 has
>>> uploaded the vifs, stations and keys again, so using private tx queue
>>> data during that time is not safe.
>>>
>>> The driver can also not rely on drv_reconfig_complete to figure out when
>>> it is safe to accept drv_wake_tx_queue calls again, because it is only
>>> called after all tx queues are woken again.
>>>
>>> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig
>>> is set.
>>
>> This reminded me of a patch I posted back in 2016. The discussion just sort of
>> ended on it, but curious if you have a new opinion on it after debugging the
>> issue in this patch:
>>
>> https://patchwork.kernel.org/patch/9457709/
>>
>> For what its worth, I've been using the patch above since I posted it, and
>> it seems to work well for ath9k and ath10k.
> I agree with what Johannes wrote about that patch. Fixing this could
> likely be as simple as clearing IEEE80211_SDATA_IN_DRIVER on all
> interfaces before bringing any of them back up. That way the normal
> interface add logic applies without nasty special cases.
> The reconfig code checks for ieee80211_sdata_running (which is
> unaffected), so I don't think we need to save the previous value of that
> flag.
How about this? (untested)
---
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1956,7 +1956,6 @@ static void ieee80211_flush_completed_scan(struct ieee80211_local *local,

static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local)
{
- struct ieee80211_sub_if_data *sdata;
struct ieee80211_chanctx *ctx;

/*
@@ -1980,9 +1979,6 @@ static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local)
*/
ieee80211_sched_scan_end(local);

- list_for_each_entry(sdata, &local->interfaces, list)
- sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER;
-
/* Mark channel contexts as not being in the driver any more to avoid
* removing them from the driver during the shutdown process...
*/
@@ -2135,6 +2131,9 @@ int ieee80211_reconfig(struct ieee80211_local *local)

local->started = false;

+ list_for_each_entry(sdata, &local->interfaces, list)
+ sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER;
+
/*
* Upon resume hardware can sometimes be goofy due to
* various platform / driver / bus issues, so restarting


2019-03-01 17:21:39

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] mac80211: do not call driver wake_tx_queue op during reconfig

On 3/1/19 8:13 AM, Felix Fietkau wrote:
> On 2019-03-01 17:06, Felix Fietkau wrote:
>> On 2019-03-01 16:29, Ben Greear wrote:
>>> On 3/1/19 5:48 AM, Felix Fietkau wrote:
>>>> There are several scenarios in which mac80211 can call drv_wake_tx_queue
>>>> after ieee80211_restart_hw has been called and has not yet completed.
>>>> Driver private structs are considered uninitialized until mac80211 has
>>>> uploaded the vifs, stations and keys again, so using private tx queue
>>>> data during that time is not safe.
>>>>
>>>> The driver can also not rely on drv_reconfig_complete to figure out when
>>>> it is safe to accept drv_wake_tx_queue calls again, because it is only
>>>> called after all tx queues are woken again.
>>>>
>>>> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig
>>>> is set.
>>>
>>> This reminded me of a patch I posted back in 2016. The discussion just sort of
>>> ended on it, but curious if you have a new opinion on it after debugging the
>>> issue in this patch:
>>>
>>> https://patchwork.kernel.org/patch/9457709/
>>>
>>> For what its worth, I've been using the patch above since I posted it, and
>>> it seems to work well for ath9k and ath10k.
>> I agree with what Johannes wrote about that patch. Fixing this could
>> likely be as simple as clearing IEEE80211_SDATA_IN_DRIVER on all
>> interfaces before bringing any of them back up. That way the normal
>> interface add logic applies without nasty special cases.
>> The reconfig code checks for ieee80211_sdata_running (which is
>> unaffected), so I don't think we need to save the previous value of that
>> flag.
> How about this? (untested)
> ---
> --- a/net/mac80211/util.c
> +++ b/net/mac80211/util.c
> @@ -1956,7 +1956,6 @@ static void ieee80211_flush_completed_scan(struct ieee80211_local *local,
>
> static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local)
> {
> - struct ieee80211_sub_if_data *sdata;
> struct ieee80211_chanctx *ctx;
>
> /*
> @@ -1980,9 +1979,6 @@ static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local)
> */
> ieee80211_sched_scan_end(local);
>
> - list_for_each_entry(sdata, &local->interfaces, list)
> - sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER;
> -
> /* Mark channel contexts as not being in the driver any more to avoid
> * removing them from the driver during the shutdown process...
> */
> @@ -2135,6 +2131,9 @@ int ieee80211_reconfig(struct ieee80211_local *local)
>
> local->started = false;
>
> + list_for_each_entry(sdata, &local->interfaces, list)
> + sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER;
> +
> /*
> * Upon resume hardware can sometimes be goofy due to
> * various platform / driver / bus issues, so restarting
>

Let me see if anyone (Johannes?) has comments on this... I can test it out
a bit later if no one suggests changes.

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com