2014-05-09 12:23:42

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 0/5] ath10k: recovery fixes 2014-05-09

Hi,

Here's is a bunch of fixes for recovery/hw
restart. This notably fixes a bug Ben has hit some
time ago.

I also have some improvements for recovery code as
well in a follow up patchset that I'll be sending
separately.


Michal Kazior (5):
ath10k: clean up start() callback
ath10k: perform hw restart lazily
ath10k: drain tx before restarting hw
ath10k: protect wep tx key setup
ath10k: dont configure bssid for ap mode

drivers/net/wireless/ath/ath10k/core.c | 5 +-
drivers/net/wireless/ath/ath10k/mac.c | 103 +++++++++++++++++++++------------
2 files changed, 70 insertions(+), 38 deletions(-)

--
1.8.5.3



2014-05-26 09:43:12

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On 26 May 2014 11:40, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>> This makes sure no further tx requests are
>> submitted to HTT before driver teardown.
>>
>> This should prevent invalid pointer/NULL
>> dereference on htt tx pool in ath10k_htt_tx() in
>> some cases of heavy traffic.
>>
>> Reported-By: Ben Greear <[email protected]>
>> Signed-off-by: Michal Kazior <[email protected]>
>> ---
>> drivers/net/wireless/ath/ath10k/mac.c | 30 ++++++++++++++++++++++++------
>> 1 file changed, 24 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
>> index b5d75b8..577a3a5 100644
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -2289,6 +2289,21 @@ static void ath10k_tx(struct ieee80211_hw *hw,
>> /*
>> * Initialize various parameters with default vaules.
>> */
>> +static void ath10k_drain_tx(struct ath10k *ar)
>> +{
>> + /* workers can hold conf_mutex -- avoid deadlock */
>> + WARN_ON(lockdep_is_held(&ar->conf_mutex));
>
> Based on the discussion I modified the function to this now. Are you ok
> with that?
>
> /* Must not be called with conf_mutex held as workers can use that also. */
> static void ath10k_drain_tx(struct ath10k *ar)
> {
> /* make sure rcu-protected mac80211 tx path itself is drained */
> synchronize_net();
>
> ath10k_offchan_tx_purge(ar);
> ath10k_mgmt_over_wmi_tx_purge(ar);
>
> cancel_work_sync(&ar->offchan_tx_work);
> cancel_work_sync(&ar->wmi_mgmt_tx_work);
> }
>
> Full patch:
>
> https://github.com/kvalo/ath/commit/e3c513cd03802e75802a6ce5efbadaa0dbf04b27

Looks good, thanks!


Michał

2014-05-15 08:49:13

by Michal Kazior

[permalink] [raw]
Subject: [PATCH v2 3/5] ath10k: drain tx before restarting hw

This makes sure no further tx requests are
submitted to HTT before driver teardown.

This should prevent invalid pointer/NULL
dereference on htt tx pool in ath10k_htt_tx() in
some cases of heavy traffic.

Reported-By: Ben Greear <[email protected]>
Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index b5d75b8..577a3a5 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2289,6 +2289,21 @@ static void ath10k_tx(struct ieee80211_hw *hw,
/*
* Initialize various parameters with default vaules.
*/
+static void ath10k_drain_tx(struct ath10k *ar)
+{
+ /* workers can hold conf_mutex -- avoid deadlock */
+ WARN_ON(lockdep_is_held(&ar->conf_mutex));
+
+ /* make sure rcu-protected mac80211 tx path itself is drained */
+ synchronize_net();
+
+ ath10k_offchan_tx_purge(ar);
+ ath10k_mgmt_over_wmi_tx_purge(ar);
+
+ cancel_work_sync(&ar->offchan_tx_work);
+ cancel_work_sync(&ar->wmi_mgmt_tx_work);
+}
+
void ath10k_halt(struct ath10k *ar)
{
struct ath10k_vif *arvif;
@@ -2304,8 +2319,6 @@ void ath10k_halt(struct ath10k *ar)

del_timer_sync(&ar->scan.timeout);
ath10k_reset_scan((unsigned long)ar);
- ath10k_offchan_tx_purge(ar);
- ath10k_mgmt_over_wmi_tx_purge(ar);
ath10k_peer_cleanup_all(ar);
ath10k_core_stop(ar);
ath10k_hif_power_down(ar);
@@ -2329,6 +2342,13 @@ static int ath10k_start(struct ieee80211_hw *hw)
struct ath10k *ar = hw->priv;
int ret = 0;

+ /*
+ * This makes sense only when restarting hw. It is harmless to call
+ * uncoditionally. This is necessary to make sure no HTT/WMI tx
+ * commands will be submitted while restarting.
+ */
+ ath10k_drain_tx(ar);
+
mutex_lock(&ar->conf_mutex);

switch (ar->state) {
@@ -2412,6 +2432,8 @@ static void ath10k_stop(struct ieee80211_hw *hw)
{
struct ath10k *ar = hw->priv;

+ ath10k_drain_tx(ar);
+
mutex_lock(&ar->conf_mutex);
if (ar->state != ATH10K_STATE_OFF) {
ath10k_halt(ar);
@@ -2419,10 +2441,6 @@ static void ath10k_stop(struct ieee80211_hw *hw)
}
mutex_unlock(&ar->conf_mutex);

- ath10k_mgmt_over_wmi_tx_purge(ar);
-
- cancel_work_sync(&ar->offchan_tx_work);
- cancel_work_sync(&ar->wmi_mgmt_tx_work);
cancel_work_sync(&ar->restart_work);
}

--
1.8.5.3


2014-05-26 05:48:58

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On 23 May 2014 15:03, Johannes Berg <[email protected]> wrote:
> On Fri, 2014-05-23 at 11:06 +0200, Michal Kazior wrote:
>
>> There's nothing wrong with other thread holding it. Actually that's
>> the reason for this very check.
>>
>> The point is to prevent ath10k_drain_tx() being called while caller
>> (current thread) holds conf_mutex. If it were to hold conf_mutex then
>> cancel_work_sync() can deadlock as both workers it tries to stop try
>> to get a hold of the lock too.
>
> That seems pointless - lockdep would warn you about that *anyway*
> because cancel_work_sync() and friends have proper annotations. Try it
> sometime :)

Does it splat every time or just by chance or after a deadlock?


Michał

2014-05-09 12:31:40

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 1/5] ath10k: clean up start() callback

This fixes failpath when override AC pdev param
setup fails and makes other pdev params setting
fail as well.

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index e2c01dc..eef9104 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2339,22 +2339,19 @@ static int ath10k_start(struct ieee80211_hw *hw)
if (ar->state != ATH10K_STATE_OFF &&
ar->state != ATH10K_STATE_RESTARTING) {
ret = -EINVAL;
- goto exit;
+ goto out;
}

ret = ath10k_hif_power_up(ar);
if (ret) {
ath10k_err("Could not init hif: %d\n", ret);
- ar->state = ATH10K_STATE_OFF;
- goto exit;
+ goto err_off;
}

ret = ath10k_core_start(ar);
if (ret) {
ath10k_err("Could not init core: %d\n", ret);
- ath10k_hif_power_down(ar);
- ar->state = ATH10K_STATE_OFF;
- goto exit;
+ goto err_power_down;
}

if (ar->state == ATH10K_STATE_OFF)
@@ -2363,12 +2360,16 @@ static int ath10k_start(struct ieee80211_hw *hw)
ar->state = ATH10K_STATE_RESTARTED;

ret = ath10k_wmi_pdev_set_param(ar, ar->wmi.pdev_param->pmf_qos, 1);
- if (ret)
+ if (ret) {
ath10k_warn("failed to enable PMF QOS: %d\n", ret);
+ goto err_core_stop;
+ }

ret = ath10k_wmi_pdev_set_param(ar, ar->wmi.pdev_param->dynamic_bw, 1);
- if (ret)
+ if (ret) {
ath10k_warn("failed to enable dynamic BW: %d\n", ret);
+ goto err_core_stop;
+ }

/*
* By default FW set ARP frames ac to voice (6). In that case ARP
@@ -2384,14 +2385,21 @@ static int ath10k_start(struct ieee80211_hw *hw)
if (ret) {
ath10k_warn("failed to set arp ac override parameter: %d\n",
ret);
- goto exit;
+ goto err_core_stop;
}

ar->num_started_vdevs = 0;
ath10k_regd_update(ar);
ret = 0;
+ goto out;

-exit:
+err_core_stop:
+ ath10k_core_stop(ar);
+err_power_down:
+ ath10k_hif_power_down(ar);
+err_off:
+ ar->state = ATH10K_STATE_OFF;
+out:
mutex_unlock(&ar->conf_mutex);
return ret;
}
--
1.8.5.3


2014-05-14 19:50:56

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 4/5] ath10k: protect wep tx key setup

Michal Kazior <[email protected]> writes:

> All configuration sequences should be protected
> with conf_mutex to avoid concurrent/conflicting
> requests.
>
> This should make sure that wep tx key setup is not
> performed while hw is restarted (at least).

Locks and mutexes should protect data, not thread of execution. What are
we exactly protecting here, arvif->def_wep_key_idx?

The patch makes sense, I'm just trying to understand the idea behind the
implementation.

> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -1888,8 +1888,10 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
> wep_key_work);
> int ret, keyidx = arvif->def_wep_key_newidx;
>
> + mutex_lock(&arvif->ar->conf_mutex);
> +
> if (arvif->def_wep_key_idx == keyidx)
> - return;
> + goto unlock;

BTW, shouldn't we also check the state and bail out if the state is not
ON (and possibly RESTARTED)? How can this work otherwise?

--
Kalle Valo

2014-05-15 06:04:30

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH 4/5] ath10k: protect wep tx key setup

On 14 May 2014 21:50, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>> All configuration sequences should be protected
>> with conf_mutex to avoid concurrent/conflicting
>> requests.
>>
>> This should make sure that wep tx key setup is not
>> performed while hw is restarted (at least).
>
> Locks and mutexes should protect data, not thread of execution. What are
> we exactly protecting here, arvif->def_wep_key_idx?

Actually conf_mutex idea is to protect and serialize configuration
sequences (WMI) too, not just data. Perhaps we should revise this?


> The patch makes sense, I'm just trying to understand the idea behind the
> implementation.
>
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -1888,8 +1888,10 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
>> wep_key_work);
>> int ret, keyidx = arvif->def_wep_key_newidx;
>>
>> + mutex_lock(&arvif->ar->conf_mutex);
>> +
>> if (arvif->def_wep_key_idx == keyidx)
>> - return;
>> + goto unlock;
>
> BTW, shouldn't we also check the state and bail out if the state is not
> ON (and possibly RESTARTED)? How can this work otherwise?

Very good point!

But actually.. this should be cancelled during recovery in the first
place. I need to take a look at it.


Michał

2014-05-09 12:31:01

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 5/5] ath10k: dont configure bssid for ap mode

FW creates self-peer for AP internally.

This prevents ath10k from trying to create
explicit self-peer during hw recovery and thus
prevents a timeout and a warning during teardown:

ath10k: removing stale peer $AP_BSSID from vdev_id 0

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 78b427d..9b14ec8 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2953,7 +2953,8 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw,
arvif->u.ap.hidden_ssid = info->hidden_ssid;
}

- if (changed & BSS_CHANGED_BSSID) {
+ if (changed & BSS_CHANGED_BSSID &&
+ vif->type != NL80211_IFTYPE_AP) {
if (!is_zero_ether_addr(info->bssid)) {
ath10k_dbg(ATH10K_DBG_MAC,
"mac vdev %d create peer %pM\n",
--
1.8.5.3


2014-05-14 18:58:27

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 5/5] ath10k: dont configure bssid for ap mode

Michal Kazior <[email protected]> writes:

> FW creates self-peer for AP internally.
>
> This prevents ath10k from trying to create
> explicit self-peer during hw recovery and thus
> prevents a timeout and a warning during teardown:
>
> ath10k: removing stale peer $AP_BSSID from vdev_id 0
>
> Signed-off-by: Michal Kazior <[email protected]>

[...]

> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -2953,7 +2953,8 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw,
> arvif->u.ap.hidden_ssid = info->hidden_ssid;
> }
>
> - if (changed & BSS_CHANGED_BSSID) {
> + if (changed & BSS_CHANGED_BSSID &&
> + vif->type != NL80211_IFTYPE_AP) {
> if (!is_zero_ether_addr(info->bssid)) {
> ath10k_dbg(ATH10K_DBG_MAC,
> "mac vdev %d create peer %pM\n",

I think this deserves a comment, at least "Firmware creates self-peer in
AP mode automatically" or something like that.

--
Kalle Valo

2014-05-15 08:49:15

by Michal Kazior

[permalink] [raw]
Subject: [PATCH v2 5/5] ath10k: dont configure bssid for ap mode

FW creates self-peer for AP internally.

This prevents ath10k from trying to create
explicit self-peer during hw recovery and thus
prevents a timeout and a warning during teardown:

ath10k: removing stale peer $AP_BSSID from vdev_id 0

Signed-off-by: Michal Kazior <[email protected]>
---
v2:
* add comment why CHANGED_BSSID is skipped for AP [Kalle]

drivers/net/wireless/ath/ath10k/mac.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index e474eca..5b01b82 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2961,7 +2961,12 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw,
arvif->u.ap.hidden_ssid = info->hidden_ssid;
}

- if (changed & BSS_CHANGED_BSSID) {
+ /*
+ * Firmware manages AP self-peer internally so make sure to not create
+ * it in driver. Otherwise AP self-peer deletion may timeout later.
+ */
+ if (changed & BSS_CHANGED_BSSID &&
+ vif->type != NL80211_IFTYPE_AP) {
if (!is_zero_ether_addr(info->bssid)) {
ath10k_dbg(ATH10K_DBG_MAC,
"mac vdev %d create peer %pM\n",
--
1.8.5.3


2014-05-09 12:31:31

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 2/5] ath10k: perform hw restart lazily

This reduces risk of races and prepares for more
hw restart fixes.

It also makes sense to perform teardown after
mac80211 starts its restart routine as it
guarantees it has stopped itself by then
(including tx queues).

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/core.c | 5 ++++-
drivers/net/wireless/ath/ath10k/mac.c | 34 ++++++++++++++++------------------
2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 75b3dfb..7034c72 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -681,7 +681,8 @@ static void ath10k_core_restart(struct work_struct *work)
switch (ar->state) {
case ATH10K_STATE_ON:
ar->state = ATH10K_STATE_RESTARTING;
- ath10k_halt(ar);
+ del_timer_sync(&ar->scan.timeout);
+ ath10k_reset_scan((unsigned long)ar);
ieee80211_restart_hw(ar->hw);
break;
case ATH10K_STATE_OFF:
@@ -690,6 +691,8 @@ static void ath10k_core_restart(struct work_struct *work)
ath10k_warn("cannot restart a device that hasn't been started\n");
break;
case ATH10K_STATE_RESTARTING:
+ /* hw restart might be requested from multiple places */
+ break;
case ATH10K_STATE_RESTARTED:
ar->state = ATH10K_STATE_WEDGED;
/* fall through */
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index eef9104..dbf0960 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2303,6 +2303,7 @@ void ath10k_halt(struct ath10k *ar)
}

del_timer_sync(&ar->scan.timeout);
+ ath10k_reset_scan((unsigned long)ar);
ath10k_offchan_tx_purge(ar);
ath10k_mgmt_over_wmi_tx_purge(ar);
ath10k_peer_cleanup_all(ar);
@@ -2310,12 +2311,6 @@ void ath10k_halt(struct ath10k *ar)
ath10k_hif_power_down(ar);

spin_lock_bh(&ar->data_lock);
- if (ar->scan.in_progress) {
- del_timer(&ar->scan.timeout);
- ar->scan.in_progress = false;
- ieee80211_scan_completed(ar->hw, true);
- }
-
list_for_each_entry(arvif, &ar->arvifs, list) {
if (!arvif->beacon)
continue;
@@ -2336,8 +2331,18 @@ static int ath10k_start(struct ieee80211_hw *hw)

mutex_lock(&ar->conf_mutex);

- if (ar->state != ATH10K_STATE_OFF &&
- ar->state != ATH10K_STATE_RESTARTING) {
+ switch (ar->state) {
+ case ATH10K_STATE_OFF:
+ ar->state = ATH10K_STATE_ON;
+ break;
+ case ATH10K_STATE_RESTARTING:
+ ath10k_halt(ar);
+ ar->state = ATH10K_STATE_RESTARTED;
+ break;
+ case ATH10K_STATE_ON:
+ case ATH10K_STATE_RESTARTED:
+ case ATH10K_STATE_WEDGED:
+ WARN_ON(1);
ret = -EINVAL;
goto out;
}
@@ -2354,11 +2359,6 @@ static int ath10k_start(struct ieee80211_hw *hw)
goto err_power_down;
}

- if (ar->state == ATH10K_STATE_OFF)
- ar->state = ATH10K_STATE_ON;
- else if (ar->state == ATH10K_STATE_RESTARTING)
- ar->state = ATH10K_STATE_RESTARTED;
-
ret = ath10k_wmi_pdev_set_param(ar, ar->wmi.pdev_param->pmf_qos, 1);
if (ret) {
ath10k_warn("failed to enable PMF QOS: %d\n", ret);
@@ -2409,12 +2409,10 @@ static void ath10k_stop(struct ieee80211_hw *hw)
struct ath10k *ar = hw->priv;

mutex_lock(&ar->conf_mutex);
- if (ar->state == ATH10K_STATE_ON ||
- ar->state == ATH10K_STATE_RESTARTED ||
- ar->state == ATH10K_STATE_WEDGED)
+ if (ar->state != ATH10K_STATE_OFF) {
ath10k_halt(ar);
-
- ar->state = ATH10K_STATE_OFF;
+ ar->state = ATH10K_STATE_OFF;
+ }
mutex_unlock(&ar->conf_mutex);

ath10k_mgmt_over_wmi_tx_purge(ar);
--
1.8.5.3


2014-05-23 08:45:32

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

Michal Kazior <[email protected]> writes:

> On 23 May 2014 10:27, Kalle Valo <[email protected]> wrote:
>> Michal Kazior <[email protected]> writes:
>>
>>> +static void ath10k_drain_tx(struct ath10k *ar)
>>> +{
>>> + /* workers can hold conf_mutex -- avoid deadlock */
>>> + WARN_ON(lockdep_is_held(&ar->conf_mutex));
>>
>> If CONFIG_LOCKDEP is disabled this will fail with:
>>
>> drivers/net/wireless/ath/ath10k/mac.c:2301:2: error: implicit declaration of function 'lockdep_is_held' [-Werror=implicit-function-declaration]
>>
>> I wasn't able to come up with any other way than adding '#ifdef
>> CONFIG_LOCKDEP' which is quite ugly. What should we do?
>
> I guess we could define lockdep_assert_not_held() in debug.h?

Sure, but I still think it's a bit ugly. The right way to fix this would
be to add it to include/lockdep.h, instead of adding custom checks to a
driver. But does this check even make sense? There's nothing preventing
to another thread to take lock just after the WARN_ON(), right?

Oh, and if we add it to debug.h we should name ath10k_assert_not_held().
We should not use lockdep's namespace for anything inside ath10k.

--
Kalle Valo

2014-05-26 06:29:33

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On Mon, 2014-05-26 at 07:48 +0200, Michal Kazior wrote:
> On 23 May 2014 15:03, Johannes Berg <[email protected]> wrote:
> > On Fri, 2014-05-23 at 11:06 +0200, Michal Kazior wrote:
> >
> >> There's nothing wrong with other thread holding it. Actually that's
> >> the reason for this very check.
> >>
> >> The point is to prevent ath10k_drain_tx() being called while caller
> >> (current thread) holds conf_mutex. If it were to hold conf_mutex then
> >> cancel_work_sync() can deadlock as both workers it tries to stop try
> >> to get a hold of the lock too.
> >
> > That seems pointless - lockdep would warn you about that *anyway*
> > because cancel_work_sync() and friends have proper annotations. Try it
> > sometime :)
>
> Does it splat every time or just by chance or after a deadlock?

Whenever you execute the second path for the first time.

johannes


2014-05-15 08:49:10

by Michal Kazior

[permalink] [raw]
Subject: [PATCH v2 1/5] ath10k: clean up start() callback

This fixes failpath when override AC pdev param
setup fails and makes other pdev params setting
fail as well.

Signed-off-by: Michal Kazior <[email protected]>
---
v2:
* add empty lines before each goto label [Kalle]
* dont use err goto paths on main code path [Kalle]

drivers/net/wireless/ath/ath10k/mac.c | 34 +++++++++++++++++++++++-----------
1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 7026f02..854e183 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2339,22 +2339,19 @@ static int ath10k_start(struct ieee80211_hw *hw)
if (ar->state != ATH10K_STATE_OFF &&
ar->state != ATH10K_STATE_RESTARTING) {
ret = -EINVAL;
- goto exit;
+ goto err;
}

ret = ath10k_hif_power_up(ar);
if (ret) {
ath10k_err("Could not init hif: %d\n", ret);
- ar->state = ATH10K_STATE_OFF;
- goto exit;
+ goto err_off;
}

ret = ath10k_core_start(ar);
if (ret) {
ath10k_err("Could not init core: %d\n", ret);
- ath10k_hif_power_down(ar);
- ar->state = ATH10K_STATE_OFF;
- goto exit;
+ goto err_power_down;
}

if (ar->state == ATH10K_STATE_OFF)
@@ -2363,12 +2360,16 @@ static int ath10k_start(struct ieee80211_hw *hw)
ar->state = ATH10K_STATE_RESTARTED;

ret = ath10k_wmi_pdev_set_param(ar, ar->wmi.pdev_param->pmf_qos, 1);
- if (ret)
+ if (ret) {
ath10k_warn("failed to enable PMF QOS: %d\n", ret);
+ goto err_core_stop;
+ }

ret = ath10k_wmi_pdev_set_param(ar, ar->wmi.pdev_param->dynamic_bw, 1);
- if (ret)
+ if (ret) {
ath10k_warn("failed to enable dynamic BW: %d\n", ret);
+ goto err_core_stop;
+ }

/*
* By default FW set ARP frames ac to voice (6). In that case ARP
@@ -2384,14 +2385,25 @@ static int ath10k_start(struct ieee80211_hw *hw)
if (ret) {
ath10k_warn("failed to set arp ac override parameter: %d\n",
ret);
- goto exit;
+ goto err_core_stop;
}

ar->num_started_vdevs = 0;
ath10k_regd_update(ar);
- ret = 0;

-exit:
+ mutex_unlock(&ar->conf_mutex);
+ return 0;
+
+err_core_stop:
+ ath10k_core_stop(ar);
+
+err_power_down:
+ ath10k_hif_power_down(ar);
+
+err_off:
+ ar->state = ATH10K_STATE_OFF;
+
+err:
mutex_unlock(&ar->conf_mutex);
return ret;
}
--
1.8.5.3


2014-05-09 12:29:57

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 4/5] ath10k: protect wep tx key setup

All configuration sequences should be protected
with conf_mutex to avoid concurrent/conflicting
requests.

This should make sure that wep tx key setup is not
performed while hw is restarted (at least).

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 40ce1e9..78b427d 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -1888,8 +1888,10 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
wep_key_work);
int ret, keyidx = arvif->def_wep_key_newidx;

+ mutex_lock(&arvif->ar->conf_mutex);
+
if (arvif->def_wep_key_idx == keyidx)
- return;
+ goto unlock;

ath10k_dbg(ATH10K_DBG_MAC, "mac vdev %d set keyidx %d\n",
arvif->vdev_id, keyidx);
@@ -1902,10 +1904,12 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
ath10k_warn("failed to update wep key index for vdev %d: %d\n",
arvif->vdev_id,
ret);
- return;
+ goto unlock;
}

arvif->def_wep_key_idx = keyidx;
+unlock:
+ mutex_unlock(&arvif->ar->conf_mutex);
}

static void ath10k_tx_h_update_wep_key(struct sk_buff *skb)
--
1.8.5.3


2014-05-09 12:30:47

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 3/5] ath10k: drain tx before restarting hw

This makes sure no further tx requests are
submitted to HTT before driver teardown.

This should prevent invalid pointer/NULL
dereference on htt tx pool in ath10k_htt_tx() in
some cases of heavy traffic.

Reported-By: Ben Greear <[email protected]>
Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index dbf0960..40ce1e9 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2289,6 +2289,21 @@ static void ath10k_tx(struct ieee80211_hw *hw,
/*
* Initialize various parameters with default vaules.
*/
+static void ath10k_drain_tx(struct ath10k *ar)
+{
+ /* workers can hold conf_mutex -- avoid deadlock */
+ WARN_ON(lockdep_is_held(&ar->conf_mutex));
+
+ /* make sure rcu-protected mac80211 tx path itself is drained */
+ synchronize_net();
+
+ ath10k_offchan_tx_purge(ar);
+ ath10k_mgmt_over_wmi_tx_purge(ar);
+
+ cancel_work_sync(&ar->offchan_tx_work);
+ cancel_work_sync(&ar->wmi_mgmt_tx_work);
+}
+
void ath10k_halt(struct ath10k *ar)
{
struct ath10k_vif *arvif;
@@ -2304,8 +2319,6 @@ void ath10k_halt(struct ath10k *ar)

del_timer_sync(&ar->scan.timeout);
ath10k_reset_scan((unsigned long)ar);
- ath10k_offchan_tx_purge(ar);
- ath10k_mgmt_over_wmi_tx_purge(ar);
ath10k_peer_cleanup_all(ar);
ath10k_core_stop(ar);
ath10k_hif_power_down(ar);
@@ -2329,6 +2342,13 @@ static int ath10k_start(struct ieee80211_hw *hw)
struct ath10k *ar = hw->priv;
int ret = 0;

+ /*
+ * This makes sense only when restarting hw. It is harmless to call
+ * uncoditionally. This is necessary to make sure no HTT/WMI tx
+ * commands will be submitted while restarting.
+ */
+ ath10k_drain_tx(ar);
+
mutex_lock(&ar->conf_mutex);

switch (ar->state) {
@@ -2408,6 +2428,8 @@ static void ath10k_stop(struct ieee80211_hw *hw)
{
struct ath10k *ar = hw->priv;

+ ath10k_drain_tx(ar);
+
mutex_lock(&ar->conf_mutex);
if (ar->state != ATH10K_STATE_OFF) {
ath10k_halt(ar);
@@ -2415,10 +2437,6 @@ static void ath10k_stop(struct ieee80211_hw *hw)
}
mutex_unlock(&ar->conf_mutex);

- ath10k_mgmt_over_wmi_tx_purge(ar);
-
- cancel_work_sync(&ar->offchan_tx_work);
- cancel_work_sync(&ar->wmi_mgmt_tx_work);
cancel_work_sync(&ar->restart_work);
}

--
1.8.5.3


2014-05-23 08:27:07

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

Michal Kazior <[email protected]> writes:

> This makes sure no further tx requests are
> submitted to HTT before driver teardown.
>
> This should prevent invalid pointer/NULL
> dereference on htt tx pool in ath10k_htt_tx() in
> some cases of heavy traffic.
>
> Reported-By: Ben Greear <[email protected]>
> Signed-off-by: Michal Kazior <[email protected]>

[...]

> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -2289,6 +2289,21 @@ static void ath10k_tx(struct ieee80211_hw *hw,
> /*
> * Initialize various parameters with default vaules.
> */
> +static void ath10k_drain_tx(struct ath10k *ar)
> +{
> + /* workers can hold conf_mutex -- avoid deadlock */
> + WARN_ON(lockdep_is_held(&ar->conf_mutex));

If CONFIG_LOCKDEP is disabled this will fail with:

drivers/net/wireless/ath/ath10k/mac.c:2301:2: error: implicit declaration of function 'lockdep_is_held' [-Werror=implicit-function-declaration]

I wasn't able to come up with any other way than adding '#ifdef
CONFIG_LOCKDEP' which is quite ugly. What should we do?

--
Kalle Valo

2014-05-14 19:17:52

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 4/5] ath10k: protect wep tx key setup

Michal Kazior <[email protected]> writes:

> All configuration sequences should be protected
> with conf_mutex to avoid concurrent/conflicting
> requests.
>
> This should make sure that wep tx key setup is not
> performed while hw is restarted (at least).
>
> Signed-off-by: Michal Kazior <[email protected]>

[...]

> @@ -1902,10 +1904,12 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
> ath10k_warn("failed to update wep key index for vdev %d: %d\n",
> arvif->vdev_id,
> ret);
> - return;
> + goto unlock;
> }
>
> arvif->def_wep_key_idx = keyidx;
> +unlock:
> + mutex_unlock(&arvif->ar->conf_mutex);
> }

Empty line before the label, please.

--
Kalle Valo

2014-05-15 08:49:09

by Michal Kazior

[permalink] [raw]
Subject: [PATCH v2 0/5] ath10k: recovery fixes 2014-05-09

Hi,

Here's is a bunch of fixes for recovery/hw
restart. This notably fixes a bug Ben has hit some
time ago.

I also have some improvements for recovery code as
well in a follow up patchset that I'll be sending
separately.

v2:
* minor code style adjustments
* extra comment
* sanity check in wep key worker


Michal Kazior (5):
ath10k: clean up start() callback
ath10k: perform hw restart lazily
ath10k: drain tx before restarting hw
ath10k: protect wep tx key setup
ath10k: dont configure bssid for ap mode

drivers/net/wireless/ath/ath10k/core.c | 5 +-
drivers/net/wireless/ath/ath10k/mac.c | 117 ++++++++++++++++++++++-----------
2 files changed, 83 insertions(+), 39 deletions(-)

--
1.8.5.3


2014-05-23 13:03:48

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On Fri, 2014-05-23 at 11:06 +0200, Michal Kazior wrote:

> There's nothing wrong with other thread holding it. Actually that's
> the reason for this very check.
>
> The point is to prevent ath10k_drain_tx() being called while caller
> (current thread) holds conf_mutex. If it were to hold conf_mutex then
> cancel_work_sync() can deadlock as both workers it tries to stop try
> to get a hold of the lock too.

That seems pointless - lockdep would warn you about that *anyway*
because cancel_work_sync() and friends have proper annotations. Try it
sometime :)

johannes


2014-05-23 10:31:08

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

Michal Kazior <[email protected]> writes:

>> Sure, but I still think it's a bit ugly. The right way to fix this would
>> be to add it to include/lockdep.h, instead of adding custom checks to a
>> driver.
>
> Good point. I wonder if it's generic enough to justify.

No idea. But no matter what it will take some time to get it accepted,
so we need to solve this in a faster way so that I can apply this patch.

>> But does this check even make sense? There's nothing preventing
>> to another thread to take lock just after the WARN_ON(), right?
>
> There's nothing wrong with other thread holding it. Actually that's
> the reason for this very check.
>
> The point is to prevent ath10k_drain_tx() being called while caller
> (current thread) holds conf_mutex. If it were to hold conf_mutex then
> cancel_work_sync() can deadlock as both workers it tries to stop try
> to get a hold of the lock too.

Ah, now I understand.

What if we just drop the WARN_ON() from this patch just so that I can
apply the patch and we add a proper code for checking the mutex in a
followup patch? Are you ok with that?


--
Kalle Valo

2014-05-26 09:40:30

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

Michal Kazior <[email protected]> writes:

> This makes sure no further tx requests are
> submitted to HTT before driver teardown.
>
> This should prevent invalid pointer/NULL
> dereference on htt tx pool in ath10k_htt_tx() in
> some cases of heavy traffic.
>
> Reported-By: Ben Greear <[email protected]>
> Signed-off-by: Michal Kazior <[email protected]>
> ---
> drivers/net/wireless/ath/ath10k/mac.c | 30 ++++++++++++++++++++++++------
> 1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
> index b5d75b8..577a3a5 100644
> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -2289,6 +2289,21 @@ static void ath10k_tx(struct ieee80211_hw *hw,
> /*
> * Initialize various parameters with default vaules.
> */
> +static void ath10k_drain_tx(struct ath10k *ar)
> +{
> + /* workers can hold conf_mutex -- avoid deadlock */
> + WARN_ON(lockdep_is_held(&ar->conf_mutex));

Based on the discussion I modified the function to this now. Are you ok
with that?

/* Must not be called with conf_mutex held as workers can use that also. */
static void ath10k_drain_tx(struct ath10k *ar)
{
/* make sure rcu-protected mac80211 tx path itself is drained */
synchronize_net();

ath10k_offchan_tx_purge(ar);
ath10k_mgmt_over_wmi_tx_purge(ar);

cancel_work_sync(&ar->offchan_tx_work);
cancel_work_sync(&ar->wmi_mgmt_tx_work);
}

Full patch:

https://github.com/kvalo/ath/commit/e3c513cd03802e75802a6ce5efbadaa0dbf04b27

--
Kalle Valo

2014-05-15 13:18:47

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

Michal Kazior <[email protected]> writes:

> This makes sure no further tx requests are
> submitted to HTT before driver teardown.
>
> This should prevent invalid pointer/NULL
> dereference on htt tx pool in ath10k_htt_tx() in
> some cases of heavy traffic.
>
> Reported-By: Ben Greear <[email protected]>
> Signed-off-by: Michal Kazior <[email protected]>

For some reason I'm not receiving buildbot emails anymore, but I noticed
via the kbuild-all archives that this patch has a warning:

drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_drain_tx':
>> drivers/net/wireless/ath/ath10k/mac.c:2442:2: error: implicit
declaration of function 'lockdep_is_held'
[-Werror=implicit-function-declaration]

https://lists.01.org/pipermail/kbuild-all/2014-May/004387.html

--
Kalle Valo

2014-05-14 19:07:57

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 1/5] ath10k: clean up start() callback

Michal Kazior <[email protected]> writes:

> This fixes failpath when override AC pdev param
> setup fails and makes other pdev params setting
> fail as well.
>
> Signed-off-by: Michal Kazior <[email protected]>

[...]

> ar->num_started_vdevs = 0;
> ath10k_regd_update(ar);
> ret = 0;
> + goto out;
>
> -exit:
> +err_core_stop:
> + ath10k_core_stop(ar);
> +err_power_down:
> + ath10k_hif_power_down(ar);
> +err_off:
> + ar->state = ATH10K_STATE_OFF;
> +out:
> mutex_unlock(&ar->conf_mutex);
> return ret;
> }

Having err_ labels on the "main" code path is not good, the error
handling code should be clearly separated to make it easier to read. I
think you should use the same style as pci.c uses, for example something
like this:

mutex_unlock(&ar->conf_mutex);
return 0;

err_core_stop:
ath10k_core_stop(ar);

err_power_down:
ath10k_hif_power_down(ar);

err_off:
ar->state = ATH10K_STATE_OFF;

mutex_unlock(&ar->conf_mutex);

return ret;

I know this has mutex_unlock() so it's not good either, but I think it's
still better. Other option might be to do like this:

ret = 0;

out:
mutex_unlock(&ar->conf_mutex);
return ret;

err_core_stop:
ath10k_core_stop(ar);

err_power_down:
ath10k_hif_power_down(ar);

err_off:
ar->state = ATH10K_STATE_OFF;

goto out;

But not sure if it's any better. More ideas?

Oh yeah, please also add an empty line before each label.

--
Kalle Valo

2014-05-27 09:29:27

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] ath10k: recovery fixes 2014-05-09

Michal Kazior <[email protected]> writes:

> Hi,
>
> Here's is a bunch of fixes for recovery/hw
> restart. This notably fixes a bug Ben has hit some
> time ago.
>
> I also have some improvements for recovery code as
> well in a follow up patchset that I'll be sending
> separately.
>
> v2:
> * minor code style adjustments
> * extra comment
> * sanity check in wep key worker
>
>
> Michal Kazior (5):
> ath10k: clean up start() callback
> ath10k: perform hw restart lazily
> ath10k: drain tx before restarting hw
> ath10k: protect wep tx key setup
> ath10k: dont configure bssid for ap mode

Thanks, all five patches applied.

--
Kalle Valo

2014-05-16 14:37:28

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

Kalle Valo <[email protected]> writes:

> Michal Kazior <[email protected]> writes:
>
>> This makes sure no further tx requests are
>> submitted to HTT before driver teardown.
>>
>> This should prevent invalid pointer/NULL
>> dereference on htt tx pool in ath10k_htt_tx() in
>> some cases of heavy traffic.
>>
>> Reported-By: Ben Greear <[email protected]>
>> Signed-off-by: Michal Kazior <[email protected]>
>
> For some reason I'm not receiving buildbot emails anymore, but I noticed
> via the kbuild-all archives that this patch has a warning:
>
> drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_drain_tx':
>>> drivers/net/wireless/ath/ath10k/mac.c:2442:2: error: implicit
> declaration of function 'lockdep_is_held'
> [-Werror=implicit-function-declaration]
>
> https://lists.01.org/pipermail/kbuild-all/2014-May/004387.html

I fixed it with this:

--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -19,6 +19,7 @@

#include <net/mac80211.h>
#include <linux/etherdevice.h>
+#include <linux/lockdep.h>

#include "hif.h"
#include "core.h"

Let's see if buildbot is happier now.

--
Kalle Valo

2014-05-23 09:06:27

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On 23 May 2014 10:45, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>> On 23 May 2014 10:27, Kalle Valo <[email protected]> wrote:
>>> Michal Kazior <[email protected]> writes:
>>>
>>>> +static void ath10k_drain_tx(struct ath10k *ar)
>>>> +{
>>>> + /* workers can hold conf_mutex -- avoid deadlock */
>>>> + WARN_ON(lockdep_is_held(&ar->conf_mutex));
>>>
>>> If CONFIG_LOCKDEP is disabled this will fail with:
>>>
>>> drivers/net/wireless/ath/ath10k/mac.c:2301:2: error: implicit declaration of function 'lockdep_is_held' [-Werror=implicit-function-declaration]
>>>
>>> I wasn't able to come up with any other way than adding '#ifdef
>>> CONFIG_LOCKDEP' which is quite ugly. What should we do?
>>
>> I guess we could define lockdep_assert_not_held() in debug.h?
>
> Sure, but I still think it's a bit ugly. The right way to fix this would
> be to add it to include/lockdep.h, instead of adding custom checks to a
> driver.

Good point. I wonder if it's generic enough to justify.


> But does this check even make sense? There's nothing preventing
> to another thread to take lock just after the WARN_ON(), right?

There's nothing wrong with other thread holding it. Actually that's
the reason for this very check.

The point is to prevent ath10k_drain_tx() being called while caller
(current thread) holds conf_mutex. If it were to hold conf_mutex then
cancel_work_sync() can deadlock as both workers it tries to stop try
to get a hold of the lock too.


> Oh, and if we add it to debug.h we should name ath10k_assert_not_held().
> We should not use lockdep's namespace for anything inside ath10k.

Good point.


Michał

2014-05-23 08:38:23

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On 23 May 2014 10:27, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>> This makes sure no further tx requests are
>> submitted to HTT before driver teardown.
>>
>> This should prevent invalid pointer/NULL
>> dereference on htt tx pool in ath10k_htt_tx() in
>> some cases of heavy traffic.
>>
>> Reported-By: Ben Greear <[email protected]>
>> Signed-off-by: Michal Kazior <[email protected]>
>
> [...]
>
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -2289,6 +2289,21 @@ static void ath10k_tx(struct ieee80211_hw *hw,
>> /*
>> * Initialize various parameters with default vaules.
>> */
>> +static void ath10k_drain_tx(struct ath10k *ar)
>> +{
>> + /* workers can hold conf_mutex -- avoid deadlock */
>> + WARN_ON(lockdep_is_held(&ar->conf_mutex));
>
> If CONFIG_LOCKDEP is disabled this will fail with:
>
> drivers/net/wireless/ath/ath10k/mac.c:2301:2: error: implicit declaration of function 'lockdep_is_held' [-Werror=implicit-function-declaration]
>
> I wasn't able to come up with any other way than adding '#ifdef
> CONFIG_LOCKDEP' which is quite ugly. What should we do?

I guess we could define lockdep_assert_not_held() in debug.h?


Michał

2014-05-15 08:49:11

by Michal Kazior

[permalink] [raw]
Subject: [PATCH v2 2/5] ath10k: perform hw restart lazily

This reduces risk of races and prepares for more
hw restart fixes.

It also makes sense to perform teardown after
mac80211 starts its restart routine as it
guarantees it has stopped itself by then
(including tx queues).

Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/core.c | 5 ++++-
drivers/net/wireless/ath/ath10k/mac.c | 34 ++++++++++++++++------------------
2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 75b3dfb..7034c72 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -681,7 +681,8 @@ static void ath10k_core_restart(struct work_struct *work)
switch (ar->state) {
case ATH10K_STATE_ON:
ar->state = ATH10K_STATE_RESTARTING;
- ath10k_halt(ar);
+ del_timer_sync(&ar->scan.timeout);
+ ath10k_reset_scan((unsigned long)ar);
ieee80211_restart_hw(ar->hw);
break;
case ATH10K_STATE_OFF:
@@ -690,6 +691,8 @@ static void ath10k_core_restart(struct work_struct *work)
ath10k_warn("cannot restart a device that hasn't been started\n");
break;
case ATH10K_STATE_RESTARTING:
+ /* hw restart might be requested from multiple places */
+ break;
case ATH10K_STATE_RESTARTED:
ar->state = ATH10K_STATE_WEDGED;
/* fall through */
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 854e183..b5d75b8 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -2303,6 +2303,7 @@ void ath10k_halt(struct ath10k *ar)
}

del_timer_sync(&ar->scan.timeout);
+ ath10k_reset_scan((unsigned long)ar);
ath10k_offchan_tx_purge(ar);
ath10k_mgmt_over_wmi_tx_purge(ar);
ath10k_peer_cleanup_all(ar);
@@ -2310,12 +2311,6 @@ void ath10k_halt(struct ath10k *ar)
ath10k_hif_power_down(ar);

spin_lock_bh(&ar->data_lock);
- if (ar->scan.in_progress) {
- del_timer(&ar->scan.timeout);
- ar->scan.in_progress = false;
- ieee80211_scan_completed(ar->hw, true);
- }
-
list_for_each_entry(arvif, &ar->arvifs, list) {
if (!arvif->beacon)
continue;
@@ -2336,8 +2331,18 @@ static int ath10k_start(struct ieee80211_hw *hw)

mutex_lock(&ar->conf_mutex);

- if (ar->state != ATH10K_STATE_OFF &&
- ar->state != ATH10K_STATE_RESTARTING) {
+ switch (ar->state) {
+ case ATH10K_STATE_OFF:
+ ar->state = ATH10K_STATE_ON;
+ break;
+ case ATH10K_STATE_RESTARTING:
+ ath10k_halt(ar);
+ ar->state = ATH10K_STATE_RESTARTED;
+ break;
+ case ATH10K_STATE_ON:
+ case ATH10K_STATE_RESTARTED:
+ case ATH10K_STATE_WEDGED:
+ WARN_ON(1);
ret = -EINVAL;
goto err;
}
@@ -2354,11 +2359,6 @@ static int ath10k_start(struct ieee80211_hw *hw)
goto err_power_down;
}

- if (ar->state == ATH10K_STATE_OFF)
- ar->state = ATH10K_STATE_ON;
- else if (ar->state == ATH10K_STATE_RESTARTING)
- ar->state = ATH10K_STATE_RESTARTED;
-
ret = ath10k_wmi_pdev_set_param(ar, ar->wmi.pdev_param->pmf_qos, 1);
if (ret) {
ath10k_warn("failed to enable PMF QOS: %d\n", ret);
@@ -2413,12 +2413,10 @@ static void ath10k_stop(struct ieee80211_hw *hw)
struct ath10k *ar = hw->priv;

mutex_lock(&ar->conf_mutex);
- if (ar->state == ATH10K_STATE_ON ||
- ar->state == ATH10K_STATE_RESTARTED ||
- ar->state == ATH10K_STATE_WEDGED)
+ if (ar->state != ATH10K_STATE_OFF) {
ath10k_halt(ar);
-
- ar->state = ATH10K_STATE_OFF;
+ ar->state = ATH10K_STATE_OFF;
+ }
mutex_unlock(&ar->conf_mutex);

ath10k_mgmt_over_wmi_tx_purge(ar);
--
1.8.5.3


2014-05-15 08:49:14

by Michal Kazior

[permalink] [raw]
Subject: [PATCH v2 4/5] ath10k: protect wep tx key setup

All configuration sequences should be protected
with conf_mutex to avoid concurrent/conflicting
requests.

This should make sure that wep tx key setup is not
performed while hw is restarted (at least).

Signed-off-by: Michal Kazior <[email protected]>
---
v2:
* skip wep update if state != ON [Kalle]
* add empty space before goto label [Kalle]

drivers/net/wireless/ath/ath10k/mac.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 577a3a5..e474eca 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -1888,8 +1888,13 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
wep_key_work);
int ret, keyidx = arvif->def_wep_key_newidx;

+ mutex_lock(&arvif->ar->conf_mutex);
+
+ if (arvif->ar->state != ATH10K_STATE_ON)
+ goto unlock;
+
if (arvif->def_wep_key_idx == keyidx)
- return;
+ goto unlock;

ath10k_dbg(ATH10K_DBG_MAC, "mac vdev %d set keyidx %d\n",
arvif->vdev_id, keyidx);
@@ -1902,10 +1907,13 @@ static void ath10k_tx_wep_key_work(struct work_struct *work)
ath10k_warn("failed to update wep key index for vdev %d: %d\n",
arvif->vdev_id,
ret);
- return;
+ goto unlock;
}

arvif->def_wep_key_idx = keyidx;
+
+unlock:
+ mutex_unlock(&arvif->ar->conf_mutex);
}

static void ath10k_tx_h_update_wep_key(struct sk_buff *skb)
--
1.8.5.3


2014-05-23 10:37:33

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] ath10k: drain tx before restarting hw

On 23 May 2014 12:31, Kalle Valo <[email protected]> wrote:
> Michal Kazior <[email protected]> writes:
>
>>> Sure, but I still think it's a bit ugly. The right way to fix this would
>>> be to add it to include/lockdep.h, instead of adding custom checks to a
>>> driver.
>>
>> Good point. I wonder if it's generic enough to justify.
>
> No idea. But no matter what it will take some time to get it accepted,
> so we need to solve this in a faster way so that I can apply this patch.
>
>>> But does this check even make sense? There's nothing preventing
>>> to another thread to take lock just after the WARN_ON(), right?
>>
>> There's nothing wrong with other thread holding it. Actually that's
>> the reason for this very check.
>>
>> The point is to prevent ath10k_drain_tx() being called while caller
>> (current thread) holds conf_mutex. If it were to hold conf_mutex then
>> cancel_work_sync() can deadlock as both workers it tries to stop try
>> to get a hold of the lock too.
>
> Ah, now I understand.
>
> What if we just drop the WARN_ON() from this patch just so that I can
> apply the patch and we add a proper code for checking the mutex in a
> followup patch? Are you ok with that?

I'm okay with this as long as we at least have a comment stating that
conf_mutex must not be held while calling the function to avoid
deadlocks.


Michał