2016-03-31 20:59:49

by Ben Greear

[permalink] [raw]
Subject: [PATCH 1/3] ath10k: Ensure txrx-compl-task is stopped when cleaning htt-tx.

From: Ben Greear <[email protected]>

Otherwise, the txrx-compl-task may access some bad memory?

Signed-off-by: Ben Greear <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt_tx.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/htt_tx.c b/drivers/net/wireless/ath/ath10k/htt_tx.c
index 07b960e..58e88d3 100644
--- a/drivers/net/wireless/ath/ath10k/htt_tx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_tx.c
@@ -376,6 +376,8 @@ void ath10k_htt_tx_free(struct ath10k_htt *htt)
{
int size;

+ tasklet_kill(&htt->txrx_compl_task);
+
idr_for_each(&htt->pending_tx, ath10k_htt_tx_clean_up_pending, htt->ar);
idr_destroy(&htt->pending_tx);

--
2.4.3



2016-03-31 20:59:50

by Ben Greear

[permalink] [raw]
Subject: [PATCH 2/3] ath10k: Ensure peer_map references are cleaned up.

From: Ben Greear <[email protected]>

While debugging OS crashes due to firmware crashes, I enabled
kasan, and it noticed that peer objects were being used-after-freed.

Looks like there are two places we could be leaving stale references
in the peer-map, so clean that up.

Signed-off-by: Ben Greear <[email protected]>
---
drivers/net/wireless/ath/ath10k/mac.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 0a81ca2..07b155d 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -794,6 +794,7 @@ static void ath10k_peer_cleanup(struct ath10k *ar, u32 vdev_id)
{
struct ath10k_peer *peer, *tmp;
int peer_id;
+ int i;

lockdep_assert_held(&ar->conf_mutex);

@@ -812,6 +813,15 @@ static void ath10k_peer_cleanup(struct ath10k *ar, u32 vdev_id)
ar->peer_map[peer_id] = NULL;
}

+ /* Double check that peer is properly un-referenced from the peer_map */
+ for (i = 0; i < ARRAY_SIZE(ar->peer_map); i++) {
+ if (ar->peer_map[i] == peer) {
+ ath10k_warn(ar, "ERROR: Faile to properly clean up peer: %p %pM, idx: %d, will fix.\n",
+ peer, peer->addr, i);
+ ar->peer_map[i] = NULL;
+ }
+ }
+
list_del(&peer->list);
kfree(peer);
ar->num_peers--;
@@ -840,6 +850,7 @@ void ath10k_dump_peer_info(struct ath10k *ar)
static void ath10k_peer_cleanup_all(struct ath10k *ar)
{
struct ath10k_peer *peer, *tmp;
+ int i;

lockdep_assert_held(&ar->conf_mutex);

@@ -850,6 +861,11 @@ static void ath10k_peer_cleanup_all(struct ath10k *ar)
list_del(&peer->list);
kfree(peer);
}
+
+ /* Clean up peer-map */
+ for (i = 0; i < ARRAY_SIZE(ar->peer_map); i++)
+ ar->peer_map[i] = NULL;
+
spin_unlock_bh(&ar->data_lock);

ar->num_peers = 0;
--
2.4.3


2016-03-31 20:59:55

by Ben Greear

[permalink] [raw]
Subject: [PATCH 3/3] ath10k: Add BUG_ON if we over-write peer-map pointer.

From: Ben Greear <[email protected]>

Not sure this can happen, but seems like a reasonable sanity
check.

Signed-off-by: Ben Greear <[email protected]>
---
drivers/net/wireless/ath/ath10k/txrx.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
index 94c27f6..172b1d6 100644
--- a/drivers/net/wireless/ath/ath10k/txrx.c
+++ b/drivers/net/wireless/ath/ath10k/txrx.c
@@ -309,6 +309,7 @@ void ath10k_peer_map_event(struct ath10k_htt *htt,
ath10k_warn(ar, /*ATH10K_DBG_HTT,*/ "htt peer map vdev %d peer %pM id %d\n",
ev->vdev_id, ev->addr, ev->peer_id);

+ BUG_ON(ar->peer_map[ev->peer_id] && (ar->peer_map[ev->peer_id] != peer));
ar->peer_map[ev->peer_id] = peer;
set_bit(ev->peer_id, peer->peer_ids);
exit:
--
2.4.3


2016-04-01 20:36:10

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH 1/3] ath10k: Ensure txrx-compl-task is stopped when cleaning htt-tx.

On 03/31/2016 11:12 PM, Michal Kazior wrote:
> On 31 March 2016 at 22:59, <[email protected]> wrote:
>> From: Ben Greear <[email protected]>
>>
>> Otherwise, the txrx-compl-task may access some bad memory?
>>
>> Signed-off-by: Ben Greear <[email protected]>
>> ---
>> drivers/net/wireless/ath/ath10k/htt_tx.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/htt_tx.c b/drivers/net/wireless/ath/ath10k/htt_tx.c
>> index 07b960e..58e88d3 100644
>> --- a/drivers/net/wireless/ath/ath10k/htt_tx.c
>> +++ b/drivers/net/wireless/ath/ath10k/htt_tx.c
>> @@ -376,6 +376,8 @@ void ath10k_htt_tx_free(struct ath10k_htt *htt)
>> {
>> int size;
>>
>> + tasklet_kill(&htt->txrx_compl_task);
>> +
>
> I think, instead, the ordering of ath10k_htt_tx_free() and
> ath10k_htt_rx_free() should be reversed. It's already in the "correct"
> order in ath10k_core_stop() on error path.

That seems a bit too subtle for my taste, but either way
is fine with me. You want to cook up a patch?

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2016-04-01 14:04:27

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH 3/3] ath10k: Add BUG_ON if we over-write peer-map pointer.



On 03/31/2016 11:09 PM, Michal Kazior wrote:
> On 31 March 2016 at 22:59, <[email protected]> wrote:
>> From: Ben Greear <[email protected]>
>>
>> Not sure this can happen, but seems like a reasonable sanity
>> check.
>>
>> Signed-off-by: Ben Greear <[email protected]>
>> ---
>> drivers/net/wireless/ath/ath10k/txrx.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
>> index 94c27f6..172b1d6 100644
>> --- a/drivers/net/wireless/ath/ath10k/txrx.c
>> +++ b/drivers/net/wireless/ath/ath10k/txrx.c
>> @@ -309,6 +309,7 @@ void ath10k_peer_map_event(struct ath10k_htt *htt,
>> ath10k_warn(ar, /*ATH10K_DBG_HTT,*/ "htt peer map vdev %d peer %pM id %d\n",
>> ev->vdev_id, ev->addr, ev->peer_id);
>>
>> + BUG_ON(ar->peer_map[ev->peer_id] && (ar->peer_map[ev->peer_id] != peer));
>
> Does this really need to be a BUG_ON?

It means we have memory or logic corruption, or maybe we are out of sync with the
firmware, so I think it should be very visible, at least for a while. I haven't
hit it, so not sure it can happen anyway....

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2016-04-01 06:18:23

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH 2/3] ath10k: Ensure peer_map references are cleaned up.

On 31 March 2016 at 22:59, <[email protected]> wrote:
[...]
> @@ -812,6 +813,15 @@ static void ath10k_peer_cleanup(struct ath10k *ar, u32 vdev_id)
> ar->peer_map[peer_id] = NULL;
> }
>
> + /* Double check that peer is properly un-referenced from the peer_map */
> + for (i = 0; i < ARRAY_SIZE(ar->peer_map); i++) {
> + if (ar->peer_map[i] == peer) {
> + ath10k_warn(ar, "ERROR: Faile to properly clean up peer: %p %pM, idx: %d, will fix.\n",
> + peer, peer->addr, i);

The message doesn't fit the style and has typos. I think it's more
in-line with the preceeding warning to:
"removing stale peer_map entry for %pM (ptr %p idx %d)"


[...]
> @@ -850,6 +861,11 @@ static void ath10k_peer_cleanup_all(struct ath10k *ar)
> list_del(&peer->list);
> kfree(peer);
> }
> +
> + /* Clean up peer-map */

No need for the comment.

> + for (i = 0; i < ARRAY_SIZE(ar->peer_map); i++)
> + ar->peer_map[i] = NULL;

Could use memset() as well?


Anyway, thanks for catching these! :)


Michał

2016-04-01 06:09:40

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH 3/3] ath10k: Add BUG_ON if we over-write peer-map pointer.

On 31 March 2016 at 22:59, <[email protected]> wrote:
> From: Ben Greear <[email protected]>
>
> Not sure this can happen, but seems like a reasonable sanity
> check.
>
> Signed-off-by: Ben Greear <[email protected]>
> ---
> drivers/net/wireless/ath/ath10k/txrx.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
> index 94c27f6..172b1d6 100644
> --- a/drivers/net/wireless/ath/ath10k/txrx.c
> +++ b/drivers/net/wireless/ath/ath10k/txrx.c
> @@ -309,6 +309,7 @@ void ath10k_peer_map_event(struct ath10k_htt *htt,
> ath10k_warn(ar, /*ATH10K_DBG_HTT,*/ "htt peer map vdev %d peer %pM id %d\n",
> ev->vdev_id, ev->addr, ev->peer_id);
>
> + BUG_ON(ar->peer_map[ev->peer_id] && (ar->peer_map[ev->peer_id] != peer));

Does this really need to be a BUG_ON?


Michał

2016-04-01 06:12:09

by Michal Kazior

[permalink] [raw]
Subject: Re: [PATCH 1/3] ath10k: Ensure txrx-compl-task is stopped when cleaning htt-tx.

On 31 March 2016 at 22:59, <[email protected]> wrote:
> From: Ben Greear <[email protected]>
>
> Otherwise, the txrx-compl-task may access some bad memory?
>
> Signed-off-by: Ben Greear <[email protected]>
> ---
> drivers/net/wireless/ath/ath10k/htt_tx.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/htt_tx.c b/drivers/net/wireless/ath/ath10k/htt_tx.c
> index 07b960e..58e88d3 100644
> --- a/drivers/net/wireless/ath/ath10k/htt_tx.c
> +++ b/drivers/net/wireless/ath/ath10k/htt_tx.c
> @@ -376,6 +376,8 @@ void ath10k_htt_tx_free(struct ath10k_htt *htt)
> {
> int size;
>
> + tasklet_kill(&htt->txrx_compl_task);
> +

I think, instead, the ordering of ath10k_htt_tx_free() and
ath10k_htt_rx_free() should be reversed. It's already in the "correct"
order in ath10k_core_stop() on error path.


Michał