2014-05-07 13:07:12

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 0/2] ath10k: 2014-05-07 fixes

Hi,

These 2 patches address kernel panics reported by
Avery recently. One of them was most likely
reported by Ben some time ago already.

I've never seen these panics myself. It'd be great
if you guys could give it a try and (assuming
you're capable of reproducing) see if it actually
helps.


Michal Kazior (2):
ath10k: fix htt rx ring clean up
ath10k: fix handling of wierd MSDU chaining cases

drivers/net/wireless/ath/ath10k/htt_rx.c | 56 +++++++++++++++++++++++---------
1 file changed, 40 insertions(+), 16 deletions(-)

--
1.8.5.3



2014-05-14 13:43:16

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 0/2] ath10k: 2014-05-07 fixes

Michal Kazior <[email protected]> writes:

> These 2 patches address kernel panics reported by
> Avery recently. One of them was most likely
> reported by Ben some time ago already.
>
> I've never seen these panics myself. It'd be great
> if you guys could give it a try and (assuming
> you're capable of reproducing) see if it actually
> helps.
>
>
> Michal Kazior (2):
> ath10k: fix htt rx ring clean up
> ath10k: fix handling of wierd MSDU chaining cases

Thanks, applied.

I didn't see any reports if these help or not, but the patches looked
good and I applied them anyway.

--
Kalle Valo

2014-05-07 13:07:14

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 2/2] ath10k: fix handling of wierd MSDU chaining cases

Apparently firmware can sometimes report a
sequence with the first rx descriptor saying it's
not the last MSDU. In that case msdu_chaining
value could be overwritten saying it's not a
chained MSDU. This in turn led to skb_push panic
as the frame could be treated as an A-MSDU instead
of a chained MSDU.

Reported-By: Avery Pennarun <[email protected]>
Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt_rx.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index db6c8af..ac6a5fe 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -312,6 +312,7 @@ static int ath10k_htt_rx_amsdu_pop(struct ath10k_htt *htt,
int msdu_len, msdu_chaining = 0;
struct sk_buff *msdu;
struct htt_rx_desc *rx_desc;
+ bool corrupted = false;

lockdep_assert_held(&htt->rx_ring.lock);

@@ -405,7 +406,6 @@ static int ath10k_htt_rx_amsdu_pop(struct ath10k_htt *htt,
msdu_len = MS(__le32_to_cpu(rx_desc->msdu_start.info0),
RX_MSDU_START_INFO0_MSDU_LENGTH);
msdu_chained = rx_desc->frag_info.ring2_more_count;
- msdu_chaining = msdu_chained;

if (msdu_len_invalid)
msdu_len = 0;
@@ -433,11 +433,15 @@ static int ath10k_htt_rx_amsdu_pop(struct ath10k_htt *htt,

msdu->next = next;
msdu = next;
+ msdu_chaining = 1;
}

last_msdu = __le32_to_cpu(rx_desc->msdu_end.info0) &
RX_MSDU_END_INFO0_LAST_MSDU;

+ if (msdu_chaining && !last_msdu)
+ corrupted = true;
+
if (last_msdu) {
msdu->next = NULL;
break;
@@ -453,6 +457,20 @@ static int ath10k_htt_rx_amsdu_pop(struct ath10k_htt *htt,
msdu_chaining = -1;

/*
+ * Apparently FW sometimes reports weird chained MSDU sequences with
+ * more than one rx descriptor. This seems like a bug but needs more
+ * analyzing. For the time being fix it by dropping such sequences to
+ * avoid blowing up the host system.
+ */
+ if (corrupted) {
+ ath10k_warn("failed to pop chained msdus, dropping\n");
+ ath10k_htt_rx_free_msdu_chain(*head_msdu);
+ *head_msdu = NULL;
+ *tail_msdu = NULL;
+ msdu_chaining = -EINVAL;
+ }
+
+ /*
* Don't refill the ring yet.
*
* First, the elements popped here are still in use - it is not
--
1.8.5.3


2014-05-07 13:36:45

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH 0/2] ath10k: 2014-05-07 fixes

On 05/07/2014 05:33 AM, Michal Kazior wrote:
> Hi,
>
> These 2 patches address kernel panics reported by
> Avery recently. One of them was most likely
> reported by Ben some time ago already.
>
> I've never seen these panics myself. It'd be great
> if you guys could give it a try and (assuming
> you're capable of reproducing) see if it actually
> helps.

I'll add these to my tree today and we'll run them through
our test harness.

I have not been able to repeat any crashes like this recently...but
I'll let you know if I see anything strange after applying
the patches.

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2014-05-07 13:07:13

by Michal Kazior

[permalink] [raw]
Subject: [PATCH 1/2] ath10k: fix htt rx ring clean up

msdu_payId was read before txrx tasklet was killed
so it was possible to end up using an invalid
sk_buff pointer leading to a panic.

Make sure to sanitize rx ring sk_buff pointers and
make the clean up go through all possible entries
and not rely on coherent-DMA mapped u32 index
which could be (in theory) corrupted by the device
as well.

Reported-By: Avery Pennarun <[email protected]>
Reported-By: Ben Greear <[email protected]>
Signed-off-by: Michal Kazior <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt_rx.c | 36 +++++++++++++++++++-------------
1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index f85a3cf..db6c8af 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -225,10 +225,26 @@ static void ath10k_htt_rx_ring_refill_retry(unsigned long arg)
ath10k_htt_rx_msdu_buff_replenish(htt);
}

-void ath10k_htt_rx_detach(struct ath10k_htt *htt)
+static void ath10k_htt_rx_ring_clean_up(struct ath10k_htt *htt)
{
- int sw_rd_idx = htt->rx_ring.sw_rd_idx.msdu_payld;
+ struct sk_buff *skb;
+ int i;
+
+ for (i = 0; i < htt->rx_ring.size; i++) {
+ skb = htt->rx_ring.netbufs_ring[i];
+ if (!skb)
+ continue;
+
+ dma_unmap_single(htt->ar->dev, ATH10K_SKB_CB(skb)->paddr,
+ skb->len + skb_tailroom(skb),
+ DMA_FROM_DEVICE);
+ dev_kfree_skb_any(skb);
+ htt->rx_ring.netbufs_ring[i] = NULL;
+ }
+}

+void ath10k_htt_rx_detach(struct ath10k_htt *htt)
+{
del_timer_sync(&htt->rx_ring.refill_retry_timer);
tasklet_kill(&htt->rx_replenish_task);
tasklet_kill(&htt->txrx_compl_task);
@@ -236,18 +252,7 @@ void ath10k_htt_rx_detach(struct ath10k_htt *htt)
skb_queue_purge(&htt->tx_compl_q);
skb_queue_purge(&htt->rx_compl_q);

- while (sw_rd_idx != __le32_to_cpu(*(htt->rx_ring.alloc_idx.vaddr))) {
- struct sk_buff *skb =
- htt->rx_ring.netbufs_ring[sw_rd_idx];
- struct ath10k_skb_cb *cb = ATH10K_SKB_CB(skb);
-
- dma_unmap_single(htt->ar->dev, cb->paddr,
- skb->len + skb_tailroom(skb),
- DMA_FROM_DEVICE);
- dev_kfree_skb_any(htt->rx_ring.netbufs_ring[sw_rd_idx]);
- sw_rd_idx++;
- sw_rd_idx &= htt->rx_ring.size_mask;
- }
+ ath10k_htt_rx_ring_clean_up(htt);

dma_free_coherent(htt->ar->dev,
(htt->rx_ring.size *
@@ -277,6 +282,7 @@ static inline struct sk_buff *ath10k_htt_rx_netbuf_pop(struct ath10k_htt *htt)

idx = htt->rx_ring.sw_rd_idx.msdu_payld;
msdu = htt->rx_ring.netbufs_ring[idx];
+ htt->rx_ring.netbufs_ring[idx] = NULL;

idx++;
idx &= htt->rx_ring.size_mask;
@@ -494,7 +500,7 @@ int ath10k_htt_rx_attach(struct ath10k_htt *htt)
htt->rx_ring.fill_level = ath10k_htt_rx_ring_fill_level(htt);

htt->rx_ring.netbufs_ring =
- kmalloc(htt->rx_ring.size * sizeof(struct sk_buff *),
+ kzalloc(htt->rx_ring.size * sizeof(struct sk_buff *),
GFP_KERNEL);
if (!htt->rx_ring.netbufs_ring)
goto err_netbuf;
--
1.8.5.3