2016-02-12 06:11:14

by Rajkumar Manoharan

[permalink] [raw]
Subject: [PATCH 1/2] ath10k: reduce rx_lock contention for htt rx indication

Received frame indications are queued into a skb list and latest
processed by txrx tasklet. This skb queue is protected by htt rx lock.
Since the entire rx processing till delivering frame to mac80211 and
replenish tasks are processed under rx_lock protection, there might be
some delay in queuing newly received rx frame into that list on
multicore systems. Optimize this by using skb list lock while accessing
rx completion queue instead of htt rx lock.

Signed-off-by: Rajkumar Manoharan <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt_rx.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index cc957a6..bedd8c3 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -2011,9 +2011,7 @@ void ath10k_htt_t2h_msg_handler(struct ath10k *ar, struct sk_buff *skb)
break;
}
case HTT_T2H_MSG_TYPE_RX_IND:
- spin_lock_bh(&htt->rx_ring.lock);
- __skb_queue_tail(&htt->rx_compl_q, skb);
- spin_unlock_bh(&htt->rx_ring.lock);
+ skb_queue_tail(&htt->rx_compl_q, skb);
tasklet_schedule(&htt->txrx_compl_task);
return;
case HTT_T2H_MSG_TYPE_PEER_MAP: {
@@ -2111,9 +2109,7 @@ void ath10k_htt_t2h_msg_handler(struct ath10k *ar, struct sk_buff *skb)
break;
}
case HTT_T2H_MSG_TYPE_RX_IN_ORD_PADDR_IND: {
- spin_lock_bh(&htt->rx_ring.lock);
- __skb_queue_tail(&htt->rx_in_ord_compl_q, skb);
- spin_unlock_bh(&htt->rx_ring.lock);
+ skb_queue_tail(&htt->rx_in_ord_compl_q, skb);
tasklet_schedule(&htt->txrx_compl_task);
return;
}
@@ -2174,16 +2170,18 @@ static void ath10k_htt_txrx_compl_task(unsigned long ptr)
dev_kfree_skb_any(skb);
}

- spin_lock_bh(&htt->rx_ring.lock);
- while ((skb = __skb_dequeue(&htt->rx_compl_q))) {
+ while ((skb = skb_dequeue(&htt->rx_compl_q))) {
resp = (struct htt_resp *)skb->data;
+ spin_lock_bh(&htt->rx_ring.lock);
ath10k_htt_rx_handler(htt, &resp->rx_ind);
+ spin_unlock_bh(&htt->rx_ring.lock);
dev_kfree_skb_any(skb);
}

- while ((skb = __skb_dequeue(&htt->rx_in_ord_compl_q))) {
+ while ((skb = skb_dequeue(&htt->rx_in_ord_compl_q))) {
+ spin_lock_bh(&htt->rx_ring.lock);
ath10k_htt_rx_in_ord_ind(ar, skb);
+ spin_unlock_bh(&htt->rx_ring.lock);
dev_kfree_skb_any(skb);
}
- spin_unlock_bh(&htt->rx_ring.lock);
}
--
2.7.0



2016-02-12 06:11:27

by Rajkumar Manoharan

[permalink] [raw]
Subject: [PATCH 2/2] ath10k: process htt rx indication as batch mode

On multicore systems, it is possible that txrx tasket can run
in parallel with pci tasklet (i.e smp affinity of ath10k irq is
assigned to multiple CPUs). Feeding and consuming from the same
rx completion list leads to txrx tasklet runs for longer period.
Prevent this by processing a snapshot of rx queue by moving list
into temporary list. Consecutive received frames will be processed
in next batch.

Signed-off-by: Rajkumar Manoharan <[email protected]>
---
drivers/net/wireless/ath/ath10k/htt_rx.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index bedd8c3..61d9507 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -2155,22 +2155,34 @@ static void ath10k_htt_txrx_compl_task(unsigned long ptr)
struct ath10k_htt *htt = (struct ath10k_htt *)ptr;
struct ath10k *ar = htt->ar;
struct sk_buff_head tx_q;
+ struct sk_buff_head rx_q;
+ struct sk_buff_head rx_ind_q;
struct htt_resp *resp;
struct sk_buff *skb;
unsigned long flags;

__skb_queue_head_init(&tx_q);
+ __skb_queue_head_init(&rx_q);
+ __skb_queue_head_init(&rx_ind_q);

spin_lock_irqsave(&htt->tx_compl_q.lock, flags);
skb_queue_splice_init(&htt->tx_compl_q, &tx_q);
spin_unlock_irqrestore(&htt->tx_compl_q.lock, flags);

+ spin_lock_irqsave(&htt->rx_compl_q.lock, flags);
+ skb_queue_splice_init(&htt->rx_compl_q, &rx_q);
+ spin_unlock_irqrestore(&htt->rx_compl_q.lock, flags);
+
+ spin_lock_irqsave(&htt->rx_in_ord_compl_q.lock, flags);
+ skb_queue_splice_init(&htt->rx_in_ord_compl_q, &rx_ind_q);
+ spin_unlock_irqrestore(&htt->rx_in_ord_compl_q.lock, flags);
+
while ((skb = __skb_dequeue(&tx_q))) {
ath10k_htt_rx_frm_tx_compl(htt->ar, skb);
dev_kfree_skb_any(skb);
}

- while ((skb = skb_dequeue(&htt->rx_compl_q))) {
+ while ((skb = __skb_dequeue(&rx_q))) {
resp = (struct htt_resp *)skb->data;
spin_lock_bh(&htt->rx_ring.lock);
ath10k_htt_rx_handler(htt, &resp->rx_ind);
@@ -2178,7 +2190,7 @@ static void ath10k_htt_txrx_compl_task(unsigned long ptr)
dev_kfree_skb_any(skb);
}

- while ((skb = skb_dequeue(&htt->rx_in_ord_compl_q))) {
+ while ((skb = __skb_dequeue(&rx_ind_q))) {
spin_lock_bh(&htt->rx_ring.lock);
ath10k_htt_rx_in_ord_ind(ar, skb);
spin_unlock_bh(&htt->rx_ring.lock);
--
2.7.0


2016-03-04 08:42:42

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 1/2] ath10k: reduce rx_lock contention for htt rx indication

Rajkumar Manoharan <[email protected]> writes:

> Received frame indications are queued into a skb list and latest
> processed by txrx tasklet. This skb queue is protected by htt rx lock.
> Since the entire rx processing till delivering frame to mac80211 and
> replenish tasks are processed under rx_lock protection, there might be
> some delay in queuing newly received rx frame into that list on
> multicore systems. Optimize this by using skb list lock while accessing
> rx completion queue instead of htt rx lock.
>
> Signed-off-by: Rajkumar Manoharan <[email protected]>

Both applied, thanks.

--
Kalle Valo