2017-08-15 13:06:55

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH] qtnfmac: pcie datapath optimizations and cleanups

Hello Kalle and all,

This patchset implements several optimizations and cleanups for
pcie datapath in qtnfmac driver. Major changes include:
- switch to kernel circ_buf implementation
- modify tx reclaim locking
- introduce counter for rx underflow events

Signed-off-by: Sergey Matyukevich <[email protected]>

bus.h | 1
pearl/pcie.c | 285 +++++++++++++++++++++++++++++-------------------
pearl/pcie_bus_priv.h | 15 +-
pearl/pcie_ipc.h | 1
pearl/pcie_regs_pearl.h | 1
5 files changed, 183 insertions(+), 120 deletions(-)


2017-08-17 06:53:36

by Igor Mitsyanko

[permalink] [raw]
Subject: Re: [PATCH] qtnfmac: pcie datapath optimizations and cleanups

On 08/15/2017 06:06 AM, Sergey Matyukevich wrote:
> Hello Kalle and all,
>
> This patchset implements several optimizations and cleanups for
> pcie datapath in qtnfmac driver. Major changes include:
> - switch to kernel circ_buf implementation
> - modify tx reclaim locking
> - introduce counter for rx underflow events
>
> Signed-off-by: Sergey Matyukevich <[email protected]>

Entire series:
Tested-by: Igor Mitsyanko <[email protected]>

>
> bus.h | 1
> pearl/pcie.c | 285 +++++++++++++++++++++++++++++-------------------
> pearl/pcie_bus_priv.h | 15 +-
> pearl/pcie_ipc.h | 1
> pearl/pcie_regs_pearl.h | 1
> 5 files changed, 183 insertions(+), 120 deletions(-)
>

2017-08-16 16:28:19

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] qtnfmac: pcie datapath optimizations and cleanups

Sergey Matyukevich <[email protected]> writes:

>> I don't see patch 9 in the mailing list. Did it get lost somewhere?
>
> No, this is my fault. The last patch was sort of experimental shuffling
> dma memory barriers. I kept it for testing, but forgot to remove it
> before sending. Let me know if it matters, then I will resend
> a properly numbered patchset.

Not a problem, I just always check if I see a patch missing.

--
Kalle Valo

2017-08-16 15:04:47

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 2/9] qtnfmac: switch to napi_gro_receive

Sergey Matyukevich <[email protected]> writes:

> Use napi_gro_receive() rather than netif_receive_skb() in qtnfmac driver.

Yes, I can easily see that from the diff so no need to document that :)
But the commit log should tell _why_ you did it.

--
Kalle Valo

2017-08-15 13:07:07

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 7/9] qtnfmac: introduce counter for Rx underflow events

Introduce counter for Rx underflow events. Export this counter via debugfs.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c | 16 ++++++++++++++--
.../net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h | 1 +
drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_ipc.h | 1 +
.../wireless/quantenna/qtnfmac/pearl/pcie_regs_pearl.h | 1 +
4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index f8207ab25576..72730aff2a41 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -686,14 +686,21 @@ static irqreturn_t qtnf_interrupt(int irq, void *data)
if (!(status & priv->pcie_irq_mask))
goto irq_done;

- if (status & PCIE_HDP_INT_RX_BITS) {
+ if (status & PCIE_HDP_INT_RX_BITS)
priv->pcie_irq_rx_count++;
+
+ if (status & PCIE_HDP_INT_TX_BITS)
+ priv->pcie_irq_tx_count++;
+
+ if (status & PCIE_HDP_INT_HHBM_UF)
+ priv->pcie_irq_uf_count++;
+
+ if (status & PCIE_HDP_INT_RX_BITS) {
qtnf_dis_rxdone_irq(priv);
napi_schedule(&bus->mux_napi);
}

if (status & PCIE_HDP_INT_TX_BITS) {
- priv->pcie_irq_tx_count++;
qtnf_dis_txdone_irq(priv);
tasklet_hi_schedule(&priv->reclaim_tq);
}
@@ -1104,6 +1111,10 @@ static int qtnf_dbg_irq_stats(struct seq_file *s, void *data)
status = reg & PCIE_HDP_INT_RX_BITS;
seq_printf(s, "pcie_irq_rx_status(%s)\n",
(status == PCIE_HDP_INT_RX_BITS) ? "EN" : "DIS");
+ seq_printf(s, "pcie_irq_uf_count(%u)\n", priv->pcie_irq_uf_count);
+ status = reg & PCIE_HDP_INT_HHBM_UF;
+ seq_printf(s, "pcie_irq_hhbm_uf_status(%s)\n",
+ (status == PCIE_HDP_INT_HHBM_UF) ? "EN" : "DIS");

return 0;
}
@@ -1189,6 +1200,7 @@ static int qtnf_pcie_probe(struct pci_dev *pdev, const struct pci_device_id *id)
pcie_priv->pcie_irq_count = 0;
pcie_priv->pcie_irq_rx_count = 0;
pcie_priv->pcie_irq_tx_count = 0;
+ pcie_priv->pcie_irq_uf_count = 0;
pcie_priv->tx_reclaim_done = 0;
pcie_priv->tx_reclaim_req = 0;

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
index 1b37914299e9..698e42132ed4 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
@@ -78,6 +78,7 @@ struct qtnf_pcie_bus_priv {
u32 pcie_irq_count;
u32 pcie_irq_rx_count;
u32 pcie_irq_tx_count;
+ u32 pcie_irq_uf_count;
u32 tx_full_count;
u32 tx_done_count;
u32 tx_reclaim_done;
diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_ipc.h b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_ipc.h
index e00d508fbcf0..667f5ec457e3 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_ipc.h
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_ipc.h
@@ -50,6 +50,7 @@
#define PCIE_HDP_INT_RX_BITS (0 \
| PCIE_HDP_INT_EP_TXDMA \
| PCIE_HDP_INT_EP_TXEMPTY \
+ | PCIE_HDP_INT_HHBM_UF \
)

#define PCIE_HDP_INT_TX_BITS (0 \
diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_regs_pearl.h b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_regs_pearl.h
index 78715b8a8ef9..69696f118769 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_regs_pearl.h
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_regs_pearl.h
@@ -333,6 +333,7 @@
#define PCIE_HDP_INT_RX_LEN_ERR (BIT(2))
#define PCIE_HDP_INT_RX_HDR_LEN_ERR (BIT(3))
#define PCIE_HDP_INT_EP_TXDMA (BIT(12))
+#define PCIE_HDP_INT_HHBM_UF (BIT(13))
#define PCIE_HDP_INT_EP_TXEMPTY (BIT(15))
#define PCIE_HDP_INT_IPC (BIT(29))

--
2.11.0

2017-08-16 15:02:37

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] qtnfmac: pcie datapath optimizations and cleanups

Sergey Matyukevich <[email protected]> writes:

> Hello Kalle and all,
>
> This patchset implements several optimizations and cleanups for
> pcie datapath in qtnfmac driver. Major changes include:
> - switch to kernel circ_buf implementation
> - modify tx reclaim locking
> - introduce counter for rx underflow events
>
> Signed-off-by: Sergey Matyukevich <[email protected]>
>
> bus.h | 1
> pearl/pcie.c | 285 +++++++++++++++++++++++++++++-------------------
> pearl/pcie_bus_priv.h | 15 +-
> pearl/pcie_ipc.h | 1
> pearl/pcie_regs_pearl.h | 1
> 5 files changed, 183 insertions(+), 120 deletions(-)

I don't see patch 9 in the mailing list. Did it get lost somewhere?

--
Kalle Valo

2017-08-15 13:07:03

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 5/9] qtnfmac: decrease default Tx queue size

Avoid extra buffering in driver by default. Use max hardware Tx queue size.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index a0b65d487ddb..f18e8a724c68 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -36,7 +36,7 @@ static bool use_msi = true;
module_param(use_msi, bool, 0644);
MODULE_PARM_DESC(use_msi, "set 0 to use legacy interrupt");

-static unsigned int tx_bd_size_param = 256;
+static unsigned int tx_bd_size_param = 32;
module_param(tx_bd_size_param, uint, 0644);
MODULE_PARM_DESC(tx_bd_size_param, "Tx descriptors queue size");

--
2.11.0

2017-08-16 15:48:15

by Sergey Matyukevich

[permalink] [raw]
Subject: Re: [PATCH 2/9] qtnfmac: switch to napi_gro_receive

> >> Use napi_gro_receive() rather than netif_receive_skb() in qtnfmac driver.
> >
> > Yes, I can easily see that from the diff so no need to document that :)
> > But the commit log should tell _why_ you did it.
>
> And no need to resend because of this, I can improve the commit log
> during commit. Just let me know what I should add.

Sure. The motivation is to improve performance when GRO is enabled, e.g. by
reducing the number of TCP ACKs. Updated message would be:

Use napi_gro_receive() rather than netif_receive_skb() to improve
performance when GRO is enabled.

Regards,
Sergey

2017-08-15 13:06:55

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 1/9] qtnfmac: remove unused qtnf_rx_frame declaration

Signed-off-by: Sergey Matyukevich <[email protected]>
---
drivers/net/wireless/quantenna/qtnfmac/bus.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/bus.h b/drivers/net/wireless/quantenna/qtnfmac/bus.h
index dda05003d522..56e5fed92a2a 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/bus.h
+++ b/drivers/net/wireless/quantenna/qtnfmac/bus.h
@@ -130,7 +130,6 @@ static __always_inline void qtnf_bus_unlock(struct qtnf_bus *bus)

/* interface functions from common layer */

-void qtnf_rx_frame(struct device *dev, struct sk_buff *rxp);
int qtnf_core_attach(struct qtnf_bus *bus);
void qtnf_core_detach(struct qtnf_bus *bus);
void qtnf_txflowblock(struct device *dev, bool state);
--
2.11.0

2017-08-15 13:06:57

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 3/9] qtnfmac: use __netdev_alloc_skb_ip_align

Replace __dev_alloc_skb and explicit NET_IP_ALIGN alignment by built-in
__netdev_alloc_skb_ip_align function.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index 08b35dc30bc8..079aa1693ff5 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -429,8 +429,7 @@ static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 rx_bd_index)
struct sk_buff *skb;
dma_addr_t paddr;

- skb = __dev_alloc_skb(SKB_BUF_SIZE + NET_IP_ALIGN,
- GFP_ATOMIC);
+ skb = __netdev_alloc_skb_ip_align(NULL, SKB_BUF_SIZE, GFP_ATOMIC);
if (!skb) {
priv->rx_skb[rx_bd_index] = NULL;
return -ENOMEM;
@@ -438,8 +437,6 @@ static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 rx_bd_index)

priv->rx_skb[rx_bd_index] = skb;

- skb_reserve(skb, NET_IP_ALIGN);
-
rxbd = &priv->rx_bd_vbase[rx_bd_index];

paddr = pci_map_single(priv->pdev, skb->data,
--
2.11.0

2017-08-17 07:06:04

by Kalle Valo

[permalink] [raw]
Subject: Re: [1/9] qtnfmac: remove unused qtnf_rx_frame declaration

Sergey Matyukevich <[email protected]> wrote:

> Signed-off-by: Sergey Matyukevich <[email protected]>

8 patches applied to wireless-drivers-next.git, thanks.

0db63e37992c qtnfmac: remove unused qtnf_rx_frame declaration
7376947dfb80 qtnfmac: switch to napi_gro_receive
c58730cab8ea qtnfmac: use __netdev_alloc_skb_ip_align
867ba964fa69 qtnfmac: skb2rbd_attach cleanup
dfb13db68f3e qtnfmac: decrease default Tx queue size
3cbc3a0f19ac qtnfmac: switch to kernel circ_buf implementation
cc75f9e5bc66 qtnfmac: introduce counter for Rx underflow events
0593da274d4d qtnfmac: modify tx reclaim locking

--
https://patchwork.kernel.org/patch/9901845/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2017-08-15 13:07:08

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 8/9] qtnfmac: modify tx reclaim locking

Perform additional reclaim from qtnf_pcie_data_tx. Lock tx_lock serves only
reclaim synchronization purposes. Rename it accordingly and improve
granularity moving this lock to qtnf_pcie_data_tx_reclaim.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c | 17 ++++++-----------
.../wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h | 4 ++--
2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index 72730aff2a41..cd2f2b667643 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -534,11 +534,13 @@ static void qtnf_pcie_data_tx_reclaim(struct qtnf_pcie_bus_priv *priv)
{
struct qtnf_tx_bd *txbd;
struct sk_buff *skb;
+ unsigned long flags;
dma_addr_t paddr;
u32 tx_done_index;
int count = 0;
int i;

+ spin_lock_irqsave(&priv->tx_reclaim_lock, flags);

tx_done_index = readl(PCIE_HDP_RX0DMA_CNT(priv->pcie_reg_base))
& (priv->tx_bd_num - 1);
@@ -576,6 +578,7 @@ static void qtnf_pcie_data_tx_reclaim(struct qtnf_pcie_bus_priv *priv)
priv->tx_reclaim_req++;
priv->tx_bd_r_index = i;

+ spin_unlock_irqrestore(&priv->tx_reclaim_lock, flags);
}

static int qtnf_tx_queue_ready(struct qtnf_pcie_bus_priv *priv)
@@ -600,20 +603,14 @@ static int qtnf_pcie_data_tx(struct qtnf_bus *bus, struct sk_buff *skb)
struct qtnf_pcie_bus_priv *priv = (void *)get_bus_priv(bus);
dma_addr_t txbd_paddr, skb_paddr;
struct qtnf_tx_bd *txbd;
- unsigned long flags;
int len, i;
u32 info;
int ret = 0;

- spin_lock_irqsave(&priv->tx_lock, flags);
-
- priv->tx_done_count++;
-
if (!qtnf_tx_queue_ready(priv)) {
if (skb->dev)
netif_stop_queue(skb->dev);

- spin_unlock_irqrestore(&priv->tx_lock, flags);
return NETDEV_TX_BUSY;
}

@@ -659,7 +656,8 @@ static int qtnf_pcie_data_tx(struct qtnf_bus *bus, struct sk_buff *skb)
dev_kfree_skb_any(skb);
}

- spin_unlock_irqrestore(&priv->tx_lock, flags);
+ qtnf_pcie_data_tx_reclaim(priv);
+ priv->tx_done_count++;

return NETDEV_TX_OK;
}
@@ -1067,11 +1065,8 @@ static int qtnf_bringup_fw(struct qtnf_bus *bus)
static void qtnf_reclaim_tasklet_fn(unsigned long data)
{
struct qtnf_pcie_bus_priv *priv = (void *)data;
- unsigned long flags;

- spin_lock_irqsave(&priv->tx_lock, flags);
qtnf_pcie_data_tx_reclaim(priv);
- spin_unlock_irqrestore(&priv->tx_lock, flags);
qtnf_en_txdone_irq(priv);
}

@@ -1192,7 +1187,7 @@ static int qtnf_pcie_probe(struct pci_dev *pdev, const struct pci_device_id *id)
init_completion(&bus->request_firmware_complete);
mutex_init(&bus->bus_lock);
spin_lock_init(&pcie_priv->irq_lock);
- spin_lock_init(&pcie_priv->tx_lock);
+ spin_lock_init(&pcie_priv->tx_reclaim_lock);

/* init stats */
pcie_priv->tx_full_count = 0;
diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
index 698e42132ed4..e76a23716ee0 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
@@ -32,8 +32,8 @@ struct qtnf_pcie_bus_priv {
/* lock for irq configuration changes */
spinlock_t irq_lock;

- /* lock for tx operations */
- spinlock_t tx_lock;
+ /* lock for tx reclaim operations */
+ spinlock_t tx_reclaim_lock;
u8 msi_enabled;
int mps;

--
2.11.0

2017-08-16 15:53:29

by Sergey Matyukevich

[permalink] [raw]
Subject: Re: [PATCH] qtnfmac: pcie datapath optimizations and cleanups

> I don't see patch 9 in the mailing list. Did it get lost somewhere?

No, this is my fault. The last patch was sort of experimental shuffling
dma memory barriers. I kept it for testing, but forgot to remove it
before sending. Let me know if it matters, then I will resend
a properly numbered patchset.

Regards,
Sergey

2017-08-16 15:09:06

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH 2/9] qtnfmac: switch to napi_gro_receive

Kalle Valo <[email protected]> writes:

> Sergey Matyukevich <[email protected]> writes:
>
>> Use napi_gro_receive() rather than netif_receive_skb() in qtnfmac driver.
>
> Yes, I can easily see that from the diff so no need to document that :)
> But the commit log should tell _why_ you did it.

And no need to resend because of this, I can improve the commit log
during commit. Just let me know what I should add.

--
Kalle Valo

2017-08-15 13:06:59

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 4/9] qtnfmac: skb2rbd_attach cleanup

Update PCIE_HDP_TX_HOST_Q_WR_PTR register in skb2rbd_attach as a part of
procedure of passing new Rx buffer to hardware. Sync up all the the
qtnf_rx_bd descriptor updates before passing Rx buffer to hardware.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
.../net/wireless/quantenna/qtnfmac/pearl/pcie.c | 31 +++++++++-------------
1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index 079aa1693ff5..a0b65d487ddb 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -411,11 +411,6 @@ static int alloc_bd_table(struct qtnf_pcie_bus_priv *priv)
writel(priv->rx_bd_num | (sizeof(struct qtnf_rx_bd)) << 16,
PCIE_HDP_TX_HOST_Q_SZ_CTRL(priv->pcie_reg_base));

- priv->hw_txproc_wr_ptr = priv->rx_bd_num - rx_bd_reserved_param;
-
- writel(priv->hw_txproc_wr_ptr,
- PCIE_HDP_TX_HOST_Q_WR_PTR(priv->pcie_reg_base));
-
pr_debug("RX descriptor table: vaddr=0x%p paddr=%pad\n", vaddr, &paddr);

priv->rx_bd_index = 0;
@@ -423,7 +418,7 @@ static int alloc_bd_table(struct qtnf_pcie_bus_priv *priv)
return 0;
}

-static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 rx_bd_index)
+static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 index)
{
struct qtnf_rx_bd *rxbd;
struct sk_buff *skb;
@@ -431,13 +426,12 @@ static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 rx_bd_index)

skb = __netdev_alloc_skb_ip_align(NULL, SKB_BUF_SIZE, GFP_ATOMIC);
if (!skb) {
- priv->rx_skb[rx_bd_index] = NULL;
+ priv->rx_skb[index] = NULL;
return -ENOMEM;
}

- priv->rx_skb[rx_bd_index] = skb;
-
- rxbd = &priv->rx_bd_vbase[rx_bd_index];
+ priv->rx_skb[index] = skb;
+ rxbd = &priv->rx_bd_vbase[index];

paddr = pci_map_single(priv->pdev, skb->data,
SKB_BUF_SIZE, PCI_DMA_FROMDEVICE);
@@ -446,17 +440,20 @@ static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 rx_bd_index)
return -ENOMEM;
}

- writel(QTN_HOST_LO32(paddr),
- PCIE_HDP_HHBM_BUF_PTR(priv->pcie_reg_base));
- writel(QTN_HOST_HI32(paddr),
- PCIE_HDP_HHBM_BUF_PTR_H(priv->pcie_reg_base));
-
/* keep rx skb paddrs in rx buffer descriptors for cleanup purposes */
rxbd->addr = cpu_to_le32(QTN_HOST_LO32(paddr));
rxbd->addr_h = cpu_to_le32(QTN_HOST_HI32(paddr));
-
rxbd->info = 0x0;

+ /* sync up all descriptor updates */
+ wmb();
+
+ writel(QTN_HOST_HI32(paddr),
+ PCIE_HDP_HHBM_BUF_PTR_H(priv->pcie_reg_base));
+ writel(QTN_HOST_LO32(paddr),
+ PCIE_HDP_HHBM_BUF_PTR(priv->pcie_reg_base));
+
+ writel(index, PCIE_HDP_TX_HOST_Q_WR_PTR(priv->pcie_reg_base));
return 0;
}

@@ -787,8 +784,6 @@ static int qtnf_rx_poll(struct napi_struct *napi, int budget)
break;
}

- writel(priv->hw_txproc_wr_ptr,
- PCIE_HDP_TX_HOST_Q_WR_PTR(priv->pcie_reg_base));
}

if (processed < budget) {
--
2.11.0

2017-08-15 13:07:05

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 6/9] qtnfmac: switch to kernel circ_buf implementation

Current code for both Rx and Tx queue management is a custom and incomplete
circular buffer implementation. It makes a lot of sense to switch to kernel
built-in circ_buf implementation.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
.../net/wireless/quantenna/qtnfmac/pearl/pcie.c | 206 +++++++++++++--------
.../quantenna/qtnfmac/pearl/pcie_bus_priv.h | 10 +-
2 files changed, 136 insertions(+), 80 deletions(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index f18e8a724c68..f8207ab25576 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -25,6 +25,7 @@
#include <linux/completion.h>
#include <linux/crc32.h>
#include <linux/spinlock.h>
+#include <linux/circ_buf.h>

#include "qtn_hw_ids.h"
#include "pcie_bus_priv.h"
@@ -44,10 +45,6 @@ static unsigned int rx_bd_size_param = 256;
module_param(rx_bd_size_param, uint, 0644);
MODULE_PARM_DESC(rx_bd_size_param, "Rx descriptors queue size");

-static unsigned int rx_bd_reserved_param = 16;
-module_param(rx_bd_reserved_param, uint, 0644);
-MODULE_PARM_DESC(rx_bd_reserved_param, "Reserved RX descriptors");
-
static u8 flashboot = 1;
module_param(flashboot, byte, 0644);
MODULE_PARM_DESC(flashboot, "set to 0 to use FW binary file on FS");
@@ -392,9 +389,8 @@ static int alloc_bd_table(struct qtnf_pcie_bus_priv *priv)

pr_debug("TX descriptor table: vaddr=0x%p paddr=%pad\n", vaddr, &paddr);

- priv->tx_bd_reclaim_start = 0;
- priv->tx_bd_index = 0;
- priv->tx_queue_len = 0;
+ priv->tx_bd_r_index = 0;
+ priv->tx_bd_w_index = 0;

/* rx bd */

@@ -413,8 +409,6 @@ static int alloc_bd_table(struct qtnf_pcie_bus_priv *priv)

pr_debug("RX descriptor table: vaddr=0x%p paddr=%pad\n", vaddr, &paddr);

- priv->rx_bd_index = 0;
-
return 0;
}

@@ -445,6 +439,8 @@ static int skb2rbd_attach(struct qtnf_pcie_bus_priv *priv, u16 index)
rxbd->addr_h = cpu_to_le32(QTN_HOST_HI32(paddr));
rxbd->info = 0x0;

+ priv->rx_bd_w_index = index;
+
/* sync up all descriptor updates */
wmb();

@@ -510,6 +506,8 @@ static int qtnf_pcie_init_xfer(struct qtnf_pcie_bus_priv *priv)

priv->tx_bd_num = tx_bd_size_param;
priv->rx_bd_num = rx_bd_size_param;
+ priv->rx_bd_w_index = 0;
+ priv->rx_bd_r_index = 0;

ret = alloc_skb_array(priv);
if (ret) {
@@ -532,67 +530,69 @@ static int qtnf_pcie_init_xfer(struct qtnf_pcie_bus_priv *priv)
return ret;
}

-static int qtnf_pcie_data_tx_reclaim(struct qtnf_pcie_bus_priv *priv)
+static void qtnf_pcie_data_tx_reclaim(struct qtnf_pcie_bus_priv *priv)
{
struct qtnf_tx_bd *txbd;
struct sk_buff *skb;
dma_addr_t paddr;
- int last_sent;
- int count;
+ u32 tx_done_index;
+ int count = 0;
int i;

- last_sent = readl(PCIE_HDP_RX0DMA_CNT(priv->pcie_reg_base))
- % priv->tx_bd_num;
- i = priv->tx_bd_reclaim_start;
- count = 0;

- while (i != last_sent) {
- skb = priv->tx_skb[i];
- if (!skb)
- break;
+ tx_done_index = readl(PCIE_HDP_RX0DMA_CNT(priv->pcie_reg_base))
+ & (priv->tx_bd_num - 1);

- txbd = &priv->tx_bd_vbase[i];
- paddr = QTN_HOST_ADDR(le32_to_cpu(txbd->addr_h),
- le32_to_cpu(txbd->addr));
- pci_unmap_single(priv->pdev, paddr, skb->len, PCI_DMA_TODEVICE);
+ i = priv->tx_bd_r_index;

- if (skb->dev) {
- skb->dev->stats.tx_packets++;
- skb->dev->stats.tx_bytes += skb->len;
+ while (CIRC_CNT(tx_done_index, i, priv->tx_bd_num)) {
+ skb = priv->tx_skb[i];
+ if (likely(skb)) {
+ txbd = &priv->tx_bd_vbase[i];
+ paddr = QTN_HOST_ADDR(le32_to_cpu(txbd->addr_h),
+ le32_to_cpu(txbd->addr));
+ pci_unmap_single(priv->pdev, paddr, skb->len,
+ PCI_DMA_TODEVICE);
+
+ if (skb->dev) {
+ skb->dev->stats.tx_packets++;
+ skb->dev->stats.tx_bytes += skb->len;
+
+ if (netif_queue_stopped(skb->dev))
+ netif_wake_queue(skb->dev);
+ }

- if (netif_queue_stopped(skb->dev))
- netif_wake_queue(skb->dev);
+ dev_kfree_skb_any(skb);
}

- dev_kfree_skb_any(skb);
priv->tx_skb[i] = NULL;
- priv->tx_queue_len--;
count++;

if (++i >= priv->tx_bd_num)
i = 0;
}

- priv->tx_bd_reclaim_start = i;
priv->tx_reclaim_done += count;
priv->tx_reclaim_req++;
+ priv->tx_bd_r_index = i;

- return count;
}

-static bool qtnf_tx_queue_ready(struct qtnf_pcie_bus_priv *priv)
+static int qtnf_tx_queue_ready(struct qtnf_pcie_bus_priv *priv)
{
- if (priv->tx_queue_len >= priv->tx_bd_num - 1) {
+ if (!CIRC_SPACE(priv->tx_bd_w_index, priv->tx_bd_r_index,
+ priv->tx_bd_num)) {
pr_err_ratelimited("reclaim full Tx queue\n");
qtnf_pcie_data_tx_reclaim(priv);

- if (priv->tx_queue_len >= priv->tx_bd_num - 1) {
+ if (!CIRC_SPACE(priv->tx_bd_w_index, priv->tx_bd_r_index,
+ priv->tx_bd_num)) {
priv->tx_full_count++;
- return false;
+ return 0;
}
}

- return true;
+ return 1;
}

static int qtnf_pcie_data_tx(struct qtnf_bus *bus, struct sk_buff *skb)
@@ -617,7 +617,7 @@ static int qtnf_pcie_data_tx(struct qtnf_bus *bus, struct sk_buff *skb)
return NETDEV_TX_BUSY;
}

- i = priv->tx_bd_index;
+ i = priv->tx_bd_w_index;
priv->tx_skb[i] = skb;
len = skb->len;

@@ -649,8 +649,7 @@ static int qtnf_pcie_data_tx(struct qtnf_bus *bus, struct sk_buff *skb)
if (++i >= priv->tx_bd_num)
i = 0;

- priv->tx_bd_index = i;
- priv->tx_queue_len++;
+ priv->tx_bd_w_index = i;

tx_done:
if (ret && skb) {
@@ -709,16 +708,19 @@ static irqreturn_t qtnf_interrupt(int irq, void *data)
return IRQ_HANDLED;
}

-static inline void hw_txproc_wr_ptr_inc(struct qtnf_pcie_bus_priv *priv)
+static int qtnf_rx_data_ready(struct qtnf_pcie_bus_priv *priv)
{
- u32 index;
+ u16 index = priv->rx_bd_r_index;
+ struct qtnf_rx_bd *rxbd;
+ u32 descw;

- index = priv->hw_txproc_wr_ptr;
+ rxbd = &priv->rx_bd_vbase[index];
+ descw = le32_to_cpu(rxbd->info);

- if (++index >= priv->rx_bd_num)
- index = 0;
+ if (descw & QTN_TXDONE_MASK)
+ return 1;

- priv->hw_txproc_wr_ptr = index;
+ return 0;
}

static int qtnf_rx_poll(struct napi_struct *napi, int budget)
@@ -730,26 +732,52 @@ static int qtnf_rx_poll(struct napi_struct *napi, int budget)
int processed = 0;
struct qtnf_rx_bd *rxbd;
dma_addr_t skb_paddr;
+ int consume;
u32 descw;
- u16 index;
+ u32 psize;
+ u16 r_idx;
+ u16 w_idx;
int ret;

- index = priv->rx_bd_index;
- rxbd = &priv->rx_bd_vbase[index];
+ while (processed < budget) {

- descw = le32_to_cpu(rxbd->info);

- while ((descw & QTN_TXDONE_MASK) && (processed < budget)) {
- skb = priv->rx_skb[index];
+ if (!qtnf_rx_data_ready(priv))
+ goto rx_out;

- if (likely(skb)) {
- skb_put(skb, QTN_GET_LEN(descw));
+ r_idx = priv->rx_bd_r_index;
+ rxbd = &priv->rx_bd_vbase[r_idx];
+ descw = le32_to_cpu(rxbd->info);
+
+ skb = priv->rx_skb[r_idx];
+ psize = QTN_GET_LEN(descw);
+ consume = 1;

+ if (!(descw & QTN_TXDONE_MASK)) {
+ pr_warn("skip invalid rxbd[%d]\n", r_idx);
+ consume = 0;
+ }
+
+ if (!skb) {
+ pr_warn("skip missing rx_skb[%d]\n", r_idx);
+ consume = 0;
+ }
+
+ if (skb && (skb_tailroom(skb) < psize)) {
+ pr_err("skip packet with invalid length: %u > %u\n",
+ psize, skb_tailroom(skb));
+ consume = 0;
+ }
+
+ if (skb) {
skb_paddr = QTN_HOST_ADDR(le32_to_cpu(rxbd->addr_h),
le32_to_cpu(rxbd->addr));
pci_unmap_single(priv->pdev, skb_paddr, SKB_BUF_SIZE,
PCI_DMA_FROMDEVICE);
+ }

+ if (consume) {
+ skb_put(skb, psize);
ndev = qtnf_classify_skb(bus, skb);
if (likely(ndev)) {
ndev->stats.rx_packets++;
@@ -762,30 +790,38 @@ static int qtnf_rx_poll(struct napi_struct *napi, int budget)
bus->mux_dev.stats.rx_dropped++;
dev_kfree_skb_any(skb);
}
-
- processed++;
} else {
- pr_err("missing rx_skb[%d]\n", index);
+ if (skb) {
+ bus->mux_dev.stats.rx_dropped++;
+ dev_kfree_skb_any(skb);
+ }
}

- /* attached rx buffer is passed upstream: map a new one */
- ret = skb2rbd_attach(priv, index);
- if (likely(!ret)) {
- if (++index >= priv->rx_bd_num)
- index = 0;
+ priv->rx_skb[r_idx] = NULL;
+ if (++r_idx >= priv->rx_bd_num)
+ r_idx = 0;

- priv->rx_bd_index = index;
- hw_txproc_wr_ptr_inc(priv);
+ priv->rx_bd_r_index = r_idx;

- rxbd = &priv->rx_bd_vbase[index];
- descw = le32_to_cpu(rxbd->info);
- } else {
- pr_err("failed to allocate new rx_skb[%d]\n", index);
- break;
+ /* repalce processed buffer by a new one */
+ w_idx = priv->rx_bd_w_index;
+ while (CIRC_SPACE(priv->rx_bd_w_index, priv->rx_bd_r_index,
+ priv->rx_bd_num) > 0) {
+ if (++w_idx >= priv->rx_bd_num)
+ w_idx = 0;
+
+ ret = skb2rbd_attach(priv, w_idx);
+ if (ret) {
+ pr_err("failed to allocate new rx_skb[%d]\n",
+ w_idx);
+ break;
+ }
}

+ processed++;
}

+rx_out:
if (processed < budget) {
napi_complete(napi);
qtnf_en_rxdone_irq(priv);
@@ -1056,10 +1092,18 @@ static int qtnf_dbg_irq_stats(struct seq_file *s, void *data)
{
struct qtnf_bus *bus = dev_get_drvdata(s->private);
struct qtnf_pcie_bus_priv *priv = get_bus_priv(bus);
+ u32 reg = readl(PCIE_HDP_INT_EN(priv->pcie_reg_base));
+ u32 status;

seq_printf(s, "pcie_irq_count(%u)\n", priv->pcie_irq_count);
seq_printf(s, "pcie_irq_tx_count(%u)\n", priv->pcie_irq_tx_count);
+ status = reg & PCIE_HDP_INT_TX_BITS;
+ seq_printf(s, "pcie_irq_tx_status(%s)\n",
+ (status == PCIE_HDP_INT_TX_BITS) ? "EN" : "DIS");
seq_printf(s, "pcie_irq_rx_count(%u)\n", priv->pcie_irq_rx_count);
+ status = reg & PCIE_HDP_INT_RX_BITS;
+ seq_printf(s, "pcie_irq_rx_status(%s)\n",
+ (status == PCIE_HDP_INT_RX_BITS) ? "EN" : "DIS");

return 0;
}
@@ -1073,10 +1117,24 @@ static int qtnf_dbg_hdp_stats(struct seq_file *s, void *data)
seq_printf(s, "tx_done_count(%u)\n", priv->tx_done_count);
seq_printf(s, "tx_reclaim_done(%u)\n", priv->tx_reclaim_done);
seq_printf(s, "tx_reclaim_req(%u)\n", priv->tx_reclaim_req);
- seq_printf(s, "tx_bd_reclaim_start(%u)\n", priv->tx_bd_reclaim_start);
- seq_printf(s, "tx_bd_index(%u)\n", priv->tx_bd_index);
- seq_printf(s, "rx_bd_index(%u)\n", priv->rx_bd_index);
- seq_printf(s, "tx_queue_len(%u)\n", priv->tx_queue_len);
+
+ seq_printf(s, "tx_bd_r_index(%u)\n", priv->tx_bd_r_index);
+ seq_printf(s, "tx_bd_p_index(%u)\n",
+ readl(PCIE_HDP_RX0DMA_CNT(priv->pcie_reg_base))
+ & (priv->tx_bd_num - 1));
+ seq_printf(s, "tx_bd_w_index(%u)\n", priv->tx_bd_w_index);
+ seq_printf(s, "tx queue len(%u)\n",
+ CIRC_CNT(priv->tx_bd_w_index, priv->tx_bd_r_index,
+ priv->tx_bd_num));
+
+ seq_printf(s, "rx_bd_r_index(%u)\n", priv->rx_bd_r_index);
+ seq_printf(s, "rx_bd_p_index(%u)\n",
+ readl(PCIE_HDP_TX0DMA_CNT(priv->pcie_reg_base))
+ & (priv->rx_bd_num - 1));
+ seq_printf(s, "rx_bd_w_index(%u)\n", priv->rx_bd_w_index);
+ seq_printf(s, "rx alloc queue len(%u)\n",
+ CIRC_SPACE(priv->rx_bd_w_index, priv->rx_bd_r_index,
+ priv->rx_bd_num));

return 0;
}
diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
index 2a897db2bd79..1b37914299e9 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie_bus_priv.h
@@ -66,13 +66,11 @@ struct qtnf_pcie_bus_priv {
void *bd_table_vaddr;
u32 bd_table_len;

- u32 hw_txproc_wr_ptr;
+ u32 rx_bd_w_index;
+ u32 rx_bd_r_index;

- u16 tx_bd_reclaim_start;
- u16 tx_bd_index;
- u32 tx_queue_len;
-
- u16 rx_bd_index;
+ u32 tx_bd_w_index;
+ u32 tx_bd_r_index;

u32 pcie_irq_mask;

--
2.11.0

2017-08-15 13:06:56

by Sergey Matyukevich

[permalink] [raw]
Subject: [PATCH 2/9] qtnfmac: switch to napi_gro_receive

Use napi_gro_receive() rather than netif_receive_skb() in qtnfmac driver.

Signed-off-by: Sergey Matyukevich <[email protected]>
---
drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
index ae8acc1bf291..08b35dc30bc8 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/pearl/pcie.c
@@ -762,7 +762,7 @@ static int qtnf_rx_poll(struct napi_struct *napi, int budget)
ndev->stats.rx_bytes += skb->len;

skb->protocol = eth_type_trans(skb, ndev);
- netif_receive_skb(skb);
+ napi_gro_receive(napi, skb);
} else {
pr_debug("drop untagged skb\n");
bus->mux_dev.stats.rx_dropped++;
--
2.11.0