2024-05-28 13:49:29

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 00/12] idpf: XDP chapter I: convert Rx to libeth

XDP for idpf is currently 5 chapters:
* convert Rx to libeth (this);
* convert Tx and stats to libeth;
* generic XDP and XSk code changes, libeth_xdp;
* actual XDP for idpf via libeth_xdp;
* XSk for idpf (^).

Part I does the following:
* splits &idpf_queue into 4 (RQ, SQ, FQ, CQ) and puts them on a diet;
* ensures optimal cacheline placement, strictly asserts CL sizes;
* moves currently unused/dead singleq mode out of line;
* reuses libeth's Rx ptype definitions and helpers;
* uses libeth's Rx buffer management for both header and payload;
* eliminates memcpy()s and coherent DMA uses on hotpath, uses
napi_build_skb() instead of in-place short skb allocation.

Most idpf patches, except for the queue split, removes more lines
than adds.
Expect far better memory utilization and +5-8% on Rx depending on
the case (+17% on skb XDP_DROP :>).

Alexander Lobakin (12):
libeth: add cacheline / struct alignment helpers
idpf: stop using macros for accessing queue descriptors
idpf: split &idpf_queue into 4 strictly-typed queue structures
idpf: avoid bloating &idpf_q_vector with big %NR_CPUS
idpf: strictly assert cachelines of queue and queue vector structures
idpf: merge singleq and splitq &net_device_ops
idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ
idpf: reuse libeth's definitions of parsed ptype structures
idpf: remove legacy Page Pool Ethtool stats
libeth: support different types of buffers for Rx
idpf: convert header split mode to libeth + napi_build_skb()
idpf: use libeth Rx buffer management for payload buffer

drivers/net/ethernet/intel/Kconfig | 13 +-
drivers/net/ethernet/intel/idpf/Kconfig | 26 +
drivers/net/ethernet/intel/idpf/Makefile | 3 +-
scripts/kernel-doc | 1 +
drivers/net/ethernet/intel/idpf/idpf.h | 11 +-
.../net/ethernet/intel/idpf/idpf_lan_txrx.h | 2 +
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 800 +++++-----
include/net/libeth/cache.h | 100 ++
include/net/libeth/rx.h | 19 +
.../net/ethernet/intel/idpf/idpf_ethtool.c | 152 +-
drivers/net/ethernet/intel/idpf/idpf_lib.c | 88 +-
drivers/net/ethernet/intel/idpf/idpf_main.c | 1 +
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 311 ++--
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 1410 +++++++++--------
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 178 ++-
drivers/net/ethernet/intel/libeth/rx.c | 132 +-
16 files changed, 1824 insertions(+), 1423 deletions(-)
create mode 100644 drivers/net/ethernet/intel/idpf/Kconfig
create mode 100644 include/net/libeth/cache.h

---
Applies on top of "idpf: don't enable NAPI and interrupts prior to
allocating Rx buffers" from the -net tree, otherwise may work unstable.

From RFC[0]:
* *: add kdocs where needed and fix the existing ones to build cleanly;
fix minor checkpatch and codespell warnings;
add RBs from Przemek;
* 01: fix kdoc script to understand new libeth_cacheline_group() macro;
add an additional assert for queue struct alignment;
* 02: pick RB from Mina;
* 06: make idpf_chk_linearize() static as it's now used only in one file;
* 07: rephrase the commitmsg: HW supports it, but never wants;
* 08: fix crashes on some configurations (Mina);
* 11: constify header buffer pointer in idpf_rx_hsplit_wa().

Testing hints: basic Rx regression tests (+ perf and memory usage
before/after if needed).

Note that I'm on a vacation starting 30th May and return 10th of June.
I may sometimes reply, but no new versions in the meantime =\

[0] https://lore.kernel.org/netdev/[email protected]
--
2.45.1



2024-05-28 13:49:46

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers

Following the latest netdev trend, i.e. effective and usage-based field
cacheline placement, add helpers to group and then assert struct fields
by cachelines.
For 64-bit with 64-byte cachelines, the assertions are more strict as
the size can then be easily predicted. For the rest, just make sure
they don't cross the specified bound.

Signed-off-by: Alexander Lobakin <[email protected]>
---
scripts/kernel-doc | 1 +
include/net/libeth/cache.h | 100 +++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+)
create mode 100644 include/net/libeth/cache.h

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 95a59ac78f82..d0cf9a2d82de 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1155,6 +1155,7 @@ sub dump_struct($$) {
$members =~ s/\bstruct_group_attr\s*\(([^,]*,){2}/STRUCT_GROUP(/gos;
$members =~ s/\bstruct_group_tagged\s*\(([^,]*),([^,]*),/struct $1 $2; STRUCT_GROUP(/gos;
$members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos;
+ $members =~ s/\blibeth_cacheline_group\s*\(([^,]*,)/struct { } $1; STRUCT_GROUP(/gos;
$members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;

my $args = qr{([^,)]+)};
diff --git a/include/net/libeth/cache.h b/include/net/libeth/cache.h
new file mode 100644
index 000000000000..5579240913d2
--- /dev/null
+++ b/include/net/libeth/cache.h
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (C) 2024 Intel Corporation */
+
+#ifndef __LIBETH_CACHE_H
+#define __LIBETH_CACHE_H
+
+#include <linux/cache.h>
+
+/* ``__aligned_largest`` is architecture-dependent. Get the actual alignment */
+#define ___LIBETH_LARGEST_ALIGN \
+ sizeof(struct { long __UNIQUE_ID(long_); } __aligned_largest)
+#define __LIBETH_LARGEST_ALIGN \
+ (___LIBETH_LARGEST_ALIGN > SMP_CACHE_BYTES ? \
+ ___LIBETH_LARGEST_ALIGN : SMP_CACHE_BYTES)
+#define __LIBETH_LARGEST_ALIGNED(sz) \
+ ALIGN(sz, __LIBETH_LARGEST_ALIGN)
+
+#define __libeth_cacheline_group_begin(grp) \
+ __cacheline_group_begin(grp) __aligned(__LIBETH_LARGEST_ALIGN)
+#define __libeth_cacheline_group_end(grp) \
+ __cacheline_group_end(grp) __aligned(sizeof(long))
+
+/**
+ * libeth_cacheline_group - declare a cacheline-aligned field group
+ * @grp: name of the group (usually 'read_mostly', 'read_write', or 'cold')
+ * @...: struct fields inside the group
+ *
+ * Note that the whole group is cacheline-aligned, but the end marker is
+ * aligned to long, so that you pass the (almost) actual field size sum to
+ * the assertion macros below instead of CL-aligned values.
+ * Each cacheline group must be described in struct's kernel-doc.
+ */
+#define libeth_cacheline_group(grp, ...) \
+ struct_group(grp, \
+ __libeth_cacheline_group_begin(grp); \
+ __VA_ARGS__ \
+ __libeth_cacheline_group_end(grp); \
+ )
+
+/**
+ * libeth_cacheline_group_assert - make sure cacheline group size is expected
+ * @type: type of the structure containing the group
+ * @grp: group name inside the struct
+ * @sz: expected group size
+ */
+#if defined(CONFIG_64BIT) && SMP_CACHE_BYTES == 64
+#define libeth_cacheline_group_assert(type, grp, sz) \
+ static_assert(offsetof(type, __cacheline_group_end__##grp) - \
+ offsetofend(type, __cacheline_group_begin__##grp) == \
+ (sz))
+#define __libeth_cacheline_struct_assert(type, sz) \
+ static_assert(sizeof(type) == (sz))
+#else /* !CONFIG_64BIT || SMP_CACHE_BYTES != 64 */
+#define libeth_cacheline_group_assert(type, grp, sz) \
+ static_assert(offsetof(type, __cacheline_group_end__##grp) - \
+ offsetofend(type, __cacheline_group_begin__##grp) <= \
+ (sz))
+#define __libeth_cacheline_struct_assert(type, sz) \
+ static_assert(sizeof(type) <= (sz))
+#endif /* !CONFIG_64BIT || SMP_CACHE_BYTES != 64 */
+
+#define __libeth_cls1(sz1) \
+ __LIBETH_LARGEST_ALIGNED(sz1)
+#define __libeth_cls2(sz1, sz2) \
+ (__LIBETH_LARGEST_ALIGNED(sz1) + __LIBETH_LARGEST_ALIGNED(sz2))
+#define __libeth_cls3(sz1, sz2, sz3) \
+ (__LIBETH_LARGEST_ALIGNED(sz1) + __LIBETH_LARGEST_ALIGNED(sz2) + \
+ __LIBETH_LARGEST_ALIGNED(sz3))
+#define __libeth_cls(...) \
+ CONCATENATE(__libeth_cls, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__)
+
+/**
+ * libeth_cacheline_struct_assert - make sure CL-based struct size is expected
+ * @type: type of the struct
+ * @...: from 1 to 3 CL group sizes (read-mostly, read-write, cold)
+ *
+ * When a struct contains several CL groups, it's difficult to predict its size
+ * on different architectures. The macro instead takes sizes of all of the
+ * groups the structure contains and generates the final struct size.
+ */
+#define libeth_cacheline_struct_assert(type, ...) \
+ __libeth_cacheline_struct_assert(type, __libeth_cls(__VA_ARGS__)); \
+ static_assert(__alignof(type) >= __LIBETH_LARGEST_ALIGN)
+
+/**
+ * libeth_cacheline_set_assert - make sure CL-based struct layout is expected
+ * @type: type of the struct
+ * @ro: expected size of the read-mostly group
+ * @rw: expected size of the read-write group
+ * @c: expected size of the cold group
+ *
+ * Check that each group size is expected and then do final struct size check.
+ */
+#define libeth_cacheline_set_assert(type, ro, rw, c) \
+ libeth_cacheline_group_assert(type, read_mostly, ro); \
+ libeth_cacheline_group_assert(type, read_write, rw); \
+ libeth_cacheline_group_assert(type, cold, c); \
+ libeth_cacheline_struct_assert(type, ro, rw, c)
+
+#endif /* __LIBETH_CACHE_H */
--
2.45.1


2024-05-28 13:50:06

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 02/12] idpf: stop using macros for accessing queue descriptors

In C, we have structures and unions.
Casting `void *` via macros is not only error-prone, but also looks
confusing and awful in general.
In preparation for splitting the queue structs, replace it with a
union and direct array dereferences.

Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf.h | 1 -
.../net/ethernet/intel/idpf/idpf_lan_txrx.h | 2 +
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 47 ++++++++++---------
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 20 ++++----
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 32 ++++++-------
5 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index e7a036538246..0b26dd9b8a51 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -20,7 +20,6 @@ struct idpf_vport_max_q;
#include <linux/dim.h>

#include "virtchnl2.h"
-#include "idpf_lan_txrx.h"
#include "idpf_txrx.h"
#include "idpf_controlq.h"

diff --git a/drivers/net/ethernet/intel/idpf/idpf_lan_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_lan_txrx.h
index a5752dcab888..8c7f8ef8f1a1 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lan_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_lan_txrx.h
@@ -4,6 +4,8 @@
#ifndef _IDPF_LAN_TXRX_H_
#define _IDPF_LAN_TXRX_H_

+#include <linux/bits.h>
+
enum idpf_rss_hash {
IDPF_HASH_INVALID = 0,
/* Values 1 - 28 are reserved for future use */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 551391e20464..6dce14483215 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -8,6 +8,7 @@
#include <net/tcp.h>
#include <net/netdev_queues.h>

+#include "idpf_lan_txrx.h"
#include "virtchnl2_lan_desc.h"

#define IDPF_LARGE_MAX_Q 256
@@ -117,24 +118,6 @@ do { \
#define IDPF_RXD_EOF_SPLITQ VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_EOF_M
#define IDPF_RXD_EOF_SINGLEQ VIRTCHNL2_RX_BASE_DESC_STATUS_EOF_M

-#define IDPF_SINGLEQ_RX_BUF_DESC(rxq, i) \
- (&(((struct virtchnl2_singleq_rx_buf_desc *)((rxq)->desc_ring))[i]))
-#define IDPF_SPLITQ_RX_BUF_DESC(rxq, i) \
- (&(((struct virtchnl2_splitq_rx_buf_desc *)((rxq)->desc_ring))[i]))
-#define IDPF_SPLITQ_RX_BI_DESC(rxq, i) ((((rxq)->ring))[i])
-
-#define IDPF_BASE_TX_DESC(txq, i) \
- (&(((struct idpf_base_tx_desc *)((txq)->desc_ring))[i]))
-#define IDPF_BASE_TX_CTX_DESC(txq, i) \
- (&(((struct idpf_base_tx_ctx_desc *)((txq)->desc_ring))[i]))
-#define IDPF_SPLITQ_TX_COMPLQ_DESC(txcq, i) \
- (&(((struct idpf_splitq_tx_compl_desc *)((txcq)->desc_ring))[i]))
-
-#define IDPF_FLEX_TX_DESC(txq, i) \
- (&(((union idpf_tx_flex_desc *)((txq)->desc_ring))[i]))
-#define IDPF_FLEX_TX_CTX_DESC(txq, i) \
- (&(((struct idpf_flex_tx_ctx_desc *)((txq)->desc_ring))[i]))
-
#define IDPF_DESC_UNUSED(txq) \
((((txq)->next_to_clean > (txq)->next_to_use) ? 0 : (txq)->desc_count) + \
(txq)->next_to_clean - (txq)->next_to_use - 1)
@@ -317,8 +300,6 @@ struct idpf_rx_extracted {

#define IDPF_RX_DMA_ATTR \
(DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING)
-#define IDPF_RX_DESC(rxq, i) \
- (&(((union virtchnl2_rx_desc *)((rxq)->desc_ring))[i]))

struct idpf_rx_buf {
struct page *page;
@@ -655,7 +636,15 @@ union idpf_queue_stats {
* @q_vector: Backreference to associated vector
* @size: Length of descriptor ring in bytes
* @dma: Physical address of ring
- * @desc_ring: Descriptor ring memory
+ * @rx: universal receive descriptor array
+ * @single_buf: Rx buffer descriptor array in singleq
+ * @split_buf: Rx buffer descriptor array in splitq
+ * @base_tx: basic Tx descriptor array
+ * @base_ctx: basic Tx context descriptor array
+ * @flex_tx: flex Tx descriptor array
+ * @flex_ctx: flex Tx context descriptor array
+ * @comp: completion descriptor array
+ * @desc_ring: virtual descriptor ring address
* @tx_max_bufs: Max buffers that can be transmitted with scatter-gather
* @tx_min_pkt_len: Min supported packet length
* @num_completions: Only relevant for TX completion queue. It tracks the
@@ -733,7 +722,21 @@ struct idpf_queue {
struct idpf_q_vector *q_vector;
unsigned int size;
dma_addr_t dma;
- void *desc_ring;
+ union {
+ union virtchnl2_rx_desc *rx;
+
+ struct virtchnl2_singleq_rx_buf_desc *single_buf;
+ struct virtchnl2_splitq_rx_buf_desc *split_buf;
+
+ struct idpf_base_tx_desc *base_tx;
+ struct idpf_base_tx_ctx_desc *base_ctx;
+ union idpf_tx_flex_desc *flex_tx;
+ struct idpf_flex_tx_ctx_desc *flex_ctx;
+
+ struct idpf_splitq_tx_compl_desc *comp;
+
+ void *desc_ring;
+ };

u16 tx_max_bufs;
u8 tx_min_pkt_len;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
index 27b93592c4ba..b17d88e15000 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
@@ -205,7 +205,7 @@ static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
data_len = skb->data_len;
size = skb_headlen(skb);

- tx_desc = IDPF_BASE_TX_DESC(tx_q, i);
+ tx_desc = &tx_q->base_tx[i];

dma = dma_map_single(tx_q->dev, skb->data, size, DMA_TO_DEVICE);

@@ -239,7 +239,7 @@ static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
i++;

if (i == tx_q->desc_count) {
- tx_desc = IDPF_BASE_TX_DESC(tx_q, 0);
+ tx_desc = &tx_q->base_tx[0];
i = 0;
}

@@ -259,7 +259,7 @@ static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
i++;

if (i == tx_q->desc_count) {
- tx_desc = IDPF_BASE_TX_DESC(tx_q, 0);
+ tx_desc = &tx_q->base_tx[0];
i = 0;
}

@@ -307,7 +307,7 @@ idpf_tx_singleq_get_ctx_desc(struct idpf_queue *txq)
memset(&txq->tx_buf[ntu], 0, sizeof(struct idpf_tx_buf));
txq->tx_buf[ntu].ctx_entry = true;

- ctx_desc = IDPF_BASE_TX_CTX_DESC(txq, ntu);
+ ctx_desc = &txq->base_ctx[ntu];

IDPF_SINGLEQ_BUMP_RING_IDX(txq, ntu);
txq->next_to_use = ntu;
@@ -455,7 +455,7 @@ static bool idpf_tx_singleq_clean(struct idpf_queue *tx_q, int napi_budget,
struct netdev_queue *nq;
bool dont_wake;

- tx_desc = IDPF_BASE_TX_DESC(tx_q, ntc);
+ tx_desc = &tx_q->base_tx[ntc];
tx_buf = &tx_q->tx_buf[ntc];
ntc -= tx_q->desc_count;

@@ -517,7 +517,7 @@ static bool idpf_tx_singleq_clean(struct idpf_queue *tx_q, int napi_budget,
if (unlikely(!ntc)) {
ntc -= tx_q->desc_count;
tx_buf = tx_q->tx_buf;
- tx_desc = IDPF_BASE_TX_DESC(tx_q, 0);
+ tx_desc = &tx_q->base_tx[0];
}

/* unmap any remaining paged data */
@@ -540,7 +540,7 @@ static bool idpf_tx_singleq_clean(struct idpf_queue *tx_q, int napi_budget,
if (unlikely(!ntc)) {
ntc -= tx_q->desc_count;
tx_buf = tx_q->tx_buf;
- tx_desc = IDPF_BASE_TX_DESC(tx_q, 0);
+ tx_desc = &tx_q->base_tx[0];
}
} while (likely(budget));

@@ -895,7 +895,7 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,
if (!cleaned_count)
return false;

- desc = IDPF_SINGLEQ_RX_BUF_DESC(rx_q, nta);
+ desc = &rx_q->single_buf[nta];
buf = &rx_q->rx_buf.buf[nta];

do {
@@ -915,7 +915,7 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,
buf++;
nta++;
if (unlikely(nta == rx_q->desc_count)) {
- desc = IDPF_SINGLEQ_RX_BUF_DESC(rx_q, 0);
+ desc = &rx_q->single_buf[0];
buf = rx_q->rx_buf.buf;
nta = 0;
}
@@ -1016,7 +1016,7 @@ static int idpf_rx_singleq_clean(struct idpf_queue *rx_q, int budget)
struct idpf_rx_buf *rx_buf;

/* get the Rx desc from Rx queue based on 'next_to_clean' */
- rx_desc = IDPF_RX_DESC(rx_q, ntc);
+ rx_desc = &rx_q->rx[ntc];

/* status_error_ptype_len will always be zero for unused
* descriptors because it's cleared in cleanup, and overlaps
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index b023704bbbda..01301e640fba 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -531,7 +531,7 @@ static bool idpf_rx_post_buf_desc(struct idpf_queue *bufq, u16 buf_id)
struct idpf_rx_buf *buf;
dma_addr_t addr;

- splitq_rx_desc = IDPF_SPLITQ_RX_BUF_DESC(bufq, nta);
+ splitq_rx_desc = &bufq->split_buf[nta];
buf = &bufq->rx_buf.buf[buf_id];

if (bufq->rx_hsplit_en) {
@@ -1584,7 +1584,7 @@ do { \
if (unlikely(!(ntc))) { \
ntc -= (txq)->desc_count; \
buf = (txq)->tx_buf; \
- desc = IDPF_FLEX_TX_DESC(txq, 0); \
+ desc = &(txq)->flex_tx[0]; \
} else { \
(buf)++; \
(desc)++; \
@@ -1617,8 +1617,8 @@ static void idpf_tx_splitq_clean(struct idpf_queue *tx_q, u16 end,
s16 ntc = tx_q->next_to_clean;
struct idpf_tx_buf *tx_buf;

- tx_desc = IDPF_FLEX_TX_DESC(tx_q, ntc);
- next_pending_desc = IDPF_FLEX_TX_DESC(tx_q, end);
+ tx_desc = &tx_q->flex_tx[ntc];
+ next_pending_desc = &tx_q->flex_tx[end];
tx_buf = &tx_q->tx_buf[ntc];
ntc -= tx_q->desc_count;

@@ -1814,7 +1814,7 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
int i;

complq_budget = vport->compln_clean_budget;
- tx_desc = IDPF_SPLITQ_TX_COMPLQ_DESC(complq, ntc);
+ tx_desc = &complq->comp[ntc];
ntc -= complq->desc_count;

do {
@@ -1879,7 +1879,7 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
ntc++;
if (unlikely(!ntc)) {
ntc -= complq->desc_count;
- tx_desc = IDPF_SPLITQ_TX_COMPLQ_DESC(complq, 0);
+ tx_desc = &complq->comp[0];
change_bit(__IDPF_Q_GEN_CHK, complq->flags);
}

@@ -2143,7 +2143,7 @@ void idpf_tx_dma_map_error(struct idpf_queue *txq, struct sk_buff *skb,
* used one additional descriptor for a context
* descriptor. Reset that here.
*/
- tx_desc = IDPF_FLEX_TX_DESC(txq, idx);
+ tx_desc = &txq->flex_tx[idx];
memset(tx_desc, 0, sizeof(struct idpf_flex_tx_ctx_desc));
if (idx == 0)
idx = txq->desc_count;
@@ -2202,7 +2202,7 @@ static void idpf_tx_splitq_map(struct idpf_queue *tx_q,
data_len = skb->data_len;
size = skb_headlen(skb);

- tx_desc = IDPF_FLEX_TX_DESC(tx_q, i);
+ tx_desc = &tx_q->flex_tx[i];

dma = dma_map_single(tx_q->dev, skb->data, size, DMA_TO_DEVICE);

@@ -2275,7 +2275,7 @@ static void idpf_tx_splitq_map(struct idpf_queue *tx_q,
i++;

if (i == tx_q->desc_count) {
- tx_desc = IDPF_FLEX_TX_DESC(tx_q, 0);
+ tx_desc = &tx_q->flex_tx[0];
i = 0;
tx_q->compl_tag_cur_gen =
IDPF_TX_ADJ_COMPL_TAG_GEN(tx_q);
@@ -2320,7 +2320,7 @@ static void idpf_tx_splitq_map(struct idpf_queue *tx_q,
i++;

if (i == tx_q->desc_count) {
- tx_desc = IDPF_FLEX_TX_DESC(tx_q, 0);
+ tx_desc = &tx_q->flex_tx[0];
i = 0;
tx_q->compl_tag_cur_gen = IDPF_TX_ADJ_COMPL_TAG_GEN(tx_q);
}
@@ -2553,7 +2553,7 @@ idpf_tx_splitq_get_ctx_desc(struct idpf_queue *txq)
txq->tx_buf[i].compl_tag = IDPF_SPLITQ_TX_INVAL_COMPL_TAG;

/* grab the next descriptor */
- desc = IDPF_FLEX_TX_CTX_DESC(txq, i);
+ desc = &txq->flex_ctx[i];
txq->next_to_use = idpf_tx_splitq_bump_ntu(txq, i);

return desc;
@@ -3128,7 +3128,6 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
struct idpf_sw_queue *refillq = NULL;
struct idpf_rxq_set *rxq_set = NULL;
struct idpf_rx_buf *rx_buf = NULL;
- union virtchnl2_rx_desc *desc;
unsigned int pkt_len = 0;
unsigned int hdr_len = 0;
u16 gen_id, buf_id = 0;
@@ -3138,8 +3137,7 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
u8 rxdid;

/* get the Rx desc from Rx queue based on 'next_to_clean' */
- desc = IDPF_RX_DESC(rxq, ntc);
- rx_desc = (struct virtchnl2_rx_flex_desc_adv_nic_3 *)desc;
+ rx_desc = &rxq->rx[ntc].flex_adv_nic_3_wb;

/* This memory barrier is needed to keep us from reading
* any other fields out of the rx_desc
@@ -3320,11 +3318,11 @@ static void idpf_rx_clean_refillq(struct idpf_queue *bufq,
int cleaned = 0;
u16 gen;

- buf_desc = IDPF_SPLITQ_RX_BUF_DESC(bufq, bufq_nta);
+ buf_desc = &bufq->split_buf[bufq_nta];

/* make sure we stop at ring wrap in the unlikely case ring is full */
while (likely(cleaned < refillq->desc_count)) {
- u16 refill_desc = IDPF_SPLITQ_RX_BI_DESC(refillq, ntc);
+ u16 refill_desc = refillq->ring[ntc];
bool failure;

gen = FIELD_GET(IDPF_RX_BI_GEN_M, refill_desc);
@@ -3342,7 +3340,7 @@ static void idpf_rx_clean_refillq(struct idpf_queue *bufq,
}

if (unlikely(++bufq_nta == bufq->desc_count)) {
- buf_desc = IDPF_SPLITQ_RX_BUF_DESC(bufq, 0);
+ buf_desc = &bufq->split_buf[0];
bufq_nta = 0;
} else {
buf_desc++;
--
2.45.1


2024-05-28 13:50:23

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 04/12] idpf: avoid bloating &idpf_q_vector with big %NR_CPUS

With CONFIG_MAXSMP, sizeof(cpumask_t) is 1 Kb. The queue vector
structure has them embedded, which means 1 additional Kb of not
really hotpath data.
We have cpumask_var_t, which is either an embedded cpumask or a pointer
for allocating it dynamically when it's big. Use it instead of plain
cpumasks and put &idpf_q_vector on a good diet.
Also remove redundant pointer to the interrupt name from the structure.
request_irq() saves it and free_irq() returns it on deinit, so that you
can free the memory.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 7 +++----
drivers/net/ethernet/intel/idpf/idpf_lib.c | 13 ++++++-------
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 20 +++++++++++++-------
3 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index ce4ade9c864f..9e4ba0aaf3ab 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -505,7 +505,6 @@ struct idpf_intr_reg {
/**
* struct idpf_q_vector
* @vport: Vport back pointer
- * @affinity_mask: CPU affinity mask
* @napi: napi handler
* @v_idx: Vector index
* @intr_reg: See struct idpf_intr_reg
@@ -526,11 +525,10 @@ struct idpf_intr_reg {
* @num_bufq: Number of buffer queues
* @bufq: Array of buffer queues to service
* @total_events: Number of interrupts processed
- * @name: Queue vector name
+ * @affinity_mask: CPU affinity mask
*/
struct idpf_q_vector {
struct idpf_vport *vport;
- cpumask_t affinity_mask;
struct napi_struct napi;
u16 v_idx;
struct idpf_intr_reg intr_reg;
@@ -556,7 +554,8 @@ struct idpf_q_vector {
struct idpf_buf_queue **bufq;

u16 total_events;
- char *name;
+
+ cpumask_var_t affinity_mask;
};

struct idpf_rx_queue_stats {
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index 3e8b24430dd8..a8be09a89943 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -69,7 +69,7 @@ static void idpf_deinit_vector_stack(struct idpf_adapter *adapter)
static void idpf_mb_intr_rel_irq(struct idpf_adapter *adapter)
{
clear_bit(IDPF_MB_INTR_MODE, adapter->flags);
- free_irq(adapter->msix_entries[0].vector, adapter);
+ kfree(free_irq(adapter->msix_entries[0].vector, adapter));
queue_delayed_work(adapter->mbx_wq, &adapter->mbx_task, 0);
}

@@ -124,15 +124,14 @@ static void idpf_mb_irq_enable(struct idpf_adapter *adapter)
*/
static int idpf_mb_intr_req_irq(struct idpf_adapter *adapter)
{
- struct idpf_q_vector *mb_vector = &adapter->mb_vector;
int irq_num, mb_vidx = 0, err;
+ char *name;

irq_num = adapter->msix_entries[mb_vidx].vector;
- mb_vector->name = kasprintf(GFP_KERNEL, "%s-%s-%d",
- dev_driver_string(&adapter->pdev->dev),
- "Mailbox", mb_vidx);
- err = request_irq(irq_num, adapter->irq_mb_handler, 0,
- mb_vector->name, adapter);
+ name = kasprintf(GFP_KERNEL, "%s-%s-%d",
+ dev_driver_string(&adapter->pdev->dev),
+ "Mailbox", mb_vidx);
+ err = request_irq(irq_num, adapter->irq_mb_handler, 0, name, adapter);
if (err) {
dev_err(&adapter->pdev->dev,
"IRQ request for mailbox failed, error: %d\n", err);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 7be5a723f558..f569ea389b04 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -3613,6 +3613,8 @@ void idpf_vport_intr_rel(struct idpf_vport *vport)
q_vector->tx = NULL;
kfree(q_vector->rx);
q_vector->rx = NULL;
+
+ free_cpumask_var(q_vector->affinity_mask);
}

/* Clean up the mapping of queues to vectors */
@@ -3661,7 +3663,7 @@ static void idpf_vport_intr_rel_irq(struct idpf_vport *vport)

/* clear the affinity_mask in the IRQ descriptor */
irq_set_affinity_hint(irq_num, NULL);
- free_irq(irq_num, q_vector);
+ kfree(free_irq(irq_num, q_vector));
}
}

@@ -3812,6 +3814,7 @@ static int idpf_vport_intr_req_irq(struct idpf_vport *vport, char *basename)

for (vector = 0; vector < vport->num_q_vectors; vector++) {
struct idpf_q_vector *q_vector = &vport->q_vectors[vector];
+ char *name;

vidx = vport->q_vector_idxs[vector];
irq_num = adapter->msix_entries[vidx].vector;
@@ -3825,18 +3828,18 @@ static int idpf_vport_intr_req_irq(struct idpf_vport *vport, char *basename)
else
continue;

- q_vector->name = kasprintf(GFP_KERNEL, "%s-%s-%d",
- basename, vec_name, vidx);
+ name = kasprintf(GFP_KERNEL, "%s-%s-%d", basename, vec_name,
+ vidx);

err = request_irq(irq_num, idpf_vport_intr_clean_queues, 0,
- q_vector->name, q_vector);
+ name, q_vector);
if (err) {
netdev_err(vport->netdev,
"Request_irq failed, error: %d\n", err);
goto free_q_irqs;
}
/* assign the mask for this irq */
- irq_set_affinity_hint(irq_num, &q_vector->affinity_mask);
+ irq_set_affinity_hint(irq_num, q_vector->affinity_mask);
}

return 0;
@@ -3845,7 +3848,7 @@ static int idpf_vport_intr_req_irq(struct idpf_vport *vport, char *basename)
while (--vector >= 0) {
vidx = vport->q_vector_idxs[vector];
irq_num = adapter->msix_entries[vidx].vector;
- free_irq(irq_num, &vport->q_vectors[vector]);
+ kfree(free_irq(irq_num, &vport->q_vectors[vector]));
}

return err;
@@ -4255,7 +4258,7 @@ static void idpf_vport_intr_napi_add_all(struct idpf_vport *vport)

/* only set affinity_mask if the CPU is online */
if (cpu_online(v_idx))
- cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
+ cpumask_set_cpu(v_idx, q_vector->affinity_mask);
}
}

@@ -4299,6 +4302,9 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
q_vector->rx_intr_mode = IDPF_ITR_DYNAMIC;
q_vector->rx_itr_idx = VIRTCHNL2_ITR_IDX_0;

+ if (!zalloc_cpumask_var(&q_vector->affinity_mask, GFP_KERNEL))
+ goto error;
+
q_vector->tx = kcalloc(txqs_per_vector, sizeof(*q_vector->tx),
GFP_KERNEL);
if (!q_vector->tx)
--
2.45.1


2024-05-28 13:51:06

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 06/12] idpf: merge singleq and splitq &net_device_ops

It makes no sense to have a second &net_device_ops struct (800 bytes of
rodata) with only one difference in .ndo_start_xmit, which can easily
be just one `if`. This `if` is a drop in the ocean and you won't see
any difference.
Define unified idpf_xmit_start(). The preparation for sending is the
same, just call either idpf_tx_splitq_frame() or idpf_tx_singleq_frame()
depending on the active model to actually map and send the skb.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 9 ++----
drivers/net/ethernet/intel/idpf/idpf_lib.c | 26 +++-------------
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 31 ++-----------------
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 17 ++++++----
4 files changed, 20 insertions(+), 63 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 731e2a73def5..e8a71da62d80 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -1204,14 +1204,11 @@ void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
struct idpf_tx_buf *first, u16 ring_idx);
unsigned int idpf_tx_desc_count_required(struct idpf_tx_queue *txq,
struct sk_buff *skb);
-bool idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs,
- unsigned int count);
int idpf_tx_maybe_stop_common(struct idpf_tx_queue *tx_q, unsigned int size);
void idpf_tx_timeout(struct net_device *netdev, unsigned int txqueue);
-netdev_tx_t idpf_tx_splitq_start(struct sk_buff *skb,
- struct net_device *netdev);
-netdev_tx_t idpf_tx_singleq_start(struct sk_buff *skb,
- struct net_device *netdev);
+netdev_tx_t idpf_tx_singleq_frame(struct sk_buff *skb,
+ struct idpf_tx_queue *tx_q);
+netdev_tx_t idpf_tx_start(struct sk_buff *skb, struct net_device *netdev);
bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_rx_queue *rxq,
u16 cleaned_count);
int idpf_tso(struct sk_buff *skb, struct idpf_tx_offload_params *off);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index a8be09a89943..fe91475c7b4c 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -4,8 +4,7 @@
#include "idpf.h"
#include "idpf_virtchnl.h"

-static const struct net_device_ops idpf_netdev_ops_splitq;
-static const struct net_device_ops idpf_netdev_ops_singleq;
+static const struct net_device_ops idpf_netdev_ops;

/**
* idpf_init_vector_stack - Fill the MSIX vector stack with vector index
@@ -764,10 +763,7 @@ static int idpf_cfg_netdev(struct idpf_vport *vport)
}

/* assign netdev_ops */
- if (idpf_is_queue_model_split(vport->txq_model))
- netdev->netdev_ops = &idpf_netdev_ops_splitq;
- else
- netdev->netdev_ops = &idpf_netdev_ops_singleq;
+ netdev->netdev_ops = &idpf_netdev_ops;

/* setup watchdog timeout value to be 5 second */
netdev->watchdog_timeo = 5 * HZ;
@@ -2353,24 +2349,10 @@ void idpf_free_dma_mem(struct idpf_hw *hw, struct idpf_dma_mem *mem)
mem->pa = 0;
}

-static const struct net_device_ops idpf_netdev_ops_splitq = {
- .ndo_open = idpf_open,
- .ndo_stop = idpf_stop,
- .ndo_start_xmit = idpf_tx_splitq_start,
- .ndo_features_check = idpf_features_check,
- .ndo_set_rx_mode = idpf_set_rx_mode,
- .ndo_validate_addr = eth_validate_addr,
- .ndo_set_mac_address = idpf_set_mac,
- .ndo_change_mtu = idpf_change_mtu,
- .ndo_get_stats64 = idpf_get_stats64,
- .ndo_set_features = idpf_set_features,
- .ndo_tx_timeout = idpf_tx_timeout,
-};
-
-static const struct net_device_ops idpf_netdev_ops_singleq = {
+static const struct net_device_ops idpf_netdev_ops = {
.ndo_open = idpf_open,
.ndo_stop = idpf_stop,
- .ndo_start_xmit = idpf_tx_singleq_start,
+ .ndo_start_xmit = idpf_tx_start,
.ndo_features_check = idpf_features_check,
.ndo_set_rx_mode = idpf_set_rx_mode,
.ndo_validate_addr = eth_validate_addr,
diff --git a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
index b51f7cd6db01..a3b60a2dfcaa 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
@@ -351,8 +351,8 @@ static void idpf_tx_singleq_build_ctx_desc(struct idpf_tx_queue *txq,
*
* Returns NETDEV_TX_OK if sent, else an error code
*/
-static netdev_tx_t idpf_tx_singleq_frame(struct sk_buff *skb,
- struct idpf_tx_queue *tx_q)
+netdev_tx_t idpf_tx_singleq_frame(struct sk_buff *skb,
+ struct idpf_tx_queue *tx_q)
{
struct idpf_tx_offload_params offload = { };
struct idpf_tx_buf *first;
@@ -408,33 +408,6 @@ static netdev_tx_t idpf_tx_singleq_frame(struct sk_buff *skb,
return idpf_tx_drop_skb(tx_q, skb);
}

-/**
- * idpf_tx_singleq_start - Selects the right Tx queue to send buffer
- * @skb: send buffer
- * @netdev: network interface device structure
- *
- * Returns NETDEV_TX_OK if sent, else an error code
- */
-netdev_tx_t idpf_tx_singleq_start(struct sk_buff *skb,
- struct net_device *netdev)
-{
- struct idpf_vport *vport = idpf_netdev_to_vport(netdev);
- struct idpf_tx_queue *tx_q;
-
- tx_q = vport->txqs[skb_get_queue_mapping(skb)];
-
- /* hardware can't handle really short frames, hardware padding works
- * beyond this point
- */
- if (skb_put_padto(skb, IDPF_TX_MIN_PKT_LEN)) {
- idpf_tx_buf_hw_update(tx_q, tx_q->next_to_use, false);
-
- return NETDEV_TX_OK;
- }
-
- return idpf_tx_singleq_frame(skb, tx_q);
-}
-
/**
* idpf_tx_singleq_clean - Reclaim resources from queue
* @tx_q: Tx queue to clean
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index f569ea389b04..5dd1b1a9e624 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -4,6 +4,9 @@
#include "idpf.h"
#include "idpf_virtchnl.h"

+static bool idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs,
+ unsigned int count);
+
/**
* idpf_buf_lifo_push - push a buffer pointer onto stack
* @stack: pointer to stack struct
@@ -2702,8 +2705,8 @@ static bool __idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs)
* E.g.: a packet with 7 fragments can require 9 DMA transactions; 1 for TSO
* header, 1 for segment payload, and then 7 for the fragments.
*/
-bool idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs,
- unsigned int count)
+static bool idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs,
+ unsigned int count)
{
if (likely(count < max_bufs))
return false;
@@ -2849,14 +2852,13 @@ static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
}

/**
- * idpf_tx_splitq_start - Selects the right Tx queue to send buffer
+ * idpf_tx_start - Selects the right Tx queue to send buffer
* @skb: send buffer
* @netdev: network interface device structure
*
* Returns NETDEV_TX_OK if sent, else an error code
*/
-netdev_tx_t idpf_tx_splitq_start(struct sk_buff *skb,
- struct net_device *netdev)
+netdev_tx_t idpf_tx_start(struct sk_buff *skb, struct net_device *netdev)
{
struct idpf_vport *vport = idpf_netdev_to_vport(netdev);
struct idpf_tx_queue *tx_q;
@@ -2878,7 +2880,10 @@ netdev_tx_t idpf_tx_splitq_start(struct sk_buff *skb,
return NETDEV_TX_OK;
}

- return idpf_tx_splitq_frame(skb, tx_q);
+ if (idpf_is_queue_model_split(vport->txq_model))
+ return idpf_tx_splitq_frame(skb, tx_q);
+ else
+ return idpf_tx_singleq_frame(skb, tx_q);
}

/**
--
2.45.1


2024-05-28 13:51:52

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 08/12] idpf: reuse libeth's definitions of parsed ptype structures

idpf's in-kernel parsed ptype structure is almost identical to the one
used in the previous Intel drivers, which means it can be converted to
use libeth's definitions and even helpers. The only difference is that
it doesn't use a constant table (libie), rather than one obtained from
the device.
Remove the driver counterpart and use libeth's helpers for hashes and
checksums. This slightly optimizes skb fields processing due to faster
checks. Also don't define big static array of ptypes in &idpf_vport --
allocate them dynamically. The pointer to it is anyway cached in
&idpf_rx_queue.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/Kconfig | 1 +
drivers/net/ethernet/intel/idpf/idpf.h | 2 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 88 +-----------
drivers/net/ethernet/intel/idpf/idpf_lib.c | 3 +
drivers/net/ethernet/intel/idpf/idpf_main.c | 1 +
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 113 +++++++---------
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 125 +++++++-----------
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 72 ++++++----
8 files changed, 153 insertions(+), 252 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/Kconfig b/drivers/net/ethernet/intel/idpf/Kconfig
index 9082c16edb7e..638484c5723c 100644
--- a/drivers/net/ethernet/intel/idpf/Kconfig
+++ b/drivers/net/ethernet/intel/idpf/Kconfig
@@ -5,6 +5,7 @@ config IDPF
tristate "Intel(R) Infrastructure Data Path Function Support"
depends on PCI_MSI
select DIMLIB
+ select LIBETH
select PAGE_POOL
select PAGE_POOL_STATS
help
diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index 5d9529f5b41b..078340a01757 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -312,7 +312,7 @@ struct idpf_vport {
u16 num_rxq_grp;
struct idpf_rxq_group *rxq_grps;
u32 rxq_model;
- struct idpf_rx_ptype_decoded rx_ptype_lkup[IDPF_RX_MAX_PTYPE];
+ struct libeth_rx_pt *rx_ptype_lkup;

struct idpf_adapter *adapter;
struct net_device *netdev;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index e8a71da62d80..a6f5916bee8e 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -331,72 +331,6 @@ struct idpf_rx_buf {
#define IDPF_RX_MAX_BASE_PTYPE 256
#define IDPF_INVALID_PTYPE_ID 0xFFFF

-/* Packet type non-ip values */
-enum idpf_rx_ptype_l2 {
- IDPF_RX_PTYPE_L2_RESERVED = 0,
- IDPF_RX_PTYPE_L2_MAC_PAY2 = 1,
- IDPF_RX_PTYPE_L2_TIMESYNC_PAY2 = 2,
- IDPF_RX_PTYPE_L2_FIP_PAY2 = 3,
- IDPF_RX_PTYPE_L2_OUI_PAY2 = 4,
- IDPF_RX_PTYPE_L2_MACCNTRL_PAY2 = 5,
- IDPF_RX_PTYPE_L2_LLDP_PAY2 = 6,
- IDPF_RX_PTYPE_L2_ECP_PAY2 = 7,
- IDPF_RX_PTYPE_L2_EVB_PAY2 = 8,
- IDPF_RX_PTYPE_L2_QCN_PAY2 = 9,
- IDPF_RX_PTYPE_L2_EAPOL_PAY2 = 10,
- IDPF_RX_PTYPE_L2_ARP = 11,
-};
-
-enum idpf_rx_ptype_outer_ip {
- IDPF_RX_PTYPE_OUTER_L2 = 0,
- IDPF_RX_PTYPE_OUTER_IP = 1,
-};
-
-#define IDPF_RX_PTYPE_TO_IPV(ptype, ipv) \
- (((ptype)->outer_ip == IDPF_RX_PTYPE_OUTER_IP) && \
- ((ptype)->outer_ip_ver == (ipv)))
-
-enum idpf_rx_ptype_outer_ip_ver {
- IDPF_RX_PTYPE_OUTER_NONE = 0,
- IDPF_RX_PTYPE_OUTER_IPV4 = 1,
- IDPF_RX_PTYPE_OUTER_IPV6 = 2,
-};
-
-enum idpf_rx_ptype_outer_fragmented {
- IDPF_RX_PTYPE_NOT_FRAG = 0,
- IDPF_RX_PTYPE_FRAG = 1,
-};
-
-enum idpf_rx_ptype_tunnel_type {
- IDPF_RX_PTYPE_TUNNEL_NONE = 0,
- IDPF_RX_PTYPE_TUNNEL_IP_IP = 1,
- IDPF_RX_PTYPE_TUNNEL_IP_GRENAT = 2,
- IDPF_RX_PTYPE_TUNNEL_IP_GRENAT_MAC = 3,
- IDPF_RX_PTYPE_TUNNEL_IP_GRENAT_MAC_VLAN = 4,
-};
-
-enum idpf_rx_ptype_tunnel_end_prot {
- IDPF_RX_PTYPE_TUNNEL_END_NONE = 0,
- IDPF_RX_PTYPE_TUNNEL_END_IPV4 = 1,
- IDPF_RX_PTYPE_TUNNEL_END_IPV6 = 2,
-};
-
-enum idpf_rx_ptype_inner_prot {
- IDPF_RX_PTYPE_INNER_PROT_NONE = 0,
- IDPF_RX_PTYPE_INNER_PROT_UDP = 1,
- IDPF_RX_PTYPE_INNER_PROT_TCP = 2,
- IDPF_RX_PTYPE_INNER_PROT_SCTP = 3,
- IDPF_RX_PTYPE_INNER_PROT_ICMP = 4,
- IDPF_RX_PTYPE_INNER_PROT_TIMESYNC = 5,
-};
-
-enum idpf_rx_ptype_payload_layer {
- IDPF_RX_PTYPE_PAYLOAD_LAYER_NONE = 0,
- IDPF_RX_PTYPE_PAYLOAD_LAYER_PAY2 = 1,
- IDPF_RX_PTYPE_PAYLOAD_LAYER_PAY3 = 2,
- IDPF_RX_PTYPE_PAYLOAD_LAYER_PAY4 = 3,
-};
-
enum idpf_tunnel_state {
IDPF_PTYPE_TUNNEL_IP = BIT(0),
IDPF_PTYPE_TUNNEL_IP_GRENAT = BIT(1),
@@ -404,22 +338,9 @@ enum idpf_tunnel_state {
};

struct idpf_ptype_state {
- bool outer_ip;
- bool outer_frag;
- u8 tunnel_state;
-};
-
-struct idpf_rx_ptype_decoded {
- u32 ptype:10;
- u32 known:1;
- u32 outer_ip:1;
- u32 outer_ip_ver:2;
- u32 outer_frag:1;
- u32 tunnel_type:3;
- u32 tunnel_end_prot:2;
- u32 tunnel_end_frag:1;
- u32 inner_prot:4;
- u32 payload_layer:3;
+ bool outer_ip:1;
+ bool outer_frag:1;
+ u8 tunnel_state:6;
};

/**
@@ -681,7 +602,7 @@ struct idpf_rx_queue {
u16 desc_count;

u32 rxdids;
- const struct idpf_rx_ptype_decoded *rx_ptype_lkup;
+ const struct libeth_rx_pt *rx_ptype_lkup;
);
libeth_cacheline_group(read_write,
u16 next_to_use;
@@ -1186,7 +1107,6 @@ void idpf_vport_intr_update_itr_ena_irq(struct idpf_q_vector *q_vector);
void idpf_vport_intr_deinit(struct idpf_vport *vport);
int idpf_vport_intr_init(struct idpf_vport *vport);
void idpf_vport_intr_ena(struct idpf_vport *vport);
-enum pkt_hash_types idpf_ptype_to_htype(const struct idpf_rx_ptype_decoded *decoded);
int idpf_config_rss(struct idpf_vport *vport);
int idpf_init_rss(struct idpf_vport *vport);
void idpf_deinit_rss(struct idpf_vport *vport);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index fe91475c7b4c..f450f18248b3 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -942,6 +942,9 @@ static void idpf_decfg_netdev(struct idpf_vport *vport)
{
struct idpf_adapter *adapter = vport->adapter;

+ kfree(vport->rx_ptype_lkup);
+ vport->rx_ptype_lkup = NULL;
+
unregister_netdev(vport->netdev);
free_netdev(vport->netdev);
vport->netdev = NULL;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c
index f784eea044bd..db476b3314c8 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_main.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_main.c
@@ -8,6 +8,7 @@
#define DRV_SUMMARY "Intel(R) Infrastructure Data Path Function Linux Driver"

MODULE_DESCRIPTION(DRV_SUMMARY);
+MODULE_IMPORT_NS(LIBETH);
MODULE_LICENSE("GPL");

/**
diff --git a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
index a3b60a2dfcaa..7c71c72b814f 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
@@ -1,6 +1,8 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (C) 2023 Intel Corporation */

+#include <net/libeth/rx.h>
+
#include "idpf.h"

/**
@@ -602,75 +604,62 @@ static bool idpf_rx_singleq_is_non_eop(const union virtchnl2_rx_desc *rx_desc)
* @rxq: Rx ring being processed
* @skb: skb currently being received and modified
* @csum_bits: checksum bits from descriptor
- * @ptype: the packet type decoded by hardware
+ * @decoded: the packet type decoded by hardware
*
* skb->protocol must be set before this function is called
*/
-static void idpf_rx_singleq_csum(struct idpf_rx_queue *rxq, struct sk_buff *skb,
- struct idpf_rx_csum_decoded *csum_bits,
- u16 ptype)
+static void idpf_rx_singleq_csum(struct idpf_rx_queue *rxq,
+ struct sk_buff *skb,
+ struct idpf_rx_csum_decoded csum_bits,
+ struct libeth_rx_pt decoded)
{
- struct idpf_rx_ptype_decoded decoded;
bool ipv4, ipv6;

/* check if Rx checksum is enabled */
- if (unlikely(!(rxq->netdev->features & NETIF_F_RXCSUM)))
+ if (!libeth_rx_pt_has_checksum(rxq->netdev, decoded))
return;

/* check if HW has decoded the packet and checksum */
- if (unlikely(!(csum_bits->l3l4p)))
- return;
-
- decoded = rxq->rx_ptype_lkup[ptype];
- if (unlikely(!(decoded.known && decoded.outer_ip)))
+ if (unlikely(!csum_bits.l3l4p))
return;

- ipv4 = IDPF_RX_PTYPE_TO_IPV(&decoded, IDPF_RX_PTYPE_OUTER_IPV4);
- ipv6 = IDPF_RX_PTYPE_TO_IPV(&decoded, IDPF_RX_PTYPE_OUTER_IPV6);
+ ipv4 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV4;
+ ipv6 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV6;

/* Check if there were any checksum errors */
- if (unlikely(ipv4 && (csum_bits->ipe || csum_bits->eipe)))
+ if (unlikely(ipv4 && (csum_bits.ipe || csum_bits.eipe)))
goto checksum_fail;

/* Device could not do any checksum offload for certain extension
* headers as indicated by setting IPV6EXADD bit
*/
- if (unlikely(ipv6 && csum_bits->ipv6exadd))
+ if (unlikely(ipv6 && csum_bits.ipv6exadd))
return;

/* check for L4 errors and handle packets that were not able to be
* checksummed due to arrival speed
*/
- if (unlikely(csum_bits->l4e))
+ if (unlikely(csum_bits.l4e))
goto checksum_fail;

- if (unlikely(csum_bits->nat && csum_bits->eudpe))
+ if (unlikely(csum_bits.nat && csum_bits.eudpe))
goto checksum_fail;

/* Handle packets that were not able to be checksummed due to arrival
* speed, in this case the stack can compute the csum.
*/
- if (unlikely(csum_bits->pprs))
+ if (unlikely(csum_bits.pprs))
return;

/* If there is an outer header present that might contain a checksum
* we need to bump the checksum level by 1 to reflect the fact that
* we are indicating we validated the inner checksum.
*/
- if (decoded.tunnel_type >= IDPF_RX_PTYPE_TUNNEL_IP_GRENAT)
+ if (decoded.tunnel_type >= LIBETH_RX_PT_TUNNEL_IP_GRENAT)
skb->csum_level = 1;

- /* Only report checksum unnecessary for ICMP, TCP, UDP, or SCTP */
- switch (decoded.inner_prot) {
- case IDPF_RX_PTYPE_INNER_PROT_ICMP:
- case IDPF_RX_PTYPE_INNER_PROT_TCP:
- case IDPF_RX_PTYPE_INNER_PROT_UDP:
- case IDPF_RX_PTYPE_INNER_PROT_SCTP:
- skb->ip_summed = CHECKSUM_UNNECESSARY;
- return;
- default:
- return;
- }
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ return;

checksum_fail:
u64_stats_update_begin(&rxq->stats_sync);
@@ -680,20 +669,17 @@ static void idpf_rx_singleq_csum(struct idpf_rx_queue *rxq, struct sk_buff *skb,

/**
* idpf_rx_singleq_base_csum - Indicate in skb if hw indicated a good cksum
- * @rx_q: Rx completion queue
- * @skb: skb currently being received and modified
* @rx_desc: the receive descriptor
- * @ptype: Rx packet type
*
* This function only operates on the VIRTCHNL2_RXDID_1_32B_BASE_M base 32byte
* descriptor writeback format.
+ *
+ * Return: parsed checksum status.
**/
-static void idpf_rx_singleq_base_csum(struct idpf_rx_queue *rx_q,
- struct sk_buff *skb,
- const union virtchnl2_rx_desc *rx_desc,
- u16 ptype)
+static struct idpf_rx_csum_decoded
+idpf_rx_singleq_base_csum(const union virtchnl2_rx_desc *rx_desc)
{
- struct idpf_rx_csum_decoded csum_bits;
+ struct idpf_rx_csum_decoded csum_bits = { };
u32 rx_error, rx_status;
u64 qword;

@@ -712,28 +698,23 @@ static void idpf_rx_singleq_base_csum(struct idpf_rx_queue *rx_q,
rx_status);
csum_bits.ipv6exadd = FIELD_GET(VIRTCHNL2_RX_BASE_DESC_STATUS_IPV6EXADD_M,
rx_status);
- csum_bits.nat = 0;
- csum_bits.eudpe = 0;

- idpf_rx_singleq_csum(rx_q, skb, &csum_bits, ptype);
+ return csum_bits;
}

/**
* idpf_rx_singleq_flex_csum - Indicate in skb if hw indicated a good cksum
- * @rx_q: Rx completion queue
- * @skb: skb currently being received and modified
* @rx_desc: the receive descriptor
- * @ptype: Rx packet type
*
* This function only operates on the VIRTCHNL2_RXDID_2_FLEX_SQ_NIC flexible
* descriptor writeback format.
+ *
+ * Return: parsed checksum status.
**/
-static void idpf_rx_singleq_flex_csum(struct idpf_rx_queue *rx_q,
- struct sk_buff *skb,
- const union virtchnl2_rx_desc *rx_desc,
- u16 ptype)
+static struct idpf_rx_csum_decoded
+idpf_rx_singleq_flex_csum(const union virtchnl2_rx_desc *rx_desc)
{
- struct idpf_rx_csum_decoded csum_bits;
+ struct idpf_rx_csum_decoded csum_bits = { };
u16 rx_status0, rx_status1;

rx_status0 = le16_to_cpu(rx_desc->flex_nic_wb.status_error0);
@@ -753,9 +734,8 @@ static void idpf_rx_singleq_flex_csum(struct idpf_rx_queue *rx_q,
rx_status0);
csum_bits.nat = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_STATUS1_NAT_M,
rx_status1);
- csum_bits.pprs = 0;

- idpf_rx_singleq_csum(rx_q, skb, &csum_bits, ptype);
+ return csum_bits;
}

/**
@@ -771,11 +751,11 @@ static void idpf_rx_singleq_flex_csum(struct idpf_rx_queue *rx_q,
static void idpf_rx_singleq_base_hash(struct idpf_rx_queue *rx_q,
struct sk_buff *skb,
const union virtchnl2_rx_desc *rx_desc,
- struct idpf_rx_ptype_decoded *decoded)
+ struct libeth_rx_pt decoded)
{
u64 mask, qw1;

- if (unlikely(!(rx_q->netdev->features & NETIF_F_RXHASH)))
+ if (!libeth_rx_pt_has_hash(rx_q->netdev, decoded))
return;

mask = VIRTCHNL2_RX_BASE_DESC_FLTSTAT_RSS_HASH_M;
@@ -784,7 +764,7 @@ static void idpf_rx_singleq_base_hash(struct idpf_rx_queue *rx_q,
if (FIELD_GET(mask, qw1) == mask) {
u32 hash = le32_to_cpu(rx_desc->base_wb.qword0.hi_dword.rss);

- skb_set_hash(skb, hash, idpf_ptype_to_htype(decoded));
+ libeth_rx_pt_set_hash(skb, hash, decoded);
}
}

@@ -801,15 +781,17 @@ static void idpf_rx_singleq_base_hash(struct idpf_rx_queue *rx_q,
static void idpf_rx_singleq_flex_hash(struct idpf_rx_queue *rx_q,
struct sk_buff *skb,
const union virtchnl2_rx_desc *rx_desc,
- struct idpf_rx_ptype_decoded *decoded)
+ struct libeth_rx_pt decoded)
{
- if (unlikely(!(rx_q->netdev->features & NETIF_F_RXHASH)))
+ if (!libeth_rx_pt_has_hash(rx_q->netdev, decoded))
return;

if (FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_STATUS0_RSS_VALID_M,
- le16_to_cpu(rx_desc->flex_nic_wb.status_error0)))
- skb_set_hash(skb, le32_to_cpu(rx_desc->flex_nic_wb.rss_hash),
- idpf_ptype_to_htype(decoded));
+ le16_to_cpu(rx_desc->flex_nic_wb.status_error0))) {
+ u32 hash = le32_to_cpu(rx_desc->flex_nic_wb.rss_hash);
+
+ libeth_rx_pt_set_hash(skb, hash, decoded);
+ }
}

/**
@@ -830,19 +812,22 @@ idpf_rx_singleq_process_skb_fields(struct idpf_rx_queue *rx_q,
const union virtchnl2_rx_desc *rx_desc,
u16 ptype)
{
- struct idpf_rx_ptype_decoded decoded = rx_q->rx_ptype_lkup[ptype];
+ struct libeth_rx_pt decoded = rx_q->rx_ptype_lkup[ptype];
+ struct idpf_rx_csum_decoded csum_bits;

/* modifies the skb - consumes the enet header */
skb->protocol = eth_type_trans(skb, rx_q->netdev);

/* Check if we're using base mode descriptor IDs */
if (rx_q->rxdids == VIRTCHNL2_RXDID_1_32B_BASE_M) {
- idpf_rx_singleq_base_hash(rx_q, skb, rx_desc, &decoded);
- idpf_rx_singleq_base_csum(rx_q, skb, rx_desc, ptype);
+ idpf_rx_singleq_base_hash(rx_q, skb, rx_desc, decoded);
+ csum_bits = idpf_rx_singleq_base_csum(rx_desc);
} else {
- idpf_rx_singleq_flex_hash(rx_q, skb, rx_desc, &decoded);
- idpf_rx_singleq_flex_csum(rx_q, skb, rx_desc, ptype);
+ idpf_rx_singleq_flex_hash(rx_q, skb, rx_desc, decoded);
+ csum_bits = idpf_rx_singleq_flex_csum(rx_desc);
}
+
+ idpf_rx_singleq_csum(rx_q, skb, csum_bits, decoded);
}

/**
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index e79a8f9dfc40..2af9a482652a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1,6 +1,8 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (C) 2023 Intel Corporation */

+#include <net/libeth/rx.h>
+
#include "idpf.h"
#include "idpf_virtchnl.h"

@@ -2886,30 +2888,6 @@ netdev_tx_t idpf_tx_start(struct sk_buff *skb, struct net_device *netdev)
return idpf_tx_singleq_frame(skb, tx_q);
}

-/**
- * idpf_ptype_to_htype - get a hash type
- * @decoded: Decoded Rx packet type related fields
- *
- * Returns appropriate hash type (such as PKT_HASH_TYPE_L2/L3/L4) to be used by
- * skb_set_hash based on PTYPE as parsed by HW Rx pipeline and is part of
- * Rx desc.
- */
-enum pkt_hash_types idpf_ptype_to_htype(const struct idpf_rx_ptype_decoded *decoded)
-{
- if (!decoded->known)
- return PKT_HASH_TYPE_NONE;
- if (decoded->payload_layer == IDPF_RX_PTYPE_PAYLOAD_LAYER_PAY2 &&
- decoded->inner_prot)
- return PKT_HASH_TYPE_L4;
- if (decoded->payload_layer == IDPF_RX_PTYPE_PAYLOAD_LAYER_PAY2 &&
- decoded->outer_ip)
- return PKT_HASH_TYPE_L3;
- if (decoded->outer_ip == IDPF_RX_PTYPE_OUTER_L2)
- return PKT_HASH_TYPE_L2;
-
- return PKT_HASH_TYPE_NONE;
-}
-
/**
* idpf_rx_hash - set the hash value in the skb
* @rxq: Rx descriptor ring packet is being transacted on
@@ -2920,18 +2898,18 @@ enum pkt_hash_types idpf_ptype_to_htype(const struct idpf_rx_ptype_decoded *deco
static void
idpf_rx_hash(const struct idpf_rx_queue *rxq, struct sk_buff *skb,
const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
- struct idpf_rx_ptype_decoded *decoded)
+ struct libeth_rx_pt decoded)
{
u32 hash;

- if (unlikely(!(rxq->netdev->features & NETIF_F_RXHASH)))
+ if (!libeth_rx_pt_has_hash(rxq->netdev, decoded))
return;

hash = le16_to_cpu(rx_desc->hash1) |
(rx_desc->ff2_mirrid_hash2.hash2 << 16) |
(rx_desc->hash3 << 24);

- skb_set_hash(skb, hash, idpf_ptype_to_htype(decoded));
+ libeth_rx_pt_set_hash(skb, hash, decoded);
}

/**
@@ -2944,55 +2922,43 @@ idpf_rx_hash(const struct idpf_rx_queue *rxq, struct sk_buff *skb,
* skb->protocol must be set before this function is called
*/
static void idpf_rx_csum(struct idpf_rx_queue *rxq, struct sk_buff *skb,
- struct idpf_rx_csum_decoded *csum_bits,
- struct idpf_rx_ptype_decoded *decoded)
+ struct idpf_rx_csum_decoded csum_bits,
+ struct libeth_rx_pt decoded)
{
bool ipv4, ipv6;

/* check if Rx checksum is enabled */
- if (unlikely(!(rxq->netdev->features & NETIF_F_RXCSUM)))
+ if (!libeth_rx_pt_has_checksum(rxq->netdev, decoded))
return;

/* check if HW has decoded the packet and checksum */
- if (!(csum_bits->l3l4p))
+ if (!unlikely(csum_bits.l3l4p))
return;

- ipv4 = IDPF_RX_PTYPE_TO_IPV(decoded, IDPF_RX_PTYPE_OUTER_IPV4);
- ipv6 = IDPF_RX_PTYPE_TO_IPV(decoded, IDPF_RX_PTYPE_OUTER_IPV6);
+ ipv4 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV4;
+ ipv6 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV6;

- if (ipv4 && (csum_bits->ipe || csum_bits->eipe))
+ if (unlikely(ipv4 && (csum_bits.ipe || csum_bits.eipe)))
goto checksum_fail;

- if (ipv6 && csum_bits->ipv6exadd)
+ if (unlikely(ipv6 && csum_bits.ipv6exadd))
return;

/* check for L4 errors and handle packets that were not able to be
* checksummed
*/
- if (csum_bits->l4e)
+ if (unlikely(csum_bits.l4e))
goto checksum_fail;

- /* Only report checksum unnecessary for ICMP, TCP, UDP, or SCTP */
- switch (decoded->inner_prot) {
- case IDPF_RX_PTYPE_INNER_PROT_ICMP:
- case IDPF_RX_PTYPE_INNER_PROT_TCP:
- case IDPF_RX_PTYPE_INNER_PROT_UDP:
- if (!csum_bits->raw_csum_inv) {
- u16 csum = csum_bits->raw_csum;
-
- skb->csum = csum_unfold((__force __sum16)~swab16(csum));
- skb->ip_summed = CHECKSUM_COMPLETE;
- } else {
- skb->ip_summed = CHECKSUM_UNNECESSARY;
- }
- break;
- case IDPF_RX_PTYPE_INNER_PROT_SCTP:
+ if (csum_bits.raw_csum_inv ||
+ decoded.inner_prot == LIBETH_RX_PT_INNER_SCTP) {
skb->ip_summed = CHECKSUM_UNNECESSARY;
- break;
- default:
- break;
+ return;
}

+ skb->csum = csum_unfold((__force __sum16)~swab16(csum_bits.raw_csum));
+ skb->ip_summed = CHECKSUM_COMPLETE;
+
return;

checksum_fail:
@@ -3004,32 +2970,34 @@ static void idpf_rx_csum(struct idpf_rx_queue *rxq, struct sk_buff *skb,
/**
* idpf_rx_splitq_extract_csum_bits - Extract checksum bits from descriptor
* @rx_desc: receive descriptor
- * @csum: structure to extract checksum fields
*
+ * Return: parsed checksum status.
**/
-static void
-idpf_rx_splitq_extract_csum_bits(const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
- struct idpf_rx_csum_decoded *csum)
+static struct idpf_rx_csum_decoded
+idpf_rx_splitq_extract_csum_bits(const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc)
{
+ struct idpf_rx_csum_decoded csum = { };
u8 qword0, qword1;

qword0 = rx_desc->status_err0_qw0;
qword1 = rx_desc->status_err0_qw1;

- csum->ipe = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_XSUM_IPE_M,
+ csum.ipe = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_XSUM_IPE_M,
+ qword1);
+ csum.eipe = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_XSUM_EIPE_M,
qword1);
- csum->eipe = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_XSUM_EIPE_M,
+ csum.l4e = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_XSUM_L4E_M,
+ qword1);
+ csum.l3l4p = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_L3L4P_M,
qword1);
- csum->l4e = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_XSUM_L4E_M,
- qword1);
- csum->l3l4p = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_L3L4P_M,
- qword1);
- csum->ipv6exadd = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_IPV6EXADD_M,
- qword0);
- csum->raw_csum_inv =
+ csum.ipv6exadd = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_IPV6EXADD_M,
+ qword0);
+ csum.raw_csum_inv =
le16_get_bits(rx_desc->ptype_err_fflags0,
VIRTCHNL2_RX_FLEX_DESC_ADV_RAW_CSUM_INV_M);
- csum->raw_csum = le16_to_cpu(rx_desc->misc.raw_cs);
+ csum.raw_csum = le16_to_cpu(rx_desc->misc.raw_cs);
+
+ return csum;
}

/**
@@ -3046,21 +3014,22 @@ idpf_rx_splitq_extract_csum_bits(const struct virtchnl2_rx_flex_desc_adv_nic_3 *
*/
static int idpf_rx_rsc(struct idpf_rx_queue *rxq, struct sk_buff *skb,
const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
- struct idpf_rx_ptype_decoded *decoded)
+ struct libeth_rx_pt decoded)
{
u16 rsc_segments, rsc_seg_len;
bool ipv4, ipv6;
int len;

- if (unlikely(!decoded->outer_ip))
+ if (unlikely(libeth_rx_pt_get_ip_ver(decoded) ==
+ LIBETH_RX_PT_OUTER_L2))
return -EINVAL;

rsc_seg_len = le16_to_cpu(rx_desc->misc.rscseglen);
if (unlikely(!rsc_seg_len))
return -EINVAL;

- ipv4 = IDPF_RX_PTYPE_TO_IPV(decoded, IDPF_RX_PTYPE_OUTER_IPV4);
- ipv6 = IDPF_RX_PTYPE_TO_IPV(decoded, IDPF_RX_PTYPE_OUTER_IPV6);
+ ipv4 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV4;
+ ipv6 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV6;

if (unlikely(!(ipv4 ^ ipv6)))
return -EINVAL;
@@ -3118,8 +3087,8 @@ static int
idpf_rx_process_skb_fields(struct idpf_rx_queue *rxq, struct sk_buff *skb,
const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc)
{
- struct idpf_rx_csum_decoded csum_bits = { };
- struct idpf_rx_ptype_decoded decoded;
+ struct idpf_rx_csum_decoded csum_bits;
+ struct libeth_rx_pt decoded;
u16 rx_ptype;

rx_ptype = le16_get_bits(rx_desc->ptype_err_fflags0,
@@ -3127,16 +3096,16 @@ idpf_rx_process_skb_fields(struct idpf_rx_queue *rxq, struct sk_buff *skb,
decoded = rxq->rx_ptype_lkup[rx_ptype];

/* process RSS/hash */
- idpf_rx_hash(rxq, skb, rx_desc, &decoded);
+ idpf_rx_hash(rxq, skb, rx_desc, decoded);

skb->protocol = eth_type_trans(skb, rxq->netdev);

if (le16_get_bits(rx_desc->hdrlen_flags,
VIRTCHNL2_RX_FLEX_DESC_ADV_RSC_M))
- return idpf_rx_rsc(rxq, skb, rx_desc, &decoded);
+ return idpf_rx_rsc(rxq, skb, rx_desc, decoded);

- idpf_rx_splitq_extract_csum_bits(rx_desc, &csum_bits);
- idpf_rx_csum(rxq, skb, &csum_bits, &decoded);
+ csum_bits = idpf_rx_splitq_extract_csum_bits(rx_desc);
+ idpf_rx_csum(rxq, skb, csum_bits, decoded);

return 0;
}
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 1aa4770dfe18..83f3543a30e1 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -1,6 +1,8 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (C) 2023 Intel Corporation */

+#include <net/libeth/rx.h>
+
#include "idpf.h"
#include "idpf_virtchnl.h"

@@ -2483,39 +2485,52 @@ int idpf_send_get_set_rss_key_msg(struct idpf_vport *vport, bool get)
* @frag: fragmentation allowed
*
*/
-static void idpf_fill_ptype_lookup(struct idpf_rx_ptype_decoded *ptype,
+static void idpf_fill_ptype_lookup(struct libeth_rx_pt *ptype,
struct idpf_ptype_state *pstate,
bool ipv4, bool frag)
{
if (!pstate->outer_ip || !pstate->outer_frag) {
- ptype->outer_ip = IDPF_RX_PTYPE_OUTER_IP;
pstate->outer_ip = true;

if (ipv4)
- ptype->outer_ip_ver = IDPF_RX_PTYPE_OUTER_IPV4;
+ ptype->outer_ip = LIBETH_RX_PT_OUTER_IPV4;
else
- ptype->outer_ip_ver = IDPF_RX_PTYPE_OUTER_IPV6;
+ ptype->outer_ip = LIBETH_RX_PT_OUTER_IPV6;

if (frag) {
- ptype->outer_frag = IDPF_RX_PTYPE_FRAG;
+ ptype->outer_frag = LIBETH_RX_PT_FRAG;
pstate->outer_frag = true;
}
} else {
- ptype->tunnel_type = IDPF_RX_PTYPE_TUNNEL_IP_IP;
+ ptype->tunnel_type = LIBETH_RX_PT_TUNNEL_IP_IP;
pstate->tunnel_state = IDPF_PTYPE_TUNNEL_IP;

if (ipv4)
- ptype->tunnel_end_prot =
- IDPF_RX_PTYPE_TUNNEL_END_IPV4;
+ ptype->tunnel_end_prot = LIBETH_RX_PT_TUNNEL_END_IPV4;
else
- ptype->tunnel_end_prot =
- IDPF_RX_PTYPE_TUNNEL_END_IPV6;
+ ptype->tunnel_end_prot = LIBETH_RX_PT_TUNNEL_END_IPV6;

if (frag)
- ptype->tunnel_end_frag = IDPF_RX_PTYPE_FRAG;
+ ptype->tunnel_end_frag = LIBETH_RX_PT_FRAG;
}
}

+static void idpf_finalize_ptype_lookup(struct libeth_rx_pt *ptype)
+{
+ if (ptype->payload_layer == LIBETH_RX_PT_PAYLOAD_L2 &&
+ ptype->inner_prot)
+ ptype->payload_layer = LIBETH_RX_PT_PAYLOAD_L4;
+ else if (ptype->payload_layer == LIBETH_RX_PT_PAYLOAD_L2 &&
+ ptype->outer_ip)
+ ptype->payload_layer = LIBETH_RX_PT_PAYLOAD_L3;
+ else if (ptype->outer_ip == LIBETH_RX_PT_OUTER_L2)
+ ptype->payload_layer = LIBETH_RX_PT_PAYLOAD_L2;
+ else
+ ptype->payload_layer = LIBETH_RX_PT_PAYLOAD_NONE;
+
+ libeth_rx_pt_gen_hash_type(ptype);
+}
+
/**
* idpf_send_get_rx_ptype_msg - Send virtchnl for ptype info
* @vport: virtual port data structure
@@ -2526,7 +2541,7 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport)
{
struct virtchnl2_get_ptype_info *get_ptype_info __free(kfree) = NULL;
struct virtchnl2_get_ptype_info *ptype_info __free(kfree) = NULL;
- struct idpf_rx_ptype_decoded *ptype_lkup = vport->rx_ptype_lkup;
+ struct libeth_rx_pt *ptype_lkup __free(kfree) = NULL;
int max_ptype, ptypes_recvd = 0, ptype_offset;
struct idpf_adapter *adapter = vport->adapter;
struct idpf_vc_xn_params xn_params = {};
@@ -2534,12 +2549,17 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport)
ssize_t reply_sz;
int i, j, k;

+ if (vport->rx_ptype_lkup)
+ return 0;
+
if (idpf_is_queue_model_split(vport->rxq_model))
max_ptype = IDPF_RX_MAX_PTYPE;
else
max_ptype = IDPF_RX_MAX_BASE_PTYPE;

- memset(vport->rx_ptype_lkup, 0, sizeof(vport->rx_ptype_lkup));
+ ptype_lkup = kcalloc(max_ptype, sizeof(*ptype_lkup), GFP_KERNEL);
+ if (!ptype_lkup)
+ return -ENOMEM;

get_ptype_info = kzalloc(sizeof(*get_ptype_info), GFP_KERNEL);
if (!get_ptype_info)
@@ -2597,16 +2617,13 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport)
/* 0xFFFF indicates end of ptypes */
if (le16_to_cpu(ptype->ptype_id_10) ==
IDPF_INVALID_PTYPE_ID)
- return 0;
+ goto out;

if (idpf_is_queue_model_split(vport->rxq_model))
k = le16_to_cpu(ptype->ptype_id_10);
else
k = ptype->ptype_id_8;

- if (ptype->proto_id_count)
- ptype_lkup[k].known = 1;
-
for (j = 0; j < ptype->proto_id_count; j++) {
id = le16_to_cpu(ptype->proto_id[j]);
switch (id) {
@@ -2614,18 +2631,18 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport)
if (pstate.tunnel_state ==
IDPF_PTYPE_TUNNEL_IP) {
ptype_lkup[k].tunnel_type =
- IDPF_RX_PTYPE_TUNNEL_IP_GRENAT;
+ LIBETH_RX_PT_TUNNEL_IP_GRENAT;
pstate.tunnel_state |=
IDPF_PTYPE_TUNNEL_IP_GRENAT;
}
break;
case VIRTCHNL2_PROTO_HDR_MAC:
ptype_lkup[k].outer_ip =
- IDPF_RX_PTYPE_OUTER_L2;
+ LIBETH_RX_PT_OUTER_L2;
if (pstate.tunnel_state ==
IDPF_TUN_IP_GRE) {
ptype_lkup[k].tunnel_type =
- IDPF_RX_PTYPE_TUNNEL_IP_GRENAT_MAC;
+ LIBETH_RX_PT_TUNNEL_IP_GRENAT_MAC;
pstate.tunnel_state |=
IDPF_PTYPE_TUNNEL_IP_GRENAT_MAC;
}
@@ -2652,23 +2669,23 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport)
break;
case VIRTCHNL2_PROTO_HDR_UDP:
ptype_lkup[k].inner_prot =
- IDPF_RX_PTYPE_INNER_PROT_UDP;
+ LIBETH_RX_PT_INNER_UDP;
break;
case VIRTCHNL2_PROTO_HDR_TCP:
ptype_lkup[k].inner_prot =
- IDPF_RX_PTYPE_INNER_PROT_TCP;
+ LIBETH_RX_PT_INNER_TCP;
break;
case VIRTCHNL2_PROTO_HDR_SCTP:
ptype_lkup[k].inner_prot =
- IDPF_RX_PTYPE_INNER_PROT_SCTP;
+ LIBETH_RX_PT_INNER_SCTP;
break;
case VIRTCHNL2_PROTO_HDR_ICMP:
ptype_lkup[k].inner_prot =
- IDPF_RX_PTYPE_INNER_PROT_ICMP;
+ LIBETH_RX_PT_INNER_ICMP;
break;
case VIRTCHNL2_PROTO_HDR_PAY:
ptype_lkup[k].payload_layer =
- IDPF_RX_PTYPE_PAYLOAD_LAYER_PAY2;
+ LIBETH_RX_PT_PAYLOAD_L2;
break;
case VIRTCHNL2_PROTO_HDR_ICMPV6:
case VIRTCHNL2_PROTO_HDR_IPV6_EH:
@@ -2722,9 +2739,14 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport)
break;
}
}
+
+ idpf_finalize_ptype_lookup(&ptype_lkup[k]);
}
}

+out:
+ vport->rx_ptype_lkup = no_free_ptr(ptype_lkup);
+
return 0;
}

--
2.45.1


2024-05-28 13:51:58

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 07/12] idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ

Currently, all HW supporting idpf supports the singleq model, but none
of it advertises it by default, as splitq is supported and preferred
for multiple reasons. Still, this almost dead code often times adds
hotpath branches and redundant cacheline accesses.
While it can't currently be removed, add CONFIG_IDPF_SINGLEQ and build
the singleq code only when it's enabled manually. This corresponds to
-10 Kb of object code size and a good bunch of hotpath checks.
idpf_is_queue_model_split() works as a gate and compiles out to `true`
when the config option is disabled.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/Kconfig | 13 +--------
drivers/net/ethernet/intel/idpf/Kconfig | 27 +++++++++++++++++++
drivers/net/ethernet/intel/idpf/Makefile | 3 ++-
drivers/net/ethernet/intel/idpf/idpf.h | 3 ++-
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 2 +-
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 15 ++++++++---
6 files changed, 44 insertions(+), 19 deletions(-)
create mode 100644 drivers/net/ethernet/intel/idpf/Kconfig

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index e0287fbd501d..0375c7448a57 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -384,17 +384,6 @@ config IGC_LEDS
Optional support for controlling the NIC LED's with the netdev
LED trigger.

-config IDPF
- tristate "Intel(R) Infrastructure Data Path Function Support"
- depends on PCI_MSI
- select DIMLIB
- select PAGE_POOL
- select PAGE_POOL_STATS
- help
- This driver supports Intel(R) Infrastructure Data Path Function
- devices.
-
- To compile this driver as a module, choose M here. The module
- will be called idpf.
+source "drivers/net/ethernet/intel/idpf/Kconfig"

endif # NET_VENDOR_INTEL
diff --git a/drivers/net/ethernet/intel/idpf/Kconfig b/drivers/net/ethernet/intel/idpf/Kconfig
new file mode 100644
index 000000000000..9082c16edb7e
--- /dev/null
+++ b/drivers/net/ethernet/intel/idpf/Kconfig
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2024 Intel Corporation
+
+config IDPF
+ tristate "Intel(R) Infrastructure Data Path Function Support"
+ depends on PCI_MSI
+ select DIMLIB
+ select PAGE_POOL
+ select PAGE_POOL_STATS
+ help
+ This driver supports Intel(R) Infrastructure Data Path Function
+ devices.
+
+ To compile this driver as a module, choose M here. The module
+ will be called idpf.
+
+if IDPF
+
+config IDPF_SINGLEQ
+ bool "idpf singleq support"
+ help
+ This option enables support for legacy single Rx/Tx queues w/no
+ completion and fill queues. Only enable if you have hardware which
+ wants to work in this mode as it increases the driver size and adds
+ runtme checks on hotpath.
+
+endif # IDPF
diff --git a/drivers/net/ethernet/intel/idpf/Makefile b/drivers/net/ethernet/intel/idpf/Makefile
index 6844ead2f3ac..2ce01a0b5898 100644
--- a/drivers/net/ethernet/intel/idpf/Makefile
+++ b/drivers/net/ethernet/intel/idpf/Makefile
@@ -12,7 +12,8 @@ idpf-y := \
idpf_ethtool.o \
idpf_lib.o \
idpf_main.o \
- idpf_singleq_txrx.o \
idpf_txrx.o \
idpf_virtchnl.o \
idpf_vf_dev.o
+
+idpf-$(CONFIG_IDPF_SINGLEQ) += idpf_singleq_txrx.o
diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index f9e43d171f17..5d9529f5b41b 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -599,7 +599,8 @@ struct idpf_adapter {
*/
static inline int idpf_is_queue_model_split(u16 q_model)
{
- return q_model == VIRTCHNL2_QUEUE_MODEL_SPLIT;
+ return !IS_ENABLED(CONFIG_IDPF_SINGLEQ) ||
+ q_model == VIRTCHNL2_QUEUE_MODEL_SPLIT;
}

#define idpf_is_cap_ena(adapter, field, flag) \
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 5dd1b1a9e624..e79a8f9dfc40 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1309,7 +1309,7 @@ static void idpf_vport_calc_numq_per_grp(struct idpf_vport *vport,
static void idpf_rxq_set_descids(const struct idpf_vport *vport,
struct idpf_rx_queue *q)
{
- if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) {
+ if (idpf_is_queue_model_split(vport->rxq_model)) {
q->rxdids = VIRTCHNL2_RXDID_2_FLEX_SPLITQ_M;
} else {
if (vport->base_rxd)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 44602b87cd41..1aa4770dfe18 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -1256,12 +1256,12 @@ int idpf_send_create_vport_msg(struct idpf_adapter *adapter,
vport_msg->vport_type = cpu_to_le16(VIRTCHNL2_VPORT_TYPE_DEFAULT);
vport_msg->vport_index = cpu_to_le16(idx);

- if (adapter->req_tx_splitq)
+ if (adapter->req_tx_splitq || !IS_ENABLED(CONFIG_IDPF_SINGLEQ))
vport_msg->txq_model = cpu_to_le16(VIRTCHNL2_QUEUE_MODEL_SPLIT);
else
vport_msg->txq_model = cpu_to_le16(VIRTCHNL2_QUEUE_MODEL_SINGLE);

- if (adapter->req_rx_splitq)
+ if (adapter->req_rx_splitq || !IS_ENABLED(CONFIG_IDPF_SINGLEQ))
vport_msg->rxq_model = cpu_to_le16(VIRTCHNL2_QUEUE_MODEL_SPLIT);
else
vport_msg->rxq_model = cpu_to_le16(VIRTCHNL2_QUEUE_MODEL_SINGLE);
@@ -1323,10 +1323,17 @@ int idpf_check_supported_desc_ids(struct idpf_vport *vport)

vport_msg = adapter->vport_params_recvd[vport->idx];

+ if (!IS_ENABLED(CONFIG_IDPF_SINGLEQ) &&
+ (vport_msg->rxq_model == VIRTCHNL2_QUEUE_MODEL_SINGLE ||
+ vport_msg->txq_model == VIRTCHNL2_QUEUE_MODEL_SINGLE)) {
+ pci_err(adapter->pdev, "singleq mode requested, but not compiled-in\n");
+ return -EOPNOTSUPP;
+ }
+
rx_desc_ids = le64_to_cpu(vport_msg->rx_desc_ids);
tx_desc_ids = le64_to_cpu(vport_msg->tx_desc_ids);

- if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) {
+ if (idpf_is_queue_model_split(vport->rxq_model)) {
if (!(rx_desc_ids & VIRTCHNL2_RXDID_2_FLEX_SPLITQ_M)) {
dev_info(&adapter->pdev->dev, "Minimum RX descriptor support not provided, using the default\n");
vport_msg->rx_desc_ids = cpu_to_le64(VIRTCHNL2_RXDID_2_FLEX_SPLITQ_M);
@@ -1336,7 +1343,7 @@ int idpf_check_supported_desc_ids(struct idpf_vport *vport)
vport->base_rxd = true;
}

- if (vport->txq_model != VIRTCHNL2_QUEUE_MODEL_SPLIT)
+ if (!idpf_is_queue_model_split(vport->txq_model))
return 0;

if ((tx_desc_ids & MIN_SUPPORT_TXDID) != MIN_SUPPORT_TXDID) {
--
2.45.1


2024-05-28 13:52:09

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures

Currently, sizeof(struct idpf_queue) is 32 Kb.
This is due to the 12-bit hashtable declaration at the end of the queue.
This HT is needed only for Tx queues when the flow scheduling mode is
enabled. But &idpf_queue is unified for all of the queue types,
provoking excessive memory usage.
The unified structure in general makes the code less effective via
suboptimal fields placement. You can't avoid that unless you make unions
each 2 fields. Even then, different field alignment etc., doesn't allow
you to optimize things to the limit.
Split &idpf_queue into 4 structures corresponding to the queue types:
RQ (Rx queue), SQ (Tx queue), FQ (buffer queue), and CQ (completion
queue). Place only needed fields there and shortcuts handy for hotpath.
Allocate the abovementioned hashtable dynamically and only when needed,
keeping &idpf_tx_queue relatively short (192 bytes, same as Rx). This HT
is used only for OOO completions, which aren't really hotpath anyway.
Note that this change must be done atomically, otherwise it's really
easy to get lost and miss something.

Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf.h | 3 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 440 ++++++---
.../net/ethernet/intel/idpf/idpf_ethtool.c | 125 +--
drivers/net/ethernet/intel/idpf/idpf_lib.c | 46 +-
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 149 +--
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 915 +++++++++++-------
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 73 +-
7 files changed, 1021 insertions(+), 730 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index 0b26dd9b8a51..f9e43d171f17 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -17,7 +17,6 @@ struct idpf_vport_max_q;
#include <linux/sctp.h>
#include <linux/ethtool_netlink.h>
#include <net/gro.h>
-#include <linux/dim.h>

#include "virtchnl2.h"
#include "idpf_txrx.h"
@@ -301,7 +300,7 @@ struct idpf_vport {
u16 num_txq_grp;
struct idpf_txq_group *txq_grps;
u32 txq_model;
- struct idpf_queue **txqs;
+ struct idpf_tx_queue **txqs;
bool crc_enable;

u16 num_rxq;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 6dce14483215..ce4ade9c864f 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -4,6 +4,8 @@
#ifndef _IDPF_TXRX_H_
#define _IDPF_TXRX_H_

+#include <linux/dim.h>
+
#include <net/page_pool/helpers.h>
#include <net/tcp.h>
#include <net/netdev_queues.h>
@@ -84,7 +86,7 @@
do { \
if (unlikely(++(ntc) == (rxq)->desc_count)) { \
ntc = 0; \
- change_bit(__IDPF_Q_GEN_CHK, (rxq)->flags); \
+ idpf_queue_change(GEN_CHK, rxq); \
} \
} while (0)

@@ -111,10 +113,9 @@ do { \
*/
#define IDPF_TX_SPLITQ_RE_MIN_GAP 64

-#define IDPF_RX_BI_BUFID_S 0
-#define IDPF_RX_BI_BUFID_M GENMASK(14, 0)
-#define IDPF_RX_BI_GEN_S 15
-#define IDPF_RX_BI_GEN_M BIT(IDPF_RX_BI_GEN_S)
+#define IDPF_RX_BI_GEN_M BIT(16)
+#define IDPF_RX_BI_BUFID_M GENMASK(15, 0)
+
#define IDPF_RXD_EOF_SPLITQ VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_EOF_M
#define IDPF_RXD_EOF_SINGLEQ VIRTCHNL2_RX_BASE_DESC_STATUS_EOF_M

@@ -122,7 +123,7 @@ do { \
((((txq)->next_to_clean > (txq)->next_to_use) ? 0 : (txq)->desc_count) + \
(txq)->next_to_clean - (txq)->next_to_use - 1)

-#define IDPF_TX_BUF_RSV_UNUSED(txq) ((txq)->buf_stack.top)
+#define IDPF_TX_BUF_RSV_UNUSED(txq) ((txq)->stash->buf_stack.top)
#define IDPF_TX_BUF_RSV_LOW(txq) (IDPF_TX_BUF_RSV_UNUSED(txq) < \
(txq)->desc_count >> 2)

@@ -433,23 +434,37 @@ struct idpf_rx_ptype_decoded {
* to 1 and knows that reading a gen bit of 1 in any
* descriptor on the initial pass of the ring indicates a
* writeback. It also flips on every ring wrap.
- * @__IDPF_RFLQ_GEN_CHK: Refill queues are SW only, so Q_GEN acts as the HW bit
- * and RFLGQ_GEN is the SW bit.
+ * @__IDPF_Q_RFL_GEN_CHK: Refill queues are SW only, so Q_GEN acts as the HW
+ * bit and Q_RFL_GEN is the SW bit.
* @__IDPF_Q_FLOW_SCH_EN: Enable flow scheduling
* @__IDPF_Q_SW_MARKER: Used to indicate TX queue marker completions
* @__IDPF_Q_POLL_MODE: Enable poll mode
+ * @__IDPF_Q_CRC_EN: enable CRC offload in singleq mode
+ * @__IDPF_Q_HSPLIT_EN: enable header split on Rx (splitq)
* @__IDPF_Q_FLAGS_NBITS: Must be last
*/
enum idpf_queue_flags_t {
__IDPF_Q_GEN_CHK,
- __IDPF_RFLQ_GEN_CHK,
+ __IDPF_Q_RFL_GEN_CHK,
__IDPF_Q_FLOW_SCH_EN,
__IDPF_Q_SW_MARKER,
__IDPF_Q_POLL_MODE,
+ __IDPF_Q_CRC_EN,
+ __IDPF_Q_HSPLIT_EN,

__IDPF_Q_FLAGS_NBITS,
};

+#define idpf_queue_set(f, q) __set_bit(__IDPF_Q_##f, (q)->flags)
+#define idpf_queue_clear(f, q) __clear_bit(__IDPF_Q_##f, (q)->flags)
+#define idpf_queue_change(f, q) __change_bit(__IDPF_Q_##f, (q)->flags)
+#define idpf_queue_has(f, q) test_bit(__IDPF_Q_##f, (q)->flags)
+
+#define idpf_queue_has_clear(f, q) \
+ __test_and_clear_bit(__IDPF_Q_##f, (q)->flags)
+#define idpf_queue_assign(f, q, v) \
+ __assign_bit(__IDPF_Q_##f, (q)->flags, v)
+
/**
* struct idpf_vec_regs
* @dyn_ctl_reg: Dynamic control interrupt register offset
@@ -495,7 +510,9 @@ struct idpf_intr_reg {
* @v_idx: Vector index
* @intr_reg: See struct idpf_intr_reg
* @num_txq: Number of TX queues
+ * @num_complq: number of completion queues
* @tx: Array of TX queues to service
+ * @complq: array of completion queues
* @tx_dim: Data for TX net_dim algorithm
* @tx_itr_value: TX interrupt throttling rate
* @tx_intr_mode: Dynamic ITR or not
@@ -519,21 +536,24 @@ struct idpf_q_vector {
struct idpf_intr_reg intr_reg;

u16 num_txq;
- struct idpf_queue **tx;
+ u16 num_complq;
+ struct idpf_tx_queue **tx;
+ struct idpf_compl_queue **complq;
+
struct dim tx_dim;
u16 tx_itr_value;
bool tx_intr_mode;
u32 tx_itr_idx;

u16 num_rxq;
- struct idpf_queue **rx;
+ struct idpf_rx_queue **rx;
struct dim rx_dim;
u16 rx_itr_value;
bool rx_intr_mode;
u32 rx_itr_idx;

u16 num_bufq;
- struct idpf_queue **bufq;
+ struct idpf_buf_queue **bufq;

u16 total_events;
char *name;
@@ -564,11 +584,6 @@ struct idpf_cleaned_stats {
u32 bytes;
};

-union idpf_queue_stats {
- struct idpf_rx_queue_stats rx;
- struct idpf_tx_queue_stats tx;
-};
-
#define IDPF_ITR_DYNAMIC 1
#define IDPF_ITR_MAX 0x1FE0
#define IDPF_ITR_20K 0x0032
@@ -584,39 +599,114 @@ union idpf_queue_stats {
#define IDPF_DIM_DEFAULT_PROFILE_IX 1

/**
- * struct idpf_queue
- * @dev: Device back pointer for DMA mapping
- * @vport: Back pointer to associated vport
- * @txq_grp: See struct idpf_txq_group
- * @rxq_grp: See struct idpf_rxq_group
- * @idx: For buffer queue, it is used as group id, either 0 or 1. On clean,
- * buffer queue uses this index to determine which group of refill queues
- * to clean.
- * For TX queue, it is used as index to map between TX queue group and
- * hot path TX pointers stored in vport. Used in both singleq/splitq.
- * For RX queue, it is used to index to total RX queue across groups and
+ * struct idpf_txq_stash - Tx buffer stash for Flow-based scheduling mode
+ * @buf_stack: Stack of empty buffers to store buffer info for out of order
+ * buffer completions. See struct idpf_buf_lifo.
+ * @sched_buf_hash: Hash table to stores buffers
+ */
+struct idpf_txq_stash {
+ struct idpf_buf_lifo buf_stack;
+ DECLARE_HASHTABLE(sched_buf_hash, 12);
+} ____cacheline_aligned;
+
+/**
+ * struct idpf_rx_queue - software structure representing a receive queue
+ * @rx: universal receive descriptor array
+ * @single_buf: buffer descriptor array in singleq
+ * @desc_ring: virtual descriptor ring address
+ * @bufq_sets: Pointer to the array of buffer queues in splitq mode
+ * @napi: NAPI instance corresponding to this queue (splitq)
+ * @rx_buf: See struct idpf_rx_buf
+ * @pp: Page pool pointer in singleq mode
+ * @netdev: &net_device corresponding to this queue
+ * @tail: Tail offset. Used for both queue models single and split.
+ * @flags: See enum idpf_queue_flags_t
+ * @idx: For RX queue, it is used to index to total RX queue across groups and
* used for skb reporting.
- * @tail: Tail offset. Used for both queue models single and split. In splitq
- * model relevant only for TX queue and RX queue.
- * @tx_buf: See struct idpf_tx_buf
- * @rx_buf: Struct with RX buffer related members
- * @rx_buf.buf: See struct idpf_rx_buf
- * @rx_buf.hdr_buf_pa: DMA handle
- * @rx_buf.hdr_buf_va: Virtual address
- * @pp: Page pool pointer
+ * @desc_count: Number of descriptors
+ * @next_to_use: Next descriptor to use
+ * @next_to_clean: Next descriptor to clean
+ * @next_to_alloc: RX buffer to allocate at
+ * @rxdids: Supported RX descriptor ids
+ * @rx_ptype_lkup: LUT of Rx ptypes
* @skb: Pointer to the skb
- * @q_type: Queue type (TX, RX, TX completion, RX buffer)
+ * @stats_sync: See struct u64_stats_sync
+ * @q_stats: See union idpf_rx_queue_stats
* @q_id: Queue id
- * @desc_count: Number of descriptors
- * @next_to_use: Next descriptor to use. Relevant in both split & single txq
- * and bufq.
- * @next_to_clean: Next descriptor to clean. In split queue model, only
- * relevant to TX completion queue and RX queue.
- * @next_to_alloc: RX buffer to allocate at. Used only for RX. In splitq model
- * only relevant to RX queue.
+ * @size: Length of descriptor ring in bytes
+ * @dma: Physical address of ring
+ * @q_vector: Backreference to associated vector
+ * @rx_buffer_low_watermark: RX buffer low watermark
+ * @rx_hbuf_size: Header buffer size
+ * @rx_buf_size: Buffer size
+ * @rx_max_pkt_size: RX max packet size
+ */
+struct idpf_rx_queue {
+ union {
+ union virtchnl2_rx_desc *rx;
+ struct virtchnl2_singleq_rx_buf_desc *single_buf;
+
+ void *desc_ring;
+ };
+ union {
+ struct {
+ struct idpf_bufq_set *bufq_sets;
+ struct napi_struct *napi;
+ };
+ struct {
+ struct idpf_rx_buf *rx_buf;
+ struct page_pool *pp;
+ };
+ };
+ struct net_device *netdev;
+ void __iomem *tail;
+
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u16 idx;
+ u16 desc_count;
+ u16 next_to_use;
+ u16 next_to_clean;
+ u16 next_to_alloc;
+
+ u32 rxdids;
+
+ const struct idpf_rx_ptype_decoded *rx_ptype_lkup;
+ struct sk_buff *skb;
+
+ struct u64_stats_sync stats_sync;
+ struct idpf_rx_queue_stats q_stats;
+
+ /* Slowpath */
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;
+
+ struct idpf_q_vector *q_vector;
+
+ u16 rx_buffer_low_watermark;
+ u16 rx_hbuf_size;
+ u16 rx_buf_size;
+ u16 rx_max_pkt_size;
+} ____cacheline_aligned;
+
+/**
+ * struct idpf_tx_queue - software structure representing a transmit queue
+ * @base_tx: base Tx descriptor array
+ * @base_ctx: base Tx context descriptor array
+ * @flex_tx: flex Tx descriptor array
+ * @flex_ctx: flex Tx context descriptor array
+ * @desc_ring: virtual descriptor ring address
+ * @tx_buf: See struct idpf_tx_buf
+ * @txq_grp: See struct idpf_txq_group
+ * @dev: Device back pointer for DMA mapping
+ * @tail: Tail offset. Used for both queue models single and split
* @flags: See enum idpf_queue_flags_t
- * @q_stats: See union idpf_queue_stats
- * @stats_sync: See struct u64_stats_sync
+ * @idx: For TX queue, it is used as index to map between TX queue group and
+ * hot path TX pointers stored in vport. Used in both singleq/splitq.
+ * @desc_count: Number of descriptors
+ * @next_to_use: Next descriptor to use
+ * @next_to_clean: Next descriptor to clean
+ * @netdev: &net_device corresponding to this queue
* @cleaned_bytes: Splitq only, TXQ only: When a TX completion is received on
* the TX completion queue, it can be for any TXQ associated
* with that completion queue. This means we can clean up to
@@ -625,34 +715,10 @@ union idpf_queue_stats {
* that single call to clean the completion queue. By doing so,
* we can update BQL with aggregate cleaned stats for each TXQ
* only once at the end of the cleaning routine.
+ * @clean_budget: singleq only, queue cleaning budget
* @cleaned_pkts: Number of packets cleaned for the above said case
- * @rx_hsplit_en: RX headsplit enable
- * @rx_hbuf_size: Header buffer size
- * @rx_buf_size: Buffer size
- * @rx_max_pkt_size: RX max packet size
- * @rx_buf_stride: RX buffer stride
- * @rx_buffer_low_watermark: RX buffer low watermark
- * @rxdids: Supported RX descriptor ids
- * @q_vector: Backreference to associated vector
- * @size: Length of descriptor ring in bytes
- * @dma: Physical address of ring
- * @rx: universal receive descriptor array
- * @single_buf: Rx buffer descriptor array in singleq
- * @split_buf: Rx buffer descriptor array in splitq
- * @base_tx: basic Tx descriptor array
- * @base_ctx: basic Tx context descriptor array
- * @flex_tx: flex Tx descriptor array
- * @flex_ctx: flex Tx context descriptor array
- * @comp: completion descriptor array
- * @desc_ring: virtual descriptor ring address
* @tx_max_bufs: Max buffers that can be transmitted with scatter-gather
* @tx_min_pkt_len: Min supported packet length
- * @num_completions: Only relevant for TX completion queue. It tracks the
- * number of completions received to compare against the
- * number of completions pending, as accumulated by the
- * TX queues.
- * @buf_stack: Stack of empty buffers to store buffer info for out of order
- * buffer completions. See struct idpf_buf_lifo.
* @compl_tag_bufid_m: Completion tag buffer id mask
* @compl_tag_gen_s: Completion tag generation bit
* The format of the completion tag will change based on the TXQ
@@ -676,120 +742,188 @@ union idpf_queue_stats {
* This gives us 8*8160 = 65280 possible unique values.
* @compl_tag_cur_gen: Used to keep track of current completion tag generation
* @compl_tag_gen_max: To determine when compl_tag_cur_gen should be reset
- * @sched_buf_hash: Hash table to stores buffers
+ * @stash: Tx buffer stash for Flow-based scheduling mode
+ * @stats_sync: See struct u64_stats_sync
+ * @q_stats: See union idpf_tx_queue_stats
+ * @q_id: Queue id
+ * @size: Length of descriptor ring in bytes
+ * @dma: Physical address of ring
+ * @q_vector: Backreference to associated vector
*/
-struct idpf_queue {
- struct device *dev;
- struct idpf_vport *vport;
+struct idpf_tx_queue {
union {
- struct idpf_txq_group *txq_grp;
- struct idpf_rxq_group *rxq_grp;
+ struct idpf_base_tx_desc *base_tx;
+ struct idpf_base_tx_ctx_desc *base_ctx;
+ union idpf_tx_flex_desc *flex_tx;
+ struct idpf_flex_tx_ctx_desc *flex_ctx;
+
+ void *desc_ring;
};
- u16 idx;
+ struct idpf_tx_buf *tx_buf;
+ struct idpf_txq_group *txq_grp;
+ struct device *dev;
void __iomem *tail;
- union {
- struct idpf_tx_buf *tx_buf;
- struct {
- struct idpf_rx_buf *buf;
- dma_addr_t hdr_buf_pa;
- void *hdr_buf_va;
- } rx_buf;
- };
- struct page_pool *pp;
- struct sk_buff *skb;
- u16 q_type;
- u32 q_id;
- u16 desc_count;

+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u16 idx;
+ u16 desc_count;
u16 next_to_use;
u16 next_to_clean;
- u16 next_to_alloc;
- DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);

- union idpf_queue_stats q_stats;
- struct u64_stats_sync stats_sync;
+ struct net_device *netdev;

- u32 cleaned_bytes;
+ union {
+ u32 cleaned_bytes;
+ u32 clean_budget;
+ };
u16 cleaned_pkts;

- bool rx_hsplit_en;
- u16 rx_hbuf_size;
- u16 rx_buf_size;
- u16 rx_max_pkt_size;
- u16 rx_buf_stride;
- u8 rx_buffer_low_watermark;
- u64 rxdids;
- struct idpf_q_vector *q_vector;
- unsigned int size;
+ u16 tx_max_bufs;
+ u16 tx_min_pkt_len;
+
+ u16 compl_tag_bufid_m;
+ u16 compl_tag_gen_s;
+
+ u16 compl_tag_cur_gen;
+ u16 compl_tag_gen_max;
+
+ struct idpf_txq_stash *stash;
+
+ struct u64_stats_sync stats_sync;
+ struct idpf_tx_queue_stats q_stats;
+
+ /* Slowpath */
+ u32 q_id;
+ u32 size;
dma_addr_t dma;
- union {
- union virtchnl2_rx_desc *rx;

- struct virtchnl2_singleq_rx_buf_desc *single_buf;
- struct virtchnl2_splitq_rx_buf_desc *split_buf;
+ struct idpf_q_vector *q_vector;
+} ____cacheline_aligned;

- struct idpf_base_tx_desc *base_tx;
- struct idpf_base_tx_ctx_desc *base_ctx;
- union idpf_tx_flex_desc *flex_tx;
- struct idpf_flex_tx_ctx_desc *flex_ctx;
+/**
+ * struct idpf_buf_queue - software structure representing a buffer queue
+ * @split_buf: buffer descriptor array
+ * @rx_buf: Struct with RX buffer related members
+ * @rx_buf.buf: See struct idpf_rx_buf
+ * @rx_buf.hdr_buf_pa: DMA handle
+ * @rx_buf.hdr_buf_va: Virtual address
+ * @pp: Page pool pointer
+ * @tail: Tail offset
+ * @flags: See enum idpf_queue_flags_t
+ * @desc_count: Number of descriptors
+ * @next_to_use: Next descriptor to use
+ * @next_to_clean: Next descriptor to clean
+ * @next_to_alloc: RX buffer to allocate at
+ * @q_id: Queue id
+ * @size: Length of descriptor ring in bytes
+ * @dma: Physical address of ring
+ * @q_vector: Backreference to associated vector
+ * @rx_buffer_low_watermark: RX buffer low watermark
+ * @rx_hbuf_size: Header buffer size
+ * @rx_buf_size: Buffer size
+ */
+struct idpf_buf_queue {
+ struct virtchnl2_splitq_rx_buf_desc *split_buf;
+ struct {
+ struct idpf_rx_buf *buf;
+ dma_addr_t hdr_buf_pa;
+ void *hdr_buf_va;
+ } rx_buf;
+ struct page_pool *pp;
+ void __iomem *tail;

- struct idpf_splitq_tx_compl_desc *comp;
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u16 desc_count;
+ u16 next_to_use;
+ u16 next_to_clean;
+ u16 next_to_alloc;

- void *desc_ring;
- };
+ /* Slowpath */
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;

- u16 tx_max_bufs;
- u8 tx_min_pkt_len;
+ struct idpf_q_vector *q_vector;

- u32 num_completions;
+ u16 rx_buffer_low_watermark;
+ u16 rx_hbuf_size;
+ u16 rx_buf_size;
+} ____cacheline_aligned;

- struct idpf_buf_lifo buf_stack;
+/**
+ * struct idpf_compl_queue - software structure representing a completion queue
+ * @comp: completion descriptor array
+ * @txq_grp: See struct idpf_txq_group
+ * @flags: See enum idpf_queue_flags_t
+ * @desc_count: Number of descriptors
+ * @next_to_use: Next descriptor to use. Relevant in both split & single txq
+ * and bufq.
+ * @next_to_clean: Next descriptor to clean
+ * @netdev: &net_device corresponding to this queue
+ * @clean_budget: queue cleaning budget
+ * @num_completions: Only relevant for TX completion queue. It tracks the
+ * number of completions received to compare against the
+ * number of completions pending, as accumulated by the
+ * TX queues.
+ * @q_id: Queue id
+ * @size: Length of descriptor ring in bytes
+ * @dma: Physical address of ring
+ * @q_vector: Backreference to associated vector
+ */
+struct idpf_compl_queue {
+ struct idpf_splitq_tx_compl_desc *comp;
+ struct idpf_txq_group *txq_grp;

- u16 compl_tag_bufid_m;
- u16 compl_tag_gen_s;
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u16 desc_count;
+ u16 next_to_use;
+ u16 next_to_clean;

- u16 compl_tag_cur_gen;
- u16 compl_tag_gen_max;
+ struct net_device *netdev;
+ u32 clean_budget;
+ u32 num_completions;

- DECLARE_HASHTABLE(sched_buf_hash, 12);
-} ____cacheline_internodealigned_in_smp;
+ /* Slowpath */
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;
+
+ struct idpf_q_vector *q_vector;
+} ____cacheline_aligned;

/**
* struct idpf_sw_queue
- * @next_to_clean: Next descriptor to clean
- * @next_to_alloc: Buffer to allocate at
- * @flags: See enum idpf_queue_flags_t
* @ring: Pointer to the ring
+ * @flags: See enum idpf_queue_flags_t
* @desc_count: Descriptor count
- * @dev: Device back pointer for DMA mapping
+ * @next_to_use: Buffer to allocate at
+ * @next_to_clean: Next descriptor to clean
*
* Software queues are used in splitq mode to manage buffers between rxq
* producer and the bufq consumer. These are required in order to maintain a
* lockless buffer management system and are strictly software only constructs.
*/
struct idpf_sw_queue {
- u16 next_to_clean;
- u16 next_to_alloc;
+ u32 *ring;
+
DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
- u16 *ring;
u16 desc_count;
- struct device *dev;
-} ____cacheline_internodealigned_in_smp;
+ u16 next_to_use;
+ u16 next_to_clean;
+} ____cacheline_aligned;

/**
* struct idpf_rxq_set
* @rxq: RX queue
- * @refillq0: Pointer to refill queue 0
- * @refillq1: Pointer to refill queue 1
+ * @refillq: pointers to refill queues
*
* Splitq only. idpf_rxq_set associates an rxq with at an array of refillqs.
* Each rxq needs a refillq to return used buffers back to the respective bufq.
* Bufqs then clean these refillqs for buffers to give to hardware.
*/
struct idpf_rxq_set {
- struct idpf_queue rxq;
- struct idpf_sw_queue *refillq0;
- struct idpf_sw_queue *refillq1;
+ struct idpf_rx_queue rxq;
+ struct idpf_sw_queue *refillq[IDPF_MAX_BUFQS_PER_RXQ_GRP];
};

/**
@@ -808,7 +942,7 @@ struct idpf_rxq_set {
* managed by at most two bufqs (depending on performance configuration).
*/
struct idpf_bufq_set {
- struct idpf_queue bufq;
+ struct idpf_buf_queue bufq;
int num_refillqs;
struct idpf_sw_queue *refillqs;
};
@@ -834,7 +968,7 @@ struct idpf_rxq_group {
union {
struct {
u16 num_rxq;
- struct idpf_queue *rxqs[IDPF_LARGE_MAX_Q];
+ struct idpf_rx_queue *rxqs[IDPF_LARGE_MAX_Q];
} singleq;
struct {
u16 num_rxq_sets;
@@ -849,6 +983,7 @@ struct idpf_rxq_group {
* @vport: Vport back pointer
* @num_txq: Number of TX queues associated
* @txqs: Array of TX queue pointers
+ * @stashes: array of OOO stashes for the queues
* @complq: Associated completion queue pointer, split queue only
* @num_completions_pending: Total number of completions pending for the
* completion queue, acculumated for all TX queues
@@ -862,9 +997,10 @@ struct idpf_txq_group {
struct idpf_vport *vport;

u16 num_txq;
- struct idpf_queue *txqs[IDPF_LARGE_MAX_Q];
+ struct idpf_tx_queue *txqs[IDPF_LARGE_MAX_Q];
+ struct idpf_txq_stash *stashes;

- struct idpf_queue *complq;
+ struct idpf_compl_queue *complq;

u32 num_completions_pending;
};
@@ -1001,28 +1137,26 @@ void idpf_deinit_rss(struct idpf_vport *vport);
int idpf_rx_bufs_init_all(struct idpf_vport *vport);
void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb,
unsigned int size);
-struct sk_buff *idpf_rx_construct_skb(struct idpf_queue *rxq,
+struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
struct idpf_rx_buf *rx_buf,
unsigned int size);
-bool idpf_init_rx_buf_hw_alloc(struct idpf_queue *rxq, struct idpf_rx_buf *buf);
-void idpf_rx_buf_hw_update(struct idpf_queue *rxq, u32 val);
-void idpf_tx_buf_hw_update(struct idpf_queue *tx_q, u32 val,
+void idpf_tx_buf_hw_update(struct idpf_tx_queue *tx_q, u32 val,
bool xmit_more);
unsigned int idpf_size_to_txd_count(unsigned int size);
-netdev_tx_t idpf_tx_drop_skb(struct idpf_queue *tx_q, struct sk_buff *skb);
-void idpf_tx_dma_map_error(struct idpf_queue *txq, struct sk_buff *skb,
+netdev_tx_t idpf_tx_drop_skb(struct idpf_tx_queue *tx_q, struct sk_buff *skb);
+void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
struct idpf_tx_buf *first, u16 ring_idx);
-unsigned int idpf_tx_desc_count_required(struct idpf_queue *txq,
+unsigned int idpf_tx_desc_count_required(struct idpf_tx_queue *txq,
struct sk_buff *skb);
bool idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs,
unsigned int count);
-int idpf_tx_maybe_stop_common(struct idpf_queue *tx_q, unsigned int size);
+int idpf_tx_maybe_stop_common(struct idpf_tx_queue *tx_q, unsigned int size);
void idpf_tx_timeout(struct net_device *netdev, unsigned int txqueue);
netdev_tx_t idpf_tx_splitq_start(struct sk_buff *skb,
struct net_device *netdev);
netdev_tx_t idpf_tx_singleq_start(struct sk_buff *skb,
struct net_device *netdev);
-bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rxq,
+bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_rx_queue *rxq,
u16 cleaned_count);
int idpf_tso(struct sk_buff *skb, struct idpf_tx_offload_params *off);

diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
index 1885ba618981..e933fed16c7e 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
@@ -437,22 +437,24 @@ struct idpf_stats {
.stat_offset = offsetof(_type, _stat) \
}

-/* Helper macro for defining some statistics related to queues */
-#define IDPF_QUEUE_STAT(_name, _stat) \
- IDPF_STAT(struct idpf_queue, _name, _stat)
+/* Helper macros for defining some statistics related to queues */
+#define IDPF_RX_QUEUE_STAT(_name, _stat) \
+ IDPF_STAT(struct idpf_rx_queue, _name, _stat)
+#define IDPF_TX_QUEUE_STAT(_name, _stat) \
+ IDPF_STAT(struct idpf_tx_queue, _name, _stat)

/* Stats associated with a Tx queue */
static const struct idpf_stats idpf_gstrings_tx_queue_stats[] = {
- IDPF_QUEUE_STAT("pkts", q_stats.tx.packets),
- IDPF_QUEUE_STAT("bytes", q_stats.tx.bytes),
- IDPF_QUEUE_STAT("lso_pkts", q_stats.tx.lso_pkts),
+ IDPF_TX_QUEUE_STAT("pkts", q_stats.packets),
+ IDPF_TX_QUEUE_STAT("bytes", q_stats.bytes),
+ IDPF_TX_QUEUE_STAT("lso_pkts", q_stats.lso_pkts),
};

/* Stats associated with an Rx queue */
static const struct idpf_stats idpf_gstrings_rx_queue_stats[] = {
- IDPF_QUEUE_STAT("pkts", q_stats.rx.packets),
- IDPF_QUEUE_STAT("bytes", q_stats.rx.bytes),
- IDPF_QUEUE_STAT("rx_gro_hw_pkts", q_stats.rx.rsc_pkts),
+ IDPF_RX_QUEUE_STAT("pkts", q_stats.packets),
+ IDPF_RX_QUEUE_STAT("bytes", q_stats.bytes),
+ IDPF_RX_QUEUE_STAT("rx_gro_hw_pkts", q_stats.rsc_pkts),
};

#define IDPF_TX_QUEUE_STATS_LEN ARRAY_SIZE(idpf_gstrings_tx_queue_stats)
@@ -633,7 +635,7 @@ static int idpf_get_sset_count(struct net_device *netdev, int sset)
* Copies the stat data defined by the pointer and stat structure pair into
* the memory supplied as data. If the pointer is null, data will be zero'd.
*/
-static void idpf_add_one_ethtool_stat(u64 *data, void *pstat,
+static void idpf_add_one_ethtool_stat(u64 *data, const void *pstat,
const struct idpf_stats *stat)
{
char *p;
@@ -671,6 +673,7 @@ static void idpf_add_one_ethtool_stat(u64 *data, void *pstat,
* idpf_add_queue_stats - copy queue statistics into supplied buffer
* @data: ethtool stats buffer
* @q: the queue to copy
+ * @type: type of the queue
*
* Queue statistics must be copied while protected by u64_stats_fetch_begin,
* so we can't directly use idpf_add_ethtool_stats. Assumes that queue stats
@@ -681,19 +684,23 @@ static void idpf_add_one_ethtool_stat(u64 *data, void *pstat,
*
* This function expects to be called while under rcu_read_lock().
*/
-static void idpf_add_queue_stats(u64 **data, struct idpf_queue *q)
+static void idpf_add_queue_stats(u64 **data, const void *q,
+ enum virtchnl2_queue_type type)
{
+ const struct u64_stats_sync *stats_sync;
const struct idpf_stats *stats;
unsigned int start;
unsigned int size;
unsigned int i;

- if (q->q_type == VIRTCHNL2_QUEUE_TYPE_RX) {
+ if (type == VIRTCHNL2_QUEUE_TYPE_RX) {
size = IDPF_RX_QUEUE_STATS_LEN;
stats = idpf_gstrings_rx_queue_stats;
+ stats_sync = &((const struct idpf_rx_queue *)q)->stats_sync;
} else {
size = IDPF_TX_QUEUE_STATS_LEN;
stats = idpf_gstrings_tx_queue_stats;
+ stats_sync = &((const struct idpf_tx_queue *)q)->stats_sync;
}

/* To avoid invalid statistics values, ensure that we keep retrying
@@ -701,10 +708,10 @@ static void idpf_add_queue_stats(u64 **data, struct idpf_queue *q)
* u64_stats_fetch_retry.
*/
do {
- start = u64_stats_fetch_begin(&q->stats_sync);
+ start = u64_stats_fetch_begin(stats_sync);
for (i = 0; i < size; i++)
idpf_add_one_ethtool_stat(&(*data)[i], q, &stats[i]);
- } while (u64_stats_fetch_retry(&q->stats_sync, start));
+ } while (u64_stats_fetch_retry(stats_sync, start));

/* Once we successfully copy the stats in, update the data pointer */
*data += size;
@@ -793,7 +800,7 @@ static void idpf_collect_queue_stats(struct idpf_vport *vport)
for (j = 0; j < num_rxq; j++) {
u64 hw_csum_err, hsplit, hsplit_hbo, bad_descs;
struct idpf_rx_queue_stats *stats;
- struct idpf_queue *rxq;
+ struct idpf_rx_queue *rxq;
unsigned int start;

if (idpf_is_queue_model_split(vport->rxq_model))
@@ -807,7 +814,7 @@ static void idpf_collect_queue_stats(struct idpf_vport *vport)
do {
start = u64_stats_fetch_begin(&rxq->stats_sync);

- stats = &rxq->q_stats.rx;
+ stats = &rxq->q_stats;
hw_csum_err = u64_stats_read(&stats->hw_csum_err);
hsplit = u64_stats_read(&stats->hsplit_pkts);
hsplit_hbo = u64_stats_read(&stats->hsplit_buf_ovf);
@@ -828,7 +835,7 @@ static void idpf_collect_queue_stats(struct idpf_vport *vport)

for (j = 0; j < txq_grp->num_txq; j++) {
u64 linearize, qbusy, skb_drops, dma_map_errs;
- struct idpf_queue *txq = txq_grp->txqs[j];
+ struct idpf_tx_queue *txq = txq_grp->txqs[j];
struct idpf_tx_queue_stats *stats;
unsigned int start;

@@ -838,7 +845,7 @@ static void idpf_collect_queue_stats(struct idpf_vport *vport)
do {
start = u64_stats_fetch_begin(&txq->stats_sync);

- stats = &txq->q_stats.tx;
+ stats = &txq->q_stats;
linearize = u64_stats_read(&stats->linearize);
qbusy = u64_stats_read(&stats->q_busy);
skb_drops = u64_stats_read(&stats->skb_drops);
@@ -896,12 +903,12 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,
qtype = VIRTCHNL2_QUEUE_TYPE_TX;

for (j = 0; j < txq_grp->num_txq; j++, total++) {
- struct idpf_queue *txq = txq_grp->txqs[j];
+ struct idpf_tx_queue *txq = txq_grp->txqs[j];

if (!txq)
idpf_add_empty_queue_stats(&data, qtype);
else
- idpf_add_queue_stats(&data, txq);
+ idpf_add_queue_stats(&data, txq, qtype);
}
}

@@ -929,7 +936,7 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,
num_rxq = rxq_grp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++, total++) {
- struct idpf_queue *rxq;
+ struct idpf_rx_queue *rxq;

if (is_splitq)
rxq = &rxq_grp->splitq.rxq_sets[j]->rxq;
@@ -938,7 +945,7 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,
if (!rxq)
idpf_add_empty_queue_stats(&data, qtype);
else
- idpf_add_queue_stats(&data, rxq);
+ idpf_add_queue_stats(&data, rxq, qtype);

/* In splitq mode, don't get page pool stats here since
* the pools are attached to the buffer queues
@@ -953,7 +960,7 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,

for (i = 0; i < vport->num_rxq_grp; i++) {
for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
- struct idpf_queue *rxbufq =
+ struct idpf_buf_queue *rxbufq =
&vport->rxq_grps[i].splitq.bufq_sets[j].bufq;

page_pool_get_stats(rxbufq->pp, &pp_stats);
@@ -971,60 +978,64 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,
}

/**
- * idpf_find_rxq - find rxq from q index
+ * idpf_find_rxq_vec - find rxq vector from q index
* @vport: virtual port associated to queue
* @q_num: q index used to find queue
*
- * returns pointer to rx queue
+ * returns pointer to rx vector
*/
-static struct idpf_queue *idpf_find_rxq(struct idpf_vport *vport, int q_num)
+static struct idpf_q_vector *idpf_find_rxq_vec(const struct idpf_vport *vport,
+ int q_num)
{
int q_grp, q_idx;

if (!idpf_is_queue_model_split(vport->rxq_model))
- return vport->rxq_grps->singleq.rxqs[q_num];
+ return vport->rxq_grps->singleq.rxqs[q_num]->q_vector;

q_grp = q_num / IDPF_DFLT_SPLITQ_RXQ_PER_GROUP;
q_idx = q_num % IDPF_DFLT_SPLITQ_RXQ_PER_GROUP;

- return &vport->rxq_grps[q_grp].splitq.rxq_sets[q_idx]->rxq;
+ return vport->rxq_grps[q_grp].splitq.rxq_sets[q_idx]->rxq.q_vector;
}

/**
- * idpf_find_txq - find txq from q index
+ * idpf_find_txq_vec - find txq vector from q index
* @vport: virtual port associated to queue
* @q_num: q index used to find queue
*
- * returns pointer to tx queue
+ * returns pointer to tx vector
*/
-static struct idpf_queue *idpf_find_txq(struct idpf_vport *vport, int q_num)
+static struct idpf_q_vector *idpf_find_txq_vec(const struct idpf_vport *vport,
+ int q_num)
{
int q_grp;

if (!idpf_is_queue_model_split(vport->txq_model))
- return vport->txqs[q_num];
+ return vport->txqs[q_num]->q_vector;

q_grp = q_num / IDPF_DFLT_SPLITQ_TXQ_PER_GROUP;

- return vport->txq_grps[q_grp].complq;
+ return vport->txq_grps[q_grp].complq->q_vector;
}

/**
* __idpf_get_q_coalesce - get ITR values for specific queue
* @ec: ethtool structure to fill with driver's coalesce settings
- * @q: quuee of Rx or Tx
+ * @q_vector: queue vector corresponding to this queue
+ * @type: queue type
*/
static void __idpf_get_q_coalesce(struct ethtool_coalesce *ec,
- struct idpf_queue *q)
+ const struct idpf_q_vector *q_vector,
+ enum virtchnl2_queue_type type)
{
- if (q->q_type == VIRTCHNL2_QUEUE_TYPE_RX) {
+ if (type == VIRTCHNL2_QUEUE_TYPE_RX) {
ec->use_adaptive_rx_coalesce =
- IDPF_ITR_IS_DYNAMIC(q->q_vector->rx_intr_mode);
- ec->rx_coalesce_usecs = q->q_vector->rx_itr_value;
+ IDPF_ITR_IS_DYNAMIC(q_vector->rx_intr_mode);
+ ec->rx_coalesce_usecs = q_vector->rx_itr_value;
} else {
ec->use_adaptive_tx_coalesce =
- IDPF_ITR_IS_DYNAMIC(q->q_vector->tx_intr_mode);
- ec->tx_coalesce_usecs = q->q_vector->tx_itr_value;
+ IDPF_ITR_IS_DYNAMIC(q_vector->tx_intr_mode);
+ ec->tx_coalesce_usecs = q_vector->tx_itr_value;
}
}

@@ -1040,8 +1051,8 @@ static int idpf_get_q_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec,
u32 q_num)
{
- struct idpf_netdev_priv *np = netdev_priv(netdev);
- struct idpf_vport *vport;
+ const struct idpf_netdev_priv *np = netdev_priv(netdev);
+ const struct idpf_vport *vport;
int err = 0;

idpf_vport_ctrl_lock(netdev);
@@ -1056,10 +1067,12 @@ static int idpf_get_q_coalesce(struct net_device *netdev,
}

if (q_num < vport->num_rxq)
- __idpf_get_q_coalesce(ec, idpf_find_rxq(vport, q_num));
+ __idpf_get_q_coalesce(ec, idpf_find_rxq_vec(vport, q_num),
+ VIRTCHNL2_QUEUE_TYPE_RX);

if (q_num < vport->num_txq)
- __idpf_get_q_coalesce(ec, idpf_find_txq(vport, q_num));
+ __idpf_get_q_coalesce(ec, idpf_find_txq_vec(vport, q_num),
+ VIRTCHNL2_QUEUE_TYPE_TX);

unlock_mutex:
idpf_vport_ctrl_unlock(netdev);
@@ -1103,16 +1116,15 @@ static int idpf_get_per_q_coalesce(struct net_device *netdev, u32 q_num,
/**
* __idpf_set_q_coalesce - set ITR values for specific queue
* @ec: ethtool structure from user to update ITR settings
- * @q: queue for which itr values has to be set
+ * @qv: queue vector for which itr values has to be set
* @is_rxq: is queue type rx
*
* Returns 0 on success, negative otherwise.
*/
-static int __idpf_set_q_coalesce(struct ethtool_coalesce *ec,
- struct idpf_queue *q, bool is_rxq)
+static int __idpf_set_q_coalesce(const struct ethtool_coalesce *ec,
+ struct idpf_q_vector *qv, bool is_rxq)
{
u32 use_adaptive_coalesce, coalesce_usecs;
- struct idpf_q_vector *qv = q->q_vector;
bool is_dim_ena = false;
u16 itr_val;

@@ -1128,7 +1140,7 @@ static int __idpf_set_q_coalesce(struct ethtool_coalesce *ec,
itr_val = qv->tx_itr_value;
}
if (coalesce_usecs != itr_val && use_adaptive_coalesce) {
- netdev_err(q->vport->netdev, "Cannot set coalesce usecs if adaptive enabled\n");
+ netdev_err(qv->vport->netdev, "Cannot set coalesce usecs if adaptive enabled\n");

return -EINVAL;
}
@@ -1137,7 +1149,7 @@ static int __idpf_set_q_coalesce(struct ethtool_coalesce *ec,
return 0;

if (coalesce_usecs > IDPF_ITR_MAX) {
- netdev_err(q->vport->netdev,
+ netdev_err(qv->vport->netdev,
"Invalid value, %d-usecs range is 0-%d\n",
coalesce_usecs, IDPF_ITR_MAX);

@@ -1146,7 +1158,7 @@ static int __idpf_set_q_coalesce(struct ethtool_coalesce *ec,

if (coalesce_usecs % 2) {
coalesce_usecs--;
- netdev_info(q->vport->netdev,
+ netdev_info(qv->vport->netdev,
"HW only supports even ITR values, ITR rounded to %d\n",
coalesce_usecs);
}
@@ -1185,15 +1197,16 @@ static int __idpf_set_q_coalesce(struct ethtool_coalesce *ec,
*
* Return 0 on success, and negative on failure
*/
-static int idpf_set_q_coalesce(struct idpf_vport *vport,
- struct ethtool_coalesce *ec,
+static int idpf_set_q_coalesce(const struct idpf_vport *vport,
+ const struct ethtool_coalesce *ec,
int q_num, bool is_rxq)
{
- struct idpf_queue *q;
+ struct idpf_q_vector *qv;

- q = is_rxq ? idpf_find_rxq(vport, q_num) : idpf_find_txq(vport, q_num);
+ qv = is_rxq ? idpf_find_rxq_vec(vport, q_num) :
+ idpf_find_txq_vec(vport, q_num);

- if (q && __idpf_set_q_coalesce(ec, q, is_rxq))
+ if (qv && __idpf_set_q_coalesce(ec, qv, is_rxq))
return -EINVAL;

return 0;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index b0930303eab3..3e8b24430dd8 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -1319,14 +1319,14 @@ static void idpf_rx_init_buf_tail(struct idpf_vport *vport)

if (idpf_is_queue_model_split(vport->rxq_model)) {
for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
- struct idpf_queue *q =
+ const struct idpf_buf_queue *q =
&grp->splitq.bufq_sets[j].bufq;

writel(q->next_to_alloc, q->tail);
}
} else {
for (j = 0; j < grp->singleq.num_rxq; j++) {
- struct idpf_queue *q =
+ const struct idpf_rx_queue *q =
grp->singleq.rxqs[j];

writel(q->next_to_alloc, q->tail);
@@ -1856,7 +1856,7 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport,
enum idpf_vport_state current_state = np->state;
struct idpf_adapter *adapter = vport->adapter;
struct idpf_vport *new_vport;
- int err, i;
+ int err;

/* If the system is low on memory, we can end up in bad state if we
* free all the memory for queue resources and try to allocate them
@@ -1930,46 +1930,6 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport,
*/
memcpy(vport, new_vport, offsetof(struct idpf_vport, link_speed_mbps));

- /* Since idpf_vport_queues_alloc was called with new_port, the queue
- * back pointers are currently pointing to the local new_vport. Reset
- * the backpointers to the original vport here
- */
- for (i = 0; i < vport->num_txq_grp; i++) {
- struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
- int j;
-
- tx_qgrp->vport = vport;
- for (j = 0; j < tx_qgrp->num_txq; j++)
- tx_qgrp->txqs[j]->vport = vport;
-
- if (idpf_is_queue_model_split(vport->txq_model))
- tx_qgrp->complq->vport = vport;
- }
-
- for (i = 0; i < vport->num_rxq_grp; i++) {
- struct idpf_rxq_group *rx_qgrp = &vport->rxq_grps[i];
- struct idpf_queue *q;
- u16 num_rxq;
- int j;
-
- rx_qgrp->vport = vport;
- for (j = 0; j < vport->num_bufqs_per_qgrp; j++)
- rx_qgrp->splitq.bufq_sets[j].bufq.vport = vport;
-
- if (idpf_is_queue_model_split(vport->rxq_model))
- num_rxq = rx_qgrp->splitq.num_rxq_sets;
- else
- num_rxq = rx_qgrp->singleq.num_rxq;
-
- for (j = 0; j < num_rxq; j++) {
- if (idpf_is_queue_model_split(vport->rxq_model))
- q = &rx_qgrp->splitq.rxq_sets[j]->rxq;
- else
- q = rx_qgrp->singleq.rxqs[j];
- q->vport = vport;
- }
- }
-
if (reset_cause == IDPF_SR_Q_CHANGE)
idpf_vport_alloc_vec_indexes(vport);

diff --git a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
index b17d88e15000..b51f7cd6db01 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
@@ -186,7 +186,7 @@ static int idpf_tx_singleq_csum(struct sk_buff *skb,
* and gets a physical address for each memory location and programs
* it and the length into the transmit base mode descriptor.
*/
-static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
+static void idpf_tx_singleq_map(struct idpf_tx_queue *tx_q,
struct idpf_tx_buf *first,
struct idpf_tx_offload_params *offloads)
{
@@ -210,7 +210,7 @@ static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
dma = dma_map_single(tx_q->dev, skb->data, size, DMA_TO_DEVICE);

/* write each descriptor with CRC bit */
- if (tx_q->vport->crc_enable)
+ if (idpf_queue_has(CRC_EN, tx_q))
td_cmd |= IDPF_TX_DESC_CMD_ICRC;

for (frag = &skb_shinfo(skb)->frags[0];; frag++) {
@@ -285,7 +285,7 @@ static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
/* set next_to_watch value indicating a packet is present */
first->next_to_watch = tx_desc;

- nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+ nq = netdev_get_tx_queue(tx_q->netdev, tx_q->idx);
netdev_tx_sent_queue(nq, first->bytecount);

idpf_tx_buf_hw_update(tx_q, i, netdev_xmit_more());
@@ -299,7 +299,7 @@ static void idpf_tx_singleq_map(struct idpf_queue *tx_q,
* ring entry to reflect that this index is a context descriptor
*/
static struct idpf_base_tx_ctx_desc *
-idpf_tx_singleq_get_ctx_desc(struct idpf_queue *txq)
+idpf_tx_singleq_get_ctx_desc(struct idpf_tx_queue *txq)
{
struct idpf_base_tx_ctx_desc *ctx_desc;
int ntu = txq->next_to_use;
@@ -320,7 +320,7 @@ idpf_tx_singleq_get_ctx_desc(struct idpf_queue *txq)
* @txq: queue to send buffer on
* @offload: offload parameter structure
**/
-static void idpf_tx_singleq_build_ctx_desc(struct idpf_queue *txq,
+static void idpf_tx_singleq_build_ctx_desc(struct idpf_tx_queue *txq,
struct idpf_tx_offload_params *offload)
{
struct idpf_base_tx_ctx_desc *desc = idpf_tx_singleq_get_ctx_desc(txq);
@@ -333,7 +333,7 @@ static void idpf_tx_singleq_build_ctx_desc(struct idpf_queue *txq,
qw1 |= FIELD_PREP(IDPF_TXD_CTX_QW1_MSS_M, offload->mss);

u64_stats_update_begin(&txq->stats_sync);
- u64_stats_inc(&txq->q_stats.tx.lso_pkts);
+ u64_stats_inc(&txq->q_stats.lso_pkts);
u64_stats_update_end(&txq->stats_sync);
}

@@ -352,7 +352,7 @@ static void idpf_tx_singleq_build_ctx_desc(struct idpf_queue *txq,
* Returns NETDEV_TX_OK if sent, else an error code
*/
static netdev_tx_t idpf_tx_singleq_frame(struct sk_buff *skb,
- struct idpf_queue *tx_q)
+ struct idpf_tx_queue *tx_q)
{
struct idpf_tx_offload_params offload = { };
struct idpf_tx_buf *first;
@@ -419,7 +419,7 @@ netdev_tx_t idpf_tx_singleq_start(struct sk_buff *skb,
struct net_device *netdev)
{
struct idpf_vport *vport = idpf_netdev_to_vport(netdev);
- struct idpf_queue *tx_q;
+ struct idpf_tx_queue *tx_q;

tx_q = vport->txqs[skb_get_queue_mapping(skb)];

@@ -442,16 +442,15 @@ netdev_tx_t idpf_tx_singleq_start(struct sk_buff *skb,
* @cleaned: returns number of packets cleaned
*
*/
-static bool idpf_tx_singleq_clean(struct idpf_queue *tx_q, int napi_budget,
+static bool idpf_tx_singleq_clean(struct idpf_tx_queue *tx_q, int napi_budget,
int *cleaned)
{
- unsigned int budget = tx_q->vport->compln_clean_budget;
unsigned int total_bytes = 0, total_pkts = 0;
struct idpf_base_tx_desc *tx_desc;
+ u32 budget = tx_q->clean_budget;
s16 ntc = tx_q->next_to_clean;
struct idpf_netdev_priv *np;
struct idpf_tx_buf *tx_buf;
- struct idpf_vport *vport;
struct netdev_queue *nq;
bool dont_wake;

@@ -550,16 +549,15 @@ static bool idpf_tx_singleq_clean(struct idpf_queue *tx_q, int napi_budget,
*cleaned += total_pkts;

u64_stats_update_begin(&tx_q->stats_sync);
- u64_stats_add(&tx_q->q_stats.tx.packets, total_pkts);
- u64_stats_add(&tx_q->q_stats.tx.bytes, total_bytes);
+ u64_stats_add(&tx_q->q_stats.packets, total_pkts);
+ u64_stats_add(&tx_q->q_stats.bytes, total_bytes);
u64_stats_update_end(&tx_q->stats_sync);

- vport = tx_q->vport;
- np = netdev_priv(vport->netdev);
- nq = netdev_get_tx_queue(vport->netdev, tx_q->idx);
+ np = netdev_priv(tx_q->netdev);
+ nq = netdev_get_tx_queue(tx_q->netdev, tx_q->idx);

dont_wake = np->state != __IDPF_VPORT_UP ||
- !netif_carrier_ok(vport->netdev);
+ !netif_carrier_ok(tx_q->netdev);
__netif_txq_completed_wake(nq, total_pkts, total_bytes,
IDPF_DESC_UNUSED(tx_q), IDPF_TX_WAKE_THRESH,
dont_wake);
@@ -584,7 +582,7 @@ static bool idpf_tx_singleq_clean_all(struct idpf_q_vector *q_vec, int budget,

budget_per_q = num_txq ? max(budget / num_txq, 1) : 0;
for (i = 0; i < num_txq; i++) {
- struct idpf_queue *q;
+ struct idpf_tx_queue *q;

q = q_vec->tx[i];
clean_complete &= idpf_tx_singleq_clean(q, budget_per_q,
@@ -605,8 +603,9 @@ static bool idpf_tx_singleq_clean_all(struct idpf_q_vector *q_vec, int budget,
* The status_error_ptype_len doesn't need to be shifted because it begins
* at offset zero.
*/
-static bool idpf_rx_singleq_test_staterr(const union virtchnl2_rx_desc *rx_desc,
- const u64 stat_err_bits)
+static bool
+idpf_rx_singleq_test_staterr(const union virtchnl2_rx_desc *rx_desc,
+ const u64 stat_err_bits)
{
return !!(rx_desc->base_wb.qword1.status_error_ptype_len &
cpu_to_le64(stat_err_bits));
@@ -614,14 +613,9 @@ static bool idpf_rx_singleq_test_staterr(const union virtchnl2_rx_desc *rx_desc,

/**
* idpf_rx_singleq_is_non_eop - process handling of non-EOP buffers
- * @rxq: Rx ring being processed
* @rx_desc: Rx descriptor for current buffer
- * @skb: Current socket buffer containing buffer in progress
- * @ntc: next to clean
*/
-static bool idpf_rx_singleq_is_non_eop(struct idpf_queue *rxq,
- union virtchnl2_rx_desc *rx_desc,
- struct sk_buff *skb, u16 ntc)
+static bool idpf_rx_singleq_is_non_eop(const union virtchnl2_rx_desc *rx_desc)
{
/* if we are the last buffer then there is nothing else to do */
if (likely(idpf_rx_singleq_test_staterr(rx_desc, IDPF_RXD_EOF_SINGLEQ)))
@@ -639,7 +633,7 @@ static bool idpf_rx_singleq_is_non_eop(struct idpf_queue *rxq,
*
* skb->protocol must be set before this function is called
*/
-static void idpf_rx_singleq_csum(struct idpf_queue *rxq, struct sk_buff *skb,
+static void idpf_rx_singleq_csum(struct idpf_rx_queue *rxq, struct sk_buff *skb,
struct idpf_rx_csum_decoded *csum_bits,
u16 ptype)
{
@@ -647,14 +641,14 @@ static void idpf_rx_singleq_csum(struct idpf_queue *rxq, struct sk_buff *skb,
bool ipv4, ipv6;

/* check if Rx checksum is enabled */
- if (unlikely(!(rxq->vport->netdev->features & NETIF_F_RXCSUM)))
+ if (unlikely(!(rxq->netdev->features & NETIF_F_RXCSUM)))
return;

/* check if HW has decoded the packet and checksum */
if (unlikely(!(csum_bits->l3l4p)))
return;

- decoded = rxq->vport->rx_ptype_lkup[ptype];
+ decoded = rxq->rx_ptype_lkup[ptype];
if (unlikely(!(decoded.known && decoded.outer_ip)))
return;

@@ -707,7 +701,7 @@ static void idpf_rx_singleq_csum(struct idpf_queue *rxq, struct sk_buff *skb,

checksum_fail:
u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.rx.hw_csum_err);
+ u64_stats_inc(&rxq->q_stats.hw_csum_err);
u64_stats_update_end(&rxq->stats_sync);
}

@@ -721,9 +715,9 @@ static void idpf_rx_singleq_csum(struct idpf_queue *rxq, struct sk_buff *skb,
* This function only operates on the VIRTCHNL2_RXDID_1_32B_BASE_M base 32byte
* descriptor writeback format.
**/
-static void idpf_rx_singleq_base_csum(struct idpf_queue *rx_q,
+static void idpf_rx_singleq_base_csum(struct idpf_rx_queue *rx_q,
struct sk_buff *skb,
- union virtchnl2_rx_desc *rx_desc,
+ const union virtchnl2_rx_desc *rx_desc,
u16 ptype)
{
struct idpf_rx_csum_decoded csum_bits;
@@ -761,9 +755,9 @@ static void idpf_rx_singleq_base_csum(struct idpf_queue *rx_q,
* This function only operates on the VIRTCHNL2_RXDID_2_FLEX_SQ_NIC flexible
* descriptor writeback format.
**/
-static void idpf_rx_singleq_flex_csum(struct idpf_queue *rx_q,
+static void idpf_rx_singleq_flex_csum(struct idpf_rx_queue *rx_q,
struct sk_buff *skb,
- union virtchnl2_rx_desc *rx_desc,
+ const union virtchnl2_rx_desc *rx_desc,
u16 ptype)
{
struct idpf_rx_csum_decoded csum_bits;
@@ -801,14 +795,14 @@ static void idpf_rx_singleq_flex_csum(struct idpf_queue *rx_q,
* This function only operates on the VIRTCHNL2_RXDID_1_32B_BASE_M base 32byte
* descriptor writeback format.
**/
-static void idpf_rx_singleq_base_hash(struct idpf_queue *rx_q,
+static void idpf_rx_singleq_base_hash(struct idpf_rx_queue *rx_q,
struct sk_buff *skb,
- union virtchnl2_rx_desc *rx_desc,
+ const union virtchnl2_rx_desc *rx_desc,
struct idpf_rx_ptype_decoded *decoded)
{
u64 mask, qw1;

- if (unlikely(!(rx_q->vport->netdev->features & NETIF_F_RXHASH)))
+ if (unlikely(!(rx_q->netdev->features & NETIF_F_RXHASH)))
return;

mask = VIRTCHNL2_RX_BASE_DESC_FLTSTAT_RSS_HASH_M;
@@ -831,12 +825,12 @@ static void idpf_rx_singleq_base_hash(struct idpf_queue *rx_q,
* This function only operates on the VIRTCHNL2_RXDID_2_FLEX_SQ_NIC flexible
* descriptor writeback format.
**/
-static void idpf_rx_singleq_flex_hash(struct idpf_queue *rx_q,
+static void idpf_rx_singleq_flex_hash(struct idpf_rx_queue *rx_q,
struct sk_buff *skb,
- union virtchnl2_rx_desc *rx_desc,
+ const union virtchnl2_rx_desc *rx_desc,
struct idpf_rx_ptype_decoded *decoded)
{
- if (unlikely(!(rx_q->vport->netdev->features & NETIF_F_RXHASH)))
+ if (unlikely(!(rx_q->netdev->features & NETIF_F_RXHASH)))
return;

if (FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_STATUS0_RSS_VALID_M,
@@ -857,16 +851,16 @@ static void idpf_rx_singleq_flex_hash(struct idpf_queue *rx_q,
* order to populate the hash, checksum, VLAN, protocol, and
* other fields within the skb.
*/
-static void idpf_rx_singleq_process_skb_fields(struct idpf_queue *rx_q,
- struct sk_buff *skb,
- union virtchnl2_rx_desc *rx_desc,
- u16 ptype)
+static void
+idpf_rx_singleq_process_skb_fields(struct idpf_rx_queue *rx_q,
+ struct sk_buff *skb,
+ const union virtchnl2_rx_desc *rx_desc,
+ u16 ptype)
{
- struct idpf_rx_ptype_decoded decoded =
- rx_q->vport->rx_ptype_lkup[ptype];
+ struct idpf_rx_ptype_decoded decoded = rx_q->rx_ptype_lkup[ptype];

/* modifies the skb - consumes the enet header */
- skb->protocol = eth_type_trans(skb, rx_q->vport->netdev);
+ skb->protocol = eth_type_trans(skb, rx_q->netdev);

/* Check if we're using base mode descriptor IDs */
if (rx_q->rxdids == VIRTCHNL2_RXDID_1_32B_BASE_M) {
@@ -878,6 +872,22 @@ static void idpf_rx_singleq_process_skb_fields(struct idpf_queue *rx_q,
}
}

+/**
+ * idpf_rx_buf_hw_update - Store the new tail and head values
+ * @rxq: queue to bump
+ * @val: new head index
+ */
+static void idpf_rx_buf_hw_update(struct idpf_rx_queue *rxq, u32 val)
+{
+ rxq->next_to_use = val;
+
+ if (unlikely(!rxq->tail))
+ return;
+
+ /* writel has an implicit memory barrier */
+ writel(val, rxq->tail);
+}
+
/**
* idpf_rx_singleq_buf_hw_alloc_all - Replace used receive buffers
* @rx_q: queue for which the hw buffers are allocated
@@ -885,7 +895,7 @@ static void idpf_rx_singleq_process_skb_fields(struct idpf_queue *rx_q,
*
* Returns false if all allocations were successful, true if any fail
*/
-bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,
+bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_rx_queue *rx_q,
u16 cleaned_count)
{
struct virtchnl2_singleq_rx_buf_desc *desc;
@@ -896,7 +906,7 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,
return false;

desc = &rx_q->single_buf[nta];
- buf = &rx_q->rx_buf.buf[nta];
+ buf = &rx_q->rx_buf[nta];

do {
dma_addr_t addr;
@@ -916,7 +926,7 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,
nta++;
if (unlikely(nta == rx_q->desc_count)) {
desc = &rx_q->single_buf[0];
- buf = rx_q->rx_buf.buf;
+ buf = rx_q->rx_buf;
nta = 0;
}

@@ -933,7 +943,6 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,

/**
* idpf_rx_singleq_extract_base_fields - Extract fields from the Rx descriptor
- * @rx_q: Rx descriptor queue
* @rx_desc: the descriptor to process
* @fields: storage for extracted values
*
@@ -943,9 +952,9 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_queue *rx_q,
* This function only operates on the VIRTCHNL2_RXDID_1_32B_BASE_M base 32byte
* descriptor writeback format.
*/
-static void idpf_rx_singleq_extract_base_fields(struct idpf_queue *rx_q,
- union virtchnl2_rx_desc *rx_desc,
- struct idpf_rx_extracted *fields)
+static void
+idpf_rx_singleq_extract_base_fields(const union virtchnl2_rx_desc *rx_desc,
+ struct idpf_rx_extracted *fields)
{
u64 qword;

@@ -957,7 +966,6 @@ static void idpf_rx_singleq_extract_base_fields(struct idpf_queue *rx_q,

/**
* idpf_rx_singleq_extract_flex_fields - Extract fields from the Rx descriptor
- * @rx_q: Rx descriptor queue
* @rx_desc: the descriptor to process
* @fields: storage for extracted values
*
@@ -967,9 +975,9 @@ static void idpf_rx_singleq_extract_base_fields(struct idpf_queue *rx_q,
* This function only operates on the VIRTCHNL2_RXDID_2_FLEX_SQ_NIC flexible
* descriptor writeback format.
*/
-static void idpf_rx_singleq_extract_flex_fields(struct idpf_queue *rx_q,
- union virtchnl2_rx_desc *rx_desc,
- struct idpf_rx_extracted *fields)
+static void
+idpf_rx_singleq_extract_flex_fields(const union virtchnl2_rx_desc *rx_desc,
+ struct idpf_rx_extracted *fields)
{
fields->size = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_PKT_LEN_M,
le16_to_cpu(rx_desc->flex_nic_wb.pkt_len));
@@ -984,14 +992,15 @@ static void idpf_rx_singleq_extract_flex_fields(struct idpf_queue *rx_q,
* @fields: storage for extracted values
*
*/
-static void idpf_rx_singleq_extract_fields(struct idpf_queue *rx_q,
- union virtchnl2_rx_desc *rx_desc,
- struct idpf_rx_extracted *fields)
+static void
+idpf_rx_singleq_extract_fields(const struct idpf_rx_queue *rx_q,
+ const union virtchnl2_rx_desc *rx_desc,
+ struct idpf_rx_extracted *fields)
{
if (rx_q->rxdids == VIRTCHNL2_RXDID_1_32B_BASE_M)
- idpf_rx_singleq_extract_base_fields(rx_q, rx_desc, fields);
+ idpf_rx_singleq_extract_base_fields(rx_desc, fields);
else
- idpf_rx_singleq_extract_flex_fields(rx_q, rx_desc, fields);
+ idpf_rx_singleq_extract_flex_fields(rx_desc, fields);
}

/**
@@ -1001,7 +1010,7 @@ static void idpf_rx_singleq_extract_fields(struct idpf_queue *rx_q,
*
* Returns true if there's any budget left (e.g. the clean is finished)
*/
-static int idpf_rx_singleq_clean(struct idpf_queue *rx_q, int budget)
+static int idpf_rx_singleq_clean(struct idpf_rx_queue *rx_q, int budget)
{
unsigned int total_rx_bytes = 0, total_rx_pkts = 0;
struct sk_buff *skb = rx_q->skb;
@@ -1036,7 +1045,7 @@ static int idpf_rx_singleq_clean(struct idpf_queue *rx_q, int budget)

idpf_rx_singleq_extract_fields(rx_q, rx_desc, &fields);

- rx_buf = &rx_q->rx_buf.buf[ntc];
+ rx_buf = &rx_q->rx_buf[ntc];
if (!fields.size) {
idpf_rx_put_page(rx_buf);
goto skip_data;
@@ -1058,7 +1067,7 @@ static int idpf_rx_singleq_clean(struct idpf_queue *rx_q, int budget)
cleaned_count++;

/* skip if it is non EOP desc */
- if (idpf_rx_singleq_is_non_eop(rx_q, rx_desc, skb, ntc))
+ if (idpf_rx_singleq_is_non_eop(rx_desc))
continue;

#define IDPF_RXD_ERR_S FIELD_PREP(VIRTCHNL2_RX_BASE_DESC_QW1_ERROR_M, \
@@ -1084,7 +1093,7 @@ static int idpf_rx_singleq_clean(struct idpf_queue *rx_q, int budget)
rx_desc, fields.rx_ptype);

/* send completed skb up the stack */
- napi_gro_receive(&rx_q->q_vector->napi, skb);
+ napi_gro_receive(rx_q->pp->p.napi, skb);
skb = NULL;

/* update budget accounting */
@@ -1099,8 +1108,8 @@ static int idpf_rx_singleq_clean(struct idpf_queue *rx_q, int budget)
failure = idpf_rx_singleq_buf_hw_alloc_all(rx_q, cleaned_count);

u64_stats_update_begin(&rx_q->stats_sync);
- u64_stats_add(&rx_q->q_stats.rx.packets, total_rx_pkts);
- u64_stats_add(&rx_q->q_stats.rx.bytes, total_rx_bytes);
+ u64_stats_add(&rx_q->q_stats.packets, total_rx_pkts);
+ u64_stats_add(&rx_q->q_stats.bytes, total_rx_bytes);
u64_stats_update_end(&rx_q->stats_sync);

/* guarantee a trip back through this routine if there was a failure */
@@ -1127,7 +1136,7 @@ static bool idpf_rx_singleq_clean_all(struct idpf_q_vector *q_vec, int budget,
*/
budget_per_q = num_rxq ? max(budget / num_rxq, 1) : 0;
for (i = 0; i < num_rxq; i++) {
- struct idpf_queue *rxq = q_vec->rx[i];
+ struct idpf_rx_queue *rxq = q_vec->rx[i];
int pkts_cleaned_per_q;

pkts_cleaned_per_q = idpf_rx_singleq_clean(rxq, budget_per_q);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 01301e640fba..7be5a723f558 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -60,7 +60,8 @@ void idpf_tx_timeout(struct net_device *netdev, unsigned int txqueue)
* @tx_q: the queue that owns the buffer
* @tx_buf: the buffer to free
*/
-static void idpf_tx_buf_rel(struct idpf_queue *tx_q, struct idpf_tx_buf *tx_buf)
+static void idpf_tx_buf_rel(struct idpf_tx_queue *tx_q,
+ struct idpf_tx_buf *tx_buf)
{
if (tx_buf->skb) {
if (dma_unmap_len(tx_buf, len))
@@ -86,8 +87,9 @@ static void idpf_tx_buf_rel(struct idpf_queue *tx_q, struct idpf_tx_buf *tx_buf)
* idpf_tx_buf_rel_all - Free any empty Tx buffers
* @txq: queue to be cleaned
*/
-static void idpf_tx_buf_rel_all(struct idpf_queue *txq)
+static void idpf_tx_buf_rel_all(struct idpf_tx_queue *txq)
{
+ struct idpf_buf_lifo *buf_stack;
u16 i;

/* Buffers already cleared, nothing to do */
@@ -101,38 +103,57 @@ static void idpf_tx_buf_rel_all(struct idpf_queue *txq)
kfree(txq->tx_buf);
txq->tx_buf = NULL;

- if (!txq->buf_stack.bufs)
+ if (!idpf_queue_has(FLOW_SCH_EN, txq))
return;

- for (i = 0; i < txq->buf_stack.size; i++)
- kfree(txq->buf_stack.bufs[i]);
+ buf_stack = &txq->stash->buf_stack;
+ if (!buf_stack->bufs)
+ return;
+
+ for (i = 0; i < buf_stack->size; i++)
+ kfree(buf_stack->bufs[i]);

- kfree(txq->buf_stack.bufs);
- txq->buf_stack.bufs = NULL;
+ kfree(buf_stack->bufs);
+ buf_stack->bufs = NULL;
}

/**
* idpf_tx_desc_rel - Free Tx resources per queue
* @txq: Tx descriptor ring for a specific queue
- * @bufq: buffer q or completion q
*
* Free all transmit software resources
*/
-static void idpf_tx_desc_rel(struct idpf_queue *txq, bool bufq)
+static void idpf_tx_desc_rel(struct idpf_tx_queue *txq)
{
- if (bufq)
- idpf_tx_buf_rel_all(txq);
+ idpf_tx_buf_rel_all(txq);

if (!txq->desc_ring)
return;

dmam_free_coherent(txq->dev, txq->size, txq->desc_ring, txq->dma);
txq->desc_ring = NULL;
- txq->next_to_alloc = 0;
txq->next_to_use = 0;
txq->next_to_clean = 0;
}

+/**
+ * idpf_compl_desc_rel - Free completion resources per queue
+ * @complq: completion queue
+ *
+ * Free all completion software resources.
+ */
+static void idpf_compl_desc_rel(struct idpf_compl_queue *complq)
+{
+ if (!complq->comp)
+ return;
+
+ dma_free_coherent(complq->netdev->dev.parent, complq->size,
+ complq->comp, complq->dma);
+ complq->comp = NULL;
+ complq->next_to_use = 0;
+ complq->next_to_clean = 0;
+}
+
/**
* idpf_tx_desc_rel_all - Free Tx Resources for All Queues
* @vport: virtual port structure
@@ -150,10 +171,10 @@ static void idpf_tx_desc_rel_all(struct idpf_vport *vport)
struct idpf_txq_group *txq_grp = &vport->txq_grps[i];

for (j = 0; j < txq_grp->num_txq; j++)
- idpf_tx_desc_rel(txq_grp->txqs[j], true);
+ idpf_tx_desc_rel(txq_grp->txqs[j]);

if (idpf_is_queue_model_split(vport->txq_model))
- idpf_tx_desc_rel(txq_grp->complq, false);
+ idpf_compl_desc_rel(txq_grp->complq);
}
}

@@ -163,8 +184,9 @@ static void idpf_tx_desc_rel_all(struct idpf_vport *vport)
*
* Returns 0 on success, negative on failure
*/
-static int idpf_tx_buf_alloc_all(struct idpf_queue *tx_q)
+static int idpf_tx_buf_alloc_all(struct idpf_tx_queue *tx_q)
{
+ struct idpf_buf_lifo *buf_stack;
int buf_size;
int i;

@@ -180,22 +202,26 @@ static int idpf_tx_buf_alloc_all(struct idpf_queue *tx_q)
for (i = 0; i < tx_q->desc_count; i++)
tx_q->tx_buf[i].compl_tag = IDPF_SPLITQ_TX_INVAL_COMPL_TAG;

+ if (!idpf_queue_has(FLOW_SCH_EN, tx_q))
+ return 0;
+
+ buf_stack = &tx_q->stash->buf_stack;
+
/* Initialize tx buf stack for out-of-order completions if
* flow scheduling offload is enabled
*/
- tx_q->buf_stack.bufs =
- kcalloc(tx_q->desc_count, sizeof(struct idpf_tx_stash *),
- GFP_KERNEL);
- if (!tx_q->buf_stack.bufs)
+ buf_stack->bufs = kcalloc(tx_q->desc_count, sizeof(*buf_stack->bufs),
+ GFP_KERNEL);
+ if (!buf_stack->bufs)
return -ENOMEM;

- tx_q->buf_stack.size = tx_q->desc_count;
- tx_q->buf_stack.top = tx_q->desc_count;
+ buf_stack->size = tx_q->desc_count;
+ buf_stack->top = tx_q->desc_count;

for (i = 0; i < tx_q->desc_count; i++) {
- tx_q->buf_stack.bufs[i] = kzalloc(sizeof(*tx_q->buf_stack.bufs[i]),
- GFP_KERNEL);
- if (!tx_q->buf_stack.bufs[i])
+ buf_stack->bufs[i] = kzalloc(sizeof(*buf_stack->bufs[i]),
+ GFP_KERNEL);
+ if (!buf_stack->bufs[i])
return -ENOMEM;
}

@@ -204,28 +230,22 @@ static int idpf_tx_buf_alloc_all(struct idpf_queue *tx_q)

/**
* idpf_tx_desc_alloc - Allocate the Tx descriptors
+ * @vport: vport to allocate resources for
* @tx_q: the tx ring to set up
- * @bufq: buffer or completion queue
*
* Returns 0 on success, negative on failure
*/
-static int idpf_tx_desc_alloc(struct idpf_queue *tx_q, bool bufq)
+static int idpf_tx_desc_alloc(const struct idpf_vport *vport,
+ struct idpf_tx_queue *tx_q)
{
struct device *dev = tx_q->dev;
- u32 desc_sz;
int err;

- if (bufq) {
- err = idpf_tx_buf_alloc_all(tx_q);
- if (err)
- goto err_alloc;
-
- desc_sz = sizeof(struct idpf_base_tx_desc);
- } else {
- desc_sz = sizeof(struct idpf_splitq_tx_compl_desc);
- }
+ err = idpf_tx_buf_alloc_all(tx_q);
+ if (err)
+ goto err_alloc;

- tx_q->size = tx_q->desc_count * desc_sz;
+ tx_q->size = tx_q->desc_count * sizeof(*tx_q->base_tx);

/* Allocate descriptors also round up to nearest 4K */
tx_q->size = ALIGN(tx_q->size, 4096);
@@ -238,19 +258,43 @@ static int idpf_tx_desc_alloc(struct idpf_queue *tx_q, bool bufq)
goto err_alloc;
}

- tx_q->next_to_alloc = 0;
tx_q->next_to_use = 0;
tx_q->next_to_clean = 0;
- set_bit(__IDPF_Q_GEN_CHK, tx_q->flags);
+ idpf_queue_set(GEN_CHK, tx_q);

return 0;

err_alloc:
- idpf_tx_desc_rel(tx_q, bufq);
+ idpf_tx_desc_rel(tx_q);

return err;
}

+/**
+ * idpf_compl_desc_alloc - allocate completion descriptors
+ * @vport: vport to allocate resources for
+ * @complq: completion queue to set up
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+static int idpf_compl_desc_alloc(const struct idpf_vport *vport,
+ struct idpf_compl_queue *complq)
+{
+ complq->size = array_size(complq->desc_count, sizeof(*complq->comp));
+
+ complq->comp = dma_alloc_coherent(complq->netdev->dev.parent,
+ complq->size, &complq->dma,
+ GFP_KERNEL);
+ if (!complq->comp)
+ return -ENOMEM;
+
+ complq->next_to_use = 0;
+ complq->next_to_clean = 0;
+ idpf_queue_set(GEN_CHK, complq);
+
+ return 0;
+}
+
/**
* idpf_tx_desc_alloc_all - allocate all queues Tx resources
* @vport: virtual port private structure
@@ -259,7 +303,6 @@ static int idpf_tx_desc_alloc(struct idpf_queue *tx_q, bool bufq)
*/
static int idpf_tx_desc_alloc_all(struct idpf_vport *vport)
{
- struct device *dev = &vport->adapter->pdev->dev;
int err = 0;
int i, j;

@@ -268,13 +311,14 @@ static int idpf_tx_desc_alloc_all(struct idpf_vport *vport)
*/
for (i = 0; i < vport->num_txq_grp; i++) {
for (j = 0; j < vport->txq_grps[i].num_txq; j++) {
- struct idpf_queue *txq = vport->txq_grps[i].txqs[j];
+ struct idpf_tx_queue *txq = vport->txq_grps[i].txqs[j];
u8 gen_bits = 0;
u16 bufidx_mask;

- err = idpf_tx_desc_alloc(txq, true);
+ err = idpf_tx_desc_alloc(vport, txq);
if (err) {
- dev_err(dev, "Allocation for Tx Queue %u failed\n",
+ pci_err(vport->adapter->pdev,
+ "Allocation for Tx Queue %u failed\n",
i);
goto err_out;
}
@@ -312,9 +356,10 @@ static int idpf_tx_desc_alloc_all(struct idpf_vport *vport)
continue;

/* Setup completion queues */
- err = idpf_tx_desc_alloc(vport->txq_grps[i].complq, false);
+ err = idpf_compl_desc_alloc(vport, vport->txq_grps[i].complq);
if (err) {
- dev_err(dev, "Allocation for Tx Completion Queue %u failed\n",
+ pci_err(vport->adapter->pdev,
+ "Allocation for Tx Completion Queue %u failed\n",
i);
goto err_out;
}
@@ -329,15 +374,14 @@ static int idpf_tx_desc_alloc_all(struct idpf_vport *vport)

/**
* idpf_rx_page_rel - Release an rx buffer page
- * @rxq: the queue that owns the buffer
* @rx_buf: the buffer to free
*/
-static void idpf_rx_page_rel(struct idpf_queue *rxq, struct idpf_rx_buf *rx_buf)
+static void idpf_rx_page_rel(struct idpf_rx_buf *rx_buf)
{
if (unlikely(!rx_buf->page))
return;

- page_pool_put_full_page(rxq->pp, rx_buf->page, false);
+ page_pool_put_full_page(rx_buf->page->pp, rx_buf->page, false);

rx_buf->page = NULL;
rx_buf->page_offset = 0;
@@ -345,54 +389,72 @@ static void idpf_rx_page_rel(struct idpf_queue *rxq, struct idpf_rx_buf *rx_buf)

/**
* idpf_rx_hdr_buf_rel_all - Release header buffer memory
- * @rxq: queue to use
+ * @bufq: queue to use
+ * @dev: device to free DMA memory
*/
-static void idpf_rx_hdr_buf_rel_all(struct idpf_queue *rxq)
+static void idpf_rx_hdr_buf_rel_all(struct idpf_buf_queue *bufq,
+ struct device *dev)
{
- struct idpf_adapter *adapter = rxq->vport->adapter;
-
- dma_free_coherent(&adapter->pdev->dev,
- rxq->desc_count * IDPF_HDR_BUF_SIZE,
- rxq->rx_buf.hdr_buf_va,
- rxq->rx_buf.hdr_buf_pa);
- rxq->rx_buf.hdr_buf_va = NULL;
+ dma_free_coherent(dev, bufq->desc_count * IDPF_HDR_BUF_SIZE,
+ bufq->rx_buf.hdr_buf_va, bufq->rx_buf.hdr_buf_pa);
+ bufq->rx_buf.hdr_buf_va = NULL;
}

/**
- * idpf_rx_buf_rel_all - Free all Rx buffer resources for a queue
- * @rxq: queue to be cleaned
+ * idpf_rx_buf_rel_bufq - Free all Rx buffer resources for a buffer queue
+ * @bufq: queue to be cleaned
+ * @dev: device to free DMA memory
*/
-static void idpf_rx_buf_rel_all(struct idpf_queue *rxq)
+static void idpf_rx_buf_rel_bufq(struct idpf_buf_queue *bufq,
+ struct device *dev)
{
- u16 i;
-
/* queue already cleared, nothing to do */
- if (!rxq->rx_buf.buf)
+ if (!bufq->rx_buf.buf)
return;

/* Free all the bufs allocated and given to hw on Rx queue */
- for (i = 0; i < rxq->desc_count; i++)
- idpf_rx_page_rel(rxq, &rxq->rx_buf.buf[i]);
+ for (u32 i = 0; i < bufq->desc_count; i++)
+ idpf_rx_page_rel(&bufq->rx_buf.buf[i]);
+
+ if (idpf_queue_has(HSPLIT_EN, bufq))
+ idpf_rx_hdr_buf_rel_all(bufq, dev);
+
+ page_pool_destroy(bufq->pp);
+ bufq->pp = NULL;
+
+ kfree(bufq->rx_buf.buf);
+ bufq->rx_buf.buf = NULL;
+}

- if (rxq->rx_hsplit_en)
- idpf_rx_hdr_buf_rel_all(rxq);
+/**
+ * idpf_rx_buf_rel_all - Free all Rx buffer resources for a receive queue
+ * @rxq: queue to be cleaned
+ */
+static void idpf_rx_buf_rel_all(struct idpf_rx_queue *rxq)
+{
+ if (!rxq->rx_buf)
+ return;
+
+ for (u32 i = 0; i < rxq->desc_count; i++)
+ idpf_rx_page_rel(&rxq->rx_buf[i]);

page_pool_destroy(rxq->pp);
rxq->pp = NULL;

- kfree(rxq->rx_buf.buf);
- rxq->rx_buf.buf = NULL;
+ kfree(rxq->rx_buf);
+ rxq->rx_buf = NULL;
}

/**
* idpf_rx_desc_rel - Free a specific Rx q resources
* @rxq: queue to clean the resources from
- * @bufq: buffer q or completion q
- * @q_model: single or split q model
+ * @dev: device to free DMA memory
+ * @model: single or split queue model
*
* Free a specific rx queue resources
*/
-static void idpf_rx_desc_rel(struct idpf_queue *rxq, bool bufq, s32 q_model)
+static void idpf_rx_desc_rel(struct idpf_rx_queue *rxq, struct device *dev,
+ u32 model)
{
if (!rxq)
return;
@@ -402,7 +464,7 @@ static void idpf_rx_desc_rel(struct idpf_queue *rxq, bool bufq, s32 q_model)
rxq->skb = NULL;
}

- if (bufq || !idpf_is_queue_model_split(q_model))
+ if (!idpf_is_queue_model_split(model))
idpf_rx_buf_rel_all(rxq);

rxq->next_to_alloc = 0;
@@ -411,10 +473,34 @@ static void idpf_rx_desc_rel(struct idpf_queue *rxq, bool bufq, s32 q_model)
if (!rxq->desc_ring)
return;

- dmam_free_coherent(rxq->dev, rxq->size, rxq->desc_ring, rxq->dma);
+ dmam_free_coherent(dev, rxq->size, rxq->desc_ring, rxq->dma);
rxq->desc_ring = NULL;
}

+/**
+ * idpf_rx_desc_rel_bufq - free buffer queue resources
+ * @bufq: buffer queue to clean the resources from
+ * @dev: device to free DMA memory
+ */
+static void idpf_rx_desc_rel_bufq(struct idpf_buf_queue *bufq,
+ struct device *dev)
+{
+ if (!bufq)
+ return;
+
+ idpf_rx_buf_rel_bufq(bufq, dev);
+
+ bufq->next_to_alloc = 0;
+ bufq->next_to_clean = 0;
+ bufq->next_to_use = 0;
+
+ if (!bufq->split_buf)
+ return;
+
+ dma_free_coherent(dev, bufq->size, bufq->split_buf, bufq->dma);
+ bufq->split_buf = NULL;
+}
+
/**
* idpf_rx_desc_rel_all - Free Rx Resources for All Queues
* @vport: virtual port structure
@@ -423,6 +509,7 @@ static void idpf_rx_desc_rel(struct idpf_queue *rxq, bool bufq, s32 q_model)
*/
static void idpf_rx_desc_rel_all(struct idpf_vport *vport)
{
+ struct device *dev = &vport->adapter->pdev->dev;
struct idpf_rxq_group *rx_qgrp;
u16 num_rxq;
int i, j;
@@ -435,15 +522,15 @@ static void idpf_rx_desc_rel_all(struct idpf_vport *vport)

if (!idpf_is_queue_model_split(vport->rxq_model)) {
for (j = 0; j < rx_qgrp->singleq.num_rxq; j++)
- idpf_rx_desc_rel(rx_qgrp->singleq.rxqs[j],
- false, vport->rxq_model);
+ idpf_rx_desc_rel(rx_qgrp->singleq.rxqs[j], dev,
+ VIRTCHNL2_QUEUE_MODEL_SINGLE);
continue;
}

num_rxq = rx_qgrp->splitq.num_rxq_sets;
for (j = 0; j < num_rxq; j++)
idpf_rx_desc_rel(&rx_qgrp->splitq.rxq_sets[j]->rxq,
- false, vport->rxq_model);
+ dev, VIRTCHNL2_QUEUE_MODEL_SPLIT);

if (!rx_qgrp->splitq.bufq_sets)
continue;
@@ -452,44 +539,40 @@ static void idpf_rx_desc_rel_all(struct idpf_vport *vport)
struct idpf_bufq_set *bufq_set =
&rx_qgrp->splitq.bufq_sets[j];

- idpf_rx_desc_rel(&bufq_set->bufq, true,
- vport->rxq_model);
+ idpf_rx_desc_rel_bufq(&bufq_set->bufq, dev);
}
}
}

/**
* idpf_rx_buf_hw_update - Store the new tail and head values
- * @rxq: queue to bump
+ * @bufq: queue to bump
* @val: new head index
*/
-void idpf_rx_buf_hw_update(struct idpf_queue *rxq, u32 val)
+static void idpf_rx_buf_hw_update(struct idpf_buf_queue *bufq, u32 val)
{
- rxq->next_to_use = val;
+ bufq->next_to_use = val;

- if (unlikely(!rxq->tail))
+ if (unlikely(!bufq->tail))
return;

/* writel has an implicit memory barrier */
- writel(val, rxq->tail);
+ writel(val, bufq->tail);
}

/**
* idpf_rx_hdr_buf_alloc_all - Allocate memory for header buffers
- * @rxq: ring to use
+ * @bufq: ring to use
*
* Returns 0 on success, negative on failure.
*/
-static int idpf_rx_hdr_buf_alloc_all(struct idpf_queue *rxq)
+static int idpf_rx_hdr_buf_alloc_all(struct idpf_buf_queue *bufq)
{
- struct idpf_adapter *adapter = rxq->vport->adapter;
-
- rxq->rx_buf.hdr_buf_va =
- dma_alloc_coherent(&adapter->pdev->dev,
- IDPF_HDR_BUF_SIZE * rxq->desc_count,
- &rxq->rx_buf.hdr_buf_pa,
- GFP_KERNEL);
- if (!rxq->rx_buf.hdr_buf_va)
+ bufq->rx_buf.hdr_buf_va =
+ dma_alloc_coherent(bufq->q_vector->vport->netdev->dev.parent,
+ IDPF_HDR_BUF_SIZE * bufq->desc_count,
+ &bufq->rx_buf.hdr_buf_pa, GFP_KERNEL);
+ if (!bufq->rx_buf.hdr_buf_va)
return -ENOMEM;

return 0;
@@ -502,19 +585,20 @@ static int idpf_rx_hdr_buf_alloc_all(struct idpf_queue *rxq)
*/
static void idpf_rx_post_buf_refill(struct idpf_sw_queue *refillq, u16 buf_id)
{
- u16 nta = refillq->next_to_alloc;
+ u32 nta = refillq->next_to_use;

/* store the buffer ID and the SW maintained GEN bit to the refillq */
refillq->ring[nta] =
FIELD_PREP(IDPF_RX_BI_BUFID_M, buf_id) |
FIELD_PREP(IDPF_RX_BI_GEN_M,
- test_bit(__IDPF_Q_GEN_CHK, refillq->flags));
+ idpf_queue_has(GEN_CHK, refillq));

if (unlikely(++nta == refillq->desc_count)) {
nta = 0;
- change_bit(__IDPF_Q_GEN_CHK, refillq->flags);
+ idpf_queue_change(GEN_CHK, refillq);
}
- refillq->next_to_alloc = nta;
+
+ refillq->next_to_use = nta;
}

/**
@@ -524,7 +608,7 @@ static void idpf_rx_post_buf_refill(struct idpf_sw_queue *refillq, u16 buf_id)
*
* Returns false if buffer could not be allocated, true otherwise.
*/
-static bool idpf_rx_post_buf_desc(struct idpf_queue *bufq, u16 buf_id)
+static bool idpf_rx_post_buf_desc(struct idpf_buf_queue *bufq, u16 buf_id)
{
struct virtchnl2_splitq_rx_buf_desc *splitq_rx_desc = NULL;
u16 nta = bufq->next_to_alloc;
@@ -534,11 +618,10 @@ static bool idpf_rx_post_buf_desc(struct idpf_queue *bufq, u16 buf_id)
splitq_rx_desc = &bufq->split_buf[nta];
buf = &bufq->rx_buf.buf[buf_id];

- if (bufq->rx_hsplit_en) {
+ if (idpf_queue_has(HSPLIT_EN, bufq))
splitq_rx_desc->hdr_addr =
cpu_to_le64(bufq->rx_buf.hdr_buf_pa +
(u32)buf_id * IDPF_HDR_BUF_SIZE);
- }

addr = idpf_alloc_page(bufq->pp, buf, bufq->rx_buf_size);
if (unlikely(addr == DMA_MAPPING_ERROR))
@@ -562,7 +645,8 @@ static bool idpf_rx_post_buf_desc(struct idpf_queue *bufq, u16 buf_id)
*
* Returns true if @working_set bufs were posted successfully, false otherwise.
*/
-static bool idpf_rx_post_init_bufs(struct idpf_queue *bufq, u16 working_set)
+static bool idpf_rx_post_init_bufs(struct idpf_buf_queue *bufq,
+ u16 working_set)
{
int i;

@@ -571,26 +655,28 @@ static bool idpf_rx_post_init_bufs(struct idpf_queue *bufq, u16 working_set)
return false;
}

- idpf_rx_buf_hw_update(bufq,
- bufq->next_to_alloc & ~(bufq->rx_buf_stride - 1));
+ idpf_rx_buf_hw_update(bufq, ALIGN_DOWN(bufq->next_to_alloc,
+ IDPF_RX_BUF_STRIDE));

return true;
}

/**
* idpf_rx_create_page_pool - Create a page pool
- * @rxbufq: RX queue to create page pool for
+ * @napi: NAPI of the associated queue vector
+ * @count: queue descriptor count
*
* Returns &page_pool on success, casted -errno on failure
*/
-static struct page_pool *idpf_rx_create_page_pool(struct idpf_queue *rxbufq)
+static struct page_pool *idpf_rx_create_page_pool(struct napi_struct *napi,
+ u32 count)
{
struct page_pool_params pp = {
.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
.order = 0,
- .pool_size = rxbufq->desc_count,
+ .pool_size = count,
.nid = NUMA_NO_NODE,
- .dev = rxbufq->vport->netdev->dev.parent,
+ .dev = napi->dev->dev.parent,
.max_len = PAGE_SIZE,
.dma_dir = DMA_FROM_DEVICE,
.offset = 0,
@@ -599,15 +685,58 @@ static struct page_pool *idpf_rx_create_page_pool(struct idpf_queue *rxbufq)
return page_pool_create(&pp);
}

+/**
+ * idpf_rx_buf_alloc_singleq - Allocate memory for all buffer resources
+ * @rxq: queue for which the buffers are allocated
+ *
+ * Return: 0 on success, -ENOMEM on failure.
+ */
+static int idpf_rx_buf_alloc_singleq(struct idpf_rx_queue *rxq)
+{
+ rxq->rx_buf = kcalloc(rxq->desc_count, sizeof(*rxq->rx_buf),
+ GFP_KERNEL);
+ if (!rxq->rx_buf)
+ return -ENOMEM;
+
+ if (idpf_rx_singleq_buf_hw_alloc_all(rxq, rxq->desc_count - 1))
+ goto err;
+
+ return 0;
+
+err:
+ idpf_rx_buf_rel_all(rxq);
+
+ return -ENOMEM;
+}
+
+/**
+ * idpf_rx_bufs_init_singleq - Initialize page pool and allocate Rx bufs
+ * @rxq: buffer queue to create page pool for
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+static int idpf_rx_bufs_init_singleq(struct idpf_rx_queue *rxq)
+{
+ struct page_pool *pool;
+
+ pool = idpf_rx_create_page_pool(&rxq->q_vector->napi, rxq->desc_count);
+ if (IS_ERR(pool))
+ return PTR_ERR(pool);
+
+ rxq->pp = pool;
+
+ return idpf_rx_buf_alloc_singleq(rxq);
+}
+
/**
* idpf_rx_buf_alloc_all - Allocate memory for all buffer resources
- * @rxbufq: queue for which the buffers are allocated; equivalent to
- * rxq when operating in singleq mode
+ * @rxbufq: queue for which the buffers are allocated
*
* Returns 0 on success, negative on failure
*/
-static int idpf_rx_buf_alloc_all(struct idpf_queue *rxbufq)
+static int idpf_rx_buf_alloc_all(struct idpf_buf_queue *rxbufq)
{
+ struct device *dev = rxbufq->q_vector->vport->netdev->dev.parent;
int err = 0;

/* Allocate book keeping buffers */
@@ -618,48 +747,41 @@ static int idpf_rx_buf_alloc_all(struct idpf_queue *rxbufq)
goto rx_buf_alloc_all_out;
}

- if (rxbufq->rx_hsplit_en) {
+ if (idpf_queue_has(HSPLIT_EN, rxbufq)) {
err = idpf_rx_hdr_buf_alloc_all(rxbufq);
if (err)
goto rx_buf_alloc_all_out;
}

/* Allocate buffers to be given to HW. */
- if (idpf_is_queue_model_split(rxbufq->vport->rxq_model)) {
- int working_set = IDPF_RX_BUFQ_WORKING_SET(rxbufq);
-
- if (!idpf_rx_post_init_bufs(rxbufq, working_set))
- err = -ENOMEM;
- } else {
- if (idpf_rx_singleq_buf_hw_alloc_all(rxbufq,
- rxbufq->desc_count - 1))
- err = -ENOMEM;
- }
+ if (!idpf_rx_post_init_bufs(rxbufq, IDPF_RX_BUFQ_WORKING_SET(rxbufq)))
+ err = -ENOMEM;

rx_buf_alloc_all_out:
if (err)
- idpf_rx_buf_rel_all(rxbufq);
+ idpf_rx_buf_rel_bufq(rxbufq, dev);

return err;
}

/**
* idpf_rx_bufs_init - Initialize page pool, allocate rx bufs, and post to HW
- * @rxbufq: RX queue to create page pool for
+ * @bufq: buffer queue to create page pool for
*
* Returns 0 on success, negative on failure
*/
-static int idpf_rx_bufs_init(struct idpf_queue *rxbufq)
+static int idpf_rx_bufs_init(struct idpf_buf_queue *bufq)
{
struct page_pool *pool;

- pool = idpf_rx_create_page_pool(rxbufq);
+ pool = idpf_rx_create_page_pool(&bufq->q_vector->napi,
+ bufq->desc_count);
if (IS_ERR(pool))
return PTR_ERR(pool);

- rxbufq->pp = pool;
+ bufq->pp = pool;

- return idpf_rx_buf_alloc_all(rxbufq);
+ return idpf_rx_buf_alloc_all(bufq);
}

/**
@@ -671,7 +793,6 @@ static int idpf_rx_bufs_init(struct idpf_queue *rxbufq)
int idpf_rx_bufs_init_all(struct idpf_vport *vport)
{
struct idpf_rxq_group *rx_qgrp;
- struct idpf_queue *q;
int i, j, err;

for (i = 0; i < vport->num_rxq_grp; i++) {
@@ -682,8 +803,10 @@ int idpf_rx_bufs_init_all(struct idpf_vport *vport)
int num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++) {
+ struct idpf_rx_queue *q;
+
q = rx_qgrp->singleq.rxqs[j];
- err = idpf_rx_bufs_init(q);
+ err = idpf_rx_bufs_init_singleq(q);
if (err)
return err;
}
@@ -693,6 +816,8 @@ int idpf_rx_bufs_init_all(struct idpf_vport *vport)

/* Otherwise, allocate bufs for the buffer queues */
for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
+ struct idpf_buf_queue *q;
+
q = &rx_qgrp->splitq.bufq_sets[j].bufq;
err = idpf_rx_bufs_init(q);
if (err)
@@ -705,22 +830,17 @@ int idpf_rx_bufs_init_all(struct idpf_vport *vport)

/**
* idpf_rx_desc_alloc - Allocate queue Rx resources
+ * @vport: vport to allocate resources for
* @rxq: Rx queue for which the resources are setup
- * @bufq: buffer or completion queue
- * @q_model: single or split queue model
*
* Returns 0 on success, negative on failure
*/
-static int idpf_rx_desc_alloc(struct idpf_queue *rxq, bool bufq, s32 q_model)
+static int idpf_rx_desc_alloc(const struct idpf_vport *vport,
+ struct idpf_rx_queue *rxq)
{
- struct device *dev = rxq->dev;
+ struct device *dev = &vport->adapter->pdev->dev;

- if (bufq)
- rxq->size = rxq->desc_count *
- sizeof(struct virtchnl2_splitq_rx_buf_desc);
- else
- rxq->size = rxq->desc_count *
- sizeof(union virtchnl2_rx_desc);
+ rxq->size = rxq->desc_count * sizeof(union virtchnl2_rx_desc);

/* Allocate descriptors and also round up to nearest 4K */
rxq->size = ALIGN(rxq->size, 4096);
@@ -735,7 +855,35 @@ static int idpf_rx_desc_alloc(struct idpf_queue *rxq, bool bufq, s32 q_model)
rxq->next_to_alloc = 0;
rxq->next_to_clean = 0;
rxq->next_to_use = 0;
- set_bit(__IDPF_Q_GEN_CHK, rxq->flags);
+ idpf_queue_set(GEN_CHK, rxq);
+
+ return 0;
+}
+
+/**
+ * idpf_bufq_desc_alloc - Allocate buffer queue descriptor ring
+ * @vport: vport to allocate resources for
+ * @bufq: buffer queue for which the resources are set up
+ *
+ * Return: 0 on success, -ENOMEM on failure.
+ */
+static int idpf_bufq_desc_alloc(const struct idpf_vport *vport,
+ struct idpf_buf_queue *bufq)
+{
+ struct device *dev = &vport->adapter->pdev->dev;
+
+ bufq->size = array_size(bufq->desc_count, sizeof(*bufq->split_buf));
+
+ bufq->split_buf = dma_alloc_coherent(dev, bufq->size, &bufq->dma,
+ GFP_KERNEL);
+ if (!bufq->split_buf)
+ return -ENOMEM;
+
+ bufq->next_to_alloc = 0;
+ bufq->next_to_clean = 0;
+ bufq->next_to_use = 0;
+
+ idpf_queue_set(GEN_CHK, bufq);

return 0;
}
@@ -748,9 +896,7 @@ static int idpf_rx_desc_alloc(struct idpf_queue *rxq, bool bufq, s32 q_model)
*/
static int idpf_rx_desc_alloc_all(struct idpf_vport *vport)
{
- struct device *dev = &vport->adapter->pdev->dev;
struct idpf_rxq_group *rx_qgrp;
- struct idpf_queue *q;
int i, j, err;
u16 num_rxq;

@@ -762,13 +908,17 @@ static int idpf_rx_desc_alloc_all(struct idpf_vport *vport)
num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++) {
+ struct idpf_rx_queue *q;
+
if (idpf_is_queue_model_split(vport->rxq_model))
q = &rx_qgrp->splitq.rxq_sets[j]->rxq;
else
q = rx_qgrp->singleq.rxqs[j];
- err = idpf_rx_desc_alloc(q, false, vport->rxq_model);
+
+ err = idpf_rx_desc_alloc(vport, q);
if (err) {
- dev_err(dev, "Memory allocation for Rx Queue %u failed\n",
+ pci_err(vport->adapter->pdev,
+ "Memory allocation for Rx Queue %u failed\n",
i);
goto err_out;
}
@@ -778,10 +928,14 @@ static int idpf_rx_desc_alloc_all(struct idpf_vport *vport)
continue;

for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
+ struct idpf_buf_queue *q;
+
q = &rx_qgrp->splitq.bufq_sets[j].bufq;
- err = idpf_rx_desc_alloc(q, true, vport->rxq_model);
+
+ err = idpf_bufq_desc_alloc(vport, q);
if (err) {
- dev_err(dev, "Memory allocation for Rx Buffer Queue %u failed\n",
+ pci_err(vport->adapter->pdev,
+ "Memory allocation for Rx Buffer Queue %u failed\n",
i);
goto err_out;
}
@@ -802,11 +956,16 @@ static int idpf_rx_desc_alloc_all(struct idpf_vport *vport)
*/
static void idpf_txq_group_rel(struct idpf_vport *vport)
{
+ bool split, flow_sch_en;
int i, j;

if (!vport->txq_grps)
return;

+ split = idpf_is_queue_model_split(vport->txq_model);
+ flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
+ VIRTCHNL2_CAP_SPLITQ_QSCHED);
+
for (i = 0; i < vport->num_txq_grp; i++) {
struct idpf_txq_group *txq_grp = &vport->txq_grps[i];

@@ -814,8 +973,15 @@ static void idpf_txq_group_rel(struct idpf_vport *vport)
kfree(txq_grp->txqs[j]);
txq_grp->txqs[j] = NULL;
}
+
+ if (!split)
+ continue;
+
kfree(txq_grp->complq);
txq_grp->complq = NULL;
+
+ if (flow_sch_en)
+ kfree(txq_grp->stashes);
}
kfree(vport->txq_grps);
vport->txq_grps = NULL;
@@ -919,7 +1085,7 @@ static int idpf_vport_init_fast_path_txqs(struct idpf_vport *vport)
{
int i, j, k = 0;

- vport->txqs = kcalloc(vport->num_txq, sizeof(struct idpf_queue *),
+ vport->txqs = kcalloc(vport->num_txq, sizeof(*vport->txqs),
GFP_KERNEL);

if (!vport->txqs)
@@ -1137,7 +1303,8 @@ static void idpf_vport_calc_numq_per_grp(struct idpf_vport *vport,
* @q: rx queue for which descids are set
*
*/
-static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
+static void idpf_rxq_set_descids(const struct idpf_vport *vport,
+ struct idpf_rx_queue *q)
{
if (vport->rxq_model == VIRTCHNL2_QUEUE_MODEL_SPLIT) {
q->rxdids = VIRTCHNL2_RXDID_2_FLEX_SPLITQ_M;
@@ -1158,20 +1325,22 @@ static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
*/
static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
{
- bool flow_sch_en;
- int err, i;
+ bool split, flow_sch_en;
+ int i;

vport->txq_grps = kcalloc(vport->num_txq_grp,
sizeof(*vport->txq_grps), GFP_KERNEL);
if (!vport->txq_grps)
return -ENOMEM;

+ split = idpf_is_queue_model_split(vport->txq_model);
flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
VIRTCHNL2_CAP_SPLITQ_QSCHED);

for (i = 0; i < vport->num_txq_grp; i++) {
struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
struct idpf_adapter *adapter = vport->adapter;
+ struct idpf_txq_stash *stashes;
int j;

tx_qgrp->vport = vport;
@@ -1180,45 +1349,62 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
for (j = 0; j < tx_qgrp->num_txq; j++) {
tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
GFP_KERNEL);
- if (!tx_qgrp->txqs[j]) {
- err = -ENOMEM;
+ if (!tx_qgrp->txqs[j])
goto err_alloc;
- }
+ }
+
+ if (split && flow_sch_en) {
+ stashes = kcalloc(num_txq, sizeof(*stashes),
+ GFP_KERNEL);
+ if (!stashes)
+ goto err_alloc;
+
+ tx_qgrp->stashes = stashes;
}

for (j = 0; j < tx_qgrp->num_txq; j++) {
- struct idpf_queue *q = tx_qgrp->txqs[j];
+ struct idpf_tx_queue *q = tx_qgrp->txqs[j];

q->dev = &adapter->pdev->dev;
q->desc_count = vport->txq_desc_count;
q->tx_max_bufs = idpf_get_max_tx_bufs(adapter);
q->tx_min_pkt_len = idpf_get_min_tx_pkt_len(adapter);
- q->vport = vport;
+ q->netdev = vport->netdev;
q->txq_grp = tx_qgrp;
- hash_init(q->sched_buf_hash);

- if (flow_sch_en)
- set_bit(__IDPF_Q_FLOW_SCH_EN, q->flags);
+ if (!split) {
+ q->clean_budget = vport->compln_clean_budget;
+ idpf_queue_assign(CRC_EN, q,
+ vport->crc_enable);
+ }
+
+ if (!flow_sch_en)
+ continue;
+
+ if (split) {
+ q->stash = &stashes[j];
+ hash_init(q->stash->sched_buf_hash);
+ }
+
+ idpf_queue_set(FLOW_SCH_EN, q);
}

- if (!idpf_is_queue_model_split(vport->txq_model))
+ if (!split)
continue;

tx_qgrp->complq = kcalloc(IDPF_COMPLQ_PER_GROUP,
sizeof(*tx_qgrp->complq),
GFP_KERNEL);
- if (!tx_qgrp->complq) {
- err = -ENOMEM;
+ if (!tx_qgrp->complq)
goto err_alloc;
- }

- tx_qgrp->complq->dev = &adapter->pdev->dev;
tx_qgrp->complq->desc_count = vport->complq_desc_count;
- tx_qgrp->complq->vport = vport;
tx_qgrp->complq->txq_grp = tx_qgrp;
+ tx_qgrp->complq->netdev = vport->netdev;
+ tx_qgrp->complq->clean_budget = vport->compln_clean_budget;

if (flow_sch_en)
- __set_bit(__IDPF_Q_FLOW_SCH_EN, tx_qgrp->complq->flags);
+ idpf_queue_set(FLOW_SCH_EN, tx_qgrp->complq);
}

return 0;
@@ -1226,7 +1412,7 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
err_alloc:
idpf_txq_group_rel(vport);

- return err;
+ return -ENOMEM;
}

/**
@@ -1238,8 +1424,6 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
*/
static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)
{
- struct idpf_adapter *adapter = vport->adapter;
- struct idpf_queue *q;
int i, k, err = 0;
bool hs;

@@ -1292,21 +1476,15 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)
struct idpf_bufq_set *bufq_set =
&rx_qgrp->splitq.bufq_sets[j];
int swq_size = sizeof(struct idpf_sw_queue);
+ struct idpf_buf_queue *q;

q = &rx_qgrp->splitq.bufq_sets[j].bufq;
- q->dev = &adapter->pdev->dev;
q->desc_count = vport->bufq_desc_count[j];
- q->vport = vport;
- q->rxq_grp = rx_qgrp;
- q->idx = j;
q->rx_buf_size = vport->bufq_size[j];
q->rx_buffer_low_watermark = IDPF_LOW_WATERMARK;
- q->rx_buf_stride = IDPF_RX_BUF_STRIDE;

- if (hs) {
- q->rx_hsplit_en = true;
- q->rx_hbuf_size = IDPF_HDR_BUF_SIZE;
- }
+ idpf_queue_assign(HSPLIT_EN, q, hs);
+ q->rx_hbuf_size = hs ? IDPF_HDR_BUF_SIZE : 0;

bufq_set->num_refillqs = num_rxq;
bufq_set->refillqs = kcalloc(num_rxq, swq_size,
@@ -1319,13 +1497,12 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)
struct idpf_sw_queue *refillq =
&bufq_set->refillqs[k];

- refillq->dev = &vport->adapter->pdev->dev;
refillq->desc_count =
vport->bufq_desc_count[j];
- set_bit(__IDPF_Q_GEN_CHK, refillq->flags);
- set_bit(__IDPF_RFLQ_GEN_CHK, refillq->flags);
+ idpf_queue_set(GEN_CHK, refillq);
+ idpf_queue_set(RFL_GEN_CHK, refillq);
refillq->ring = kcalloc(refillq->desc_count,
- sizeof(u16),
+ sizeof(*refillq->ring),
GFP_KERNEL);
if (!refillq->ring) {
err = -ENOMEM;
@@ -1336,27 +1513,27 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)

skip_splitq_rx_init:
for (j = 0; j < num_rxq; j++) {
+ struct idpf_rx_queue *q;
+
if (!idpf_is_queue_model_split(vport->rxq_model)) {
q = rx_qgrp->singleq.rxqs[j];
goto setup_rxq;
}
q = &rx_qgrp->splitq.rxq_sets[j]->rxq;
- rx_qgrp->splitq.rxq_sets[j]->refillq0 =
+ rx_qgrp->splitq.rxq_sets[j]->refillq[0] =
&rx_qgrp->splitq.bufq_sets[0].refillqs[j];
if (vport->num_bufqs_per_qgrp > IDPF_SINGLE_BUFQ_PER_RXQ_GRP)
- rx_qgrp->splitq.rxq_sets[j]->refillq1 =
+ rx_qgrp->splitq.rxq_sets[j]->refillq[1] =
&rx_qgrp->splitq.bufq_sets[1].refillqs[j];

- if (hs) {
- q->rx_hsplit_en = true;
- q->rx_hbuf_size = IDPF_HDR_BUF_SIZE;
- }
+ idpf_queue_assign(HSPLIT_EN, q, hs);
+ q->rx_hbuf_size = hs ? IDPF_HDR_BUF_SIZE : 0;

setup_rxq:
- q->dev = &adapter->pdev->dev;
q->desc_count = vport->rxq_desc_count;
- q->vport = vport;
- q->rxq_grp = rx_qgrp;
+ q->rx_ptype_lkup = vport->rx_ptype_lkup;
+ q->netdev = vport->netdev;
+ q->bufq_sets = rx_qgrp->splitq.bufq_sets;
q->idx = (i * num_rxq) + j;
/* In splitq mode, RXQ buffer size should be
* set to that of the first buffer queue
@@ -1445,12 +1622,13 @@ int idpf_vport_queues_alloc(struct idpf_vport *vport)
* idpf_tx_handle_sw_marker - Handle queue marker packet
* @tx_q: tx queue to handle software marker
*/
-static void idpf_tx_handle_sw_marker(struct idpf_queue *tx_q)
+static void idpf_tx_handle_sw_marker(struct idpf_tx_queue *tx_q)
{
- struct idpf_vport *vport = tx_q->vport;
+ struct idpf_netdev_priv *priv = netdev_priv(tx_q->netdev);
+ struct idpf_vport *vport = priv->vport;
int i;

- clear_bit(__IDPF_Q_SW_MARKER, tx_q->flags);
+ idpf_queue_clear(SW_MARKER, tx_q);
/* Hardware must write marker packets to all queues associated with
* completion queues. So check if all queues received marker packets
*/
@@ -1458,7 +1636,7 @@ static void idpf_tx_handle_sw_marker(struct idpf_queue *tx_q)
/* If we're still waiting on any other TXQ marker completions,
* just return now since we cannot wake up the marker_wq yet.
*/
- if (test_bit(__IDPF_Q_SW_MARKER, vport->txqs[i]->flags))
+ if (idpf_queue_has(SW_MARKER, vport->txqs[i]))
return;

/* Drain complete */
@@ -1474,7 +1652,7 @@ static void idpf_tx_handle_sw_marker(struct idpf_queue *tx_q)
* @cleaned: pointer to stats struct to track cleaned packets/bytes
* @napi_budget: Used to determine if we are in netpoll
*/
-static void idpf_tx_splitq_clean_hdr(struct idpf_queue *tx_q,
+static void idpf_tx_splitq_clean_hdr(struct idpf_tx_queue *tx_q,
struct idpf_tx_buf *tx_buf,
struct idpf_cleaned_stats *cleaned,
int napi_budget)
@@ -1505,7 +1683,8 @@ static void idpf_tx_splitq_clean_hdr(struct idpf_queue *tx_q,
* @cleaned: pointer to stats struct to track cleaned packets/bytes
* @budget: Used to determine if we are in netpoll
*/
-static void idpf_tx_clean_stashed_bufs(struct idpf_queue *txq, u16 compl_tag,
+static void idpf_tx_clean_stashed_bufs(struct idpf_tx_queue *txq,
+ u16 compl_tag,
struct idpf_cleaned_stats *cleaned,
int budget)
{
@@ -1513,7 +1692,7 @@ static void idpf_tx_clean_stashed_bufs(struct idpf_queue *txq, u16 compl_tag,
struct hlist_node *tmp_buf;

/* Buffer completion */
- hash_for_each_possible_safe(txq->sched_buf_hash, stash, tmp_buf,
+ hash_for_each_possible_safe(txq->stash->sched_buf_hash, stash, tmp_buf,
hlist, compl_tag) {
if (unlikely(stash->buf.compl_tag != (int)compl_tag))
continue;
@@ -1530,7 +1709,7 @@ static void idpf_tx_clean_stashed_bufs(struct idpf_queue *txq, u16 compl_tag,
}

/* Push shadow buf back onto stack */
- idpf_buf_lifo_push(&txq->buf_stack, stash);
+ idpf_buf_lifo_push(&txq->stash->buf_stack, stash);

hash_del(&stash->hlist);
}
@@ -1542,7 +1721,7 @@ static void idpf_tx_clean_stashed_bufs(struct idpf_queue *txq, u16 compl_tag,
* @txq: Tx queue to clean
* @tx_buf: buffer to store
*/
-static int idpf_stash_flow_sch_buffers(struct idpf_queue *txq,
+static int idpf_stash_flow_sch_buffers(struct idpf_tx_queue *txq,
struct idpf_tx_buf *tx_buf)
{
struct idpf_tx_stash *stash;
@@ -1551,10 +1730,10 @@ static int idpf_stash_flow_sch_buffers(struct idpf_queue *txq,
!dma_unmap_len(tx_buf, len)))
return 0;

- stash = idpf_buf_lifo_pop(&txq->buf_stack);
+ stash = idpf_buf_lifo_pop(&txq->stash->buf_stack);
if (unlikely(!stash)) {
net_err_ratelimited("%s: No out-of-order TX buffers left!\n",
- txq->vport->netdev->name);
+ netdev_name(txq->netdev));

return -ENOMEM;
}
@@ -1568,7 +1747,8 @@ static int idpf_stash_flow_sch_buffers(struct idpf_queue *txq,
stash->buf.compl_tag = tx_buf->compl_tag;

/* Add buffer to buf_hash table to be freed later */
- hash_add(txq->sched_buf_hash, &stash->hlist, stash->buf.compl_tag);
+ hash_add(txq->stash->sched_buf_hash, &stash->hlist,
+ stash->buf.compl_tag);

memset(tx_buf, 0, sizeof(struct idpf_tx_buf));

@@ -1607,7 +1787,7 @@ do { \
* and the buffers will be cleaned separately. The stats are not updated from
* this function when using flow-based scheduling.
*/
-static void idpf_tx_splitq_clean(struct idpf_queue *tx_q, u16 end,
+static void idpf_tx_splitq_clean(struct idpf_tx_queue *tx_q, u16 end,
int napi_budget,
struct idpf_cleaned_stats *cleaned,
bool descs_only)
@@ -1703,7 +1883,7 @@ do { \
* stashed. Returns the byte/segment count for the cleaned packet associated
* this completion tag.
*/
-static bool idpf_tx_clean_buf_ring(struct idpf_queue *txq, u16 compl_tag,
+static bool idpf_tx_clean_buf_ring(struct idpf_tx_queue *txq, u16 compl_tag,
struct idpf_cleaned_stats *cleaned,
int budget)
{
@@ -1772,14 +1952,14 @@ static bool idpf_tx_clean_buf_ring(struct idpf_queue *txq, u16 compl_tag,
*
* Returns bytes/packets cleaned
*/
-static void idpf_tx_handle_rs_completion(struct idpf_queue *txq,
+static void idpf_tx_handle_rs_completion(struct idpf_tx_queue *txq,
struct idpf_splitq_tx_compl_desc *desc,
struct idpf_cleaned_stats *cleaned,
int budget)
{
u16 compl_tag;

- if (!test_bit(__IDPF_Q_FLOW_SCH_EN, txq->flags)) {
+ if (!idpf_queue_has(FLOW_SCH_EN, txq)) {
u16 head = le16_to_cpu(desc->q_head_compl_tag.q_head);

return idpf_tx_splitq_clean(txq, head, budget, cleaned, false);
@@ -1802,24 +1982,23 @@ static void idpf_tx_handle_rs_completion(struct idpf_queue *txq,
*
* Returns true if there's any budget left (e.g. the clean is finished)
*/
-static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
+static bool idpf_tx_clean_complq(struct idpf_compl_queue *complq, int budget,
int *cleaned)
{
struct idpf_splitq_tx_compl_desc *tx_desc;
- struct idpf_vport *vport = complq->vport;
s16 ntc = complq->next_to_clean;
struct idpf_netdev_priv *np;
unsigned int complq_budget;
bool complq_ok = true;
int i;

- complq_budget = vport->compln_clean_budget;
+ complq_budget = complq->clean_budget;
tx_desc = &complq->comp[ntc];
ntc -= complq->desc_count;

do {
struct idpf_cleaned_stats cleaned_stats = { };
- struct idpf_queue *tx_q;
+ struct idpf_tx_queue *tx_q;
int rel_tx_qid;
u16 hw_head;
u8 ctype; /* completion type */
@@ -1828,7 +2007,7 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
/* if the descriptor isn't done, no work yet to do */
gen = le16_get_bits(tx_desc->qid_comptype_gen,
IDPF_TXD_COMPLQ_GEN_M);
- if (test_bit(__IDPF_Q_GEN_CHK, complq->flags) != gen)
+ if (idpf_queue_has(GEN_CHK, complq) != gen)
break;

/* Find necessary info of TX queue to clean buffers */
@@ -1836,8 +2015,7 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
IDPF_TXD_COMPLQ_QID_M);
if (rel_tx_qid >= complq->txq_grp->num_txq ||
!complq->txq_grp->txqs[rel_tx_qid]) {
- dev_err(&complq->vport->adapter->pdev->dev,
- "TxQ not found\n");
+ netdev_err(complq->netdev, "TxQ not found\n");
goto fetch_next_desc;
}
tx_q = complq->txq_grp->txqs[rel_tx_qid];
@@ -1860,15 +2038,14 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
idpf_tx_handle_sw_marker(tx_q);
break;
default:
- dev_err(&tx_q->vport->adapter->pdev->dev,
- "Unknown TX completion type: %d\n",
- ctype);
+ netdev_err(tx_q->netdev,
+ "Unknown TX completion type: %d\n", ctype);
goto fetch_next_desc;
}

u64_stats_update_begin(&tx_q->stats_sync);
- u64_stats_add(&tx_q->q_stats.tx.packets, cleaned_stats.packets);
- u64_stats_add(&tx_q->q_stats.tx.bytes, cleaned_stats.bytes);
+ u64_stats_add(&tx_q->q_stats.packets, cleaned_stats.packets);
+ u64_stats_add(&tx_q->q_stats.bytes, cleaned_stats.bytes);
tx_q->cleaned_pkts += cleaned_stats.packets;
tx_q->cleaned_bytes += cleaned_stats.bytes;
complq->num_completions++;
@@ -1880,7 +2057,7 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
if (unlikely(!ntc)) {
ntc -= complq->desc_count;
tx_desc = &complq->comp[0];
- change_bit(__IDPF_Q_GEN_CHK, complq->flags);
+ idpf_queue_change(GEN_CHK, complq);
}

prefetch(tx_desc);
@@ -1896,9 +2073,9 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
IDPF_TX_COMPLQ_OVERFLOW_THRESH(complq)))
complq_ok = false;

- np = netdev_priv(complq->vport->netdev);
+ np = netdev_priv(complq->netdev);
for (i = 0; i < complq->txq_grp->num_txq; ++i) {
- struct idpf_queue *tx_q = complq->txq_grp->txqs[i];
+ struct idpf_tx_queue *tx_q = complq->txq_grp->txqs[i];
struct netdev_queue *nq;
bool dont_wake;

@@ -1909,11 +2086,11 @@ static bool idpf_tx_clean_complq(struct idpf_queue *complq, int budget,
*cleaned += tx_q->cleaned_pkts;

/* Update BQL */
- nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+ nq = netdev_get_tx_queue(tx_q->netdev, tx_q->idx);

dont_wake = !complq_ok || IDPF_TX_BUF_RSV_LOW(tx_q) ||
np->state != __IDPF_VPORT_UP ||
- !netif_carrier_ok(tx_q->vport->netdev);
+ !netif_carrier_ok(tx_q->netdev);
/* Check if the TXQ needs to and can be restarted */
__netif_txq_completed_wake(nq, tx_q->cleaned_pkts, tx_q->cleaned_bytes,
IDPF_DESC_UNUSED(tx_q), IDPF_TX_WAKE_THRESH,
@@ -1976,7 +2153,7 @@ void idpf_tx_splitq_build_flow_desc(union idpf_tx_flex_desc *desc,
*
* Returns 0 if stop is not needed
*/
-int idpf_tx_maybe_stop_common(struct idpf_queue *tx_q, unsigned int size)
+int idpf_tx_maybe_stop_common(struct idpf_tx_queue *tx_q, unsigned int size)
{
struct netdev_queue *nq;

@@ -1984,10 +2161,10 @@ int idpf_tx_maybe_stop_common(struct idpf_queue *tx_q, unsigned int size)
return 0;

u64_stats_update_begin(&tx_q->stats_sync);
- u64_stats_inc(&tx_q->q_stats.tx.q_busy);
+ u64_stats_inc(&tx_q->q_stats.q_busy);
u64_stats_update_end(&tx_q->stats_sync);

- nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+ nq = netdev_get_tx_queue(tx_q->netdev, tx_q->idx);

return netif_txq_maybe_stop(nq, IDPF_DESC_UNUSED(tx_q), size, size);
}
@@ -1999,7 +2176,7 @@ int idpf_tx_maybe_stop_common(struct idpf_queue *tx_q, unsigned int size)
*
* Returns 0 if stop is not needed
*/
-static int idpf_tx_maybe_stop_splitq(struct idpf_queue *tx_q,
+static int idpf_tx_maybe_stop_splitq(struct idpf_tx_queue *tx_q,
unsigned int descs_needed)
{
if (idpf_tx_maybe_stop_common(tx_q, descs_needed))
@@ -2023,9 +2200,9 @@ static int idpf_tx_maybe_stop_splitq(struct idpf_queue *tx_q,

splitq_stop:
u64_stats_update_begin(&tx_q->stats_sync);
- u64_stats_inc(&tx_q->q_stats.tx.q_busy);
+ u64_stats_inc(&tx_q->q_stats.q_busy);
u64_stats_update_end(&tx_q->stats_sync);
- netif_stop_subqueue(tx_q->vport->netdev, tx_q->idx);
+ netif_stop_subqueue(tx_q->netdev, tx_q->idx);

return -EBUSY;
}
@@ -2040,12 +2217,12 @@ static int idpf_tx_maybe_stop_splitq(struct idpf_queue *tx_q,
* to do a register write to update our queue status. We know this can only
* mean tail here as HW should be owning head for TX.
*/
-void idpf_tx_buf_hw_update(struct idpf_queue *tx_q, u32 val,
+void idpf_tx_buf_hw_update(struct idpf_tx_queue *tx_q, u32 val,
bool xmit_more)
{
struct netdev_queue *nq;

- nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+ nq = netdev_get_tx_queue(tx_q->netdev, tx_q->idx);
tx_q->next_to_use = val;

idpf_tx_maybe_stop_common(tx_q, IDPF_TX_DESC_NEEDED);
@@ -2069,7 +2246,7 @@ void idpf_tx_buf_hw_update(struct idpf_queue *tx_q, u32 val,
*
* Returns number of data descriptors needed for this skb.
*/
-unsigned int idpf_tx_desc_count_required(struct idpf_queue *txq,
+unsigned int idpf_tx_desc_count_required(struct idpf_tx_queue *txq,
struct sk_buff *skb)
{
const struct skb_shared_info *shinfo;
@@ -2102,7 +2279,7 @@ unsigned int idpf_tx_desc_count_required(struct idpf_queue *txq,

count = idpf_size_to_txd_count(skb->len);
u64_stats_update_begin(&txq->stats_sync);
- u64_stats_inc(&txq->q_stats.tx.linearize);
+ u64_stats_inc(&txq->q_stats.linearize);
u64_stats_update_end(&txq->stats_sync);
}

@@ -2116,11 +2293,11 @@ unsigned int idpf_tx_desc_count_required(struct idpf_queue *txq,
* @first: original first buffer info buffer for packet
* @idx: starting point on ring to unwind
*/
-void idpf_tx_dma_map_error(struct idpf_queue *txq, struct sk_buff *skb,
+void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
struct idpf_tx_buf *first, u16 idx)
{
u64_stats_update_begin(&txq->stats_sync);
- u64_stats_inc(&txq->q_stats.tx.dma_map_errs);
+ u64_stats_inc(&txq->q_stats.dma_map_errs);
u64_stats_update_end(&txq->stats_sync);

/* clear dma mappings for failed tx_buf map */
@@ -2159,7 +2336,7 @@ void idpf_tx_dma_map_error(struct idpf_queue *txq, struct sk_buff *skb,
* @txq: the tx ring to wrap
* @ntu: ring index to bump
*/
-static unsigned int idpf_tx_splitq_bump_ntu(struct idpf_queue *txq, u16 ntu)
+static unsigned int idpf_tx_splitq_bump_ntu(struct idpf_tx_queue *txq, u16 ntu)
{
ntu++;

@@ -2181,7 +2358,7 @@ static unsigned int idpf_tx_splitq_bump_ntu(struct idpf_queue *txq, u16 ntu)
* and gets a physical address for each memory location and programs
* it and the length into the transmit flex descriptor.
*/
-static void idpf_tx_splitq_map(struct idpf_queue *tx_q,
+static void idpf_tx_splitq_map(struct idpf_tx_queue *tx_q,
struct idpf_tx_splitq_params *params,
struct idpf_tx_buf *first)
{
@@ -2348,7 +2525,7 @@ static void idpf_tx_splitq_map(struct idpf_queue *tx_q,
tx_q->txq_grp->num_completions_pending++;

/* record bytecount for BQL */
- nq = netdev_get_tx_queue(tx_q->vport->netdev, tx_q->idx);
+ nq = netdev_get_tx_queue(tx_q->netdev, tx_q->idx);
netdev_tx_sent_queue(nq, first->bytecount);

idpf_tx_buf_hw_update(tx_q, i, netdev_xmit_more());
@@ -2544,7 +2721,7 @@ bool idpf_chk_linearize(struct sk_buff *skb, unsigned int max_bufs,
* ring entry to reflect that this index is a context descriptor
*/
static struct idpf_flex_tx_ctx_desc *
-idpf_tx_splitq_get_ctx_desc(struct idpf_queue *txq)
+idpf_tx_splitq_get_ctx_desc(struct idpf_tx_queue *txq)
{
struct idpf_flex_tx_ctx_desc *desc;
int i = txq->next_to_use;
@@ -2564,10 +2741,10 @@ idpf_tx_splitq_get_ctx_desc(struct idpf_queue *txq)
* @tx_q: queue to send buffer on
* @skb: pointer to skb
*/
-netdev_tx_t idpf_tx_drop_skb(struct idpf_queue *tx_q, struct sk_buff *skb)
+netdev_tx_t idpf_tx_drop_skb(struct idpf_tx_queue *tx_q, struct sk_buff *skb)
{
u64_stats_update_begin(&tx_q->stats_sync);
- u64_stats_inc(&tx_q->q_stats.tx.skb_drops);
+ u64_stats_inc(&tx_q->q_stats.skb_drops);
u64_stats_update_end(&tx_q->stats_sync);

idpf_tx_buf_hw_update(tx_q, tx_q->next_to_use, false);
@@ -2585,7 +2762,7 @@ netdev_tx_t idpf_tx_drop_skb(struct idpf_queue *tx_q, struct sk_buff *skb)
* Returns NETDEV_TX_OK if sent, else an error code
*/
static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
- struct idpf_queue *tx_q)
+ struct idpf_tx_queue *tx_q)
{
struct idpf_tx_splitq_params tx_params = { };
struct idpf_tx_buf *first;
@@ -2625,7 +2802,7 @@ static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
ctx_desc->tso.qw0.hdr_len = tx_params.offload.tso_hdr_len;

u64_stats_update_begin(&tx_q->stats_sync);
- u64_stats_inc(&tx_q->q_stats.tx.lso_pkts);
+ u64_stats_inc(&tx_q->q_stats.lso_pkts);
u64_stats_update_end(&tx_q->stats_sync);
}

@@ -2642,7 +2819,7 @@ static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
first->bytecount = max_t(unsigned int, skb->len, ETH_ZLEN);
}

- if (test_bit(__IDPF_Q_FLOW_SCH_EN, tx_q->flags)) {
+ if (idpf_queue_has(FLOW_SCH_EN, tx_q)) {
tx_params.dtype = IDPF_TX_DESC_DTYPE_FLEX_FLOW_SCHE;
tx_params.eop_cmd = IDPF_TXD_FLEX_FLOW_CMD_EOP;
/* Set the RE bit to catch any packets that may have not been
@@ -2682,7 +2859,7 @@ netdev_tx_t idpf_tx_splitq_start(struct sk_buff *skb,
struct net_device *netdev)
{
struct idpf_vport *vport = idpf_netdev_to_vport(netdev);
- struct idpf_queue *tx_q;
+ struct idpf_tx_queue *tx_q;

if (unlikely(skb_get_queue_mapping(skb) >= vport->num_txq)) {
dev_kfree_skb_any(skb);
@@ -2735,13 +2912,14 @@ enum pkt_hash_types idpf_ptype_to_htype(const struct idpf_rx_ptype_decoded *deco
* @rx_desc: Receive descriptor
* @decoded: Decoded Rx packet type related fields
*/
-static void idpf_rx_hash(struct idpf_queue *rxq, struct sk_buff *skb,
- struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
- struct idpf_rx_ptype_decoded *decoded)
+static void
+idpf_rx_hash(const struct idpf_rx_queue *rxq, struct sk_buff *skb,
+ const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
+ struct idpf_rx_ptype_decoded *decoded)
{
u32 hash;

- if (unlikely(!idpf_is_feature_ena(rxq->vport, NETIF_F_RXHASH)))
+ if (unlikely(!(rxq->netdev->features & NETIF_F_RXHASH)))
return;

hash = le16_to_cpu(rx_desc->hash1) |
@@ -2760,14 +2938,14 @@ static void idpf_rx_hash(struct idpf_queue *rxq, struct sk_buff *skb,
*
* skb->protocol must be set before this function is called
*/
-static void idpf_rx_csum(struct idpf_queue *rxq, struct sk_buff *skb,
+static void idpf_rx_csum(struct idpf_rx_queue *rxq, struct sk_buff *skb,
struct idpf_rx_csum_decoded *csum_bits,
struct idpf_rx_ptype_decoded *decoded)
{
bool ipv4, ipv6;

/* check if Rx checksum is enabled */
- if (unlikely(!idpf_is_feature_ena(rxq->vport, NETIF_F_RXCSUM)))
+ if (unlikely(!(rxq->netdev->features & NETIF_F_RXCSUM)))
return;

/* check if HW has decoded the packet and checksum */
@@ -2814,7 +2992,7 @@ static void idpf_rx_csum(struct idpf_queue *rxq, struct sk_buff *skb,

checksum_fail:
u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.rx.hw_csum_err);
+ u64_stats_inc(&rxq->q_stats.hw_csum_err);
u64_stats_update_end(&rxq->stats_sync);
}

@@ -2824,8 +3002,9 @@ static void idpf_rx_csum(struct idpf_queue *rxq, struct sk_buff *skb,
* @csum: structure to extract checksum fields
*
**/
-static void idpf_rx_splitq_extract_csum_bits(struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
- struct idpf_rx_csum_decoded *csum)
+static void
+idpf_rx_splitq_extract_csum_bits(const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
+ struct idpf_rx_csum_decoded *csum)
{
u8 qword0, qword1;

@@ -2860,8 +3039,8 @@ static void idpf_rx_splitq_extract_csum_bits(struct virtchnl2_rx_flex_desc_adv_n
* Populate the skb fields with the total number of RSC segments, RSC payload
* length and packet type.
*/
-static int idpf_rx_rsc(struct idpf_queue *rxq, struct sk_buff *skb,
- struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
+static int idpf_rx_rsc(struct idpf_rx_queue *rxq, struct sk_buff *skb,
+ const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc,
struct idpf_rx_ptype_decoded *decoded)
{
u16 rsc_segments, rsc_seg_len;
@@ -2914,7 +3093,7 @@ static int idpf_rx_rsc(struct idpf_queue *rxq, struct sk_buff *skb,
tcp_gro_complete(skb);

u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.rx.rsc_pkts);
+ u64_stats_inc(&rxq->q_stats.rsc_pkts);
u64_stats_update_end(&rxq->stats_sync);

return 0;
@@ -2930,9 +3109,9 @@ static int idpf_rx_rsc(struct idpf_queue *rxq, struct sk_buff *skb,
* order to populate the hash, checksum, protocol, and
* other fields within the skb.
*/
-static int idpf_rx_process_skb_fields(struct idpf_queue *rxq,
- struct sk_buff *skb,
- struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc)
+static int
+idpf_rx_process_skb_fields(struct idpf_rx_queue *rxq, struct sk_buff *skb,
+ const struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc)
{
struct idpf_rx_csum_decoded csum_bits = { };
struct idpf_rx_ptype_decoded decoded;
@@ -2940,19 +3119,13 @@ static int idpf_rx_process_skb_fields(struct idpf_queue *rxq,

rx_ptype = le16_get_bits(rx_desc->ptype_err_fflags0,
VIRTCHNL2_RX_FLEX_DESC_ADV_PTYPE_M);
-
- skb->protocol = eth_type_trans(skb, rxq->vport->netdev);
-
- decoded = rxq->vport->rx_ptype_lkup[rx_ptype];
- /* If we don't know the ptype we can't do anything else with it. Just
- * pass it up the stack as-is.
- */
- if (!decoded.known)
- return 0;
+ decoded = rxq->rx_ptype_lkup[rx_ptype];

/* process RSS/hash */
idpf_rx_hash(rxq, skb, rx_desc, &decoded);

+ skb->protocol = eth_type_trans(skb, rxq->netdev);
+
if (le16_get_bits(rx_desc->hdrlen_flags,
VIRTCHNL2_RX_FLEX_DESC_ADV_RSC_M))
return idpf_rx_rsc(rxq, skb, rx_desc, &decoded);
@@ -2992,7 +3165,7 @@ void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb,
* data from the current receive descriptor, taking care to set up the
* skb correctly.
*/
-struct sk_buff *idpf_rx_construct_skb(struct idpf_queue *rxq,
+struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
struct idpf_rx_buf *rx_buf,
unsigned int size)
{
@@ -3005,7 +3178,7 @@ struct sk_buff *idpf_rx_construct_skb(struct idpf_queue *rxq,
/* prefetch first cache line of first page */
net_prefetch(va);
/* allocate a skb to store the frags */
- skb = napi_alloc_skb(&rxq->q_vector->napi, IDPF_RX_HDR_SIZE);
+ skb = napi_alloc_skb(rxq->napi, IDPF_RX_HDR_SIZE);
if (unlikely(!skb)) {
idpf_rx_put_page(rx_buf);

@@ -3052,14 +3225,14 @@ struct sk_buff *idpf_rx_construct_skb(struct idpf_queue *rxq,
* the current receive descriptor, taking care to set up the skb correctly.
* This specifically uses a header buffer to start building the skb.
*/
-static struct sk_buff *idpf_rx_hdr_construct_skb(struct idpf_queue *rxq,
- const void *va,
- unsigned int size)
+static struct sk_buff *
+idpf_rx_hdr_construct_skb(const struct idpf_rx_queue *rxq, const void *va,
+ unsigned int size)
{
struct sk_buff *skb;

/* allocate a skb to store the frags */
- skb = napi_alloc_skb(&rxq->q_vector->napi, size);
+ skb = napi_alloc_skb(rxq->napi, size);
if (unlikely(!skb))
return NULL;

@@ -3115,10 +3288,10 @@ static bool idpf_rx_splitq_is_eop(struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_de
*
* Returns amount of work completed
*/
-static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
+static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
{
int total_rx_bytes = 0, total_rx_pkts = 0;
- struct idpf_queue *rx_bufq = NULL;
+ struct idpf_buf_queue *rx_bufq = NULL;
struct sk_buff *skb = rxq->skb;
u16 ntc = rxq->next_to_clean;

@@ -3148,7 +3321,7 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
gen_id = le16_get_bits(rx_desc->pktlen_gen_bufq_id,
VIRTCHNL2_RX_FLEX_DESC_ADV_GEN_M);

- if (test_bit(__IDPF_Q_GEN_CHK, rxq->flags) != gen_id)
+ if (idpf_queue_has(GEN_CHK, rxq) != gen_id)
break;

rxdid = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_RXDID_M,
@@ -3156,7 +3329,7 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
if (rxdid != VIRTCHNL2_RXDID_2_FLEX_SPLITQ) {
IDPF_RX_BUMP_NTC(rxq, ntc);
u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.rx.bad_descs);
+ u64_stats_inc(&rxq->q_stats.bad_descs);
u64_stats_update_end(&rxq->stats_sync);
continue;
}
@@ -3174,7 +3347,7 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
* data/payload buffer.
*/
u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.rx.hsplit_buf_ovf);
+ u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf);
u64_stats_update_end(&rxq->stats_sync);
goto bypass_hsplit;
}
@@ -3187,13 +3360,10 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
VIRTCHNL2_RX_FLEX_DESC_ADV_BUFQ_ID_M);

rxq_set = container_of(rxq, struct idpf_rxq_set, rxq);
- if (!bufq_id)
- refillq = rxq_set->refillq0;
- else
- refillq = rxq_set->refillq1;
+ refillq = rxq_set->refillq[bufq_id];

/* retrieve buffer from the rxq */
- rx_bufq = &rxq->rxq_grp->splitq.bufq_sets[bufq_id].bufq;
+ rx_bufq = &rxq->bufq_sets[bufq_id].bufq;

buf_id = le16_to_cpu(rx_desc->buf_id);

@@ -3205,7 +3375,7 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)

skb = idpf_rx_hdr_construct_skb(rxq, va, hdr_len);
u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.rx.hsplit_pkts);
+ u64_stats_inc(&rxq->q_stats.hsplit_pkts);
u64_stats_update_end(&rxq->stats_sync);
}

@@ -3248,7 +3418,7 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
}

/* send completed skb up the stack */
- napi_gro_receive(&rxq->q_vector->napi, skb);
+ napi_gro_receive(rxq->napi, skb);
skb = NULL;

/* update budget accounting */
@@ -3259,8 +3429,8 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)

rxq->skb = skb;
u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_add(&rxq->q_stats.rx.packets, total_rx_pkts);
- u64_stats_add(&rxq->q_stats.rx.bytes, total_rx_bytes);
+ u64_stats_add(&rxq->q_stats.packets, total_rx_pkts);
+ u64_stats_add(&rxq->q_stats.bytes, total_rx_bytes);
u64_stats_update_end(&rxq->stats_sync);

/* guarantee a trip back through this routine if there was a failure */
@@ -3270,19 +3440,16 @@ static int idpf_rx_splitq_clean(struct idpf_queue *rxq, int budget)
/**
* idpf_rx_update_bufq_desc - Update buffer queue descriptor
* @bufq: Pointer to the buffer queue
- * @refill_desc: SW Refill queue descriptor containing buffer ID
+ * @buf_id: buffer ID
* @buf_desc: Buffer queue descriptor
*
* Return 0 on success and negative on failure.
*/
-static int idpf_rx_update_bufq_desc(struct idpf_queue *bufq, u16 refill_desc,
+static int idpf_rx_update_bufq_desc(struct idpf_buf_queue *bufq, u32 buf_id,
struct virtchnl2_splitq_rx_buf_desc *buf_desc)
{
struct idpf_rx_buf *buf;
dma_addr_t addr;
- u16 buf_id;
-
- buf_id = FIELD_GET(IDPF_RX_BI_BUFID_M, refill_desc);

buf = &bufq->rx_buf.buf[buf_id];

@@ -3293,7 +3460,7 @@ static int idpf_rx_update_bufq_desc(struct idpf_queue *bufq, u16 refill_desc,
buf_desc->pkt_addr = cpu_to_le64(addr);
buf_desc->qword0.buf_id = cpu_to_le16(buf_id);

- if (!bufq->rx_hsplit_en)
+ if (!idpf_queue_has(HSPLIT_EN, bufq))
return 0;

buf_desc->hdr_addr = cpu_to_le64(bufq->rx_buf.hdr_buf_pa +
@@ -3309,33 +3476,32 @@ static int idpf_rx_update_bufq_desc(struct idpf_queue *bufq, u16 refill_desc,
*
* This function takes care of the buffer refill management
*/
-static void idpf_rx_clean_refillq(struct idpf_queue *bufq,
+static void idpf_rx_clean_refillq(struct idpf_buf_queue *bufq,
struct idpf_sw_queue *refillq)
{
struct virtchnl2_splitq_rx_buf_desc *buf_desc;
u16 bufq_nta = bufq->next_to_alloc;
u16 ntc = refillq->next_to_clean;
int cleaned = 0;
- u16 gen;

buf_desc = &bufq->split_buf[bufq_nta];

/* make sure we stop at ring wrap in the unlikely case ring is full */
while (likely(cleaned < refillq->desc_count)) {
- u16 refill_desc = refillq->ring[ntc];
+ u32 buf_id, refill_desc = refillq->ring[ntc];
bool failure;

- gen = FIELD_GET(IDPF_RX_BI_GEN_M, refill_desc);
- if (test_bit(__IDPF_RFLQ_GEN_CHK, refillq->flags) != gen)
+ if (idpf_queue_has(RFL_GEN_CHK, refillq) !=
+ !!(refill_desc & IDPF_RX_BI_GEN_M))
break;

- failure = idpf_rx_update_bufq_desc(bufq, refill_desc,
- buf_desc);
+ buf_id = FIELD_GET(IDPF_RX_BI_BUFID_M, refill_desc);
+ failure = idpf_rx_update_bufq_desc(bufq, buf_id, buf_desc);
if (failure)
break;

if (unlikely(++ntc == refillq->desc_count)) {
- change_bit(__IDPF_RFLQ_GEN_CHK, refillq->flags);
+ idpf_queue_change(RFL_GEN_CHK, refillq);
ntc = 0;
}

@@ -3374,7 +3540,7 @@ static void idpf_rx_clean_refillq(struct idpf_queue *bufq,
* this vector. Returns true if clean is complete within budget, false
* otherwise.
*/
-static void idpf_rx_clean_refillq_all(struct idpf_queue *bufq)
+static void idpf_rx_clean_refillq_all(struct idpf_buf_queue *bufq)
{
struct idpf_bufq_set *bufq_set;
int i;
@@ -3439,6 +3605,8 @@ void idpf_vport_intr_rel(struct idpf_vport *vport)
for (v_idx = 0; v_idx < vport->num_q_vectors; v_idx++) {
struct idpf_q_vector *q_vector = &vport->q_vectors[v_idx];

+ kfree(q_vector->complq);
+ q_vector->complq = NULL;
kfree(q_vector->bufq);
q_vector->bufq = NULL;
kfree(q_vector->tx);
@@ -3577,13 +3745,13 @@ static void idpf_net_dim(struct idpf_q_vector *q_vector)
goto check_rx_itr;

for (i = 0, packets = 0, bytes = 0; i < q_vector->num_txq; i++) {
- struct idpf_queue *txq = q_vector->tx[i];
+ struct idpf_tx_queue *txq = q_vector->tx[i];
unsigned int start;

do {
start = u64_stats_fetch_begin(&txq->stats_sync);
- packets += u64_stats_read(&txq->q_stats.tx.packets);
- bytes += u64_stats_read(&txq->q_stats.tx.bytes);
+ packets += u64_stats_read(&txq->q_stats.packets);
+ bytes += u64_stats_read(&txq->q_stats.bytes);
} while (u64_stats_fetch_retry(&txq->stats_sync, start));
}

@@ -3596,13 +3764,13 @@ static void idpf_net_dim(struct idpf_q_vector *q_vector)
return;

for (i = 0, packets = 0, bytes = 0; i < q_vector->num_rxq; i++) {
- struct idpf_queue *rxq = q_vector->rx[i];
+ struct idpf_rx_queue *rxq = q_vector->rx[i];
unsigned int start;

do {
start = u64_stats_fetch_begin(&rxq->stats_sync);
- packets += u64_stats_read(&rxq->q_stats.rx.packets);
- bytes += u64_stats_read(&rxq->q_stats.rx.bytes);
+ packets += u64_stats_read(&rxq->q_stats.packets);
+ bytes += u64_stats_read(&rxq->q_stats.bytes);
} while (u64_stats_fetch_retry(&rxq->stats_sync, start));
}

@@ -3844,16 +4012,17 @@ static void idpf_vport_intr_napi_ena_all(struct idpf_vport *vport)
static bool idpf_tx_splitq_clean_all(struct idpf_q_vector *q_vec,
int budget, int *cleaned)
{
- u16 num_txq = q_vec->num_txq;
+ u16 num_complq = q_vec->num_complq;
bool clean_complete = true;
int i, budget_per_q;

- if (unlikely(!num_txq))
+ if (unlikely(!num_complq))
return true;

- budget_per_q = DIV_ROUND_UP(budget, num_txq);
- for (i = 0; i < num_txq; i++)
- clean_complete &= idpf_tx_clean_complq(q_vec->tx[i],
+ budget_per_q = DIV_ROUND_UP(budget, num_complq);
+
+ for (i = 0; i < num_complq; i++)
+ clean_complete &= idpf_tx_clean_complq(q_vec->complq[i],
budget_per_q, cleaned);

return clean_complete;
@@ -3880,7 +4049,7 @@ static bool idpf_rx_splitq_clean_all(struct idpf_q_vector *q_vec, int budget,
*/
budget_per_q = num_rxq ? max(budget / num_rxq, 1) : 0;
for (i = 0; i < num_rxq; i++) {
- struct idpf_queue *rxq = q_vec->rx[i];
+ struct idpf_rx_queue *rxq = q_vec->rx[i];
int pkts_cleaned_per_q;

pkts_cleaned_per_q = idpf_rx_splitq_clean(rxq, budget_per_q);
@@ -3935,8 +4104,8 @@ static int idpf_vport_splitq_napi_poll(struct napi_struct *napi, int budget)
* queues virtchnl message, as the interrupts will be disabled after
* that
*/
- if (unlikely(q_vector->num_txq && test_bit(__IDPF_Q_POLL_MODE,
- q_vector->tx[0]->flags)))
+ if (unlikely(q_vector->num_txq && idpf_queue_has(POLL_MODE,
+ q_vector->tx[0])))
return budget;
else
return work_done;
@@ -3950,27 +4119,28 @@ static int idpf_vport_splitq_napi_poll(struct napi_struct *napi, int budget)
*/
static void idpf_vport_intr_map_vector_to_qs(struct idpf_vport *vport)
{
+ bool split = idpf_is_queue_model_split(vport->rxq_model);
u16 num_txq_grp = vport->num_txq_grp;
- int i, j, qv_idx, bufq_vidx = 0;
struct idpf_rxq_group *rx_qgrp;
struct idpf_txq_group *tx_qgrp;
- struct idpf_queue *q, *bufq;
- u16 q_index;
+ u32 i, qv_idx, q_index;

for (i = 0, qv_idx = 0; i < vport->num_rxq_grp; i++) {
u16 num_rxq;

+ if (qv_idx >= vport->num_q_vectors)
+ qv_idx = 0;
+
rx_qgrp = &vport->rxq_grps[i];
- if (idpf_is_queue_model_split(vport->rxq_model))
+ if (split)
num_rxq = rx_qgrp->splitq.num_rxq_sets;
else
num_rxq = rx_qgrp->singleq.num_rxq;

- for (j = 0; j < num_rxq; j++) {
- if (qv_idx >= vport->num_q_vectors)
- qv_idx = 0;
+ for (u32 j = 0; j < num_rxq; j++) {
+ struct idpf_rx_queue *q;

- if (idpf_is_queue_model_split(vport->rxq_model))
+ if (split)
q = &rx_qgrp->splitq.rxq_sets[j]->rxq;
else
q = rx_qgrp->singleq.rxqs[j];
@@ -3978,52 +4148,53 @@ static void idpf_vport_intr_map_vector_to_qs(struct idpf_vport *vport)
q_index = q->q_vector->num_rxq;
q->q_vector->rx[q_index] = q;
q->q_vector->num_rxq++;
- qv_idx++;
+
+ if (split)
+ q->napi = &q->q_vector->napi;
}

- if (idpf_is_queue_model_split(vport->rxq_model)) {
- for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
+ if (split) {
+ for (u32 j = 0; j < vport->num_bufqs_per_qgrp; j++) {
+ struct idpf_buf_queue *bufq;
+
bufq = &rx_qgrp->splitq.bufq_sets[j].bufq;
- bufq->q_vector = &vport->q_vectors[bufq_vidx];
+ bufq->q_vector = &vport->q_vectors[qv_idx];
q_index = bufq->q_vector->num_bufq;
bufq->q_vector->bufq[q_index] = bufq;
bufq->q_vector->num_bufq++;
}
- if (++bufq_vidx >= vport->num_q_vectors)
- bufq_vidx = 0;
}
+
+ qv_idx++;
}

+ split = idpf_is_queue_model_split(vport->txq_model);
+
for (i = 0, qv_idx = 0; i < num_txq_grp; i++) {
u16 num_txq;

+ if (qv_idx >= vport->num_q_vectors)
+ qv_idx = 0;
+
tx_qgrp = &vport->txq_grps[i];
num_txq = tx_qgrp->num_txq;

- if (idpf_is_queue_model_split(vport->txq_model)) {
- if (qv_idx >= vport->num_q_vectors)
- qv_idx = 0;
+ for (u32 j = 0; j < num_txq; j++) {
+ struct idpf_tx_queue *q;

- q = tx_qgrp->complq;
+ q = tx_qgrp->txqs[j];
q->q_vector = &vport->q_vectors[qv_idx];
- q_index = q->q_vector->num_txq;
- q->q_vector->tx[q_index] = q;
- q->q_vector->num_txq++;
- qv_idx++;
- } else {
- for (j = 0; j < num_txq; j++) {
- if (qv_idx >= vport->num_q_vectors)
- qv_idx = 0;
+ q->q_vector->tx[q->q_vector->num_txq++] = q;
+ }

- q = tx_qgrp->txqs[j];
- q->q_vector = &vport->q_vectors[qv_idx];
- q_index = q->q_vector->num_txq;
- q->q_vector->tx[q_index] = q;
- q->q_vector->num_txq++;
+ if (split) {
+ struct idpf_compl_queue *q = tx_qgrp->complq;

- qv_idx++;
- }
+ q->q_vector = &vport->q_vectors[qv_idx];
+ q->q_vector->complq[q->q_vector->num_complq++] = q;
}
+
+ qv_idx++;
}
}

@@ -4099,18 +4270,22 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
{
u16 txqs_per_vector, rxqs_per_vector, bufqs_per_vector;
struct idpf_q_vector *q_vector;
- int v_idx, err;
+ u32 complqs_per_vector, v_idx;

vport->q_vectors = kcalloc(vport->num_q_vectors,
sizeof(struct idpf_q_vector), GFP_KERNEL);
if (!vport->q_vectors)
return -ENOMEM;

- txqs_per_vector = DIV_ROUND_UP(vport->num_txq, vport->num_q_vectors);
- rxqs_per_vector = DIV_ROUND_UP(vport->num_rxq, vport->num_q_vectors);
+ txqs_per_vector = DIV_ROUND_UP(vport->num_txq_grp,
+ vport->num_q_vectors);
+ rxqs_per_vector = DIV_ROUND_UP(vport->num_rxq_grp,
+ vport->num_q_vectors);
bufqs_per_vector = vport->num_bufqs_per_qgrp *
DIV_ROUND_UP(vport->num_rxq_grp,
vport->num_q_vectors);
+ complqs_per_vector = DIV_ROUND_UP(vport->num_txq_grp,
+ vport->num_q_vectors);

for (v_idx = 0; v_idx < vport->num_q_vectors; v_idx++) {
q_vector = &vport->q_vectors[v_idx];
@@ -4124,32 +4299,30 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
q_vector->rx_intr_mode = IDPF_ITR_DYNAMIC;
q_vector->rx_itr_idx = VIRTCHNL2_ITR_IDX_0;

- q_vector->tx = kcalloc(txqs_per_vector,
- sizeof(struct idpf_queue *),
+ q_vector->tx = kcalloc(txqs_per_vector, sizeof(*q_vector->tx),
GFP_KERNEL);
- if (!q_vector->tx) {
- err = -ENOMEM;
+ if (!q_vector->tx)
goto error;
- }

- q_vector->rx = kcalloc(rxqs_per_vector,
- sizeof(struct idpf_queue *),
+ q_vector->rx = kcalloc(rxqs_per_vector, sizeof(*q_vector->rx),
GFP_KERNEL);
- if (!q_vector->rx) {
- err = -ENOMEM;
+ if (!q_vector->rx)
goto error;
- }

if (!idpf_is_queue_model_split(vport->rxq_model))
continue;

q_vector->bufq = kcalloc(bufqs_per_vector,
- sizeof(struct idpf_queue *),
+ sizeof(*q_vector->bufq),
GFP_KERNEL);
- if (!q_vector->bufq) {
- err = -ENOMEM;
+ if (!q_vector->bufq)
+ goto error;
+
+ q_vector->complq = kcalloc(complqs_per_vector,
+ sizeof(*q_vector->complq),
+ GFP_KERNEL);
+ if (!q_vector->complq)
goto error;
- }
}

return 0;
@@ -4157,7 +4330,7 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
error:
idpf_vport_intr_rel(vport);

- return err;
+ return -ENOMEM;
}

/**
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index a5f9b7a5effe..44602b87cd41 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -750,7 +750,7 @@ static int idpf_wait_for_marker_event(struct idpf_vport *vport)
int i;

for (i = 0; i < vport->num_txq; i++)
- set_bit(__IDPF_Q_SW_MARKER, vport->txqs[i]->flags);
+ idpf_queue_set(SW_MARKER, vport->txqs[i]);

event = wait_event_timeout(vport->sw_marker_wq,
test_and_clear_bit(IDPF_VPORT_SW_MARKER,
@@ -758,7 +758,7 @@ static int idpf_wait_for_marker_event(struct idpf_vport *vport)
msecs_to_jiffies(500));

for (i = 0; i < vport->num_txq; i++)
- clear_bit(__IDPF_Q_POLL_MODE, vport->txqs[i]->flags);
+ idpf_queue_clear(POLL_MODE, vport->txqs[i]);

if (event)
return 0;
@@ -1092,7 +1092,6 @@ static int __idpf_queue_reg_init(struct idpf_vport *vport, u32 *reg_vals,
int num_regs, u32 q_type)
{
struct idpf_adapter *adapter = vport->adapter;
- struct idpf_queue *q;
int i, j, k = 0;

switch (q_type) {
@@ -1111,6 +1110,8 @@ static int __idpf_queue_reg_init(struct idpf_vport *vport, u32 *reg_vals,
u16 num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq && k < num_regs; j++, k++) {
+ struct idpf_rx_queue *q;
+
q = rx_qgrp->singleq.rxqs[j];
q->tail = idpf_get_reg_addr(adapter,
reg_vals[k]);
@@ -1123,6 +1124,8 @@ static int __idpf_queue_reg_init(struct idpf_vport *vport, u32 *reg_vals,
u8 num_bufqs = vport->num_bufqs_per_qgrp;

for (j = 0; j < num_bufqs && k < num_regs; j++, k++) {
+ struct idpf_buf_queue *q;
+
q = &rx_qgrp->splitq.bufq_sets[j].bufq;
q->tail = idpf_get_reg_addr(adapter,
reg_vals[k]);
@@ -1449,19 +1452,19 @@ static int idpf_send_config_tx_queues_msg(struct idpf_vport *vport)
qi[k].model =
cpu_to_le16(vport->txq_model);
qi[k].type =
- cpu_to_le32(tx_qgrp->txqs[j]->q_type);
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_TX);
qi[k].ring_len =
cpu_to_le16(tx_qgrp->txqs[j]->desc_count);
qi[k].dma_ring_addr =
cpu_to_le64(tx_qgrp->txqs[j]->dma);
if (idpf_is_queue_model_split(vport->txq_model)) {
- struct idpf_queue *q = tx_qgrp->txqs[j];
+ struct idpf_tx_queue *q = tx_qgrp->txqs[j];

qi[k].tx_compl_queue_id =
cpu_to_le16(tx_qgrp->complq->q_id);
qi[k].relative_queue_id = cpu_to_le16(j);

- if (test_bit(__IDPF_Q_FLOW_SCH_EN, q->flags))
+ if (idpf_queue_has(FLOW_SCH_EN, q))
qi[k].sched_mode =
cpu_to_le16(VIRTCHNL2_TXQ_SCHED_MODE_FLOW);
else
@@ -1478,11 +1481,11 @@ static int idpf_send_config_tx_queues_msg(struct idpf_vport *vport)

qi[k].queue_id = cpu_to_le32(tx_qgrp->complq->q_id);
qi[k].model = cpu_to_le16(vport->txq_model);
- qi[k].type = cpu_to_le32(tx_qgrp->complq->q_type);
+ qi[k].type = cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_TX_COMPLETION);
qi[k].ring_len = cpu_to_le16(tx_qgrp->complq->desc_count);
qi[k].dma_ring_addr = cpu_to_le64(tx_qgrp->complq->dma);

- if (test_bit(__IDPF_Q_FLOW_SCH_EN, tx_qgrp->complq->flags))
+ if (idpf_queue_has(FLOW_SCH_EN, tx_qgrp->complq))
sched_mode = VIRTCHNL2_TXQ_SCHED_MODE_FLOW;
else
sched_mode = VIRTCHNL2_TXQ_SCHED_MODE_QUEUE;
@@ -1567,17 +1570,18 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
goto setup_rxqs;

for (j = 0; j < vport->num_bufqs_per_qgrp; j++, k++) {
- struct idpf_queue *bufq =
+ struct idpf_buf_queue *bufq =
&rx_qgrp->splitq.bufq_sets[j].bufq;

qi[k].queue_id = cpu_to_le32(bufq->q_id);
qi[k].model = cpu_to_le16(vport->rxq_model);
- qi[k].type = cpu_to_le32(bufq->q_type);
+ qi[k].type =
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX_BUFFER);
qi[k].desc_ids = cpu_to_le64(VIRTCHNL2_RXDID_2_FLEX_SPLITQ_M);
qi[k].ring_len = cpu_to_le16(bufq->desc_count);
qi[k].dma_ring_addr = cpu_to_le64(bufq->dma);
qi[k].data_buffer_size = cpu_to_le32(bufq->rx_buf_size);
- qi[k].buffer_notif_stride = bufq->rx_buf_stride;
+ qi[k].buffer_notif_stride = IDPF_RX_BUF_STRIDE;
qi[k].rx_buffer_low_watermark =
cpu_to_le16(bufq->rx_buffer_low_watermark);
if (idpf_is_feature_ena(vport, NETIF_F_GRO_HW))
@@ -1591,7 +1595,7 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++, k++) {
- struct idpf_queue *rxq;
+ struct idpf_rx_queue *rxq;

if (!idpf_is_queue_model_split(vport->rxq_model)) {
rxq = rx_qgrp->singleq.rxqs[j];
@@ -1599,11 +1603,11 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
}
rxq = &rx_qgrp->splitq.rxq_sets[j]->rxq;
qi[k].rx_bufq1_id =
- cpu_to_le16(rxq->rxq_grp->splitq.bufq_sets[0].bufq.q_id);
+ cpu_to_le16(rxq->bufq_sets[0].bufq.q_id);
if (vport->num_bufqs_per_qgrp > IDPF_SINGLE_BUFQ_PER_RXQ_GRP) {
qi[k].bufq2_ena = IDPF_BUFQ2_ENA;
qi[k].rx_bufq2_id =
- cpu_to_le16(rxq->rxq_grp->splitq.bufq_sets[1].bufq.q_id);
+ cpu_to_le16(rxq->bufq_sets[1].bufq.q_id);
}
qi[k].rx_buffer_low_watermark =
cpu_to_le16(rxq->rx_buffer_low_watermark);
@@ -1611,7 +1615,7 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
qi[k].qflags |= cpu_to_le16(VIRTCHNL2_RXQ_RSC);

common_qi_fields:
- if (rxq->rx_hsplit_en) {
+ if (idpf_queue_has(HSPLIT_EN, rxq)) {
qi[k].qflags |=
cpu_to_le16(VIRTCHNL2_RXQ_HDR_SPLIT);
qi[k].hdr_buffer_size =
@@ -1619,7 +1623,7 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
}
qi[k].queue_id = cpu_to_le32(rxq->q_id);
qi[k].model = cpu_to_le16(vport->rxq_model);
- qi[k].type = cpu_to_le32(rxq->q_type);
+ qi[k].type = cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX);
qi[k].ring_len = cpu_to_le16(rxq->desc_count);
qi[k].dma_ring_addr = cpu_to_le64(rxq->dma);
qi[k].max_pkt_size = cpu_to_le32(rxq->rx_max_pkt_size);
@@ -1706,7 +1710,7 @@ static int idpf_send_ena_dis_queues_msg(struct idpf_vport *vport, bool ena)
struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];

for (j = 0; j < tx_qgrp->num_txq; j++, k++) {
- qc[k].type = cpu_to_le32(tx_qgrp->txqs[j]->q_type);
+ qc[k].type = cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_TX);
qc[k].start_queue_id = cpu_to_le32(tx_qgrp->txqs[j]->q_id);
qc[k].num_queues = cpu_to_le32(IDPF_NUMQ_PER_CHUNK);
}
@@ -1720,7 +1724,7 @@ static int idpf_send_ena_dis_queues_msg(struct idpf_vport *vport, bool ena)
for (i = 0; i < vport->num_txq_grp; i++, k++) {
struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];

- qc[k].type = cpu_to_le32(tx_qgrp->complq->q_type);
+ qc[k].type = cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_TX_COMPLETION);
qc[k].start_queue_id = cpu_to_le32(tx_qgrp->complq->q_id);
qc[k].num_queues = cpu_to_le32(IDPF_NUMQ_PER_CHUNK);
}
@@ -1741,12 +1745,12 @@ static int idpf_send_ena_dis_queues_msg(struct idpf_vport *vport, bool ena)
qc[k].start_queue_id =
cpu_to_le32(rx_qgrp->splitq.rxq_sets[j]->rxq.q_id);
qc[k].type =
- cpu_to_le32(rx_qgrp->splitq.rxq_sets[j]->rxq.q_type);
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX);
} else {
qc[k].start_queue_id =
cpu_to_le32(rx_qgrp->singleq.rxqs[j]->q_id);
qc[k].type =
- cpu_to_le32(rx_qgrp->singleq.rxqs[j]->q_type);
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX);
}
qc[k].num_queues = cpu_to_le32(IDPF_NUMQ_PER_CHUNK);
}
@@ -1761,10 +1765,11 @@ static int idpf_send_ena_dis_queues_msg(struct idpf_vport *vport, bool ena)
struct idpf_rxq_group *rx_qgrp = &vport->rxq_grps[i];

for (j = 0; j < vport->num_bufqs_per_qgrp; j++, k++) {
- struct idpf_queue *q;
+ const struct idpf_buf_queue *q;

q = &rx_qgrp->splitq.bufq_sets[j].bufq;
- qc[k].type = cpu_to_le32(q->q_type);
+ qc[k].type =
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX_BUFFER);
qc[k].start_queue_id = cpu_to_le32(q->q_id);
qc[k].num_queues = cpu_to_le32(IDPF_NUMQ_PER_CHUNK);
}
@@ -1849,7 +1854,8 @@ int idpf_send_map_unmap_queue_vector_msg(struct idpf_vport *vport, bool map)
struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];

for (j = 0; j < tx_qgrp->num_txq; j++, k++) {
- vqv[k].queue_type = cpu_to_le32(tx_qgrp->txqs[j]->q_type);
+ vqv[k].queue_type =
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_TX);
vqv[k].queue_id = cpu_to_le32(tx_qgrp->txqs[j]->q_id);

if (idpf_is_queue_model_split(vport->txq_model)) {
@@ -1879,14 +1885,15 @@ int idpf_send_map_unmap_queue_vector_msg(struct idpf_vport *vport, bool map)
num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++, k++) {
- struct idpf_queue *rxq;
+ struct idpf_rx_queue *rxq;

if (idpf_is_queue_model_split(vport->rxq_model))
rxq = &rx_qgrp->splitq.rxq_sets[j]->rxq;
else
rxq = rx_qgrp->singleq.rxqs[j];

- vqv[k].queue_type = cpu_to_le32(rxq->q_type);
+ vqv[k].queue_type =
+ cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX);
vqv[k].queue_id = cpu_to_le32(rxq->q_id);
vqv[k].vector_id = cpu_to_le16(rxq->q_vector->v_idx);
vqv[k].itr_idx = cpu_to_le32(rxq->q_vector->rx_itr_idx);
@@ -1975,7 +1982,7 @@ int idpf_send_disable_queues_msg(struct idpf_vport *vport)
* queues virtchnl message is sent
*/
for (i = 0; i < vport->num_txq; i++)
- set_bit(__IDPF_Q_POLL_MODE, vport->txqs[i]->flags);
+ idpf_queue_set(POLL_MODE, vport->txqs[i]);

/* schedule the napi to receive all the marker packets */
local_bh_disable();
@@ -3242,7 +3249,6 @@ static int __idpf_vport_queue_ids_init(struct idpf_vport *vport,
int num_qids,
u32 q_type)
{
- struct idpf_queue *q;
int i, j, k = 0;

switch (q_type) {
@@ -3250,11 +3256,8 @@ static int __idpf_vport_queue_ids_init(struct idpf_vport *vport,
for (i = 0; i < vport->num_txq_grp; i++) {
struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];

- for (j = 0; j < tx_qgrp->num_txq && k < num_qids; j++, k++) {
+ for (j = 0; j < tx_qgrp->num_txq && k < num_qids; j++, k++)
tx_qgrp->txqs[j]->q_id = qids[k];
- tx_qgrp->txqs[j]->q_type =
- VIRTCHNL2_QUEUE_TYPE_TX;
- }
}
break;
case VIRTCHNL2_QUEUE_TYPE_RX:
@@ -3268,12 +3271,13 @@ static int __idpf_vport_queue_ids_init(struct idpf_vport *vport,
num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq && k < num_qids; j++, k++) {
+ struct idpf_rx_queue *q;
+
if (idpf_is_queue_model_split(vport->rxq_model))
q = &rx_qgrp->splitq.rxq_sets[j]->rxq;
else
q = rx_qgrp->singleq.rxqs[j];
q->q_id = qids[k];
- q->q_type = VIRTCHNL2_QUEUE_TYPE_RX;
}
}
break;
@@ -3282,8 +3286,6 @@ static int __idpf_vport_queue_ids_init(struct idpf_vport *vport,
struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];

tx_qgrp->complq->q_id = qids[k];
- tx_qgrp->complq->q_type =
- VIRTCHNL2_QUEUE_TYPE_TX_COMPLETION;
}
break;
case VIRTCHNL2_QUEUE_TYPE_RX_BUFFER:
@@ -3292,9 +3294,10 @@ static int __idpf_vport_queue_ids_init(struct idpf_vport *vport,
u8 num_bufqs = vport->num_bufqs_per_qgrp;

for (j = 0; j < num_bufqs && k < num_qids; j++, k++) {
+ struct idpf_buf_queue *q;
+
q = &rx_qgrp->splitq.bufq_sets[j].bufq;
q->q_id = qids[k];
- q->q_type = VIRTCHNL2_QUEUE_TYPE_RX_BUFFER;
}
}
break;
--
2.45.1


2024-05-28 13:52:18

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 09/12] idpf: remove legacy Page Pool Ethtool stats

Page Pool Ethtool stats are deprecated since the Netlink Page Pool
interface introduction.
idpf receives big changes in Rx buffer management, including &page_pool
layout, so keeping these deprecated stats does only harm, not speaking
of that CONFIG_IDPF selects CONFIG_PAGE_POOL_STATS unconditionally,
while the latter is often turned off for better performance.
Remove all the references to PP stats from the Ethtool code. The stats
are still available in their full via the generic Netlink interface.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/Kconfig | 1 -
.../net/ethernet/intel/idpf/idpf_ethtool.c | 29 +------------------
2 files changed, 1 insertion(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/Kconfig b/drivers/net/ethernet/intel/idpf/Kconfig
index 638484c5723c..1f071143d992 100644
--- a/drivers/net/ethernet/intel/idpf/Kconfig
+++ b/drivers/net/ethernet/intel/idpf/Kconfig
@@ -7,7 +7,6 @@ config IDPF
select DIMLIB
select LIBETH
select PAGE_POOL
- select PAGE_POOL_STATS
help
This driver supports Intel(R) Infrastructure Data Path Function
devices.
diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
index e933fed16c7e..3806ddd3ce4a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
@@ -565,8 +565,6 @@ static void idpf_get_stat_strings(struct net_device *netdev, u8 *data)
for (i = 0; i < vport_config->max_q.max_rxq; i++)
idpf_add_qstat_strings(&data, idpf_gstrings_rx_queue_stats,
"rx", i);
-
- page_pool_ethtool_stats_get_strings(data);
}

/**
@@ -600,7 +598,6 @@ static int idpf_get_sset_count(struct net_device *netdev, int sset)
struct idpf_netdev_priv *np = netdev_priv(netdev);
struct idpf_vport_config *vport_config;
u16 max_txq, max_rxq;
- unsigned int size;

if (sset != ETH_SS_STATS)
return -EINVAL;
@@ -619,11 +616,8 @@ static int idpf_get_sset_count(struct net_device *netdev, int sset)
max_txq = vport_config->max_q.max_txq;
max_rxq = vport_config->max_q.max_rxq;

- size = IDPF_PORT_STATS_LEN + (IDPF_TX_QUEUE_STATS_LEN * max_txq) +
+ return IDPF_PORT_STATS_LEN + (IDPF_TX_QUEUE_STATS_LEN * max_txq) +
(IDPF_RX_QUEUE_STATS_LEN * max_rxq);
- size += page_pool_ethtool_stats_get_count();
-
- return size;
}

/**
@@ -876,7 +870,6 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,
{
struct idpf_netdev_priv *np = netdev_priv(netdev);
struct idpf_vport_config *vport_config;
- struct page_pool_stats pp_stats = { };
struct idpf_vport *vport;
unsigned int total = 0;
unsigned int i, j;
@@ -946,32 +939,12 @@ static void idpf_get_ethtool_stats(struct net_device *netdev,
idpf_add_empty_queue_stats(&data, qtype);
else
idpf_add_queue_stats(&data, rxq, qtype);
-
- /* In splitq mode, don't get page pool stats here since
- * the pools are attached to the buffer queues
- */
- if (is_splitq)
- continue;
-
- if (rxq)
- page_pool_get_stats(rxq->pp, &pp_stats);
- }
- }
-
- for (i = 0; i < vport->num_rxq_grp; i++) {
- for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
- struct idpf_buf_queue *rxbufq =
- &vport->rxq_grps[i].splitq.bufq_sets[j].bufq;
-
- page_pool_get_stats(rxbufq->pp, &pp_stats);
}
}

for (; total < vport_config->max_q.max_rxq; total++)
idpf_add_empty_queue_stats(&data, VIRTCHNL2_QUEUE_TYPE_RX);

- page_pool_ethtool_stats_get(data, &pp_stats);
-
rcu_read_unlock();

idpf_vport_ctrl_unlock(netdev);
--
2.45.1


2024-05-28 13:52:24

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 10/12] libeth: support different types of buffers for Rx

Unlike previous generations, idpf requires more buffer types for optimal
performance. This includes: header buffers, short buffers, and
no-overhead buffers (w/o headroom and tailroom, for TCP zerocopy when
the header split is enabled).
Introduce libeth Rx buffer type and calculate page_pool params
accordingly. All the HW-related details like buffer alignment are still
accounted. For the header buffers, pick 256 bytes as in most places in
the kernel (have you ever seen frames with bigger headers?).

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/net/libeth/rx.h | 19 ++++
drivers/net/ethernet/intel/libeth/rx.c | 132 ++++++++++++++++++++++---
2 files changed, 140 insertions(+), 11 deletions(-)

diff --git a/include/net/libeth/rx.h b/include/net/libeth/rx.h
index f29ea3e34c6c..43574bd6612f 100644
--- a/include/net/libeth/rx.h
+++ b/include/net/libeth/rx.h
@@ -17,6 +17,8 @@
#define LIBETH_MAX_HEADROOM LIBETH_SKB_HEADROOM
/* Link layer / L2 overhead: Ethernet, 2 VLAN tags (C + S), FCS */
#define LIBETH_RX_LL_LEN (ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN)
+/* Maximum supported L2-L4 header length */
+#define LIBETH_MAX_HEAD roundup_pow_of_two(max(MAX_HEADER, 256))

/* Always use order-0 pages */
#define LIBETH_RX_PAGE_ORDER 0
@@ -43,6 +45,18 @@ struct libeth_fqe {
u32 truesize;
} __aligned_largest;

+/**
+ * enum libeth_fqe_type - enum representing types of Rx buffers
+ * @LIBETH_FQE_MTU: buffer size is determined by MTU
+ * @LIBETH_FQE_SHORT: buffer size is smaller than MTU, for short frames
+ * @LIBETH_FQE_HDR: buffer size is ```LIBETH_MAX_HEAD```-sized, for headers
+ */
+enum libeth_fqe_type {
+ LIBETH_FQE_MTU = 0U,
+ LIBETH_FQE_SHORT,
+ LIBETH_FQE_HDR,
+};
+
/**
* struct libeth_fq - structure representing a buffer (fill) queue
* @fp: hotpath part of the structure
@@ -50,6 +64,8 @@ struct libeth_fqe {
* @fqes: array of Rx buffers
* @truesize: size to allocate per buffer, w/overhead
* @count: number of descriptors/buffers the queue has
+ * @type: type of the buffers this queue has
+ * @hsplit: flag whether header split is enabled
* @buf_len: HW-writeable length per each buffer
* @nid: ID of the closest NUMA node with memory
*/
@@ -63,6 +79,9 @@ struct libeth_fq {
);

/* Cold fields */
+ enum libeth_fqe_type type:2;
+ bool hsplit:1;
+
u32 buf_len;
int nid;
};
diff --git a/drivers/net/ethernet/intel/libeth/rx.c b/drivers/net/ethernet/intel/libeth/rx.c
index 6221b88c34ac..d0b158b6e55b 100644
--- a/drivers/net/ethernet/intel/libeth/rx.c
+++ b/drivers/net/ethernet/intel/libeth/rx.c
@@ -6,7 +6,7 @@
/* Rx buffer management */

/**
- * libeth_rx_hw_len - get the actual buffer size to be passed to HW
+ * libeth_rx_hw_len_mtu - get the actual buffer size to be passed to HW
* @pp: &page_pool_params of the netdev to calculate the size for
* @max_len: maximum buffer size for a single descriptor
*
@@ -14,7 +14,7 @@
* MTU the @dev has, HW required alignment, minimum and maximum allowed values,
* and system's page size.
*/
-static u32 libeth_rx_hw_len(const struct page_pool_params *pp, u32 max_len)
+static u32 libeth_rx_hw_len_mtu(const struct page_pool_params *pp, u32 max_len)
{
u32 len;

@@ -26,6 +26,118 @@ static u32 libeth_rx_hw_len(const struct page_pool_params *pp, u32 max_len)
return len;
}

+/**
+ * libeth_rx_hw_len_truesize - get the short buffer size to be passed to HW
+ * @pp: &page_pool_params of the netdev to calculate the size for
+ * @max_len: maximum buffer size for a single descriptor
+ * @truesize: desired truesize for the buffers
+ *
+ * Return: HW-writeable length per one buffer to pass it to the HW ignoring the
+ * MTU and closest to the passed truesize. Can be used for "short" buffer
+ * queues to fragment pages more efficiently.
+ */
+static u32 libeth_rx_hw_len_truesize(const struct page_pool_params *pp,
+ u32 max_len, u32 truesize)
+{
+ u32 min, len;
+
+ min = SKB_HEAD_ALIGN(pp->offset + LIBETH_RX_BUF_STRIDE);
+ truesize = clamp(roundup_pow_of_two(truesize), roundup_pow_of_two(min),
+ PAGE_SIZE << LIBETH_RX_PAGE_ORDER);
+
+ len = SKB_WITH_OVERHEAD(truesize - pp->offset);
+ len = ALIGN_DOWN(len, LIBETH_RX_BUF_STRIDE) ? : LIBETH_RX_BUF_STRIDE;
+ len = min3(len, ALIGN_DOWN(max_len ? : U32_MAX, LIBETH_RX_BUF_STRIDE),
+ pp->max_len);
+
+ return len;
+}
+
+/**
+ * libeth_rx_page_pool_params - calculate params with the stack overhead
+ * @fq: buffer queue to calculate the size for
+ * @pp: &page_pool_params of the netdev
+ *
+ * Set the PP params to will all needed stack overhead (headroom, tailroom) and
+ * both the HW buffer length and the truesize for all types of buffers. For
+ * "short" buffers, truesize never exceeds the "wanted" one; for the rest,
+ * it can be up to the page size.
+ *
+ * Return: true on success, false on invalid input params.
+ */
+static bool libeth_rx_page_pool_params(struct libeth_fq *fq,
+ struct page_pool_params *pp)
+{
+ pp->offset = LIBETH_SKB_HEADROOM;
+ /* HW-writeable / syncable length per one page */
+ pp->max_len = LIBETH_RX_PAGE_LEN(pp->offset);
+
+ /* HW-writeable length per buffer */
+ switch (fq->type) {
+ case LIBETH_FQE_MTU:
+ fq->buf_len = libeth_rx_hw_len_mtu(pp, fq->buf_len);
+ break;
+ case LIBETH_FQE_SHORT:
+ fq->buf_len = libeth_rx_hw_len_truesize(pp, fq->buf_len,
+ fq->truesize);
+ break;
+ case LIBETH_FQE_HDR:
+ fq->buf_len = ALIGN(LIBETH_MAX_HEAD, LIBETH_RX_BUF_STRIDE);
+ break;
+ default:
+ return false;
+ }
+
+ /* Buffer size to allocate */
+ fq->truesize = roundup_pow_of_two(SKB_HEAD_ALIGN(pp->offset +
+ fq->buf_len));
+
+ return true;
+}
+
+/**
+ * libeth_rx_page_pool_params_zc - calculate params without the stack overhead
+ * @fq: buffer queue to calculate the size for
+ * @pp: &page_pool_params of the netdev
+ *
+ * Set the PP params to exclude the stack overhead and both the buffer length
+ * and the truesize, which are equal for the data buffers. Note that this
+ * requires separate header buffers to be always active and account the
+ * overhead.
+ * With the MTU == ``PAGE_SIZE``, this allows the kernel to enable the zerocopy
+ * mode.
+ *
+ * Return: true on success, false on invalid input params.
+ */
+static bool libeth_rx_page_pool_params_zc(struct libeth_fq *fq,
+ struct page_pool_params *pp)
+{
+ u32 mtu, max;
+
+ pp->offset = 0;
+ pp->max_len = PAGE_SIZE << LIBETH_RX_PAGE_ORDER;
+
+ switch (fq->type) {
+ case LIBETH_FQE_MTU:
+ mtu = READ_ONCE(pp->netdev->mtu);
+ break;
+ case LIBETH_FQE_SHORT:
+ mtu = fq->truesize;
+ break;
+ default:
+ return false;
+ }
+
+ mtu = roundup_pow_of_two(mtu);
+ max = min(rounddown_pow_of_two(fq->buf_len ? : U32_MAX),
+ pp->max_len);
+
+ fq->buf_len = clamp(mtu, LIBETH_RX_BUF_STRIDE, max);
+ fq->truesize = fq->buf_len;
+
+ return true;
+}
+
/**
* libeth_rx_fq_create - create a PP with the default libeth settings
* @fq: buffer queue struct to fill
@@ -44,19 +156,17 @@ int libeth_rx_fq_create(struct libeth_fq *fq, struct napi_struct *napi)
.netdev = napi->dev,
.napi = napi,
.dma_dir = DMA_FROM_DEVICE,
- .offset = LIBETH_SKB_HEADROOM,
};
struct libeth_fqe *fqes;
struct page_pool *pool;
+ bool ret;

- /* HW-writeable / syncable length per one page */
- pp.max_len = LIBETH_RX_PAGE_LEN(pp.offset);
-
- /* HW-writeable length per buffer */
- fq->buf_len = libeth_rx_hw_len(&pp, fq->buf_len);
- /* Buffer size to allocate */
- fq->truesize = roundup_pow_of_two(SKB_HEAD_ALIGN(pp.offset +
- fq->buf_len));
+ if (!fq->hsplit)
+ ret = libeth_rx_page_pool_params(fq, &pp);
+ else
+ ret = libeth_rx_page_pool_params_zc(fq, &pp);
+ if (!ret)
+ return -EINVAL;

pool = page_pool_create(&pp);
if (IS_ERR(pool))
--
2.45.1


2024-05-28 13:53:16

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer

idpf uses Page Pool for data buffers with hardcoded buffer lengths of
4k for "classic" buffers and 2k for "short" ones. This is not flexible
and does not ensure optimal memory usage. Why would you need 4k buffers
when the MTU is 1500?
Use libeth for the data buffers and don't hardcode any buffer sizes. Let
them be calculated from the MTU for "classics" and then divide the
truesize by 2 for "short" ones. The memory usage is now greatly reduced
and 2 buffer queues starts make sense: on frames <= 1024, you'll recycle
(and resync) a page only after 4 HW writes rather than two.

Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/Kconfig | 1 -
drivers/net/ethernet/intel/idpf/idpf.h | 2 -
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 86 +------
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 25 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 240 ++++++------------
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 8 +-
6 files changed, 118 insertions(+), 244 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/Kconfig b/drivers/net/ethernet/intel/idpf/Kconfig
index 1f071143d992..1addd663acad 100644
--- a/drivers/net/ethernet/intel/idpf/Kconfig
+++ b/drivers/net/ethernet/intel/idpf/Kconfig
@@ -6,7 +6,6 @@ config IDPF
depends on PCI_MSI
select DIMLIB
select LIBETH
- select PAGE_POOL
help
This driver supports Intel(R) Infrastructure Data Path Function
devices.
diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index 078340a01757..2c31ad87587a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -264,7 +264,6 @@ struct idpf_port_stats {
* the worst case.
* @num_bufqs_per_qgrp: Buffer queues per RX queue in a given grouping
* @bufq_desc_count: Buffer queue descriptor count
- * @bufq_size: Size of buffers in ring (e.g. 2K, 4K, etc)
* @num_rxq_grp: Number of RX queues in a group
* @rxq_grps: Total number of RX groups. Number of groups * number of RX per
* group will yield total number of RX queues.
@@ -308,7 +307,6 @@ struct idpf_vport {
u32 rxq_desc_count;
u8 num_bufqs_per_qgrp;
u32 bufq_desc_count[IDPF_MAX_BUFQS_PER_RXQ_GRP];
- u32 bufq_size[IDPF_MAX_BUFQS_PER_RXQ_GRP];
u16 num_rxq_grp;
struct idpf_rxq_group *rxq_grps;
u32 rxq_model;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 86a1efc24caf..e0a2ad1edfc6 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -7,7 +7,6 @@
#include <linux/dim.h>

#include <net/libeth/cache.h>
-#include <net/libeth/rx.h>
#include <net/tcp.h>
#include <net/netdev_queues.h>

@@ -97,14 +96,10 @@ do { \
idx = 0; \
} while (0)

-#define IDPF_RX_HDR_SIZE 256
-#define IDPF_RX_BUF_2048 2048
-#define IDPF_RX_BUF_4096 4096
#define IDPF_RX_BUF_STRIDE 32
#define IDPF_RX_BUF_POST_STRIDE 16
#define IDPF_LOW_WATERMARK 64
-#define IDPF_PACKET_HDR_PAD \
- (ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN * 2)
+
#define IDPF_TX_TSO_MIN_MSS 88

/* Minimum number of descriptors between 2 descriptors with the RE bit set;
@@ -540,7 +535,7 @@ struct idpf_txq_stash {
* @desc_ring: virtual descriptor ring address
* @bufq_sets: Pointer to the array of buffer queues in splitq mode
* @napi: NAPI instance corresponding to this queue (splitq)
- * @rx_buf: See struct idpf_rx_buf
+ * @rx_buf: See struct &libeth_fqe
* @pp: Page pool pointer in singleq mode
* @netdev: &net_device corresponding to this queue
* @tail: Tail offset. Used for both queue models single and split.
@@ -555,6 +550,7 @@ struct idpf_txq_stash {
* @next_to_clean: Next descriptor to clean
* @next_to_alloc: RX buffer to allocate at
* @skb: Pointer to the skb
+ * @truesize: data buffer truesize in singleq
* @stats_sync: See struct u64_stats_sync
* @q_stats: See union idpf_rx_queue_stats
* @cold: CL group with fields no touched on hotpath
@@ -581,7 +577,7 @@ struct idpf_rx_queue {
struct napi_struct *napi;
};
struct {
- struct idpf_rx_buf *rx_buf;
+ struct libeth_fqe *rx_buf;
struct page_pool *pp;
};
};
@@ -601,6 +597,7 @@ struct idpf_rx_queue {
u16 next_to_alloc;

struct sk_buff *skb;
+ u32 truesize;

struct u64_stats_sync stats_sync;
struct idpf_rx_queue_stats q_stats;
@@ -619,7 +616,7 @@ struct idpf_rx_queue {
);
};
libeth_cacheline_set_assert(struct idpf_rx_queue, 64,
- 72 + sizeof(struct u64_stats_sync),
+ 80 + sizeof(struct u64_stats_sync),
32);

/**
@@ -748,16 +745,17 @@ libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
* @split_buf: buffer descriptor array
* @hdr_buf: &libeth_fqe for header buffers
* @hdr_pp: &page_pool for header buffers
- * @buf: &idpf_rx_buf for data buffers
+ * @buf: &libeth_fqe for data buffers
* @pp: &page_pool for data buffers
* @tail: Tail offset
* @flags: See enum idpf_queue_flags_t
* @desc_count: Number of descriptors
- * @hdr_truesize: truesize for buffer headers
* @read_write: CL group with both read/write hot fields
* @next_to_use: Next descriptor to use
* @next_to_clean: Next descriptor to clean
* @next_to_alloc: RX buffer to allocate at
+ * @hdr_truesize: truesize for buffer headers
+ * @truesize: truesize for data buffers
* @cold: CL group with fields no touched on hotpath
* @q_id: Queue id
* @size: Length of descriptor ring in bytes
@@ -772,19 +770,20 @@ struct idpf_buf_queue {
struct virtchnl2_splitq_rx_buf_desc *split_buf;
struct libeth_fqe *hdr_buf;
struct page_pool *hdr_pp;
- struct idpf_rx_buf *buf;
+ struct libeth_fqe *buf;
struct page_pool *pp;
void __iomem *tail;

DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
u32 desc_count;
-
- u32 hdr_truesize;
);
libeth_cacheline_group(read_write,
u32 next_to_use;
u32 next_to_clean;
u32 next_to_alloc;
+
+ u32 hdr_truesize;
+ u32 truesize;
);
libeth_cacheline_group(cold,
u32 q_id;
@@ -798,7 +797,7 @@ struct idpf_buf_queue {
u16 rx_buf_size;
);
};
-libeth_cacheline_set_assert(struct idpf_buf_queue, 64, 16, 32);
+libeth_cacheline_set_assert(struct idpf_buf_queue, 64, 24, 32);

/**
* struct idpf_compl_queue - software structure representing a completion queue
@@ -1040,60 +1039,6 @@ static inline void idpf_tx_splitq_build_desc(union idpf_tx_flex_desc *desc,
idpf_tx_splitq_build_flow_desc(desc, params, td_cmd, size);
}

-/**
- * idpf_alloc_page - Allocate a new RX buffer from the page pool
- * @pool: page_pool to allocate from
- * @buf: metadata struct to populate with page info
- * @buf_size: 2K or 4K
- *
- * Returns &dma_addr_t to be passed to HW for Rx, %DMA_MAPPING_ERROR otherwise.
- */
-static inline dma_addr_t idpf_alloc_page(struct page_pool *pool,
- struct idpf_rx_buf *buf,
- unsigned int buf_size)
-{
- if (buf_size == IDPF_RX_BUF_2048)
- buf->page = page_pool_dev_alloc_frag(pool, &buf->offset,
- buf_size);
- else
- buf->page = page_pool_dev_alloc_pages(pool);
-
- if (!buf->page)
- return DMA_MAPPING_ERROR;
-
- buf->truesize = buf_size;
-
- return page_pool_get_dma_addr(buf->page) + buf->offset +
- pool->p.offset;
-}
-
-/**
- * idpf_rx_put_page - Return RX buffer page to pool
- * @rx_buf: RX buffer metadata struct
- */
-static inline void idpf_rx_put_page(struct idpf_rx_buf *rx_buf)
-{
- page_pool_put_page(rx_buf->page->pp, rx_buf->page,
- rx_buf->truesize, true);
- rx_buf->page = NULL;
-}
-
-/**
- * idpf_rx_sync_for_cpu - Synchronize DMA buffer
- * @rx_buf: RX buffer metadata struct
- * @len: frame length from descriptor
- */
-static inline void idpf_rx_sync_for_cpu(struct idpf_rx_buf *rx_buf, u32 len)
-{
- struct page *page = rx_buf->page;
- struct page_pool *pp = page->pp;
-
- dma_sync_single_range_for_cpu(pp->p.dev,
- page_pool_get_dma_addr(page),
- rx_buf->offset + pp->p.offset, len,
- page_pool_get_dma_dir(pp));
-}
-
int idpf_vport_singleq_napi_poll(struct napi_struct *napi, int budget);
void idpf_vport_init_num_qs(struct idpf_vport *vport,
struct virtchnl2_create_vport *vport_msg);
@@ -1116,9 +1061,6 @@ void idpf_deinit_rss(struct idpf_vport *vport);
int idpf_rx_bufs_init_all(struct idpf_vport *vport);
void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb,
unsigned int size);
-struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
- struct idpf_rx_buf *rx_buf,
- unsigned int size);
struct sk_buff *idpf_rx_build_skb(const struct libeth_fqe *buf, u32 size);
void idpf_tx_buf_hw_update(struct idpf_tx_queue *tx_q, u32 val,
bool xmit_more);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
index cde768082fc4..631a882c563a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
@@ -858,20 +858,24 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_rx_queue *rx_q,
u16 cleaned_count)
{
struct virtchnl2_singleq_rx_buf_desc *desc;
+ const struct libeth_fq_fp fq = {
+ .pp = rx_q->pp,
+ .fqes = rx_q->rx_buf,
+ .truesize = rx_q->truesize,
+ .count = rx_q->desc_count,
+ };
u16 nta = rx_q->next_to_alloc;
- struct idpf_rx_buf *buf;

if (!cleaned_count)
return false;

desc = &rx_q->single_buf[nta];
- buf = &rx_q->rx_buf[nta];

do {
dma_addr_t addr;

- addr = idpf_alloc_page(rx_q->pp, buf, rx_q->rx_buf_size);
- if (unlikely(addr == DMA_MAPPING_ERROR))
+ addr = libeth_rx_alloc(&fq, nta);
+ if (addr == DMA_MAPPING_ERROR)
break;

/* Refresh the desc even if buffer_addrs didn't change
@@ -881,11 +885,9 @@ bool idpf_rx_singleq_buf_hw_alloc_all(struct idpf_rx_queue *rx_q,
desc->hdr_addr = 0;
desc++;

- buf++;
nta++;
if (unlikely(nta == rx_q->desc_count)) {
desc = &rx_q->single_buf[0];
- buf = rx_q->rx_buf;
nta = 0;
}

@@ -1005,24 +1007,22 @@ static int idpf_rx_singleq_clean(struct idpf_rx_queue *rx_q, int budget)
idpf_rx_singleq_extract_fields(rx_q, rx_desc, &fields);

rx_buf = &rx_q->rx_buf[ntc];
- if (!fields.size) {
- idpf_rx_put_page(rx_buf);
+ if (!libeth_rx_sync_for_cpu(rx_buf, fields.size))
goto skip_data;
- }

- idpf_rx_sync_for_cpu(rx_buf, fields.size);
if (skb)
idpf_rx_add_frag(rx_buf, skb, fields.size);
else
- skb = idpf_rx_construct_skb(rx_q, rx_buf, fields.size);
+ skb = idpf_rx_build_skb(rx_buf, fields.size);

/* exit if we failed to retrieve a buffer */
if (!skb)
break;

skip_data:
- IDPF_SINGLEQ_BUMP_RING_IDX(rx_q, ntc);
+ rx_buf->page = NULL;

+ IDPF_SINGLEQ_BUMP_RING_IDX(rx_q, ntc);
cleaned_count++;

/* skip if it is non EOP desc */
@@ -1063,6 +1063,7 @@ static int idpf_rx_singleq_clean(struct idpf_rx_queue *rx_q, int budget)

rx_q->next_to_clean = ntc;

+ page_pool_nid_changed(rx_q->pp, numa_mem_id());
if (cleaned_count)
failure = idpf_rx_singleq_buf_hw_alloc_all(rx_q, cleaned_count);

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 689668e0ca51..097926f7a10a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -417,6 +417,11 @@ static void idpf_rx_hdr_buf_rel_all(struct idpf_buf_queue *bufq)
*/
static void idpf_rx_buf_rel_bufq(struct idpf_buf_queue *bufq)
{
+ struct libeth_fq fq = {
+ .fqes = bufq->buf,
+ .pp = bufq->pp,
+ };
+
/* queue already cleared, nothing to do */
if (!bufq->buf)
return;
@@ -428,11 +433,9 @@ static void idpf_rx_buf_rel_bufq(struct idpf_buf_queue *bufq)
if (idpf_queue_has(HSPLIT_EN, bufq))
idpf_rx_hdr_buf_rel_all(bufq);

- page_pool_destroy(bufq->pp);
- bufq->pp = NULL;
-
- kfree(bufq->buf);
+ libeth_rx_fq_destroy(&fq);
bufq->buf = NULL;
+ bufq->pp = NULL;
}

/**
@@ -441,17 +444,20 @@ static void idpf_rx_buf_rel_bufq(struct idpf_buf_queue *bufq)
*/
static void idpf_rx_buf_rel_all(struct idpf_rx_queue *rxq)
{
+ struct libeth_fq fq = {
+ .fqes = rxq->rx_buf,
+ .pp = rxq->pp,
+ };
+
if (!rxq->rx_buf)
return;

for (u32 i = 0; i < rxq->desc_count; i++)
idpf_rx_page_rel(&rxq->rx_buf[i]);

- page_pool_destroy(rxq->pp);
- rxq->pp = NULL;
-
- kfree(rxq->rx_buf);
+ libeth_rx_fq_destroy(&fq);
rxq->rx_buf = NULL;
+ rxq->pp = NULL;
}

/**
@@ -633,11 +639,9 @@ static bool idpf_rx_post_buf_desc(struct idpf_buf_queue *bufq, u16 buf_id)
.count = bufq->desc_count,
};
u16 nta = bufq->next_to_alloc;
- struct idpf_rx_buf *buf;
dma_addr_t addr;

splitq_rx_desc = &bufq->split_buf[nta];
- buf = &bufq->buf[buf_id];

if (idpf_queue_has(HSPLIT_EN, bufq)) {
fq.pp = bufq->hdr_pp;
@@ -651,8 +655,12 @@ static bool idpf_rx_post_buf_desc(struct idpf_buf_queue *bufq, u16 buf_id)
splitq_rx_desc->hdr_addr = cpu_to_le64(addr);
}

- addr = idpf_alloc_page(bufq->pp, buf, bufq->rx_buf_size);
- if (unlikely(addr == DMA_MAPPING_ERROR))
+ fq.pp = bufq->pp;
+ fq.fqes = bufq->buf;
+ fq.truesize = bufq->truesize;
+
+ addr = libeth_rx_alloc(&fq, buf_id);
+ if (addr == DMA_MAPPING_ERROR)
return false;

splitq_rx_desc->pkt_addr = cpu_to_le64(addr);
@@ -689,30 +697,6 @@ static bool idpf_rx_post_init_bufs(struct idpf_buf_queue *bufq,
return true;
}

-/**
- * idpf_rx_create_page_pool - Create a page pool
- * @napi: NAPI of the associated queue vector
- * @count: queue descriptor count
- *
- * Returns &page_pool on success, casted -errno on failure
- */
-static struct page_pool *idpf_rx_create_page_pool(struct napi_struct *napi,
- u32 count)
-{
- struct page_pool_params pp = {
- .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
- .order = 0,
- .pool_size = count,
- .nid = NUMA_NO_NODE,
- .dev = napi->dev->dev.parent,
- .max_len = PAGE_SIZE,
- .dma_dir = DMA_FROM_DEVICE,
- .offset = 0,
- };
-
- return page_pool_create(&pp);
-}
-
/**
* idpf_rx_buf_alloc_singleq - Allocate memory for all buffer resources
* @rxq: queue for which the buffers are allocated
@@ -721,11 +705,6 @@ static struct page_pool *idpf_rx_create_page_pool(struct napi_struct *napi,
*/
static int idpf_rx_buf_alloc_singleq(struct idpf_rx_queue *rxq)
{
- rxq->rx_buf = kcalloc(rxq->desc_count, sizeof(*rxq->rx_buf),
- GFP_KERNEL);
- if (!rxq->rx_buf)
- return -ENOMEM;
-
if (idpf_rx_singleq_buf_hw_alloc_all(rxq, rxq->desc_count - 1))
goto err;

@@ -745,13 +724,21 @@ static int idpf_rx_buf_alloc_singleq(struct idpf_rx_queue *rxq)
*/
static int idpf_rx_bufs_init_singleq(struct idpf_rx_queue *rxq)
{
- struct page_pool *pool;
+ struct libeth_fq fq = {
+ .count = rxq->desc_count,
+ .type = LIBETH_FQE_MTU,
+ .nid = idpf_q_vector_to_mem(rxq->q_vector),
+ };
+ int ret;

- pool = idpf_rx_create_page_pool(&rxq->q_vector->napi, rxq->desc_count);
- if (IS_ERR(pool))
- return PTR_ERR(pool);
+ ret = libeth_rx_fq_create(&fq, &rxq->q_vector->napi);
+ if (ret)
+ return ret;

- rxq->pp = pool;
+ rxq->pp = fq.pp;
+ rxq->rx_buf = fq.fqes;
+ rxq->truesize = fq.truesize;
+ rxq->rx_buf_size = fq.buf_len;

return idpf_rx_buf_alloc_singleq(rxq);
}
@@ -766,14 +753,6 @@ static int idpf_rx_buf_alloc_all(struct idpf_buf_queue *rxbufq)
{
int err = 0;

- /* Allocate book keeping buffers */
- rxbufq->buf = kcalloc(rxbufq->desc_count, sizeof(*rxbufq->buf),
- GFP_KERNEL);
- if (!rxbufq->buf) {
- err = -ENOMEM;
- goto rx_buf_alloc_all_out;
- }
-
if (idpf_queue_has(HSPLIT_EN, rxbufq)) {
err = idpf_rx_hdr_buf_alloc_all(rxbufq);
if (err)
@@ -794,19 +773,30 @@ static int idpf_rx_buf_alloc_all(struct idpf_buf_queue *rxbufq)
/**
* idpf_rx_bufs_init - Initialize page pool, allocate rx bufs, and post to HW
* @bufq: buffer queue to create page pool for
+ * @type: type of Rx buffers to allocate
*
* Returns 0 on success, negative on failure
*/
-static int idpf_rx_bufs_init(struct idpf_buf_queue *bufq)
+static int idpf_rx_bufs_init(struct idpf_buf_queue *bufq,
+ enum libeth_fqe_type type)
{
- struct page_pool *pool;
+ struct libeth_fq fq = {
+ .truesize = bufq->truesize,
+ .count = bufq->desc_count,
+ .type = type,
+ .hsplit = idpf_queue_has(HSPLIT_EN, bufq),
+ .nid = idpf_q_vector_to_mem(bufq->q_vector),
+ };
+ int ret;

- pool = idpf_rx_create_page_pool(&bufq->q_vector->napi,
- bufq->desc_count);
- if (IS_ERR(pool))
- return PTR_ERR(pool);
+ ret = libeth_rx_fq_create(&fq, &bufq->q_vector->napi);
+ if (ret)
+ return ret;

- bufq->pp = pool;
+ bufq->pp = fq.pp;
+ bufq->buf = fq.fqes;
+ bufq->truesize = fq.truesize;
+ bufq->rx_buf_size = fq.buf_len;

return idpf_rx_buf_alloc_all(bufq);
}
@@ -819,14 +809,15 @@ static int idpf_rx_bufs_init(struct idpf_buf_queue *bufq)
*/
int idpf_rx_bufs_init_all(struct idpf_vport *vport)
{
- struct idpf_rxq_group *rx_qgrp;
+ bool split = idpf_is_queue_model_split(vport->rxq_model);
int i, j, err;

for (i = 0; i < vport->num_rxq_grp; i++) {
- rx_qgrp = &vport->rxq_grps[i];
+ struct idpf_rxq_group *rx_qgrp = &vport->rxq_grps[i];
+ u32 truesize = 0;

/* Allocate bufs for the rxq itself in singleq */
- if (!idpf_is_queue_model_split(vport->rxq_model)) {
+ if (!split) {
int num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++) {
@@ -843,12 +834,19 @@ int idpf_rx_bufs_init_all(struct idpf_vport *vport)

/* Otherwise, allocate bufs for the buffer queues */
for (j = 0; j < vport->num_bufqs_per_qgrp; j++) {
+ enum libeth_fqe_type type;
struct idpf_buf_queue *q;

q = &rx_qgrp->splitq.bufq_sets[j].bufq;
- err = idpf_rx_bufs_init(q);
+ q->truesize = truesize;
+
+ type = truesize ? LIBETH_FQE_SHORT : LIBETH_FQE_MTU;
+
+ err = idpf_rx_bufs_init(q, type);
if (err)
return err;
+
+ truesize = q->truesize >> 1;
}
}

@@ -1160,17 +1158,11 @@ void idpf_vport_init_num_qs(struct idpf_vport *vport,
/* Adjust number of buffer queues per Rx queue group. */
if (!idpf_is_queue_model_split(vport->rxq_model)) {
vport->num_bufqs_per_qgrp = 0;
- vport->bufq_size[0] = IDPF_RX_BUF_2048;

return;
}

vport->num_bufqs_per_qgrp = IDPF_MAX_BUFQS_PER_RXQ_GRP;
- /* Bufq[0] default buffer size is 4K
- * Bufq[1] default buffer size is 2K
- */
- vport->bufq_size[0] = IDPF_RX_BUF_4096;
- vport->bufq_size[1] = IDPF_RX_BUF_2048;
}

/**
@@ -1507,7 +1499,6 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)

q = &rx_qgrp->splitq.bufq_sets[j].bufq;
q->desc_count = vport->bufq_desc_count[j];
- q->rx_buf_size = vport->bufq_size[j];
q->rx_buffer_low_watermark = IDPF_LOW_WATERMARK;

idpf_queue_assign(HSPLIT_EN, q, hs);
@@ -1560,14 +1551,9 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)
q->netdev = vport->netdev;
q->bufq_sets = rx_qgrp->splitq.bufq_sets;
q->idx = (i * num_rxq) + j;
- /* In splitq mode, RXQ buffer size should be
- * set to that of the first buffer queue
- * associated with this RXQ
- */
- q->rx_buf_size = vport->bufq_size[0];
q->rx_buffer_low_watermark = IDPF_LOW_WATERMARK;
q->rx_max_pkt_size = vport->netdev->mtu +
- IDPF_PACKET_HDR_PAD;
+ LIBETH_RX_LL_LEN;
idpf_rxq_set_descids(vport, q);
}
}
@@ -3145,69 +3131,10 @@ idpf_rx_process_skb_fields(struct idpf_rx_queue *rxq, struct sk_buff *skb,
void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb,
unsigned int size)
{
- skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buf->page,
- rx_buf->offset, size, rx_buf->truesize);
-
- rx_buf->page = NULL;
-}
-
-/**
- * idpf_rx_construct_skb - Allocate skb and populate it
- * @rxq: Rx descriptor queue
- * @rx_buf: Rx buffer to pull data from
- * @size: the length of the packet
- *
- * This function allocates an skb. It then populates it with the page
- * data from the current receive descriptor, taking care to set up the
- * skb correctly.
- */
-struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
- struct idpf_rx_buf *rx_buf,
- unsigned int size)
-{
- unsigned int headlen;
- struct sk_buff *skb;
- void *va;
-
- va = page_address(rx_buf->page) + rx_buf->offset;
-
- /* prefetch first cache line of first page */
- net_prefetch(va);
- /* allocate a skb to store the frags */
- skb = napi_alloc_skb(rxq->napi, IDPF_RX_HDR_SIZE);
- if (unlikely(!skb)) {
- idpf_rx_put_page(rx_buf);
-
- return NULL;
- }
+ u32 hr = rx_buf->page->pp->p.offset;

- skb_mark_for_recycle(skb);
-
- /* Determine available headroom for copy */
- headlen = size;
- if (headlen > IDPF_RX_HDR_SIZE)
- headlen = eth_get_headlen(skb->dev, va, IDPF_RX_HDR_SIZE);
-
- /* align pull length to size of long to optimize memcpy performance */
- memcpy(__skb_put(skb, headlen), va, ALIGN(headlen, sizeof(long)));
-
- /* if we exhaust the linear part then add what is left as a frag */
- size -= headlen;
- if (!size) {
- idpf_rx_put_page(rx_buf);
-
- return skb;
- }
-
- skb_add_rx_frag(skb, 0, rx_buf->page, rx_buf->offset + headlen,
- size, rx_buf->truesize);
-
- /* Since we're giving the page to the stack, clear our reference to it.
- * We'll get a new one during buffer posting.
- */
- rx_buf->page = NULL;
-
- return skb;
+ skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buf->page,
+ rx_buf->offset + hr, size, rx_buf->truesize);
}

/**
@@ -3413,24 +3340,24 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
hdr->page = NULL;

payload:
- if (pkt_len) {
- idpf_rx_sync_for_cpu(rx_buf, pkt_len);
- if (skb)
- idpf_rx_add_frag(rx_buf, skb, pkt_len);
- else
- skb = idpf_rx_construct_skb(rxq, rx_buf,
- pkt_len);
- } else {
- idpf_rx_put_page(rx_buf);
- }
+ if (!libeth_rx_sync_for_cpu(rx_buf, pkt_len))
+ goto skip_data;
+
+ if (skb)
+ idpf_rx_add_frag(rx_buf, skb, pkt_len);
+ else
+ skb = idpf_rx_build_skb(rx_buf, pkt_len);

/* exit if we failed to retrieve a buffer */
if (!skb)
break;

- idpf_rx_post_buf_refill(refillq, buf_id);
+skip_data:
+ rx_buf->page = NULL;

+ idpf_rx_post_buf_refill(refillq, buf_id);
IDPF_RX_BUMP_NTC(rxq, ntc);
+
/* skip if it is non EOP desc */
if (!idpf_rx_splitq_is_eop(rx_desc))
continue;
@@ -3483,15 +3410,15 @@ static int idpf_rx_update_bufq_desc(struct idpf_buf_queue *bufq, u32 buf_id,
struct virtchnl2_splitq_rx_buf_desc *buf_desc)
{
struct libeth_fq_fp fq = {
+ .pp = bufq->pp,
+ .fqes = bufq->buf,
+ .truesize = bufq->truesize,
.count = bufq->desc_count,
};
- struct idpf_rx_buf *buf;
dma_addr_t addr;

- buf = &bufq->buf[buf_id];
-
- addr = idpf_alloc_page(bufq->pp, buf, bufq->rx_buf_size);
- if (unlikely(addr == DMA_MAPPING_ERROR))
+ addr = libeth_rx_alloc(&fq, buf_id);
+ if (addr == DMA_MAPPING_ERROR)
return -ENOMEM;

buf_desc->pkt_addr = cpu_to_le64(addr);
@@ -3590,6 +3517,7 @@ static void idpf_rx_clean_refillq_all(struct idpf_buf_queue *bufq, int nid)
struct idpf_bufq_set *bufq_set;
int i;

+ page_pool_nid_changed(bufq->pp, nid);
if (bufq->hdr_pp)
page_pool_nid_changed(bufq->hdr_pp, nid);

diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 50dcb3ab02b1..70986e12da28 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -1615,6 +1615,12 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
rxq = &rx_qgrp->splitq.rxq_sets[j]->rxq;
sets = rxq->bufq_sets;

+ /* In splitq mode, RXQ buffer size should be
+ * set to that of the first buffer queue
+ * associated with this RXQ.
+ */
+ rxq->rx_buf_size = sets[0].bufq.rx_buf_size;
+
qi[k].rx_bufq1_id = cpu_to_le16(sets[0].bufq.q_id);
if (vport->num_bufqs_per_qgrp > IDPF_SINGLE_BUFQ_PER_RXQ_GRP) {
qi[k].bufq2_ena = IDPF_BUFQ2_ENA;
@@ -3167,7 +3173,7 @@ void idpf_vport_init(struct idpf_vport *vport, struct idpf_vport_max_q *max_q)
rss_data->rss_lut_size = le16_to_cpu(vport_msg->rss_lut_size);

ether_addr_copy(vport->default_mac_addr, vport_msg->default_mac_addr);
- vport->max_mtu = le16_to_cpu(vport_msg->max_mtu) - IDPF_PACKET_HDR_PAD;
+ vport->max_mtu = le16_to_cpu(vport_msg->max_mtu) - LIBETH_RX_LL_LEN;

/* Initialize Tx and Rx profiles for Dynamic Interrupt Moderation */
memcpy(vport->rx_itr_profile, rx_itr, IDPF_DIM_PROFILE_SLOTS);
--
2.45.1


2024-05-28 13:55:21

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 05/12] idpf: strictly assert cachelines of queue and queue vector structures

Now that the queue and queue vector structures are separated and laid
out optimally, group the fields as read-mostly, read-write, and cold
cachelines and add size assertions to make sure new features won't push
something out of its place and provoke perf regression.
Despite looking innocent, this gives up to 2% of perf bump on Rx.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 443 +++++++++++---------
1 file changed, 250 insertions(+), 193 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 9e4ba0aaf3ab..731e2a73def5 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -6,6 +6,7 @@

#include <linux/dim.h>

+#include <net/libeth/cache.h>
#include <net/page_pool/helpers.h>
#include <net/tcp.h>
#include <net/netdev_queues.h>
@@ -504,59 +505,70 @@ struct idpf_intr_reg {

/**
* struct idpf_q_vector
+ * @read_mostly: CL group with rarely written hot fields
* @vport: Vport back pointer
- * @napi: napi handler
- * @v_idx: Vector index
- * @intr_reg: See struct idpf_intr_reg
+ * @num_rxq: Number of RX queues
* @num_txq: Number of TX queues
+ * @num_bufq: Number of buffer queues
* @num_complq: number of completion queues
+ * @rx: Array of RX queues to service
* @tx: Array of TX queues to service
+ * @bufq: Array of buffer queues to service
* @complq: array of completion queues
+ * @intr_reg: See struct idpf_intr_reg
+ * @read_write: CL group with both read/write hot fields
+ * @napi: napi handler
+ * @total_events: Number of interrupts processed
* @tx_dim: Data for TX net_dim algorithm
* @tx_itr_value: TX interrupt throttling rate
* @tx_intr_mode: Dynamic ITR or not
* @tx_itr_idx: TX ITR index
- * @num_rxq: Number of RX queues
- * @rx: Array of RX queues to service
* @rx_dim: Data for RX net_dim algorithm
* @rx_itr_value: RX interrupt throttling rate
* @rx_intr_mode: Dynamic ITR or not
* @rx_itr_idx: RX ITR index
- * @num_bufq: Number of buffer queues
- * @bufq: Array of buffer queues to service
- * @total_events: Number of interrupts processed
+ * @cold: CL group with fields no touched on hotpath
+ * @v_idx: Vector index
* @affinity_mask: CPU affinity mask
*/
struct idpf_q_vector {
- struct idpf_vport *vport;
- struct napi_struct napi;
- u16 v_idx;
- struct idpf_intr_reg intr_reg;
-
- u16 num_txq;
- u16 num_complq;
- struct idpf_tx_queue **tx;
- struct idpf_compl_queue **complq;
-
- struct dim tx_dim;
- u16 tx_itr_value;
- bool tx_intr_mode;
- u32 tx_itr_idx;
-
- u16 num_rxq;
- struct idpf_rx_queue **rx;
- struct dim rx_dim;
- u16 rx_itr_value;
- bool rx_intr_mode;
- u32 rx_itr_idx;
-
- u16 num_bufq;
- struct idpf_buf_queue **bufq;
-
- u16 total_events;
-
- cpumask_var_t affinity_mask;
+ libeth_cacheline_group(read_mostly,
+ struct idpf_vport *vport;
+
+ u16 num_rxq;
+ u16 num_txq;
+ u16 num_bufq;
+ u16 num_complq;
+ struct idpf_rx_queue **rx;
+ struct idpf_tx_queue **tx;
+ struct idpf_buf_queue **bufq;
+ struct idpf_compl_queue **complq;
+
+ struct idpf_intr_reg intr_reg;
+ );
+ libeth_cacheline_group(read_write,
+ struct napi_struct napi;
+ u16 total_events;
+
+ struct dim tx_dim;
+ u16 tx_itr_value;
+ bool tx_intr_mode;
+ u32 tx_itr_idx;
+
+ struct dim rx_dim;
+ u16 rx_itr_value;
+ bool rx_intr_mode;
+ u32 rx_itr_idx;
+ );
+ libeth_cacheline_group(cold,
+ u16 v_idx;
+
+ cpumask_var_t affinity_mask;
+ );
};
+libeth_cacheline_set_assert(struct idpf_q_vector, 104,
+ 424 + 2 * sizeof(struct dim),
+ 8 + sizeof(cpumask_var_t));

struct idpf_rx_queue_stats {
u64_stats_t packets;
@@ -610,6 +622,7 @@ struct idpf_txq_stash {

/**
* struct idpf_rx_queue - software structure representing a receive queue
+ * @read_mostly: CL group with rarely written hot fields
* @rx: universal receive descriptor array
* @single_buf: buffer descriptor array in singleq
* @desc_ring: virtual descriptor ring address
@@ -623,14 +636,16 @@ struct idpf_txq_stash {
* @idx: For RX queue, it is used to index to total RX queue across groups and
* used for skb reporting.
* @desc_count: Number of descriptors
+ * @rxdids: Supported RX descriptor ids
+ * @rx_ptype_lkup: LUT of Rx ptypes
+ * @read_write: CL group with both read/write hot fields
* @next_to_use: Next descriptor to use
* @next_to_clean: Next descriptor to clean
* @next_to_alloc: RX buffer to allocate at
- * @rxdids: Supported RX descriptor ids
- * @rx_ptype_lkup: LUT of Rx ptypes
* @skb: Pointer to the skb
* @stats_sync: See struct u64_stats_sync
* @q_stats: See union idpf_rx_queue_stats
+ * @cold: CL group with fields no touched on hotpath
* @q_id: Queue id
* @size: Length of descriptor ring in bytes
* @dma: Physical address of ring
@@ -641,55 +656,63 @@ struct idpf_txq_stash {
* @rx_max_pkt_size: RX max packet size
*/
struct idpf_rx_queue {
- union {
- union virtchnl2_rx_desc *rx;
- struct virtchnl2_singleq_rx_buf_desc *single_buf;
+ libeth_cacheline_group(read_mostly,
+ union {
+ union virtchnl2_rx_desc *rx;
+ struct virtchnl2_singleq_rx_buf_desc *single_buf;

- void *desc_ring;
- };
- union {
- struct {
- struct idpf_bufq_set *bufq_sets;
- struct napi_struct *napi;
+ void *desc_ring;
};
- struct {
- struct idpf_rx_buf *rx_buf;
- struct page_pool *pp;
+ union {
+ struct {
+ struct idpf_bufq_set *bufq_sets;
+ struct napi_struct *napi;
+ };
+ struct {
+ struct idpf_rx_buf *rx_buf;
+ struct page_pool *pp;
+ };
};
- };
- struct net_device *netdev;
- void __iomem *tail;
-
- DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
- u16 idx;
- u16 desc_count;
- u16 next_to_use;
- u16 next_to_clean;
- u16 next_to_alloc;
-
- u32 rxdids;
-
- const struct idpf_rx_ptype_decoded *rx_ptype_lkup;
- struct sk_buff *skb;
-
- struct u64_stats_sync stats_sync;
- struct idpf_rx_queue_stats q_stats;
-
- /* Slowpath */
- u32 q_id;
- u32 size;
- dma_addr_t dma;
-
- struct idpf_q_vector *q_vector;
-
- u16 rx_buffer_low_watermark;
- u16 rx_hbuf_size;
- u16 rx_buf_size;
- u16 rx_max_pkt_size;
-} ____cacheline_aligned;
+ struct net_device *netdev;
+ void __iomem *tail;
+
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u16 idx;
+ u16 desc_count;
+
+ u32 rxdids;
+ const struct idpf_rx_ptype_decoded *rx_ptype_lkup;
+ );
+ libeth_cacheline_group(read_write,
+ u16 next_to_use;
+ u16 next_to_clean;
+ u16 next_to_alloc;
+
+ struct sk_buff *skb;
+
+ struct u64_stats_sync stats_sync;
+ struct idpf_rx_queue_stats q_stats;
+ );
+ libeth_cacheline_group(cold,
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;
+
+ struct idpf_q_vector *q_vector;
+
+ u16 rx_buffer_low_watermark;
+ u16 rx_hbuf_size;
+ u16 rx_buf_size;
+ u16 rx_max_pkt_size;
+ );
+};
+libeth_cacheline_set_assert(struct idpf_rx_queue, 64,
+ 72 + sizeof(struct u64_stats_sync),
+ 32);

/**
* struct idpf_tx_queue - software structure representing a transmit queue
+ * @read_mostly: CL group with rarely written hot fields
* @base_tx: base Tx descriptor array
* @base_ctx: base Tx context descriptor array
* @flex_tx: flex Tx descriptor array
@@ -703,22 +726,7 @@ struct idpf_rx_queue {
* @idx: For TX queue, it is used as index to map between TX queue group and
* hot path TX pointers stored in vport. Used in both singleq/splitq.
* @desc_count: Number of descriptors
- * @next_to_use: Next descriptor to use
- * @next_to_clean: Next descriptor to clean
- * @netdev: &net_device corresponding to this queue
- * @cleaned_bytes: Splitq only, TXQ only: When a TX completion is received on
- * the TX completion queue, it can be for any TXQ associated
- * with that completion queue. This means we can clean up to
- * N TXQs during a single call to clean the completion queue.
- * cleaned_bytes|pkts tracks the clean stats per TXQ during
- * that single call to clean the completion queue. By doing so,
- * we can update BQL with aggregate cleaned stats for each TXQ
- * only once at the end of the cleaning routine.
- * @clean_budget: singleq only, queue cleaning budget
- * @cleaned_pkts: Number of packets cleaned for the above said case
- * @tx_max_bufs: Max buffers that can be transmitted with scatter-gather
* @tx_min_pkt_len: Min supported packet length
- * @compl_tag_bufid_m: Completion tag buffer id mask
* @compl_tag_gen_s: Completion tag generation bit
* The format of the completion tag will change based on the TXQ
* descriptor ring size so that we can maintain roughly the same level
@@ -739,68 +747,92 @@ struct idpf_rx_queue {
* --------------------------------
*
* This gives us 8*8160 = 65280 possible unique values.
+ * @netdev: &net_device corresponding to this queue
+ * @read_write: CL group with both read/write hot fields
+ * @next_to_use: Next descriptor to use
+ * @next_to_clean: Next descriptor to clean
+ * @cleaned_bytes: Splitq only, TXQ only: When a TX completion is received on
+ * the TX completion queue, it can be for any TXQ associated
+ * with that completion queue. This means we can clean up to
+ * N TXQs during a single call to clean the completion queue.
+ * cleaned_bytes|pkts tracks the clean stats per TXQ during
+ * that single call to clean the completion queue. By doing so,
+ * we can update BQL with aggregate cleaned stats for each TXQ
+ * only once at the end of the cleaning routine.
+ * @clean_budget: singleq only, queue cleaning budget
+ * @cleaned_pkts: Number of packets cleaned for the above said case
+ * @tx_max_bufs: Max buffers that can be transmitted with scatter-gather
+ * @stash: Tx buffer stash for Flow-based scheduling mode
+ * @compl_tag_bufid_m: Completion tag buffer id mask
* @compl_tag_cur_gen: Used to keep track of current completion tag generation
* @compl_tag_gen_max: To determine when compl_tag_cur_gen should be reset
- * @stash: Tx buffer stash for Flow-based scheduling mode
* @stats_sync: See struct u64_stats_sync
* @q_stats: See union idpf_tx_queue_stats
+ * @cold: CL group with fields no touched on hotpath
* @q_id: Queue id
* @size: Length of descriptor ring in bytes
* @dma: Physical address of ring
* @q_vector: Backreference to associated vector
*/
struct idpf_tx_queue {
- union {
- struct idpf_base_tx_desc *base_tx;
- struct idpf_base_tx_ctx_desc *base_ctx;
- union idpf_tx_flex_desc *flex_tx;
- struct idpf_flex_tx_ctx_desc *flex_ctx;
-
- void *desc_ring;
- };
- struct idpf_tx_buf *tx_buf;
- struct idpf_txq_group *txq_grp;
- struct device *dev;
- void __iomem *tail;
-
- DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
- u16 idx;
- u16 desc_count;
- u16 next_to_use;
- u16 next_to_clean;
-
- struct net_device *netdev;
-
- union {
- u32 cleaned_bytes;
- u32 clean_budget;
- };
- u16 cleaned_pkts;
-
- u16 tx_max_bufs;
- u16 tx_min_pkt_len;
-
- u16 compl_tag_bufid_m;
- u16 compl_tag_gen_s;
-
- u16 compl_tag_cur_gen;
- u16 compl_tag_gen_max;
+ libeth_cacheline_group(read_mostly,
+ union {
+ struct idpf_base_tx_desc *base_tx;
+ struct idpf_base_tx_ctx_desc *base_ctx;
+ union idpf_tx_flex_desc *flex_tx;
+ struct idpf_flex_tx_ctx_desc *flex_ctx;
+
+ void *desc_ring;
+ };
+ struct idpf_tx_buf *tx_buf;
+ struct idpf_txq_group *txq_grp;
+ struct device *dev;
+ void __iomem *tail;
+
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u16 idx;
+ u16 desc_count;
+
+ u16 tx_min_pkt_len;
+ u16 compl_tag_gen_s;
+
+ struct net_device *netdev;
+ );
+ libeth_cacheline_group(read_write,
+ u16 next_to_use;
+ u16 next_to_clean;
+
+ union {
+ u32 cleaned_bytes;
+ u32 clean_budget;
+ };
+ u16 cleaned_pkts;

- struct idpf_txq_stash *stash;
+ u16 tx_max_bufs;
+ struct idpf_txq_stash *stash;

- struct u64_stats_sync stats_sync;
- struct idpf_tx_queue_stats q_stats;
+ u16 compl_tag_bufid_m;
+ u16 compl_tag_cur_gen;
+ u16 compl_tag_gen_max;

- /* Slowpath */
- u32 q_id;
- u32 size;
- dma_addr_t dma;
+ struct u64_stats_sync stats_sync;
+ struct idpf_tx_queue_stats q_stats;
+ );
+ libeth_cacheline_group(cold,
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;

- struct idpf_q_vector *q_vector;
-} ____cacheline_aligned;
+ struct idpf_q_vector *q_vector;
+ );
+};
+libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
+ 88 + sizeof(struct u64_stats_sync),
+ 24);

/**
* struct idpf_buf_queue - software structure representing a buffer queue
+ * @read_mostly: CL group with rarely written hot fields
* @split_buf: buffer descriptor array
* @rx_buf: Struct with RX buffer related members
* @rx_buf.buf: See struct idpf_rx_buf
@@ -810,9 +842,11 @@ struct idpf_tx_queue {
* @tail: Tail offset
* @flags: See enum idpf_queue_flags_t
* @desc_count: Number of descriptors
+ * @read_write: CL group with both read/write hot fields
* @next_to_use: Next descriptor to use
* @next_to_clean: Next descriptor to clean
* @next_to_alloc: RX buffer to allocate at
+ * @cold: CL group with fields no touched on hotpath
* @q_id: Queue id
* @size: Length of descriptor ring in bytes
* @dma: Physical address of ring
@@ -822,79 +856,95 @@ struct idpf_tx_queue {
* @rx_buf_size: Buffer size
*/
struct idpf_buf_queue {
- struct virtchnl2_splitq_rx_buf_desc *split_buf;
- struct {
- struct idpf_rx_buf *buf;
- dma_addr_t hdr_buf_pa;
- void *hdr_buf_va;
- } rx_buf;
- struct page_pool *pp;
- void __iomem *tail;
-
- DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
- u16 desc_count;
- u16 next_to_use;
- u16 next_to_clean;
- u16 next_to_alloc;
-
- /* Slowpath */
- u32 q_id;
- u32 size;
- dma_addr_t dma;
-
- struct idpf_q_vector *q_vector;
-
- u16 rx_buffer_low_watermark;
- u16 rx_hbuf_size;
- u16 rx_buf_size;
-} ____cacheline_aligned;
+ libeth_cacheline_group(read_mostly,
+ struct virtchnl2_splitq_rx_buf_desc *split_buf;
+ struct {
+ struct idpf_rx_buf *buf;
+ dma_addr_t hdr_buf_pa;
+ void *hdr_buf_va;
+ } rx_buf;
+ struct page_pool *pp;
+ void __iomem *tail;
+
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u32 desc_count;
+ );
+ libeth_cacheline_group(read_write,
+ u32 next_to_use;
+ u32 next_to_clean;
+ u32 next_to_alloc;
+ );
+ libeth_cacheline_group(cold,
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;
+
+ struct idpf_q_vector *q_vector;
+
+ u16 rx_buffer_low_watermark;
+ u16 rx_hbuf_size;
+ u16 rx_buf_size;
+ );
+};
+libeth_cacheline_set_assert(struct idpf_buf_queue, 64, 16, 32);

/**
* struct idpf_compl_queue - software structure representing a completion queue
+ * @read_mostly: CL group with rarely written hot fields
* @comp: completion descriptor array
* @txq_grp: See struct idpf_txq_group
* @flags: See enum idpf_queue_flags_t
* @desc_count: Number of descriptors
+ * @clean_budget: queue cleaning budget
+ * @netdev: &net_device corresponding to this queue
+ * @read_write: CL group with both read/write hot fields
* @next_to_use: Next descriptor to use. Relevant in both split & single txq
* and bufq.
* @next_to_clean: Next descriptor to clean
- * @netdev: &net_device corresponding to this queue
- * @clean_budget: queue cleaning budget
* @num_completions: Only relevant for TX completion queue. It tracks the
* number of completions received to compare against the
* number of completions pending, as accumulated by the
* TX queues.
+ * @cold: CL group with fields no touched on hotpath
* @q_id: Queue id
* @size: Length of descriptor ring in bytes
* @dma: Physical address of ring
* @q_vector: Backreference to associated vector
*/
struct idpf_compl_queue {
- struct idpf_splitq_tx_compl_desc *comp;
- struct idpf_txq_group *txq_grp;
-
- DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
- u16 desc_count;
- u16 next_to_use;
- u16 next_to_clean;
-
- struct net_device *netdev;
- u32 clean_budget;
- u32 num_completions;
+ libeth_cacheline_group(read_mostly,
+ struct idpf_splitq_tx_compl_desc *comp;
+ struct idpf_txq_group *txq_grp;

- /* Slowpath */
- u32 q_id;
- u32 size;
- dma_addr_t dma;
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u32 desc_count;

- struct idpf_q_vector *q_vector;
-} ____cacheline_aligned;
+ u32 clean_budget;
+ struct net_device *netdev;
+ );
+ libeth_cacheline_group(read_write,
+ u32 next_to_use;
+ u32 next_to_clean;
+
+ u32 num_completions;
+ );
+ libeth_cacheline_group(cold,
+ u32 q_id;
+ u32 size;
+ dma_addr_t dma;
+
+ struct idpf_q_vector *q_vector;
+ );
+};
+libeth_cacheline_set_assert(struct idpf_compl_queue, 40, 16, 24);

/**
* struct idpf_sw_queue
+ * @read_mostly: CL group with rarely written hot fields
* @ring: Pointer to the ring
* @flags: See enum idpf_queue_flags_t
* @desc_count: Descriptor count
+ * @read_write: CL group with both read/write hot fields
* @next_to_use: Buffer to allocate at
* @next_to_clean: Next descriptor to clean
*
@@ -903,13 +953,20 @@ struct idpf_compl_queue {
* lockless buffer management system and are strictly software only constructs.
*/
struct idpf_sw_queue {
- u32 *ring;
-
- DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
- u16 desc_count;
- u16 next_to_use;
- u16 next_to_clean;
-} ____cacheline_aligned;
+ libeth_cacheline_group(read_mostly,
+ u32 *ring;
+
+ DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
+ u32 desc_count;
+ );
+ libeth_cacheline_group(read_write,
+ u32 next_to_use;
+ u32 next_to_clean;
+ );
+};
+libeth_cacheline_group_assert(struct idpf_sw_queue, read_mostly, 24);
+libeth_cacheline_group_assert(struct idpf_sw_queue, read_write, 8);
+libeth_cacheline_struct_assert(struct idpf_sw_queue, 24, 8);

/**
* struct idpf_rxq_set
--
2.45.1


2024-05-28 13:58:18

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb()

Currently, idpf uses the following model for the header buffers:

* buffers are allocated via dma_alloc_coherent();
* when receiving, napi_alloc_skb() is called and then the header is
copied to the newly allocated linear part.

This is far from optimal as DMA coherent zone is slow on many systems
and memcpy() neutralizes the idea and benefits of the header split. Not
speaking of that XDP can't be run on DMA coherent buffers, but at the
same time the idea of allocating an skb to run XDP program is ill.
Instead, use libeth to create page_pools for the header buffers, allocate
them dynamically and then build an skb via napi_build_skb() around them
with no memory copy. With one exception...
When you enable header split, you except you'll always have a separate
header buffer, so that you could reserve headroom and tailroom only
there and then use full buffers for the data. For example, this is how
TCP zerocopy works -- you have to have the payload aligned to PAGE_SIZE.
The current hardware running idpf does *not* guarantee that you'll
always have headers placed separately. For example, on my setup, even
ICMP packets are written as one piece to the data buffers. You can't
build a valid skb around a data buffer in this case.
To not complicate things and not lose TCP zerocopy etc., when such thing
happens, use the empty header buffer and pull either full frame (if it's
short) or the Ethernet header there and build an skb around it. GRO
layer will pull more from the data buffer later. This W/A will hopefully
be removed one day.

Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 52 ++--
.../ethernet/intel/idpf/idpf_singleq_txrx.c | 1 +
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 253 ++++++++++++------
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 14 +-
4 files changed, 204 insertions(+), 116 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index a6f5916bee8e..86a1efc24caf 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -7,7 +7,7 @@
#include <linux/dim.h>

#include <net/libeth/cache.h>
-#include <net/page_pool/helpers.h>
+#include <net/libeth/rx.h>
#include <net/tcp.h>
#include <net/netdev_queues.h>

@@ -103,8 +103,6 @@ do { \
#define IDPF_RX_BUF_STRIDE 32
#define IDPF_RX_BUF_POST_STRIDE 16
#define IDPF_LOW_WATERMARK 64
-/* Size of header buffer specifically for header split */
-#define IDPF_HDR_BUF_SIZE 256
#define IDPF_PACKET_HDR_PAD \
(ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN * 2)
#define IDPF_TX_TSO_MIN_MSS 88
@@ -300,14 +298,7 @@ struct idpf_rx_extracted {
#define IDPF_TX_MAX_DESC_DATA_ALIGNED \
ALIGN_DOWN(IDPF_TX_MAX_DESC_DATA, IDPF_TX_MAX_READ_REQ_SIZE)

-#define IDPF_RX_DMA_ATTR \
- (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING)
-
-struct idpf_rx_buf {
- struct page *page;
- unsigned int page_offset;
- u16 truesize;
-};
+#define idpf_rx_buf libeth_fqe

#define IDPF_RX_MAX_PTYPE_PROTO_IDS 32
#define IDPF_RX_MAX_PTYPE_SZ (sizeof(struct virtchnl2_ptype) + \
@@ -755,14 +746,14 @@ libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
* struct idpf_buf_queue - software structure representing a buffer queue
* @read_mostly: CL group with rarely written hot fields
* @split_buf: buffer descriptor array
- * @rx_buf: Struct with RX buffer related members
- * @rx_buf.buf: See struct idpf_rx_buf
- * @rx_buf.hdr_buf_pa: DMA handle
- * @rx_buf.hdr_buf_va: Virtual address
- * @pp: Page pool pointer
+ * @hdr_buf: &libeth_fqe for header buffers
+ * @hdr_pp: &page_pool for header buffers
+ * @buf: &idpf_rx_buf for data buffers
+ * @pp: &page_pool for data buffers
* @tail: Tail offset
* @flags: See enum idpf_queue_flags_t
* @desc_count: Number of descriptors
+ * @hdr_truesize: truesize for buffer headers
* @read_write: CL group with both read/write hot fields
* @next_to_use: Next descriptor to use
* @next_to_clean: Next descriptor to clean
@@ -779,16 +770,16 @@ libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
struct idpf_buf_queue {
libeth_cacheline_group(read_mostly,
struct virtchnl2_splitq_rx_buf_desc *split_buf;
- struct {
- struct idpf_rx_buf *buf;
- dma_addr_t hdr_buf_pa;
- void *hdr_buf_va;
- } rx_buf;
+ struct libeth_fqe *hdr_buf;
+ struct page_pool *hdr_pp;
+ struct idpf_rx_buf *buf;
struct page_pool *pp;
void __iomem *tail;

DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
u32 desc_count;
+
+ u32 hdr_truesize;
);
libeth_cacheline_group(read_write,
u32 next_to_use;
@@ -982,6 +973,18 @@ struct idpf_txq_group {
u32 num_completions_pending;
};

+static inline int idpf_q_vector_to_mem(const struct idpf_q_vector *q_vector)
+{
+ u32 cpu;
+
+ if (!q_vector)
+ return NUMA_NO_NODE;
+
+ cpu = cpumask_first(q_vector->affinity_mask);
+
+ return cpu < nr_cpu_ids ? cpu_to_mem(cpu) : NUMA_NO_NODE;
+}
+
/**
* idpf_size_to_txd_count - Get number of descriptors needed for large Tx frag
* @size: transmit request size in bytes
@@ -1050,7 +1053,7 @@ static inline dma_addr_t idpf_alloc_page(struct page_pool *pool,
unsigned int buf_size)
{
if (buf_size == IDPF_RX_BUF_2048)
- buf->page = page_pool_dev_alloc_frag(pool, &buf->page_offset,
+ buf->page = page_pool_dev_alloc_frag(pool, &buf->offset,
buf_size);
else
buf->page = page_pool_dev_alloc_pages(pool);
@@ -1060,7 +1063,7 @@ static inline dma_addr_t idpf_alloc_page(struct page_pool *pool,

buf->truesize = buf_size;

- return page_pool_get_dma_addr(buf->page) + buf->page_offset +
+ return page_pool_get_dma_addr(buf->page) + buf->offset +
pool->p.offset;
}

@@ -1087,7 +1090,7 @@ static inline void idpf_rx_sync_for_cpu(struct idpf_rx_buf *rx_buf, u32 len)

dma_sync_single_range_for_cpu(pp->p.dev,
page_pool_get_dma_addr(page),
- rx_buf->page_offset + pp->p.offset, len,
+ rx_buf->offset + pp->p.offset, len,
page_pool_get_dma_dir(pp));
}

@@ -1116,6 +1119,7 @@ void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb,
struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
struct idpf_rx_buf *rx_buf,
unsigned int size);
+struct sk_buff *idpf_rx_build_skb(const struct libeth_fqe *buf, u32 size);
void idpf_tx_buf_hw_update(struct idpf_tx_queue *tx_q, u32 val,
bool xmit_more);
unsigned int idpf_size_to_txd_count(unsigned int size);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
index 7c71c72b814f..cde768082fc4 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
@@ -828,6 +828,7 @@ idpf_rx_singleq_process_skb_fields(struct idpf_rx_queue *rx_q,
}

idpf_rx_singleq_csum(rx_q, skb, csum_bits, decoded);
+ skb_record_rx_queue(skb, rx_q->idx);
}

/**
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 2af9a482652a..689668e0ca51 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -381,7 +381,7 @@ static int idpf_tx_desc_alloc_all(struct idpf_vport *vport)
* idpf_rx_page_rel - Release an rx buffer page
* @rx_buf: the buffer to free
*/
-static void idpf_rx_page_rel(struct idpf_rx_buf *rx_buf)
+static void idpf_rx_page_rel(struct libeth_fqe *rx_buf)
{
if (unlikely(!rx_buf->page))
return;
@@ -389,46 +389,50 @@ static void idpf_rx_page_rel(struct idpf_rx_buf *rx_buf)
page_pool_put_full_page(rx_buf->page->pp, rx_buf->page, false);

rx_buf->page = NULL;
- rx_buf->page_offset = 0;
+ rx_buf->offset = 0;
}

/**
* idpf_rx_hdr_buf_rel_all - Release header buffer memory
* @bufq: queue to use
- * @dev: device to free DMA memory
*/
-static void idpf_rx_hdr_buf_rel_all(struct idpf_buf_queue *bufq,
- struct device *dev)
+static void idpf_rx_hdr_buf_rel_all(struct idpf_buf_queue *bufq)
{
- dma_free_coherent(dev, bufq->desc_count * IDPF_HDR_BUF_SIZE,
- bufq->rx_buf.hdr_buf_va, bufq->rx_buf.hdr_buf_pa);
- bufq->rx_buf.hdr_buf_va = NULL;
+ struct libeth_fq fq = {
+ .fqes = bufq->hdr_buf,
+ .pp = bufq->hdr_pp,
+ };
+
+ for (u32 i = 0; i < bufq->desc_count; i++)
+ idpf_rx_page_rel(&bufq->hdr_buf[i]);
+
+ libeth_rx_fq_destroy(&fq);
+ bufq->hdr_buf = NULL;
+ bufq->hdr_pp = NULL;
}

/**
* idpf_rx_buf_rel_bufq - Free all Rx buffer resources for a buffer queue
* @bufq: queue to be cleaned
- * @dev: device to free DMA memory
*/
-static void idpf_rx_buf_rel_bufq(struct idpf_buf_queue *bufq,
- struct device *dev)
+static void idpf_rx_buf_rel_bufq(struct idpf_buf_queue *bufq)
{
/* queue already cleared, nothing to do */
- if (!bufq->rx_buf.buf)
+ if (!bufq->buf)
return;

/* Free all the bufs allocated and given to hw on Rx queue */
for (u32 i = 0; i < bufq->desc_count; i++)
- idpf_rx_page_rel(&bufq->rx_buf.buf[i]);
+ idpf_rx_page_rel(&bufq->buf[i]);

if (idpf_queue_has(HSPLIT_EN, bufq))
- idpf_rx_hdr_buf_rel_all(bufq, dev);
+ idpf_rx_hdr_buf_rel_all(bufq);

page_pool_destroy(bufq->pp);
bufq->pp = NULL;

- kfree(bufq->rx_buf.buf);
- bufq->rx_buf.buf = NULL;
+ kfree(bufq->buf);
+ bufq->buf = NULL;
}

/**
@@ -493,7 +497,7 @@ static void idpf_rx_desc_rel_bufq(struct idpf_buf_queue *bufq,
if (!bufq)
return;

- idpf_rx_buf_rel_bufq(bufq, dev);
+ idpf_rx_buf_rel_bufq(bufq);

bufq->next_to_alloc = 0;
bufq->next_to_clean = 0;
@@ -573,12 +577,21 @@ static void idpf_rx_buf_hw_update(struct idpf_buf_queue *bufq, u32 val)
*/
static int idpf_rx_hdr_buf_alloc_all(struct idpf_buf_queue *bufq)
{
- bufq->rx_buf.hdr_buf_va =
- dma_alloc_coherent(bufq->q_vector->vport->netdev->dev.parent,
- IDPF_HDR_BUF_SIZE * bufq->desc_count,
- &bufq->rx_buf.hdr_buf_pa, GFP_KERNEL);
- if (!bufq->rx_buf.hdr_buf_va)
- return -ENOMEM;
+ struct libeth_fq fq = {
+ .count = bufq->desc_count,
+ .type = LIBETH_FQE_HDR,
+ .nid = idpf_q_vector_to_mem(bufq->q_vector),
+ };
+ int ret;
+
+ ret = libeth_rx_fq_create(&fq, &bufq->q_vector->napi);
+ if (ret)
+ return ret;
+
+ bufq->hdr_pp = fq.pp;
+ bufq->hdr_buf = fq.fqes;
+ bufq->hdr_truesize = fq.truesize;
+ bufq->rx_hbuf_size = fq.buf_len;

return 0;
}
@@ -616,17 +629,27 @@ static void idpf_rx_post_buf_refill(struct idpf_sw_queue *refillq, u16 buf_id)
static bool idpf_rx_post_buf_desc(struct idpf_buf_queue *bufq, u16 buf_id)
{
struct virtchnl2_splitq_rx_buf_desc *splitq_rx_desc = NULL;
+ struct libeth_fq_fp fq = {
+ .count = bufq->desc_count,
+ };
u16 nta = bufq->next_to_alloc;
struct idpf_rx_buf *buf;
dma_addr_t addr;

splitq_rx_desc = &bufq->split_buf[nta];
- buf = &bufq->rx_buf.buf[buf_id];
+ buf = &bufq->buf[buf_id];

- if (idpf_queue_has(HSPLIT_EN, bufq))
- splitq_rx_desc->hdr_addr =
- cpu_to_le64(bufq->rx_buf.hdr_buf_pa +
- (u32)buf_id * IDPF_HDR_BUF_SIZE);
+ if (idpf_queue_has(HSPLIT_EN, bufq)) {
+ fq.pp = bufq->hdr_pp;
+ fq.fqes = bufq->hdr_buf;
+ fq.truesize = bufq->hdr_truesize;
+
+ addr = libeth_rx_alloc(&fq, buf_id);
+ if (addr == DMA_MAPPING_ERROR)
+ return false;
+
+ splitq_rx_desc->hdr_addr = cpu_to_le64(addr);
+ }

addr = idpf_alloc_page(bufq->pp, buf, bufq->rx_buf_size);
if (unlikely(addr == DMA_MAPPING_ERROR))
@@ -741,13 +764,12 @@ static int idpf_rx_bufs_init_singleq(struct idpf_rx_queue *rxq)
*/
static int idpf_rx_buf_alloc_all(struct idpf_buf_queue *rxbufq)
{
- struct device *dev = rxbufq->q_vector->vport->netdev->dev.parent;
int err = 0;

/* Allocate book keeping buffers */
- rxbufq->rx_buf.buf = kcalloc(rxbufq->desc_count,
- sizeof(struct idpf_rx_buf), GFP_KERNEL);
- if (!rxbufq->rx_buf.buf) {
+ rxbufq->buf = kcalloc(rxbufq->desc_count, sizeof(*rxbufq->buf),
+ GFP_KERNEL);
+ if (!rxbufq->buf) {
err = -ENOMEM;
goto rx_buf_alloc_all_out;
}
@@ -764,7 +786,7 @@ static int idpf_rx_buf_alloc_all(struct idpf_buf_queue *rxbufq)

rx_buf_alloc_all_out:
if (err)
- idpf_rx_buf_rel_bufq(rxbufq, dev);
+ idpf_rx_buf_rel_bufq(rxbufq);

return err;
}
@@ -1489,7 +1511,6 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)
q->rx_buffer_low_watermark = IDPF_LOW_WATERMARK;

idpf_queue_assign(HSPLIT_EN, q, hs);
- q->rx_hbuf_size = hs ? IDPF_HDR_BUF_SIZE : 0;

bufq_set->num_refillqs = num_rxq;
bufq_set->refillqs = kcalloc(num_rxq, swq_size,
@@ -1532,7 +1553,6 @@ static int idpf_rxq_group_alloc(struct idpf_vport *vport, u16 num_rxq)
&rx_qgrp->splitq.bufq_sets[1].refillqs[j];

idpf_queue_assign(HSPLIT_EN, q, hs);
- q->rx_hbuf_size = hs ? IDPF_HDR_BUF_SIZE : 0;

setup_rxq:
q->desc_count = vport->rxq_desc_count;
@@ -3107,6 +3127,8 @@ idpf_rx_process_skb_fields(struct idpf_rx_queue *rxq, struct sk_buff *skb,
csum_bits = idpf_rx_splitq_extract_csum_bits(rx_desc);
idpf_rx_csum(rxq, skb, csum_bits, decoded);

+ skb_record_rx_queue(skb, rxq->idx);
+
return 0;
}

@@ -3124,7 +3146,7 @@ void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb,
unsigned int size)
{
skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buf->page,
- rx_buf->page_offset, size, rx_buf->truesize);
+ rx_buf->offset, size, rx_buf->truesize);

rx_buf->page = NULL;
}
@@ -3147,7 +3169,7 @@ struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
struct sk_buff *skb;
void *va;

- va = page_address(rx_buf->page) + rx_buf->page_offset;
+ va = page_address(rx_buf->page) + rx_buf->offset;

/* prefetch first cache line of first page */
net_prefetch(va);
@@ -3159,7 +3181,6 @@ struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
return NULL;
}

- skb_record_rx_queue(skb, rxq->idx);
skb_mark_for_recycle(skb);

/* Determine available headroom for copy */
@@ -3178,7 +3199,7 @@ struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
return skb;
}

- skb_add_rx_frag(skb, 0, rx_buf->page, rx_buf->page_offset + headlen,
+ skb_add_rx_frag(skb, 0, rx_buf->page, rx_buf->offset + headlen,
size, rx_buf->truesize);

/* Since we're giving the page to the stack, clear our reference to it.
@@ -3190,36 +3211,66 @@ struct sk_buff *idpf_rx_construct_skb(const struct idpf_rx_queue *rxq,
}

/**
- * idpf_rx_hdr_construct_skb - Allocate skb and populate it from header buffer
- * @rxq: Rx descriptor queue
- * @va: Rx buffer to pull data from
+ * idpf_rx_hsplit_wa - handle header buffer overflows and split errors
+ * @hdr: Rx buffer for the headers
+ * @buf: Rx buffer for the payload
+ * @data_len: number of bytes received to the payload buffer
+ *
+ * When a header buffer overflow occurs or the HW was unable do parse the
+ * packet type to perform header split, the whole frame gets placed to the
+ * payload buffer. We can't build a valid skb around a payload buffer when
+ * the header split is active since it doesn't reserve any head- or tailroom.
+ * In that case, copy either the whole frame when it's short or just the
+ * Ethernet header to the header buffer to be able to build an skb and adjust
+ * the data offset in the payload buffer, IOW emulate the header split.
+ *
+ * Return: number of bytes copied to the header buffer.
+ */
+static u32 idpf_rx_hsplit_wa(const struct libeth_fqe *hdr,
+ struct libeth_fqe *buf, u32 data_len)
+{
+ u32 copy = data_len <= L1_CACHE_BYTES ? data_len : ETH_HLEN;
+ const void *src;
+ void *dst;
+
+ if (!libeth_rx_sync_for_cpu(buf, copy))
+ return 0;
+
+ dst = page_address(hdr->page) + hdr->offset + hdr->page->pp->p.offset;
+ src = page_address(buf->page) + buf->offset + buf->page->pp->p.offset;
+ memcpy(dst, src, ALIGN(copy, sizeof(long)));
+
+ buf->offset += copy;
+
+ return copy;
+}
+
+/**
+ * idpf_rx_build_skb - Allocate skb and populate it from header buffer
+ * @buf: Rx buffer to pull data from
* @size: the length of the packet
*
* This function allocates an skb. It then populates it with the page data from
* the current receive descriptor, taking care to set up the skb correctly.
- * This specifically uses a header buffer to start building the skb.
*/
-static struct sk_buff *
-idpf_rx_hdr_construct_skb(const struct idpf_rx_queue *rxq, const void *va,
- unsigned int size)
+struct sk_buff *idpf_rx_build_skb(const struct libeth_fqe *buf, u32 size)
{
+ u32 hr = buf->page->pp->p.offset;
struct sk_buff *skb;
+ void *va;

- /* allocate a skb to store the frags */
- skb = napi_alloc_skb(rxq->napi, size);
+ va = page_address(buf->page) + buf->offset;
+ prefetch(va + hr);
+
+ skb = napi_build_skb(va, buf->truesize);
if (unlikely(!skb))
return NULL;

- skb_record_rx_queue(skb, rxq->idx);
-
- memcpy(__skb_put(skb, size), va, ALIGN(size, sizeof(long)));
-
- /* More than likely, a payload fragment, which will use a page from
- * page_pool will be added to the SKB so mark it for recycle
- * preemptively. And if not, it's inconsequential.
- */
skb_mark_for_recycle(skb);

+ skb_reserve(skb, hr);
+ __skb_put(skb, size);
+
return skb;
}

@@ -3272,14 +3323,12 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
/* Process Rx packets bounded by budget */
while (likely(total_rx_pkts < budget)) {
struct virtchnl2_rx_flex_desc_adv_nic_3 *rx_desc;
+ struct libeth_fqe *hdr, *rx_buf = NULL;
struct idpf_sw_queue *refillq = NULL;
struct idpf_rxq_set *rxq_set = NULL;
- struct idpf_rx_buf *rx_buf = NULL;
unsigned int pkt_len = 0;
unsigned int hdr_len = 0;
u16 gen_id, buf_id = 0;
- /* Header buffer overflow only valid for header split */
- bool hbo = false;
int bufq_id;
u8 rxdid;

@@ -3311,25 +3360,6 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
pkt_len = le16_get_bits(rx_desc->pktlen_gen_bufq_id,
VIRTCHNL2_RX_FLEX_DESC_ADV_LEN_PBUF_M);

- hbo = FIELD_GET(VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_HBO_M,
- rx_desc->status_err0_qw1);
-
- if (unlikely(hbo)) {
- /* If a header buffer overflow, occurs, i.e. header is
- * too large to fit in the header split buffer, HW will
- * put the entire packet, including headers, in the
- * data/payload buffer.
- */
- u64_stats_update_begin(&rxq->stats_sync);
- u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf);
- u64_stats_update_end(&rxq->stats_sync);
- goto bypass_hsplit;
- }
-
- hdr_len = le16_get_bits(rx_desc->hdrlen_flags,
- VIRTCHNL2_RX_FLEX_DESC_ADV_LEN_HDR_M);
-
-bypass_hsplit:
bufq_id = le16_get_bits(rx_desc->pktlen_gen_bufq_id,
VIRTCHNL2_RX_FLEX_DESC_ADV_BUFQ_ID_M);

@@ -3341,18 +3371,48 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)

buf_id = le16_to_cpu(rx_desc->buf_id);

- rx_buf = &rx_bufq->rx_buf.buf[buf_id];
+ rx_buf = &rx_bufq->buf[buf_id];

- if (hdr_len) {
- const void *va = (u8 *)rx_bufq->rx_buf.hdr_buf_va +
- (u32)buf_id * IDPF_HDR_BUF_SIZE;
+ if (!rx_bufq->hdr_pp)
+ goto payload;
+
+#define __HBO_BIT VIRTCHNL2_RX_FLEX_DESC_ADV_STATUS0_HBO_M
+#define __HDR_LEN_MASK VIRTCHNL2_RX_FLEX_DESC_ADV_LEN_HDR_M
+ if (likely(!(rx_desc->status_err0_qw1 & __HBO_BIT)))
+ /* If a header buffer overflow, occurs, i.e. header is
+ * too large to fit in the header split buffer, HW will
+ * put the entire packet, including headers, in the
+ * data/payload buffer.
+ */
+ hdr_len = le16_get_bits(rx_desc->hdrlen_flags,
+ __HDR_LEN_MASK);
+#undef __HDR_LEN_MASK
+#undef __HBO_BIT
+
+ hdr = &rx_bufq->hdr_buf[buf_id];
+
+ if (unlikely(!hdr_len && !skb)) {
+ hdr_len = idpf_rx_hsplit_wa(hdr, rx_buf, pkt_len);
+ pkt_len -= hdr_len;
+
+ u64_stats_update_begin(&rxq->stats_sync);
+ u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf);
+ u64_stats_update_end(&rxq->stats_sync);
+ }
+
+ if (libeth_rx_sync_for_cpu(hdr, hdr_len)) {
+ skb = idpf_rx_build_skb(hdr, hdr_len);
+ if (!skb)
+ break;

- skb = idpf_rx_hdr_construct_skb(rxq, va, hdr_len);
u64_stats_update_begin(&rxq->stats_sync);
u64_stats_inc(&rxq->q_stats.hsplit_pkts);
u64_stats_update_end(&rxq->stats_sync);
}

+ hdr->page = NULL;
+
+payload:
if (pkt_len) {
idpf_rx_sync_for_cpu(rx_buf, pkt_len);
if (skb)
@@ -3422,10 +3482,13 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
static int idpf_rx_update_bufq_desc(struct idpf_buf_queue *bufq, u32 buf_id,
struct virtchnl2_splitq_rx_buf_desc *buf_desc)
{
+ struct libeth_fq_fp fq = {
+ .count = bufq->desc_count,
+ };
struct idpf_rx_buf *buf;
dma_addr_t addr;

- buf = &bufq->rx_buf.buf[buf_id];
+ buf = &bufq->buf[buf_id];

addr = idpf_alloc_page(bufq->pp, buf, bufq->rx_buf_size);
if (unlikely(addr == DMA_MAPPING_ERROR))
@@ -3437,8 +3500,15 @@ static int idpf_rx_update_bufq_desc(struct idpf_buf_queue *bufq, u32 buf_id,
if (!idpf_queue_has(HSPLIT_EN, bufq))
return 0;

- buf_desc->hdr_addr = cpu_to_le64(bufq->rx_buf.hdr_buf_pa +
- (u32)buf_id * IDPF_HDR_BUF_SIZE);
+ fq.pp = bufq->hdr_pp;
+ fq.fqes = bufq->hdr_buf;
+ fq.truesize = bufq->hdr_truesize;
+
+ addr = libeth_rx_alloc(&fq, buf_id);
+ if (addr == DMA_MAPPING_ERROR)
+ return -ENOMEM;
+
+ buf_desc->hdr_addr = cpu_to_le64(addr);

return 0;
}
@@ -3509,16 +3579,20 @@ static void idpf_rx_clean_refillq(struct idpf_buf_queue *bufq,
/**
* idpf_rx_clean_refillq_all - Clean all refill queues
* @bufq: buffer queue with refill queues
+ * @nid: ID of the closest NUMA node with memory
*
* Iterates through all refill queues assigned to the buffer queue assigned to
* this vector. Returns true if clean is complete within budget, false
* otherwise.
*/
-static void idpf_rx_clean_refillq_all(struct idpf_buf_queue *bufq)
+static void idpf_rx_clean_refillq_all(struct idpf_buf_queue *bufq, int nid)
{
struct idpf_bufq_set *bufq_set;
int i;

+ if (bufq->hdr_pp)
+ page_pool_nid_changed(bufq->hdr_pp, nid);
+
bufq_set = container_of(bufq, struct idpf_bufq_set, bufq);
for (i = 0; i < bufq_set->num_refillqs; i++)
idpf_rx_clean_refillq(bufq, &bufq_set->refillqs[i]);
@@ -4020,6 +4094,7 @@ static bool idpf_rx_splitq_clean_all(struct idpf_q_vector *q_vec, int budget,
bool clean_complete = true;
int pkts_cleaned = 0;
int i, budget_per_q;
+ int nid;

/* We attempt to distribute budget to each Rx queue fairly, but don't
* allow the budget to go below 1 because that would exit polling early.
@@ -4037,8 +4112,10 @@ static bool idpf_rx_splitq_clean_all(struct idpf_q_vector *q_vec, int budget,
}
*cleaned = pkts_cleaned;

+ nid = numa_mem_id();
+
for (i = 0; i < q_vec->num_bufq; i++)
- idpf_rx_clean_refillq_all(q_vec->bufq[i]);
+ idpf_rx_clean_refillq_all(q_vec->bufq[i], nid);

return clean_complete;
}
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 83f3543a30e1..50dcb3ab02b1 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -1604,32 +1604,38 @@ static int idpf_send_config_rx_queues_msg(struct idpf_vport *vport)
num_rxq = rx_qgrp->singleq.num_rxq;

for (j = 0; j < num_rxq; j++, k++) {
+ const struct idpf_bufq_set *sets;
struct idpf_rx_queue *rxq;

if (!idpf_is_queue_model_split(vport->rxq_model)) {
rxq = rx_qgrp->singleq.rxqs[j];
goto common_qi_fields;
}
+
rxq = &rx_qgrp->splitq.rxq_sets[j]->rxq;
- qi[k].rx_bufq1_id =
- cpu_to_le16(rxq->bufq_sets[0].bufq.q_id);
+ sets = rxq->bufq_sets;
+
+ qi[k].rx_bufq1_id = cpu_to_le16(sets[0].bufq.q_id);
if (vport->num_bufqs_per_qgrp > IDPF_SINGLE_BUFQ_PER_RXQ_GRP) {
qi[k].bufq2_ena = IDPF_BUFQ2_ENA;
qi[k].rx_bufq2_id =
- cpu_to_le16(rxq->bufq_sets[1].bufq.q_id);
+ cpu_to_le16(sets[1].bufq.q_id);
}
qi[k].rx_buffer_low_watermark =
cpu_to_le16(rxq->rx_buffer_low_watermark);
if (idpf_is_feature_ena(vport, NETIF_F_GRO_HW))
qi[k].qflags |= cpu_to_le16(VIRTCHNL2_RXQ_RSC);

-common_qi_fields:
+ rxq->rx_hbuf_size = sets[0].bufq.rx_hbuf_size;
+
if (idpf_queue_has(HSPLIT_EN, rxq)) {
qi[k].qflags |=
cpu_to_le16(VIRTCHNL2_RXQ_HDR_SPLIT);
qi[k].hdr_buffer_size =
cpu_to_le16(rxq->rx_hbuf_size);
}
+
+common_qi_fields:
qi[k].queue_id = cpu_to_le32(rxq->q_id);
qi[k].model = cpu_to_le16(vport->rxq_model);
qi[k].type = cpu_to_le32(VIRTCHNL2_QUEUE_TYPE_RX);
--
2.45.1


2024-05-30 01:34:27

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers

On Tue, 28 May 2024 15:48:35 +0200 Alexander Lobakin wrote:
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index 95a59ac78f82..d0cf9a2d82de 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -1155,6 +1155,7 @@ sub dump_struct($$) {
> $members =~ s/\bstruct_group_attr\s*\(([^,]*,){2}/STRUCT_GROUP(/gos;
> $members =~ s/\bstruct_group_tagged\s*\(([^,]*),([^,]*),/struct $1 $2; STRUCT_GROUP(/gos;
> $members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos;
> + $members =~ s/\blibeth_cacheline_group\s*\(([^,]*,)/struct { } $1; STRUCT_GROUP(/gos;
> $members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
>
> my $args = qr{([^,)]+)};

Having per-driver grouping defines is a no-go.
Do you need the defines in the first place?
Are you sure the assert you're adding are not going to explode
on some weird arch? Honestly, patch 5 feels like a little too
much for a driver..

2024-05-30 01:40:23

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb()

On Tue, 28 May 2024 15:48:45 +0200 Alexander Lobakin wrote:
> Currently, idpf uses the following model for the header buffers:
>
> * buffers are allocated via dma_alloc_coherent();
> * when receiving, napi_alloc_skb() is called and then the header is
> copied to the newly allocated linear part.
>
> This is far from optimal as DMA coherent zone is slow on many systems
> and memcpy() neutralizes the idea and benefits of the header split. Not
> speaking of that XDP can't be run on DMA coherent buffers, but at the
> same time the idea of allocating an skb to run XDP program is ill.
> Instead, use libeth to create page_pools for the header buffers, allocate
> them dynamically and then build an skb via napi_build_skb() around them
> with no memory copy. With one exception...
> When you enable header split, you except you'll always have a separate

accept

> header buffer, so that you could reserve headroom and tailroom only
> there and then use full buffers for the data. For example, this is how
> TCP zerocopy works -- you have to have the payload aligned to PAGE_SIZE.
> The current hardware running idpf does *not* guarantee that you'll
> always have headers placed separately. For example, on my setup, even
> ICMP packets are written as one piece to the data buffers. You can't
> build a valid skb around a data buffer in this case.
> To not complicate things and not lose TCP zerocopy etc., when such thing
> happens, use the empty header buffer and pull either full frame (if it's
> short) or the Ethernet header there and build an skb around it. GRO
> layer will pull more from the data buffer later. This W/A will hopefully
> be removed one day.

Hopefully soon, cause it will prevent you from mapping data buffers to
user space or using DMABUF memory :(

2024-05-30 13:49:39

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb()

Alexander Lobakin wrote:
> Currently, idpf uses the following model for the header buffers:
>
> * buffers are allocated via dma_alloc_coherent();
> * when receiving, napi_alloc_skb() is called and then the header is
> copied to the newly allocated linear part.
>
> This is far from optimal as DMA coherent zone is slow on many systems
> and memcpy() neutralizes the idea and benefits of the header split.

In the previous revision this assertion was called out, as we have
lots of experience with the existing implementation and a previous one
based on dynamic allocation one that performed much worse. You would
share performance numbers in the next revision

https://lore.kernel.org/netdev/[email protected]/T/#me85d509365aba9279275e9b181248247e1f01bb0

This may be so integral to this patch series that asking to back it
out now sets back the whole effort. That is not my intent.

And I appreciate that in principle there are many potential
optizations.

But this (OOT) driver is already in use and regressions in existing
workloads is a serious headache. As is significant code churn wrt
other still OOT feature patch series.

This series (of series) modifies the driver significantly, beyond the
narrow scope of adding XDP and AF_XDP.

> Not
> speaking of that XDP can't be run on DMA coherent buffers, but at the
> same time the idea of allocating an skb to run XDP program is ill.
> Instead, use libeth to create page_pools for the header buffers, allocate
> them dynamically and then build an skb via napi_build_skb() around them
> with no memory copy. With one exception...
> When you enable header split, you except you'll always have a separate
> header buffer, so that you could reserve headroom and tailroom only
> there and then use full buffers for the data. For example, this is how
> TCP zerocopy works -- you have to have the payload aligned to PAGE_SIZE.
> The current hardware running idpf does *not* guarantee that you'll
> always have headers placed separately. For example, on my setup, even
> ICMP packets are written as one piece to the data buffers. You can't
> build a valid skb around a data buffer in this case.
> To not complicate things and not lose TCP zerocopy etc., when such thing
> happens, use the empty header buffer and pull either full frame (if it's
> short) or the Ethernet header there and build an skb around it. GRO
> layer will pull more from the data buffer later. This W/A will hopefully
> be removed one day.
>
> Signed-off-by: Alexander Lobakin <[email protected]>

2024-06-01 08:55:35

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures

On Tue, May 28, 2024 at 03:48:37PM +0200, Alexander Lobakin wrote:
> Currently, sizeof(struct idpf_queue) is 32 Kb.
> This is due to the 12-bit hashtable declaration at the end of the queue.
> This HT is needed only for Tx queues when the flow scheduling mode is
> enabled. But &idpf_queue is unified for all of the queue types,
> provoking excessive memory usage.
> The unified structure in general makes the code less effective via
> suboptimal fields placement. You can't avoid that unless you make unions
> each 2 fields. Even then, different field alignment etc., doesn't allow
> you to optimize things to the limit.
> Split &idpf_queue into 4 structures corresponding to the queue types:
> RQ (Rx queue), SQ (Tx queue), FQ (buffer queue), and CQ (completion
> queue). Place only needed fields there and shortcuts handy for hotpath.
> Allocate the abovementioned hashtable dynamically and only when needed,
> keeping &idpf_tx_queue relatively short (192 bytes, same as Rx). This HT
> is used only for OOO completions, which aren't really hotpath anyway.
> Note that this change must be done atomically, otherwise it's really
> easy to get lost and miss something.
>
> Signed-off-by: Alexander Lobakin <[email protected]>

...

> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c

...

> @@ -1158,20 +1325,22 @@ static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
> */
> static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> {
> - bool flow_sch_en;
> - int err, i;
> + bool split, flow_sch_en;
> + int i;
>
> vport->txq_grps = kcalloc(vport->num_txq_grp,
> sizeof(*vport->txq_grps), GFP_KERNEL);
> if (!vport->txq_grps)
> return -ENOMEM;
>
> + split = idpf_is_queue_model_split(vport->txq_model);
> flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
> VIRTCHNL2_CAP_SPLITQ_QSCHED);
>
> for (i = 0; i < vport->num_txq_grp; i++) {
> struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
> struct idpf_adapter *adapter = vport->adapter;
> + struct idpf_txq_stash *stashes;
> int j;
>
> tx_qgrp->vport = vport;
> @@ -1180,45 +1349,62 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> for (j = 0; j < tx_qgrp->num_txq; j++) {
> tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
> GFP_KERNEL);
> - if (!tx_qgrp->txqs[j]) {
> - err = -ENOMEM;
> + if (!tx_qgrp->txqs[j])
> goto err_alloc;
> - }
> + }
> +
> + if (split && flow_sch_en) {
> + stashes = kcalloc(num_txq, sizeof(*stashes),
> + GFP_KERNEL);

Hi Alexander,

Here stashes is assigned a memory allocation and
then then assigned to tx_qgrp->stashes a few lines below...

> + if (!stashes)
> + goto err_alloc;
> +
> + tx_qgrp->stashes = stashes;
> }
>
> for (j = 0; j < tx_qgrp->num_txq; j++) {
> - struct idpf_queue *q = tx_qgrp->txqs[j];
> + struct idpf_tx_queue *q = tx_qgrp->txqs[j];
>
> q->dev = &adapter->pdev->dev;
> q->desc_count = vport->txq_desc_count;
> q->tx_max_bufs = idpf_get_max_tx_bufs(adapter);
> q->tx_min_pkt_len = idpf_get_min_tx_pkt_len(adapter);
> - q->vport = vport;
> + q->netdev = vport->netdev;
> q->txq_grp = tx_qgrp;
> - hash_init(q->sched_buf_hash);
>
> - if (flow_sch_en)
> - set_bit(__IDPF_Q_FLOW_SCH_EN, q->flags);
> + if (!split) {
> + q->clean_budget = vport->compln_clean_budget;
> + idpf_queue_assign(CRC_EN, q,
> + vport->crc_enable);
> + }
> +
> + if (!flow_sch_en)
> + continue;
> +
> + if (split) {

... but here elements of stashes seem to be assigned to q->stash
without stashes having being initialised.

Flagged by Smatch

> + q->stash = &stashes[j];
> + hash_init(q->stash->sched_buf_hash);
> + }
> +
> + idpf_queue_set(FLOW_SCH_EN, q);
> }
>
> - if (!idpf_is_queue_model_split(vport->txq_model))
> + if (!split)
> continue;
>
> tx_qgrp->complq = kcalloc(IDPF_COMPLQ_PER_GROUP,
> sizeof(*tx_qgrp->complq),
> GFP_KERNEL);
> - if (!tx_qgrp->complq) {
> - err = -ENOMEM;
> + if (!tx_qgrp->complq)
> goto err_alloc;
> - }
>
> - tx_qgrp->complq->dev = &adapter->pdev->dev;
> tx_qgrp->complq->desc_count = vport->complq_desc_count;
> - tx_qgrp->complq->vport = vport;
> tx_qgrp->complq->txq_grp = tx_qgrp;
> + tx_qgrp->complq->netdev = vport->netdev;
> + tx_qgrp->complq->clean_budget = vport->compln_clean_budget;
>
> if (flow_sch_en)
> - __set_bit(__IDPF_Q_FLOW_SCH_EN, tx_qgrp->complq->flags);
> + idpf_queue_set(FLOW_SCH_EN, tx_qgrp->complq);
> }
>
> return 0;
> @@ -1226,7 +1412,7 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> err_alloc:
> idpf_txq_group_rel(vport);
>
> - return err;
> + return -ENOMEM;
> }
>
> /**

...

2024-06-01 09:08:59

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer

+ Dan Carpenter

On Tue, May 28, 2024 at 03:48:46PM +0200, Alexander Lobakin wrote:
> idpf uses Page Pool for data buffers with hardcoded buffer lengths of
> 4k for "classic" buffers and 2k for "short" ones. This is not flexible
> and does not ensure optimal memory usage. Why would you need 4k buffers
> when the MTU is 1500?
> Use libeth for the data buffers and don't hardcode any buffer sizes. Let
> them be calculated from the MTU for "classics" and then divide the
> truesize by 2 for "short" ones. The memory usage is now greatly reduced
> and 2 buffer queues starts make sense: on frames <= 1024, you'll recycle
> (and resync) a page only after 4 HW writes rather than two.
>
> Signed-off-by: Alexander Lobakin <[email protected]>

...

> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c

...

Hi Alexander,

The code above the hunk below, starting at line 3321, is:

if (unlikely(!hdr_len && !skb)) {
hdr_len = idpf_rx_hsplit_wa(hdr, rx_buf, pkt_len);
pkt_len -= hdr_len;
u64_stats_update_begin(&rxq->stats_sync);
u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf);
u64_stats_update_end(&rxq->stats_sync);
}
if (libeth_rx_sync_for_cpu(hdr, hdr_len)) {
skb = idpf_rx_build_skb(hdr, hdr_len);
if (!skb)
break;
u64_stats_update_begin(&rxq->stats_sync);
u64_stats_inc(&rxq->q_stats.hsplit_pkts);
u64_stats_update_end(&rxq->stats_sync);
}

> @@ -3413,24 +3340,24 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
> hdr->page = NULL;
>
> payload:
> - if (pkt_len) {
> - idpf_rx_sync_for_cpu(rx_buf, pkt_len);
> - if (skb)
> - idpf_rx_add_frag(rx_buf, skb, pkt_len);
> - else
> - skb = idpf_rx_construct_skb(rxq, rx_buf,
> - pkt_len);
> - } else {
> - idpf_rx_put_page(rx_buf);
> - }
> + if (!libeth_rx_sync_for_cpu(rx_buf, pkt_len))
> + goto skip_data;
> +
> + if (skb)
> + idpf_rx_add_frag(rx_buf, skb, pkt_len);
> + else
> + skb = idpf_rx_build_skb(rx_buf, pkt_len);
>
> /* exit if we failed to retrieve a buffer */
> if (!skb)
> break;
>
> - idpf_rx_post_buf_refill(refillq, buf_id);
> +skip_data:
> + rx_buf->page = NULL;
>
> + idpf_rx_post_buf_refill(refillq, buf_id);
> IDPF_RX_BUMP_NTC(rxq, ntc);
> +
> /* skip if it is non EOP desc */
> if (!idpf_rx_splitq_is_eop(rx_desc))
> continue;

The code following this hunk, ending at line 3372, looks like this:

/* pad skb if needed (to make valid ethernet frame) */
if (eth_skb_pad(skb)) {
skb = NULL;
continue;
}
/* probably a little skewed due to removing CRC */
total_rx_bytes += skb->len;

Smatch warns that:
.../idpf_txrx.c:3372 idpf_rx_splitq_clean() error: we previously assumed 'skb' could be null (see line 3321)

I think, but am not sure, this is because it thinks skb might
be NULL at the point where "goto skip_data;" is now called above.

Could you look into this?

...

2024-06-12 10:22:02

by Przemek Kitszel

[permalink] [raw]
Subject: Re: [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers

On 5/30/24 03:34, Jakub Kicinski wrote:
> On Tue, 28 May 2024 15:48:35 +0200 Alexander Lobakin wrote:
>> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
>> index 95a59ac78f82..d0cf9a2d82de 100755
>> --- a/scripts/kernel-doc
>> +++ b/scripts/kernel-doc
>> @@ -1155,6 +1155,7 @@ sub dump_struct($$) {
>> $members =~ s/\bstruct_group_attr\s*\(([^,]*,){2}/STRUCT_GROUP(/gos;
>> $members =~ s/\bstruct_group_tagged\s*\(([^,]*),([^,]*),/struct $1 $2; STRUCT_GROUP(/gos;
>> $members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos;
>> + $members =~ s/\blibeth_cacheline_group\s*\(([^,]*,)/struct { } $1; STRUCT_GROUP(/gos;
>> $members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
>>
>> my $args = qr{([^,)]+)};
>
> Having per-driver grouping defines is a no-go.

[1]

> Do you need the defines in the first place?

this patch was a tough one for me too, but I see the idea promising

> Are you sure the assert you're adding are not going to explode
> on some weird arch? Honestly, patch 5 feels like a little too
> much for a driver..
>

definitively some of the patch 5 should be added here as doc/example,
but it would be even better to simplify a bit

--

I think that "mark this struct as explicit split into cachelines" is
a hard hard C problem in general, especially in the kernel context,
*but* I think that this could be simplified for your use case - split
into exactly 3 (possibly empty) sections: mostly-Read, RW, COLD?

Given that it will be a generic solution (would fix the [1] above),
and be also easier to use, like:

CACHELINE_STRUCT_GROUP(idpf_q_vector,
CACHELINE_STRUCT_GROUP_RD(/* read mostly */
struct idpf_vport *vport;
u16 num_rxq;
u16 num_txq;
u16 num_bufq;
u16 num_complq;
struct idpf_rx_queue **rx;
struct idpf_tx_queue **tx;
struct idpf_buf_queue **bufq;
struct idpf_compl_queue **complq;
struct idpf_intr_reg intr_reg;
),
CACHELINE_STRUCT_GROUP_RW(
struct napi_struct napi;
u16 total_events;
struct dim tx_dim;
u16 tx_itr_value;
bool tx_intr_mode;
u32 tx_itr_idx;
struct dim rx_dim;
u16 rx_itr_value;
bool rx_intr_mode;
u32 rx_itr_idx;
),
CACHELINE_STRUCT_GROUP_COLD(
u16 v_idx;
cpumask_var_t affinity_mask;
)
);

Note that those three inner macros have distinct meaningful names not to
have this working, but to aid human reader, then checkpatch/check-kdoc.
Technically could be all the same CACHELINE_GROUP().

I'm not sure if (at most) 3 cacheline groups are fine for the general
case, but it would be best to have just one variant of the
CACHELINE_STRUCT_GROUP(), perhaps as a vararg.

2024-06-12 20:56:38

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers

On Wed, 12 Jun 2024 12:07:05 +0200 Przemek Kitszel wrote:
> Given that it will be a generic solution (would fix the [1] above),
> and be also easier to use, like:
>
> CACHELINE_STRUCT_GROUP(idpf_q_vector,
> CACHELINE_STRUCT_GROUP_RD(/* read mostly */
> struct idpf_vport *vport;
> u16 num_rxq;
> u16 num_txq;
> u16 num_bufq;
> u16 num_complq;
> struct idpf_rx_queue **rx;
> struct idpf_tx_queue **tx;
> struct idpf_buf_queue **bufq;
> struct idpf_compl_queue **complq;
> struct idpf_intr_reg intr_reg;
> ),
> CACHELINE_STRUCT_GROUP_RW(
> struct napi_struct napi;
> u16 total_events;
> struct dim tx_dim;
> u16 tx_itr_value;
> bool tx_intr_mode;
> u32 tx_itr_idx;
> struct dim rx_dim;
> u16 rx_itr_value;
> bool rx_intr_mode;
> u32 rx_itr_idx;
> ),
> CACHELINE_STRUCT_GROUP_COLD(
> u16 v_idx;
> cpumask_var_t affinity_mask;
> )
> );
>
> Note that those three inner macros have distinct meaningful names not to
> have this working, but to aid human reader, then checkpatch/check-kdoc.
> Technically could be all the same CACHELINE_GROUP().
>
> I'm not sure if (at most) 3 cacheline groups are fine for the general
> case, but it would be best to have just one variant of the
> CACHELINE_STRUCT_GROUP(), perhaps as a vararg.

I almost want to CC Linus on this because I think it's mostly about
personal preferences. I dislike the struct_group()-style macros. They
don't scale (imagine having to define two partially overlapping groups)
and don't look like C to my eyes. Kees really had to do this for his
memory safety work because we need to communicate a "real struct" type
to the compiler, but if you're just doing this so fail the build and
make the developer stop to think - it's not worth the ugliness.

Can we not extend __cacheline_group_begin() and __cacheline_group_end()
-style markings?

2024-06-13 10:48:32

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers

From: Jakub Kicinski <[email protected]>
Date: Wed, 29 May 2024 18:34:09 -0700

> On Tue, 28 May 2024 15:48:35 +0200 Alexander Lobakin wrote:
>> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
>> index 95a59ac78f82..d0cf9a2d82de 100755
>> --- a/scripts/kernel-doc
>> +++ b/scripts/kernel-doc
>> @@ -1155,6 +1155,7 @@ sub dump_struct($$) {
>> $members =~ s/\bstruct_group_attr\s*\(([^,]*,){2}/STRUCT_GROUP(/gos;
>> $members =~ s/\bstruct_group_tagged\s*\(([^,]*),([^,]*),/struct $1 $2; STRUCT_GROUP(/gos;
>> $members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos;
>> + $members =~ s/\blibeth_cacheline_group\s*\(([^,]*,)/struct { } $1; STRUCT_GROUP(/gos;
>> $members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
>>
>> my $args = qr{([^,)]+)};
>
> Having per-driver grouping defines is a no-go.

Without it, kdoc warns when I want to describe group fields =\

> Do you need the defines in the first place?

They allow to describe CLs w/o repeating boilerplates like

cacheline_group_begin(blah) __aligned(blah)
fields
cacheline_group_end(blah)

> Are you sure the assert you're adding are not going to explode
> on some weird arch? Honestly, patch 5 feels like a little too

I was adjusting and testing it a lot and CI finally started building
every arch with no issues some time ago, so yes, I'm sure.
64-byte CL on 64-bit arch behaves the same everywhere, so the assertions
for it can be more strict. On other arches, the behaviour is the same as
how Eric asserts netdev cachelines in the core code.

> much for a driver..

We had multiple situations when our team were optimizing the structure
layout and then someone added a new field and messed up the layout
again. So I ended up with strict assertions.
Why is it too much if we have the same stuff for the netdev core?

Thanks,
Olek

2024-06-13 10:59:36

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb()

From: Jakub Kicinski <[email protected]>
Date: Wed, 29 May 2024 18:40:12 -0700

> On Tue, 28 May 2024 15:48:45 +0200 Alexander Lobakin wrote:
>> Currently, idpf uses the following model for the header buffers:
>>
>> * buffers are allocated via dma_alloc_coherent();
>> * when receiving, napi_alloc_skb() is called and then the header is
>> copied to the newly allocated linear part.
>>
>> This is far from optimal as DMA coherent zone is slow on many systems
>> and memcpy() neutralizes the idea and benefits of the header split. Not
>> speaking of that XDP can't be run on DMA coherent buffers, but at the
>> same time the idea of allocating an skb to run XDP program is ill.
>> Instead, use libeth to create page_pools for the header buffers, allocate
>> them dynamically and then build an skb via napi_build_skb() around them
>> with no memory copy. With one exception...
>> When you enable header split, you except you'll always have a separate
>
> accept

"expect" :D Thanks for spotting, nice catch.

>
>> header buffer, so that you could reserve headroom and tailroom only
>> there and then use full buffers for the data. For example, this is how
>> TCP zerocopy works -- you have to have the payload aligned to PAGE_SIZE.
>> The current hardware running idpf does *not* guarantee that you'll
>> always have headers placed separately. For example, on my setup, even
>> ICMP packets are written as one piece to the data buffers. You can't
>> build a valid skb around a data buffer in this case.
>> To not complicate things and not lose TCP zerocopy etc., when such thing
>> happens, use the empty header buffer and pull either full frame (if it's
>> short) or the Ethernet header there and build an skb around it. GRO
>> layer will pull more from the data buffer later. This W/A will hopefully
>> be removed one day.
>
> Hopefully soon, cause it will prevent you from mapping data buffers to
> user space or using DMABUF memory :(

Correct. The HW team is informed and some work on it is happening right
now. I told them that stuff like devmem etc., i.e. when the kernel
doesn't have access to the data buffers, will just choke on this.

I mean, I don't care about unknown packet types since it's very unlikely
and it's almost always some garbage, as well as when the header doesn't
fit into 256 bytes (I can't imagine such situation with known protocols,
but if it can happen, I can bump to 512 or so), but currently, and it's
a shame, idpf does header split only for TCP/UDP/SCTP, on my setup it
didn't want to split even regular ICMP, although it parsed the type
correctly =\ I dunno why they did it this way.
There are some configuration bits which in theory allow you to enable
hsplit for other types of frames, but enabling them didn't change
anything unfortunately.

Thanks,
Olek

2024-06-13 11:03:46

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures

From: Simon Horman <[email protected]>
Date: Sat, 1 Jun 2024 09:53:08 +0100

> On Tue, May 28, 2024 at 03:48:37PM +0200, Alexander Lobakin wrote:
>> Currently, sizeof(struct idpf_queue) is 32 Kb.
>> This is due to the 12-bit hashtable declaration at the end of the queue.
>> This HT is needed only for Tx queues when the flow scheduling mode is
>> enabled. But &idpf_queue is unified for all of the queue types,
>> provoking excessive memory usage.
>> The unified structure in general makes the code less effective via
>> suboptimal fields placement. You can't avoid that unless you make unions
>> each 2 fields. Even then, different field alignment etc., doesn't allow
>> you to optimize things to the limit.
>> Split &idpf_queue into 4 structures corresponding to the queue types:
>> RQ (Rx queue), SQ (Tx queue), FQ (buffer queue), and CQ (completion
>> queue). Place only needed fields there and shortcuts handy for hotpath.
>> Allocate the abovementioned hashtable dynamically and only when needed,
>> keeping &idpf_tx_queue relatively short (192 bytes, same as Rx). This HT
>> is used only for OOO completions, which aren't really hotpath anyway.
>> Note that this change must be done atomically, otherwise it's really
>> easy to get lost and miss something.
>>
>> Signed-off-by: Alexander Lobakin <[email protected]>
>
> ...
>
>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
>
> ...
>
>> @@ -1158,20 +1325,22 @@ static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
>> */
>> static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
>> {
>> - bool flow_sch_en;
>> - int err, i;
>> + bool split, flow_sch_en;
>> + int i;
>>
>> vport->txq_grps = kcalloc(vport->num_txq_grp,
>> sizeof(*vport->txq_grps), GFP_KERNEL);
>> if (!vport->txq_grps)
>> return -ENOMEM;
>>
>> + split = idpf_is_queue_model_split(vport->txq_model);
>> flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
>> VIRTCHNL2_CAP_SPLITQ_QSCHED);
>>
>> for (i = 0; i < vport->num_txq_grp; i++) {
>> struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
>> struct idpf_adapter *adapter = vport->adapter;
>> + struct idpf_txq_stash *stashes;
>> int j;
>>
>> tx_qgrp->vport = vport;
>> @@ -1180,45 +1349,62 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
>> for (j = 0; j < tx_qgrp->num_txq; j++) {
>> tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
>> GFP_KERNEL);
>> - if (!tx_qgrp->txqs[j]) {
>> - err = -ENOMEM;
>> + if (!tx_qgrp->txqs[j])
>> goto err_alloc;
>> - }
>> + }
>> +
>> + if (split && flow_sch_en) {
>> + stashes = kcalloc(num_txq, sizeof(*stashes),
>> + GFP_KERNEL);
>
> Hi Alexander,
>
> Here stashes is assigned a memory allocation and
> then then assigned to tx_qgrp->stashes a few lines below...
>
>> + if (!stashes)
>> + goto err_alloc;
>> +
>> + tx_qgrp->stashes = stashes;
>> }
>>
>> for (j = 0; j < tx_qgrp->num_txq; j++) {
>> - struct idpf_queue *q = tx_qgrp->txqs[j];
>> + struct idpf_tx_queue *q = tx_qgrp->txqs[j];
>>
>> q->dev = &adapter->pdev->dev;
>> q->desc_count = vport->txq_desc_count;
>> q->tx_max_bufs = idpf_get_max_tx_bufs(adapter);
>> q->tx_min_pkt_len = idpf_get_min_tx_pkt_len(adapter);
>> - q->vport = vport;
>> + q->netdev = vport->netdev;
>> q->txq_grp = tx_qgrp;
>> - hash_init(q->sched_buf_hash);
>>
>> - if (flow_sch_en)
>> - set_bit(__IDPF_Q_FLOW_SCH_EN, q->flags);
>> + if (!split) {
>> + q->clean_budget = vport->compln_clean_budget;
>> + idpf_queue_assign(CRC_EN, q,
>> + vport->crc_enable);
>> + }
>> +
>> + if (!flow_sch_en)
>> + continue;
>> +
>> + if (split) {
>
> ... but here elements of stashes seem to be assigned to q->stash
> without stashes having being initialised.
>
> Flagged by Smatch

Hi! Yes, I saw the report, but isn't it a false positive?

Allocation happens when `split && flow_sch_en`, and here we have

if (!flow_sch_en)
continue;

if (split)
// assign

IOW the assignment can't happen without the allocation?

>
>> + q->stash = &stashes[j];
>> + hash_init(q->stash->sched_buf_hash);
>> + }
>> +
>> + idpf_queue_set(FLOW_SCH_EN, q);

Thanks,
Olek

2024-06-13 11:06:44

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer

From: Simon Horman <[email protected]>
Date: Sat, 1 Jun 2024 10:08:42 +0100

> + Dan Carpenter
>
> On Tue, May 28, 2024 at 03:48:46PM +0200, Alexander Lobakin wrote:
>> idpf uses Page Pool for data buffers with hardcoded buffer lengths of
>> 4k for "classic" buffers and 2k for "short" ones. This is not flexible
>> and does not ensure optimal memory usage. Why would you need 4k buffers
>> when the MTU is 1500?
>> Use libeth for the data buffers and don't hardcode any buffer sizes. Let
>> them be calculated from the MTU for "classics" and then divide the
>> truesize by 2 for "short" ones. The memory usage is now greatly reduced
>> and 2 buffer queues starts make sense: on frames <= 1024, you'll recycle
>> (and resync) a page only after 4 HW writes rather than two.
>>
>> Signed-off-by: Alexander Lobakin <[email protected]>
>
> ...
>
>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
>
> ...
>
> Hi Alexander,
>
> The code above the hunk below, starting at line 3321, is:
>
> if (unlikely(!hdr_len && !skb)) {
> hdr_len = idpf_rx_hsplit_wa(hdr, rx_buf, pkt_len);
> pkt_len -= hdr_len;
> u64_stats_update_begin(&rxq->stats_sync);
> u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf);
> u64_stats_update_end(&rxq->stats_sync);
> }
> if (libeth_rx_sync_for_cpu(hdr, hdr_len)) {
> skb = idpf_rx_build_skb(hdr, hdr_len);
> if (!skb)
> break;
> u64_stats_update_begin(&rxq->stats_sync);
> u64_stats_inc(&rxq->q_stats.hsplit_pkts);
> u64_stats_update_end(&rxq->stats_sync);
> }
>
>> @@ -3413,24 +3340,24 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
>> hdr->page = NULL;
>>
>> payload:
>> - if (pkt_len) {
>> - idpf_rx_sync_for_cpu(rx_buf, pkt_len);
>> - if (skb)
>> - idpf_rx_add_frag(rx_buf, skb, pkt_len);
>> - else
>> - skb = idpf_rx_construct_skb(rxq, rx_buf,
>> - pkt_len);
>> - } else {
>> - idpf_rx_put_page(rx_buf);
>> - }
>> + if (!libeth_rx_sync_for_cpu(rx_buf, pkt_len))
>> + goto skip_data;
>> +
>> + if (skb)
>> + idpf_rx_add_frag(rx_buf, skb, pkt_len);
>> + else
>> + skb = idpf_rx_build_skb(rx_buf, pkt_len);
>>
>> /* exit if we failed to retrieve a buffer */
>> if (!skb)
>> break;
>>
>> - idpf_rx_post_buf_refill(refillq, buf_id);
>> +skip_data:
>> + rx_buf->page = NULL;
>>
>> + idpf_rx_post_buf_refill(refillq, buf_id);
>> IDPF_RX_BUMP_NTC(rxq, ntc);
>> +
>> /* skip if it is non EOP desc */
>> if (!idpf_rx_splitq_is_eop(rx_desc))
>> continue;
>
> The code following this hunk, ending at line 3372, looks like this:
>
> /* pad skb if needed (to make valid ethernet frame) */
> if (eth_skb_pad(skb)) {
> skb = NULL;
> continue;
> }
> /* probably a little skewed due to removing CRC */
> total_rx_bytes += skb->len;
>
> Smatch warns that:
> .../idpf_txrx.c:3372 idpf_rx_splitq_clean() error: we previously assumed 'skb' could be null (see line 3321)
>
> I think, but am not sure, this is because it thinks skb might
> be NULL at the point where "goto skip_data;" is now called above.
>
> Could you look into this?

This is actually a good catch. skb indeed could be NULL and we needed to
check that in the same condition where !eop is checked.
Fixed already in my tree, so it will be fixed in v2. Thanks for catching!

(BTW I fixed that in iavf when submitting the libeth series, but forgot
to fix that here lol >_<)
(Also, it was implicitly fixed in the later commits where I convert skb
to xdp_buff here, so I didn't catch this one)

Thanks,
Olek

2024-06-13 13:47:29

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers

On Thu, 13 Jun 2024 12:47:33 +0200 Alexander Lobakin wrote:
> > Having per-driver grouping defines is a no-go.
>
> Without it, kdoc warns when I want to describe group fields =\
>
> > Do you need the defines in the first place?
>
> They allow to describe CLs w/o repeating boilerplates like
>
> cacheline_group_begin(blah) __aligned(blah)
> fields
> cacheline_group_end(blah)

And you assert that your boilerplate is somehow nicer than this?
See my reply to Przemek, I don't think so, and neither do other
maintainers, judging by how the socket grouping was done.
You can add new markers to include the align automatically too, etc.

> > Are you sure the assert you're adding are not going to explode
> > on some weird arch? Honestly, patch 5 feels like a little too
>
> I was adjusting and testing it a lot and CI finally started building
> every arch with no issues some time ago, so yes, I'm sure.
> 64-byte CL on 64-bit arch behaves the same everywhere, so the assertions
> for it can be more strict. On other arches, the behaviour is the same as
> how Eric asserts netdev cachelines in the core code.
>
> > much for a driver..
>
> We had multiple situations when our team were optimizing the structure
> layout and then someone added a new field and messed up the layout
> again. So I ended up with strict assertions.

I understand. Not 100% sure I agree but depends on the team, so okay.

> Why is it too much if we have the same stuff for the netdev core?

But we didn't add tcp_* macros and sock_* macros etc.
Improve the stuff in cache.h is you think its worth it.
And no struct_groups() please.

2024-06-15 07:33:15

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures

On Thu, Jun 13, 2024 at 01:03:00PM +0200, Alexander Lobakin wrote:
> From: Simon Horman <[email protected]>
> Date: Sat, 1 Jun 2024 09:53:08 +0100
>
> > On Tue, May 28, 2024 at 03:48:37PM +0200, Alexander Lobakin wrote:
> >> Currently, sizeof(struct idpf_queue) is 32 Kb.
> >> This is due to the 12-bit hashtable declaration at the end of the queue.
> >> This HT is needed only for Tx queues when the flow scheduling mode is
> >> enabled. But &idpf_queue is unified for all of the queue types,
> >> provoking excessive memory usage.
> >> The unified structure in general makes the code less effective via
> >> suboptimal fields placement. You can't avoid that unless you make unions
> >> each 2 fields. Even then, different field alignment etc., doesn't allow
> >> you to optimize things to the limit.
> >> Split &idpf_queue into 4 structures corresponding to the queue types:
> >> RQ (Rx queue), SQ (Tx queue), FQ (buffer queue), and CQ (completion
> >> queue). Place only needed fields there and shortcuts handy for hotpath.
> >> Allocate the abovementioned hashtable dynamically and only when needed,
> >> keeping &idpf_tx_queue relatively short (192 bytes, same as Rx). This HT
> >> is used only for OOO completions, which aren't really hotpath anyway.
> >> Note that this change must be done atomically, otherwise it's really
> >> easy to get lost and miss something.
> >>
> >> Signed-off-by: Alexander Lobakin <[email protected]>
> >
> > ...
> >
> >> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> >
> > ...
> >
> >> @@ -1158,20 +1325,22 @@ static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
> >> */
> >> static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> >> {
> >> - bool flow_sch_en;
> >> - int err, i;
> >> + bool split, flow_sch_en;
> >> + int i;
> >>
> >> vport->txq_grps = kcalloc(vport->num_txq_grp,
> >> sizeof(*vport->txq_grps), GFP_KERNEL);
> >> if (!vport->txq_grps)
> >> return -ENOMEM;
> >>
> >> + split = idpf_is_queue_model_split(vport->txq_model);
> >> flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
> >> VIRTCHNL2_CAP_SPLITQ_QSCHED);
> >>
> >> for (i = 0; i < vport->num_txq_grp; i++) {
> >> struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
> >> struct idpf_adapter *adapter = vport->adapter;
> >> + struct idpf_txq_stash *stashes;
> >> int j;
> >>
> >> tx_qgrp->vport = vport;
> >> @@ -1180,45 +1349,62 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> >> for (j = 0; j < tx_qgrp->num_txq; j++) {
> >> tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
> >> GFP_KERNEL);
> >> - if (!tx_qgrp->txqs[j]) {
> >> - err = -ENOMEM;
> >> + if (!tx_qgrp->txqs[j])
> >> goto err_alloc;
> >> - }
> >> + }
> >> +
> >> + if (split && flow_sch_en) {
> >> + stashes = kcalloc(num_txq, sizeof(*stashes),
> >> + GFP_KERNEL);
> >
> > Hi Alexander,
> >
> > Here stashes is assigned a memory allocation and
> > then then assigned to tx_qgrp->stashes a few lines below...
> >
> >> + if (!stashes)
> >> + goto err_alloc;
> >> +
> >> + tx_qgrp->stashes = stashes;
> >> }
> >>
> >> for (j = 0; j < tx_qgrp->num_txq; j++) {
> >> - struct idpf_queue *q = tx_qgrp->txqs[j];
> >> + struct idpf_tx_queue *q = tx_qgrp->txqs[j];
> >>
> >> q->dev = &adapter->pdev->dev;
> >> q->desc_count = vport->txq_desc_count;
> >> q->tx_max_bufs = idpf_get_max_tx_bufs(adapter);
> >> q->tx_min_pkt_len = idpf_get_min_tx_pkt_len(adapter);
> >> - q->vport = vport;
> >> + q->netdev = vport->netdev;
> >> q->txq_grp = tx_qgrp;
> >> - hash_init(q->sched_buf_hash);
> >>
> >> - if (flow_sch_en)
> >> - set_bit(__IDPF_Q_FLOW_SCH_EN, q->flags);
> >> + if (!split) {
> >> + q->clean_budget = vport->compln_clean_budget;
> >> + idpf_queue_assign(CRC_EN, q,
> >> + vport->crc_enable);
> >> + }
> >> +
> >> + if (!flow_sch_en)
> >> + continue;
> >> +
> >> + if (split) {
> >
> > ... but here elements of stashes seem to be assigned to q->stash
> > without stashes having being initialised.
> >
> > Flagged by Smatch
>
> Hi! Yes, I saw the report, but isn't it a false positive?
>
> Allocation happens when `split && flow_sch_en`, and here we have
>
> if (!flow_sch_en)
> continue;
>
> if (split)
> // assign
>
> IOW the assignment can't happen without the allocation?

Thanks, and sorry for missing the points you highlight above.
I agree that this is a false positive.

>
> >
> >> + q->stash = &stashes[j];
> >> + hash_init(q->stash->sched_buf_hash);
> >> + }
> >> +
> >> + idpf_queue_set(FLOW_SCH_EN, q);
>
> Thanks,
> Olek
>

2024-06-15 07:35:57

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer

On Thu, Jun 13, 2024 at 01:05:58PM +0200, Alexander Lobakin wrote:
> From: Simon Horman <[email protected]>
> Date: Sat, 1 Jun 2024 10:08:42 +0100
>
> > + Dan Carpenter
> >
> > On Tue, May 28, 2024 at 03:48:46PM +0200, Alexander Lobakin wrote:
> >> idpf uses Page Pool for data buffers with hardcoded buffer lengths of
> >> 4k for "classic" buffers and 2k for "short" ones. This is not flexible
> >> and does not ensure optimal memory usage. Why would you need 4k buffers
> >> when the MTU is 1500?
> >> Use libeth for the data buffers and don't hardcode any buffer sizes. Let
> >> them be calculated from the MTU for "classics" and then divide the
> >> truesize by 2 for "short" ones. The memory usage is now greatly reduced
> >> and 2 buffer queues starts make sense: on frames <= 1024, you'll recycle
> >> (and resync) a page only after 4 HW writes rather than two.
> >>
> >> Signed-off-by: Alexander Lobakin <[email protected]>
> >
> > ...
> >
> >> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> >
> > ...
> >
> > Hi Alexander,
> >
> > The code above the hunk below, starting at line 3321, is:
> >
> > if (unlikely(!hdr_len && !skb)) {
> > hdr_len = idpf_rx_hsplit_wa(hdr, rx_buf, pkt_len);
> > pkt_len -= hdr_len;
> > u64_stats_update_begin(&rxq->stats_sync);
> > u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf);
> > u64_stats_update_end(&rxq->stats_sync);
> > }
> > if (libeth_rx_sync_for_cpu(hdr, hdr_len)) {
> > skb = idpf_rx_build_skb(hdr, hdr_len);
> > if (!skb)
> > break;
> > u64_stats_update_begin(&rxq->stats_sync);
> > u64_stats_inc(&rxq->q_stats.hsplit_pkts);
> > u64_stats_update_end(&rxq->stats_sync);
> > }
> >
> >> @@ -3413,24 +3340,24 @@ static int idpf_rx_splitq_clean(struct idpf_rx_queue *rxq, int budget)
> >> hdr->page = NULL;
> >>
> >> payload:
> >> - if (pkt_len) {
> >> - idpf_rx_sync_for_cpu(rx_buf, pkt_len);
> >> - if (skb)
> >> - idpf_rx_add_frag(rx_buf, skb, pkt_len);
> >> - else
> >> - skb = idpf_rx_construct_skb(rxq, rx_buf,
> >> - pkt_len);
> >> - } else {
> >> - idpf_rx_put_page(rx_buf);
> >> - }
> >> + if (!libeth_rx_sync_for_cpu(rx_buf, pkt_len))
> >> + goto skip_data;
> >> +
> >> + if (skb)
> >> + idpf_rx_add_frag(rx_buf, skb, pkt_len);
> >> + else
> >> + skb = idpf_rx_build_skb(rx_buf, pkt_len);
> >>
> >> /* exit if we failed to retrieve a buffer */
> >> if (!skb)
> >> break;
> >>
> >> - idpf_rx_post_buf_refill(refillq, buf_id);
> >> +skip_data:
> >> + rx_buf->page = NULL;
> >>
> >> + idpf_rx_post_buf_refill(refillq, buf_id);
> >> IDPF_RX_BUMP_NTC(rxq, ntc);
> >> +
> >> /* skip if it is non EOP desc */
> >> if (!idpf_rx_splitq_is_eop(rx_desc))
> >> continue;
> >
> > The code following this hunk, ending at line 3372, looks like this:
> >
> > /* pad skb if needed (to make valid ethernet frame) */
> > if (eth_skb_pad(skb)) {
> > skb = NULL;
> > continue;
> > }
> > /* probably a little skewed due to removing CRC */
> > total_rx_bytes += skb->len;
> >
> > Smatch warns that:
> > .../idpf_txrx.c:3372 idpf_rx_splitq_clean() error: we previously assumed 'skb' could be null (see line 3321)
> >
> > I think, but am not sure, this is because it thinks skb might
> > be NULL at the point where "goto skip_data;" is now called above.
> >
> > Could you look into this?
>
> This is actually a good catch. skb indeed could be NULL and we needed to
> check that in the same condition where !eop is checked.
> Fixed already in my tree, so it will be fixed in v2. Thanks for catching!
>
> (BTW I fixed that in iavf when submitting the libeth series, but forgot
> to fix that here lol >_<)
> (Also, it was implicitly fixed in the later commits where I convert skb
> to xdp_buff here, so I didn't catch this one)

Thanks, much appreciated.
As I mentioned above, I wasn't sure about this one.