2023-03-29 16:45:33

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [PATCH bpf RFC-V2 0/5] XDP-hints: API change for RX-hash kfunc bpf_xdp_metadata_rx_hash

Notice targeted 6.3-rc kernel via bpf git tree.

Current API for bpf_xdp_metadata_rx_hash() returns the raw RSS hash value,
but doesn't provide information on the RSS hash type (part of 6.3-rc).

This patchset proposal is to use the return value from
bpf_xdp_metadata_rx_hash() to provide the RSS hash type.

Alternatively we disable bpf_xdp_metadata_rx_hash() in 6.3-rc, and have
more time to nitpick the RSS hash-type bits.

---

Jesper Dangaard Brouer (5):
xdp: rss hash types representation
igc: bpf_xdp_metadata_rx_hash return xdp rss hash type
veth: bpf_xdp_metadata_rx_hash return xdp rss hash type
mlx5: bpf_xdp_metadata_rx_hash return xdp rss hash type
mlx4: bpf_xdp_metadata_rx_hash return xdp rss hash type


drivers/net/ethernet/intel/igc/igc_main.c | 22 +++++-
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 20 ++++-
.../net/ethernet/mellanox/mlx5/core/en/xdp.c | 61 +++++++++++++-
drivers/net/veth.c | 7 +-
include/linux/mlx5/device.h | 14 +++-
include/net/xdp.h | 79 +++++++++++++++++++
net/core/xdp.c | 4 +-
7 files changed, 196 insertions(+), 11 deletions(-)

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer


2023-03-29 16:46:56

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [PATCH bpf RFC-V2 3/5] veth: bpf_xdp_metadata_rx_hash return xdp rss hash type

Update API for bpf_xdp_metadata_rx_hash() by returning xdp rss hash type.

The veth driver currently only support XDP-hints based on SKB code path.
The SKB have lost information about the RSS hash type, by compressing
the information down to a single bitfield skb->l4_hash, that only knows
if this was a L4 hash value.

In preparation for veth, the xdp_rss_hash_type have an L4 indication
bit that allow us to return a meaningful L4 indication when working
with SKB based packets.

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
---
drivers/net/veth.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 046461ee42ea..770eee664b4c 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1619,12 +1619,13 @@ static int veth_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
{
struct veth_xdp_buff *_ctx = (void *)ctx;
+ struct sk_buff *skb = _ctx->skb;

- if (!_ctx->skb)
+ if (!skb)
return -ENODATA;

- *hash = skb_get_hash(_ctx->skb);
- return 0;
+ *hash = skb_get_hash(skb);
+ return skb->l4_hash ? XDP_RSS_TYPE_L4_ANY : XDP_RSS_TYPE_NONE;
}

static const struct net_device_ops veth_netdev_ops = {


2023-03-29 16:48:02

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [PATCH bpf RFC-V2 5/5] mlx4: bpf_xdp_metadata_rx_hash return xdp rss hash type

Update API for bpf_xdp_metadata_rx_hash() by returning xdp rss hash type
via matching indiviual Completion Queue Entry (CQE) status bits.

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 4b5e459b6d49..f3b7351daa05 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -684,12 +684,28 @@ int mlx4_en_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
int mlx4_en_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
{
struct mlx4_en_xdp_buff *_ctx = (void *)ctx;
+ struct mlx4_cqe *cqe = _ctx->cqe;
+ enum xdp_rss_hash_type xht = 0;
+ __be16 status;

if (unlikely(!(_ctx->dev->features & NETIF_F_RXHASH)))
return -ENODATA;

- *hash = be32_to_cpu(_ctx->cqe->immed_rss_invalid);
- return 0;
+ status = cqe->status;
+ if (status & cpu_to_be16(MLX4_CQE_STATUS_TCP))
+ xht = XDP_RSS_L4_TCP;
+ if (status & cpu_to_be16(MLX4_CQE_STATUS_UDP))
+ xht = XDP_RSS_L4_UDP;
+ if (status & cpu_to_be16(MLX4_CQE_STATUS_IPV4|MLX4_CQE_STATUS_IPV4F))
+ xht |= XDP_RSS_L3_IPV4;
+ if (status & cpu_to_be16(MLX4_CQE_STATUS_IPV6)) {
+ xht |= XDP_RSS_L3_IPV6;
+ if (cqe->ipv6_ext_mask)
+ xht |= XDP_RSS_BIT_EX;
+ }
+
+ *hash = be32_to_cpu(cqe->immed_rss_invalid);
+ return xht;
}

int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int budget)


2023-03-29 16:50:29

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [PATCH bpf RFC-V2 1/5] xdp: rss hash types representation

The RSS hash type specifies what portion of packet data NIC hardware used
when calculating RSS hash value. The RSS types are focused on Internet
traffic protocols at OSI layers L3 and L4. L2 (e.g. ARP) often get hash
value zero and no RSS type. For L3 focused on IPv4 vs. IPv6, and L4
primarily TCP vs UDP, but some hardware supports SCTP.

Hardware RSS types are differently encoded for each hardware NIC. Most
hardware represent RSS hash type as a number. Determining L3 vs L4 often
requires a mapping table as there often isn't a pattern or sorting
according to ISO layer.

The patch introduce a XDP RSS hash type (xdp_rss_hash_type) that can both
be seen as a number that is ordered according by ISO layer, and can be bit
masked to separate IPv4 and IPv6 types for L4 protocols. Room is available
for extending later while keeping these properties. This maps and unifies
difference to hardware specific hashes.

This proposal change the kfunc API bpf_xdp_metadata_rx_hash() to return
this RSS hash type on success.

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
---
include/net/xdp.h | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
net/core/xdp.c | 4 ++-
2 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 5393b3ebe56e..1b2b17625c26 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -8,6 +8,7 @@

#include <linux/skbuff.h> /* skb_shared_info */
#include <uapi/linux/netdev.h>
+#include <linux/bitfield.h>

/**
* DOC: XDP RX-queue information
@@ -396,6 +397,81 @@ XDP_METADATA_KFUNC_xxx
MAX_XDP_METADATA_KFUNC,
};

+/* For partitioning of xdp_rss_hash_type */
+#define RSS_L3 GENMASK(2,0) /* 3-bits = values between 1-7 */
+#define L4_BIT BIT(3) /* 1-bit - L4 indication */
+#define RSS_L4_IPV4 GENMASK(6,4) /* 3-bits */
+#define RSS_L4_IPV6 GENMASK(9,7) /* 3-bits */
+#define RSS_L4 GENMASK(9,3) /* = 7-bits - covering L4 IPV4+IPV6 */
+#define L4_IPV6_EX_BIT BIT(9) /* 1-bit - L4 IPv6 with Extension hdr */
+ /* 11-bits in total */
+
+/* Lower 4-bits value of xdp_rss_hash_type */
+enum xdp_rss_L4 {
+ XDP_RSS_L4_MASK = GENMASK(3,0), /* 4-bits = values between 0-15 */
+ XDP_RSS_L4_NONE = 0, /* Not L4 based hash */
+ XDP_RSS_L4_ANY = 1, /* L4 based hash but protocol unknown */
+ XDP_RSS_L4_TCP = 2,
+ XDP_RSS_L4_UDP = 3,
+ XDP_RSS_L4_SCTP = 4,
+ XDP_RSS_L4_IPSEC = 5, /* L4 based hash include IPSEC SPI */
+/*
+ RFC: We don't care about vasting space, then we could just store the
+ protocol number (8-bits) directly. See /etc/protocols
+ XDP_RSS_L4_TCP = 6,
+ XDP_RSS_L4_UDP = 17,
+ XDP_RSS_L4_SCTP = 132,
+ XDP_RSS_L4_IPSEC_ESP = 50, // Issue: mlx5 didn't say ESP or AH
+ XDP_RSS_L4_IPSEC_AH = 51, // both ESP+AH just include SPI in hash
+ */
+};
+
+/* Values shifted for use in xdp_rss_hash_type */
+enum xdp_rss_L3 {
+ XDP_RSS_L3_MASK = GENMASK(5,4), /* 2-bits = values between 1-3 */
+ XDP_RSS_L3_IPV4 = FIELD_PREP_CONST(XDP_RSS_L3_MASK, 1),
+ XDP_RSS_L3_IPV6 = FIELD_PREP_CONST(XDP_RSS_L3_MASK, 2),
+};
+
+/* Bits shifted for use in xdp_rss_hash_type */
+enum xdp_rss_bit {
+ XDP_RSS_BIT_MASK = GENMASK(7,6), /* 2-bits */
+ /* IPv6 Extension Hdr */
+ XDP_RSS_BIT_EX = FIELD_PREP_CONST(XDP_RSS_BIT_MASK, BIT(0)),
+ /* XDP_RSS_BIT_VLAN ??? = FIELD_PREP_CONST(XDP_RSS_BIT_MASK, BIT(1)), */
+};
+
+/* RSS hash type combinations used for driver HW mapping */
+enum xdp_rss_hash_type {
+ XDP_RSS_TYPE_NONE = 0,
+ XDP_RSS_TYPE_L2 = XDP_RSS_TYPE_NONE,
+
+ XDP_RSS_TYPE_L3_MASK = XDP_RSS_L3_MASK,
+ XDP_RSS_TYPE_L3_IPV4 = XDP_RSS_L3_IPV4,
+ XDP_RSS_TYPE_L3_IPV6 = XDP_RSS_L3_IPV6,
+ XDP_RSS_TYPE_L3_IPV6_EX = XDP_RSS_L3_IPV6 | XDP_RSS_BIT_EX,
+
+ XDP_RSS_TYPE_L4_MASK = XDP_RSS_L4_MASK,
+ XDP_RSS_TYPE_L4_ANY = XDP_RSS_L4_ANY,
+ XDP_RSS_TYPE_L4_IPV4_TCP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_TCP,
+ XDP_RSS_TYPE_L4_IPV4_UDP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_UDP,
+ XDP_RSS_TYPE_L4_IPV4_SCTP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_SCTP,
+
+ XDP_RSS_TYPE_L4_IPV6_TCP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_TCP,
+ XDP_RSS_TYPE_L4_IPV6_UDP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_UDP,
+ XDP_RSS_TYPE_L4_IPV6_SCTP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_UDP,
+
+ XDP_RSS_TYPE_L4_IPV6_TCP_EX = XDP_RSS_TYPE_L4_IPV6_TCP |XDP_RSS_BIT_EX,
+ XDP_RSS_TYPE_L4_IPV6_UDP_EX = XDP_RSS_TYPE_L4_IPV6_UDP |XDP_RSS_BIT_EX,
+ XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP|XDP_RSS_BIT_EX,
+};
+#undef RSS_L3
+#undef L4_BIT
+#undef RSS_L4_IPV4
+#undef RSS_L4_IPV6
+#undef RSS_L4
+#undef L4_IPV6_EX_BIT
+
#ifdef CONFIG_NET
u32 bpf_xdp_metadata_kfunc_id(int id);
bool bpf_dev_bound_kfunc_id(u32 btf_id);
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 7133017bcd74..81d41df30695 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -721,12 +721,14 @@ __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
* @hash: Return value pointer.
*
* Return:
- * * Returns 0 on success or ``-errno`` on error.
+ * * Returns (positive) RSS hash **type** on success or ``-errno`` on error.
+ * * ``enum xdp_rss_hash_type`` : RSS hash type
* * ``-EOPNOTSUPP`` : means device driver doesn't implement kfunc
* * ``-ENODATA`` : means no RX-hash available for this frame
*/
__bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
{
+ BTF_TYPE_EMIT(enum xdp_rss_hash_type);
return -EOPNOTSUPP;
}



2023-03-29 16:54:16

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [PATCH bpf RFC-V2 2/5] igc: bpf_xdp_metadata_rx_hash return xdp rss hash type

Update API for bpf_xdp_metadata_rx_hash() by returning xdp rss hash type
via mapping table.

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index b382476f347c..a14f0597524a 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -6496,6 +6496,26 @@ static int igc_xdp_rx_timestamp(const struct xdp_md *_ctx, u64 *timestamp)
return -ENODATA;
}

+/* Mapping HW RSS Type to enum xdp_rss_hash_type */
+enum xdp_rss_hash_type igc_xdp_rss_type[IGC_RSS_TYPE_MAX_TABLE] = {
+ [IGC_RSS_TYPE_NO_HASH] = XDP_RSS_TYPE_L2,
+ [IGC_RSS_TYPE_HASH_TCP_IPV4] = XDP_RSS_TYPE_L4_IPV4_TCP,
+ [IGC_RSS_TYPE_HASH_IPV4] = XDP_RSS_TYPE_L3_IPV4,
+ [IGC_RSS_TYPE_HASH_TCP_IPV6] = XDP_RSS_TYPE_L4_IPV6_TCP,
+ [IGC_RSS_TYPE_HASH_IPV6_EX] = XDP_RSS_TYPE_L3_IPV6_EX,
+ [IGC_RSS_TYPE_HASH_IPV6] = XDP_RSS_TYPE_L3_IPV6,
+ [IGC_RSS_TYPE_HASH_TCP_IPV6_EX] = XDP_RSS_TYPE_L4_IPV6_TCP_EX,
+ [IGC_RSS_TYPE_HASH_UDP_IPV4] = XDP_RSS_TYPE_L4_IPV4_UDP,
+ [IGC_RSS_TYPE_HASH_UDP_IPV6] = XDP_RSS_TYPE_L4_IPV6_UDP,
+ [IGC_RSS_TYPE_HASH_UDP_IPV6_EX] = XDP_RSS_TYPE_L4_IPV6_UDP_EX,
+ [10] = XDP_RSS_TYPE_NONE, /* RSS Type above 9 "Reserved" by HW */
+ [11] = XDP_RSS_TYPE_NONE, /* keep array sized for SW bit-mask */
+ [12] = XDP_RSS_TYPE_NONE, /* to handle future HW revisons */
+ [13] = XDP_RSS_TYPE_NONE,
+ [14] = XDP_RSS_TYPE_NONE,
+ [15] = XDP_RSS_TYPE_NONE,
+};
+
static int igc_xdp_rx_hash(const struct xdp_md *_ctx, u32 *hash)
{
const struct igc_xdp_buff *ctx = (void *)_ctx;
@@ -6505,7 +6525,7 @@ static int igc_xdp_rx_hash(const struct xdp_md *_ctx, u32 *hash)

*hash = le32_to_cpu(ctx->rx_desc->wb.lower.hi_dword.rss);

- return 0;
+ return igc_xdp_rss_type[igc_rss_type(ctx->rx_desc)];
}

const struct xdp_metadata_ops igc_xdp_metadata_ops = {


2023-03-29 16:57:21

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [PATCH bpf RFC-V2 4/5] mlx5: bpf_xdp_metadata_rx_hash return xdp rss hash type

Update API for bpf_xdp_metadata_rx_hash() by returning xdp rss hash type
via mapping table.

The mlx5 hardware can also identify and RSS hash IPSEC. This indicate
hash includes SPI (Security Parameters Index) as part of IPSEC hash.

Extend xdp core enum xdp_rss_hash_type with IPSEC hash type.

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 61 +++++++++++++++++++++-
include/linux/mlx5/device.h | 14 ++++-
include/net/xdp.h | 3 +
3 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index c5dae48b7932..d3dfe11f4d50 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -34,6 +34,7 @@
#include <net/xdp_sock_drv.h>
#include "en/xdp.h"
#include "en/params.h"
+#include <linux/bitfield.h>

int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk)
{
@@ -169,15 +170,71 @@ static int mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
return 0;
}

+/* Mapping HW RSS Type bits CQE_RSS_HTYPE_IP + CQE_RSS_HTYPE_L4 into 4-bits*/
+#define RSS_TYPE_MAX_TABLE 16 /* 4-bits max 16 entries */
+#define RSS_L4 GENMASK(1,0)
+#define RSS_L3 GENMASK(3,2) /* Same as CQE_RSS_HTYPE_IP */
+
+/* Valid combinations of CQE_RSS_HTYPE_IP + CQE_RSS_HTYPE_L4 sorted numerical */
+enum mlx5_rss_hash_type {
+ RSS_TYPE_NO_HASH = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IP_NONE)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_NONE)),
+ RSS_TYPE_L3_IPV4 = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV4)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_NONE)),
+ RSS_TYPE_L4_IPV4_TCP = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV4)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_TCP)),
+ RSS_TYPE_L4_IPV4_UDP = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV4)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_UDP)),
+ RSS_TYPE_L4_IPV4_IPSEC = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV4)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_IPSEC)),
+ RSS_TYPE_L3_IPV6 = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV6)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_NONE)),
+ RSS_TYPE_L4_IPV6_TCP = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV6)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_TCP)),
+ RSS_TYPE_L4_IPV6_UDP = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV6)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_UDP)),
+ RSS_TYPE_L4_IPV6_IPSEC = (FIELD_PREP_CONST(RSS_L3, CQE_RSS_IPV6)| \
+ FIELD_PREP_CONST(RSS_L4, CQE_RSS_L4_IPSEC)),
+} mlx5_rss_hash_type;
+
+/* Invalid combinations will simply return zero, allows no boundry checks */
+static const enum xdp_rss_hash_type mlx5_xdp_rss_type[RSS_TYPE_MAX_TABLE] = {
+ [RSS_TYPE_NO_HASH] = XDP_RSS_TYPE_NONE,
+ [1] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+ [2] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+ [3] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+ [RSS_TYPE_L3_IPV4] = XDP_RSS_TYPE_L3_IPV4,
+ [RSS_TYPE_L4_IPV4_TCP] = XDP_RSS_TYPE_L4_IPV4_TCP,
+ [RSS_TYPE_L4_IPV4_UDP] = XDP_RSS_TYPE_L4_IPV4_UDP,
+ [RSS_TYPE_L4_IPV4_IPSEC]= XDP_RSS_TYPE_L4_IPV4_IPSEC,
+ [RSS_TYPE_L3_IPV6] = XDP_RSS_TYPE_L3_IPV6,
+ [RSS_TYPE_L4_IPV6_TCP] = XDP_RSS_TYPE_L4_IPV6_TCP,
+ [RSS_TYPE_L4_IPV6_UDP] = XDP_RSS_TYPE_L4_IPV6_UDP,
+ [RSS_TYPE_L4_IPV6_IPSEC]= XDP_RSS_TYPE_L4_IPV6_IPSEC,
+ [12] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+ [13] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+ [14] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+ [15] = XDP_RSS_TYPE_NONE, /* Implicit zero */
+};
+
static int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
{
const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
+ const struct mlx5_cqe64 *cqe = _ctx->cqe;
+ u32 hash_type, l4_type, ip_type, lookup;

if (unlikely(!(_ctx->xdp.rxq->dev->features & NETIF_F_RXHASH)))
return -ENODATA;

- *hash = be32_to_cpu(_ctx->cqe->rss_hash_result);
- return 0;
+ *hash = be32_to_cpu(cqe->rss_hash_result);
+
+ hash_type = cqe->rss_hash_type;
+ BUILD_BUG_ON(CQE_RSS_HTYPE_IP != RSS_L3); /* same mask */
+ ip_type = hash_type & CQE_RSS_HTYPE_IP;
+ l4_type = FIELD_GET(CQE_RSS_HTYPE_L4, hash_type);
+ lookup = ip_type | l4_type;
+
+ return mlx5_xdp_rss_type[lookup];
}

const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = {
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 71b06ebad402..27aa9ae10996 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -36,6 +36,7 @@
#include <linux/types.h>
#include <rdma/ib_verbs.h>
#include <linux/mlx5/mlx5_ifc.h>
+#include <linux/bitfield.h>

#if defined(__LITTLE_ENDIAN)
#define MLX5_SET_HOST_ENDIANNESS 0
@@ -980,14 +981,23 @@ enum {
};

enum {
- CQE_RSS_HTYPE_IP = 0x3 << 2,
+ CQE_RSS_HTYPE_IP = GENMASK(3,2),
/* cqe->rss_hash_type[3:2] - IP destination selected for hash
* (00 = none, 01 = IPv4, 10 = IPv6, 11 = Reserved)
*/
- CQE_RSS_HTYPE_L4 = 0x3 << 6,
+ CQE_RSS_IP_NONE = 0x0,
+ CQE_RSS_IPV4 = 0x1,
+ CQE_RSS_IPV6 = 0x2,
+ CQE_RSS_RESERVED = 0x3,
+
+ CQE_RSS_HTYPE_L4 = GENMASK(7,6),
/* cqe->rss_hash_type[7:6] - L4 destination selected for hash
* (00 = none, 01 = TCP. 10 = UDP, 11 = IPSEC.SPI
*/
+ CQE_RSS_L4_NONE = 0x0,
+ CQE_RSS_L4_TCP = 0x1,
+ CQE_RSS_L4_UDP = 0x2,
+ CQE_RSS_L4_IPSEC = 0x3,
};

enum {
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 1b2b17625c26..b9837cb378b2 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -456,14 +456,17 @@ enum xdp_rss_hash_type {
XDP_RSS_TYPE_L4_IPV4_TCP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_TCP,
XDP_RSS_TYPE_L4_IPV4_UDP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_UDP,
XDP_RSS_TYPE_L4_IPV4_SCTP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_SCTP,
+ XDP_RSS_TYPE_L4_IPV4_IPSEC = XDP_RSS_L3_IPV4 | XDP_RSS_L4_IPSEC,

XDP_RSS_TYPE_L4_IPV6_TCP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_TCP,
XDP_RSS_TYPE_L4_IPV6_UDP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_UDP,
XDP_RSS_TYPE_L4_IPV6_SCTP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_UDP,
+ XDP_RSS_TYPE_L4_IPV6_IPSEC = XDP_RSS_L3_IPV6 | XDP_RSS_L4_IPSEC,

XDP_RSS_TYPE_L4_IPV6_TCP_EX = XDP_RSS_TYPE_L4_IPV6_TCP |XDP_RSS_BIT_EX,
XDP_RSS_TYPE_L4_IPV6_UDP_EX = XDP_RSS_TYPE_L4_IPV6_UDP |XDP_RSS_BIT_EX,
XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP|XDP_RSS_BIT_EX,
+ XDP_RSS_TYPE_L4_IPV6_IPSEC_EX= XDP_RSS_TYPE_L4_IPV6_IPSEC|XDP_RSS_BIT_EX,
};
#undef RSS_L3
#undef L4_BIT


2023-03-29 18:04:00

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH bpf RFC-V2 1/5] xdp: rss hash types representation



On 29/03/2023 18.29, Jesper Dangaard Brouer wrote:
> The RSS hash type specifies what portion of packet data NIC hardware used
> when calculating RSS hash value. The RSS types are focused on Internet
> traffic protocols at OSI layers L3 and L4. L2 (e.g. ARP) often get hash
> value zero and no RSS type. For L3 focused on IPv4 vs. IPv6, and L4
> primarily TCP vs UDP, but some hardware supports SCTP.
>
> Hardware RSS types are differently encoded for each hardware NIC. Most
> hardware represent RSS hash type as a number. Determining L3 vs L4 often
> requires a mapping table as there often isn't a pattern or sorting
> according to ISO layer.
>
> The patch introduce a XDP RSS hash type (xdp_rss_hash_type) that can both
> be seen as a number that is ordered according by ISO layer, and can be bit
> masked to separate IPv4 and IPv6 types for L4 protocols. Room is available
> for extending later while keeping these properties. This maps and unifies
> difference to hardware specific hashes.
>
> This proposal change the kfunc API bpf_xdp_metadata_rx_hash() to return
> this RSS hash type on success.
>
> Signed-off-by: Jesper Dangaard Brouer <[email protected]>
> ---
> include/net/xdp.h | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> net/core/xdp.c | 4 ++-
> 2 files changed, 79 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index 5393b3ebe56e..1b2b17625c26 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -8,6 +8,7 @@
>
> #include <linux/skbuff.h> /* skb_shared_info */
> #include <uapi/linux/netdev.h>
> +#include <linux/bitfield.h>
>
> /**
> * DOC: XDP RX-queue information
> @@ -396,6 +397,81 @@ XDP_METADATA_KFUNC_xxx
> MAX_XDP_METADATA_KFUNC,
> };
>
> +/* For partitioning of xdp_rss_hash_type */
> +#define RSS_L3 GENMASK(2,0) /* 3-bits = values between 1-7 */
> +#define L4_BIT BIT(3) /* 1-bit - L4 indication */
> +#define RSS_L4_IPV4 GENMASK(6,4) /* 3-bits */
> +#define RSS_L4_IPV6 GENMASK(9,7) /* 3-bits */
> +#define RSS_L4 GENMASK(9,3) /* = 7-bits - covering L4 IPV4+IPV6 */
> +#define L4_IPV6_EX_BIT BIT(9) /* 1-bit - L4 IPv6 with Extension hdr */
> + /* 11-bits in total */

Please ignore above lines in review ... they should have been deleted,
the new partitioning uses the enum/defines below.

> +
> +/* Lower 4-bits value of xdp_rss_hash_type */
> +enum xdp_rss_L4 {
> + XDP_RSS_L4_MASK = GENMASK(3,0), /* 4-bits = values between 0-15 */
> + XDP_RSS_L4_NONE = 0, /* Not L4 based hash */
> + XDP_RSS_L4_ANY = 1, /* L4 based hash but protocol unknown */
> + XDP_RSS_L4_TCP = 2,
> + XDP_RSS_L4_UDP = 3,
> + XDP_RSS_L4_SCTP = 4,
> + XDP_RSS_L4_IPSEC = 5, /* L4 based hash include IPSEC SPI */
> +/*
> + RFC: We don't care about vasting space, then we could just store the
> + protocol number (8-bits) directly. See /etc/protocols
> + XDP_RSS_L4_TCP = 6,
> + XDP_RSS_L4_UDP = 17,
> + XDP_RSS_L4_SCTP = 132,
> + XDP_RSS_L4_IPSEC_ESP = 50, // Issue: mlx5 didn't say ESP or AH
> + XDP_RSS_L4_IPSEC_AH = 51, // both ESP+AH just include SPI in hash
> + */
> +};
> +
> +/* Values shifted for use in xdp_rss_hash_type */
> +enum xdp_rss_L3 {
> + XDP_RSS_L3_MASK = GENMASK(5,4), /* 2-bits = values between 1-3 */
> + XDP_RSS_L3_IPV4 = FIELD_PREP_CONST(XDP_RSS_L3_MASK, 1),
> + XDP_RSS_L3_IPV6 = FIELD_PREP_CONST(XDP_RSS_L3_MASK, 2),
> +};
> +
> +/* Bits shifted for use in xdp_rss_hash_type */
> +enum xdp_rss_bit {
> + XDP_RSS_BIT_MASK = GENMASK(7,6), /* 2-bits */
> + /* IPv6 Extension Hdr */
> + XDP_RSS_BIT_EX = FIELD_PREP_CONST(XDP_RSS_BIT_MASK, BIT(0)),
> + /* XDP_RSS_BIT_VLAN ??? = FIELD_PREP_CONST(XDP_RSS_BIT_MASK, BIT(1)), */
> +};
> +
> +/* RSS hash type combinations used for driver HW mapping */
> +enum xdp_rss_hash_type {
> + XDP_RSS_TYPE_NONE = 0,
> + XDP_RSS_TYPE_L2 = XDP_RSS_TYPE_NONE,
> +
> + XDP_RSS_TYPE_L3_MASK = XDP_RSS_L3_MASK,
> + XDP_RSS_TYPE_L3_IPV4 = XDP_RSS_L3_IPV4,
> + XDP_RSS_TYPE_L3_IPV6 = XDP_RSS_L3_IPV6,
> + XDP_RSS_TYPE_L3_IPV6_EX = XDP_RSS_L3_IPV6 | XDP_RSS_BIT_EX,
> +
> + XDP_RSS_TYPE_L4_MASK = XDP_RSS_L4_MASK,
> + XDP_RSS_TYPE_L4_ANY = XDP_RSS_L4_ANY,
> + XDP_RSS_TYPE_L4_IPV4_TCP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_TCP,
> + XDP_RSS_TYPE_L4_IPV4_UDP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_UDP,
> + XDP_RSS_TYPE_L4_IPV4_SCTP = XDP_RSS_L3_IPV4 | XDP_RSS_L4_SCTP,
> +
> + XDP_RSS_TYPE_L4_IPV6_TCP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_TCP,
> + XDP_RSS_TYPE_L4_IPV6_UDP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_UDP,
> + XDP_RSS_TYPE_L4_IPV6_SCTP = XDP_RSS_L3_IPV6 | XDP_RSS_L4_UDP,
> +
> + XDP_RSS_TYPE_L4_IPV6_TCP_EX = XDP_RSS_TYPE_L4_IPV6_TCP |XDP_RSS_BIT_EX,
> + XDP_RSS_TYPE_L4_IPV6_UDP_EX = XDP_RSS_TYPE_L4_IPV6_UDP |XDP_RSS_BIT_EX,
> + XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP|XDP_RSS_BIT_EX,
> +};
> +#undef RSS_L3
> +#undef L4_BIT
> +#undef RSS_L4_IPV4
> +#undef RSS_L4_IPV6
> +#undef RSS_L4
> +#undef L4_IPV6_EX_BIT

All the undef's are also unncecessary now.

> +
> #ifdef CONFIG_NET
> u32 bpf_xdp_metadata_kfunc_id(int id);
> bool bpf_dev_bound_kfunc_id(u32 btf_id);
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 7133017bcd74..81d41df30695 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -721,12 +721,14 @@ __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
> * @hash: Return value pointer.
> *
> * Return:
> - * * Returns 0 on success or ``-errno`` on error.
> + * * Returns (positive) RSS hash **type** on success or ``-errno`` on error.
> + * * ``enum xdp_rss_hash_type`` : RSS hash type
> * * ``-EOPNOTSUPP`` : means device driver doesn't implement kfunc
> * * ``-ENODATA`` : means no RX-hash available for this frame
> */
> __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
> {
> + BTF_TYPE_EMIT(enum xdp_rss_hash_type);
> return -EOPNOTSUPP;
> }
>
>
>

2023-03-29 21:50:09

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [xdp-hints] [PATCH bpf RFC-V2 1/5] xdp: rss hash types representation

Jesper Dangaard Brouer <[email protected]> writes:

> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 7133017bcd74..81d41df30695 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -721,12 +721,14 @@ __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
> * @hash: Return value pointer.
> *
> * Return:
> - * * Returns 0 on success or ``-errno`` on error.
> + * * Returns (positive) RSS hash **type** on success or ``-errno`` on error.

This change is going to break any BPF program that does:

if (!bpf_xdp_metadata_rx_hash(ctx, &hash))
/* do something with hash */


so I think adding a second argument is better; that way, at least
breakage will be explicit instead of being a hidden change in semantics
(and the CO-RE style checking for kfuncs Alexei introduced should
trigger correctly).

But really, what we should do anyway is merge this during the -rc phase
to minimise any breakage :)

-Toke