2023-12-01 06:24:57

by Song, Yoong Siang

[permalink] [raw]
Subject: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support

This series expands XDP TX metadata framework to include ETF HW offload.

Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime

v1: https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/

Song Yoong Siang (3):
xsk: add ETF support to XDP Tx metadata
net: stmmac: Add txtime support to XDP ZC
selftests/bpf: Add txtime to xdp_hw_metadata

Documentation/netlink/specs/netdev.yaml | 4 ++++
Documentation/networking/xsk-tx-metadata.rst | 5 +++++
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
include/net/xdp_sock.h | 9 +++++++++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 9 +++++++++
include/uapi/linux/netdev.h | 3 +++
net/core/netdev-genl.c | 2 ++
net/xdp/xsk.c | 3 +++
tools/include/uapi/linux/if_xdp.h | 9 +++++++++
tools/include/uapi/linux/netdev.h | 3 +++
tools/net/ynl/generated/netdev-user.c | 1 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++-
14 files changed, 81 insertions(+), 1 deletion(-)

--
2.34.1


2023-12-01 06:25:25

by Song, Yoong Siang

[permalink] [raw]
Subject: [PATCH bpf-next v2 1/3] xsk: add ETF support to XDP Tx metadata

This patch extends the XDP Tx metadata framework to include Earliest
TxTime First (ETF) HW offload support where the NIC will schedule the
packet for transmission at a pre-determined time called txtime. The value
of txtime is communicated from user space to Ethernet driver via txtime
field of struct xsk_tx_metadata.

Suggested-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Song Yoong Siang <[email protected]>
---
Documentation/netlink/specs/netdev.yaml | 4 ++++
Documentation/networking/xsk-tx-metadata.rst | 5 +++++
include/net/xdp_sock.h | 9 +++++++++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 9 +++++++++
include/uapi/linux/netdev.h | 3 +++
net/core/netdev-genl.c | 2 ++
net/xdp/xsk.c | 3 +++
tools/include/uapi/linux/if_xdp.h | 9 +++++++++
tools/include/uapi/linux/netdev.h | 3 +++
tools/net/ynl/generated/netdev-user.c | 1 +
11 files changed, 49 insertions(+)

diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index 00439bcbd2e3..339cdcddbfc0 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -66,6 +66,10 @@ definitions:
name: tx-checksum
doc:
L3 checksum HW offload is supported by the driver.
+ -
+ name: txtime
+ doc:
+ Earliest TxTime First (ETF) HW offload is supported by the driver.

attribute-sets:
-
diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst
index 97ecfa480d00..654ff692062a 100644
--- a/Documentation/networking/xsk-tx-metadata.rst
+++ b/Documentation/networking/xsk-tx-metadata.rst
@@ -44,6 +44,10 @@ The flags field enables the particular offload:
checksum. ``csum_start`` specifies byte offset of where the checksumming
should start and ``csum_offset`` specifies byte offset where the
device should store the computed checksum.
+- ``XDP_TXMD_FLAGS_TXTIME``: requests Earliest TxTime First (ETF) HW
+ offload to launch the packet at a pre-determined time. ``txtime``
+ indicates the time which the NIC should schedule the packet for
+ transmission.

Besides the flags above, in order to trigger the offloads, the first
packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
@@ -68,6 +72,7 @@ Refer to ``xsk-flags`` features bitmask in

- ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP``
- ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM``
+- ``txtime``: device supports ``XDP_TXMD_FLAGS_TXTIME``

See ``tools/net/ynl/samples/netdev.c`` on how to query this information.

diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index 3cb4dc9bd70e..0651b5264b72 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -110,11 +110,16 @@ struct xdp_sock {
* indicates position where checksumming should start.
* csum_offset indicates position where checksum should be stored.
*
+ * void (*tmo_request_txtime)(u64 txtime, void *priv)
+ * Called when AF_XDP frame requested Earliest TxTime First (ETF) HW offload
+ * support. txtime indicates the time which the NIC should schedule the
+ * packet for transmission.
*/
struct xsk_tx_metadata_ops {
void (*tmo_request_timestamp)(void *priv);
u64 (*tmo_fill_timestamp)(void *priv);
void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv);
+ void (*tmo_request_txtime)(u64 txtime, void *priv);
};

#ifdef CONFIG_XDP_SOCKETS
@@ -170,6 +175,10 @@ static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta,
if (meta->flags & XDP_TXMD_FLAGS_CHECKSUM)
ops->tmo_request_checksum(meta->request.csum_start,
meta->request.csum_offset, priv);
+
+ if (ops->tmo_request_txtime)
+ if (meta->flags & XDP_TXMD_FLAGS_TXTIME)
+ ops->tmo_request_txtime(meta->request.txtime, priv);
}

/**
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 81e02de3f453..e66d597e7079 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -168,6 +168,7 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
#define XDP_TXMD_FLAGS_VALID ( \
XDP_TXMD_FLAGS_TIMESTAMP | \
XDP_TXMD_FLAGS_CHECKSUM | \
+ XDP_TXMD_FLAGS_TXTIME | \
0)

static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta)
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index d31698410410..24d123bce7f3 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -123,6 +123,12 @@ struct xdp_options {
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)

+/* Request Earliest TxTime First (ETF) HW offload to launch the packet at a
+ * pre-determined time. The time which the NIC should schedule the packet for
+ * transmission is communicated via txtime field of struct xsk_tx_metadata.
+ */
+#define XDP_TXMD_FLAGS_TXTIME (1 << 2)
+
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -138,6 +144,9 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
+
+ /* XDP_TXMD_FLAGS_TXTIME */
+ __u64 txtime;
} request;

struct {
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 48d5477a668c..03b913757e1c 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
+ * @NETDEV_XSK_FLAGS_TXTIME: Earliest TxTime First (ETF) HW offload is supported
+ * by the driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
+ NETDEV_XSK_FLAGS_TXTIME = 3,

/* private: */
NETDEV_XSK_FLAGS_MASK = 3,
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 10f2124e9e23..aecc6e26f839 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -33,6 +33,8 @@ XDP_METADATA_KFUNC_xxx
xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP;
if (netdev->xsk_tx_metadata_ops->tmo_request_checksum)
xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM;
+ if (netdev->xsk_tx_metadata_ops->tmo_request_txtime)
+ xsk_features |= NETDEV_XSK_FLAGS_TXTIME;
}

if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) ||
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 281d49b4fca4..4d48fc6caf7c 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -751,6 +751,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
goto free_err;
}
}
+
+ if (meta->flags & XDP_TXMD_FLAGS_TXTIME)
+ skb->skb_mstamp_ns = meta->request.txtime;
}
}

diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
index 638c606dfa74..abcf79e8a879 100644
--- a/tools/include/uapi/linux/if_xdp.h
+++ b/tools/include/uapi/linux/if_xdp.h
@@ -123,6 +123,12 @@ struct xdp_options {
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)

+/* Request Earliest TxTime First (ETF) HW offload to launch the packet at a
+ * pre-determined time. The time which the NIC should schedule the packet for
+ * transmission is communicated via txtime field of struct xsk_tx_metadata.
+ */
+#define XDP_TXMD_FLAGS_TXTIME (1 << 2)
+
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -138,6 +144,9 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
+
+ /* XDP_TXMD_FLAGS_TXTIME */
+ __u64 txtime;
} request;

struct {
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 48d5477a668c..03b913757e1c 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
+ * @NETDEV_XSK_FLAGS_TXTIME: Earliest TxTime First (ETF) HW offload is supported
+ * by the driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
+ NETDEV_XSK_FLAGS_TXTIME = 3,

/* private: */
NETDEV_XSK_FLAGS_MASK = 3,
diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c
index 6283d87dad37..02f1bd4cd97b 100644
--- a/tools/net/ynl/generated/netdev-user.c
+++ b/tools/net/ynl/generated/netdev-user.c
@@ -61,6 +61,7 @@ const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value)
static const char * const netdev_xsk_flags_strmap[] = {
[0] = "tx-timestamp",
[1] = "tx-checksum",
+ [2] = "txtime"
};

const char *netdev_xsk_flags_str(enum netdev_xsk_flags value)
--
2.34.1

2023-12-01 06:25:36

by Song, Yoong Siang

[permalink] [raw]
Subject: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

This patch enables txtime support to XDP zero copy via XDP Tx
metadata framework.

Signed-off-by: Song Yoong Siang <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 686c94c2e8a7..e8538af6e207 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -105,6 +105,8 @@ struct stmmac_metadata_request {
struct stmmac_priv *priv;
struct dma_desc *tx_desc;
bool *set_ic;
+ struct dma_edesc *edesc;
+ int tbs;
};

struct stmmac_xsk_tx_complete {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c2ac88aaffed..c7b9338be9e0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2465,9 +2465,20 @@ static u64 stmmac_xsk_fill_timestamp(void *_priv)
return 0;
}

+static void stmmac_xsk_request_txtime(u64 txtime, void *_priv)
+{
+ struct stmmac_metadata_request *meta_req = _priv;
+ struct timespec64 ts = ns_to_timespec64(txtime);
+
+ if (meta_req->tbs & STMMAC_TBS_EN)
+ stmmac_set_desc_tbs(meta_req->priv, meta_req->edesc, ts.tv_sec,
+ ts.tv_nsec);
+}
+
static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = {
.tmo_request_timestamp = stmmac_xsk_request_timestamp,
.tmo_fill_timestamp = stmmac_xsk_fill_timestamp,
+ .tmo_request_txtime = stmmac_xsk_request_txtime,
};

static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget)
@@ -2545,6 +2556,8 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget)
meta_req.priv = priv;
meta_req.tx_desc = tx_desc;
meta_req.set_ic = &set_ic;
+ meta_req.tbs = tx_q->tbs;
+ meta_req.edesc = &tx_q->dma_entx[entry];
xsk_tx_metadata_request(meta, &stmmac_xsk_tx_metadata_ops,
&meta_req);
if (set_ic) {
--
2.34.1

2023-12-01 06:25:51

by Song, Yoong Siang

[permalink] [raw]
Subject: [PATCH bpf-next v2 3/3] selftests/bpf: Add txtime to xdp_hw_metadata

This patch adds txtime support to xdp_hw_metadata. User can configure the
delta of HW txtime to HW RX-time by using "-l" argument. The default delta
is set to 1 second.

This patch is tested with stmmac on Intel Tiger Lake platform. Refer to
result below, the delta between pre-determined ETF txtime and actual HW
transmit complete time is around 24 us.

$ sudo ./xdp_hw_metadata eth0
...
xsk_ring_cons__peek: 1
0x55fcb80ce7a8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 EoP
No rx_hash err=-95
HW RX-time: 1677764507059055964 (sec:1677764507.0591) delta to User RX-time sec:0.0002 (237.548 usec)
XDP RX-time: 1677764507059280741 (sec:1677764507.0593) delta to User RX-time sec:0.0000 (12.771 usec)
0x55fcb80ce7a8: ping-pong with csum=5619 (want 8626) csum_start=34 csum_offset=6
HW RX-time: 1677764507059055964 (sec:1677764507.0591) delta to HW Txtime sec:1.0000 (1000000.000 usec)
0x55fcb80ce7a8: complete tx idx=0 addr=18
HW Txtime: 1677764508059055964 (sec:1677764508.0591) delta to HW TX-complete-time sec:0.0000 (24.235 usec)
HW TX-complete-time: 1677764508059080199 (sec:1677764508.0591) delta to User TX-complete-time sec:0.0054 (5423.263 usec)
XDP RX-time: 1677764507059280741 (sec:1677764507.0593) delta to User TX-complete-time sec:1.0052 (1005222.721 usec)
HW RX-time: 1677764507059055964 (sec:1677764507.0591) delta to HW TX-complete-time sec:1.0000 (1000024.235 usec)
0x55fcb80ce7a8: complete rx idx=128 addr=80100

$ sudo ./xdp_hw_metadata eth0 -l 10000000
...
xsk_ring_cons__peek: 1
0x5626d54de7a8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 EoP
No rx_hash err=-95
HW RX-time: 1677764655807717783 (sec:1677764655.8077) delta to User RX-time sec:0.0002 (240.571 usec)
XDP RX-time: 1677764655807942983 (sec:1677764655.8079) delta to User RX-time sec:0.0000 (15.371 usec)
0x5626d54de7a8: ping-pong with csum=5619 (want 8626) csum_start=34 csum_offset=6
HW RX-time: 1677764655807717783 (sec:1677764655.8077) delta to HW Txtime sec:0.0100 (10000.000 usec)
0x5626d54de7a8: complete tx idx=0 addr=18
HW Txtime: 1677764655817717783 (sec:1677764655.8177) delta to HW TX-complete-time sec:0.0000 (23.965 usec)
HW TX-complete-time: 1677764655817741748 (sec:1677764655.8177) delta to User TX-complete-time sec:0.0003 (291.792 usec)
XDP RX-time: 1677764655807942983 (sec:1677764655.8079) delta to User TX-complete-time sec:0.0101 (10090.557 usec)
HW RX-time: 1677764655807717783 (sec:1677764655.8077) delta to HW TX-complete-time sec:0.0100 (10023.965 usec)
0x5626d54de7a8: complete rx idx=128 addr=80100

Signed-off-by: Song Yoong Siang <[email protected]>
---
tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 3291625ba4fb..e9c3e29dc538 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -13,6 +13,7 @@
* - UDP 9091 packets trigger TX reply
* - TX HW timestamp is requested and reported back upon completion
* - TX checksum is requested
+ * - HW txtime is set for transmission
*/

#include <test_progs.h>
@@ -61,6 +62,8 @@ int rxq;
bool skip_tx;
__u64 last_hw_rx_timestamp;
__u64 last_xdp_rx_timestamp;
+__u64 last_txtime;
+__u64 txtime_delta_to_hw_rx_timestamp = 1000000000; /* 1 second */

void test__fail(void) { /* for network_helpers.c */ }

@@ -274,6 +277,8 @@ static bool complete_tx(struct xsk *xsk, clockid_t clock_id)
if (meta->completion.tx_timestamp) {
__u64 ref_tstamp = gettime(clock_id);

+ print_tstamp_delta("HW Txtime", "HW TX-complete-time",
+ last_txtime, meta->completion.tx_timestamp);
print_tstamp_delta("HW TX-complete-time", "User TX-complete-time",
meta->completion.tx_timestamp, ref_tstamp);
print_tstamp_delta("XDP RX-time", "User TX-complete-time",
@@ -371,6 +376,13 @@ static void ping_pong(struct xsk *xsk, void *rx_packet, clockid_t clock_id)
xsk, ntohs(udph->check), ntohs(want_csum),
meta->request.csum_start, meta->request.csum_offset);

+ /* Set txtime for Earliest TxTime First (ETF) */
+ meta->flags |= XDP_TXMD_FLAGS_TXTIME;
+ meta->request.txtime = last_hw_rx_timestamp + txtime_delta_to_hw_rx_timestamp;
+ last_txtime = meta->request.txtime;
+ print_tstamp_delta("HW RX-time", "HW Txtime", last_hw_rx_timestamp,
+ meta->request.txtime);
+
memcpy(data, rx_packet, len); /* don't share umem chunk for simplicity */
tx_desc->options |= XDP_TX_METADATA;
tx_desc->len = len;
@@ -595,6 +607,7 @@ static void print_usage(void)
" -h Display this help and exit\n\n"
" -m Enable multi-buffer XDP for larger MTU\n"
" -r Don't generate AF_XDP reply (rx metadata only)\n"
+ " -l Delta of HW Txtime to HW RX-time in ns (default: 1s)\n"
"Generate test packets on the other machine with:\n"
" echo -n xdp | nc -u -q1 <dst_ip> 9091\n";

@@ -605,7 +618,7 @@ static void read_args(int argc, char *argv[])
{
int opt;

- while ((opt = getopt(argc, argv, "chmr")) != -1) {
+ while ((opt = getopt(argc, argv, "chmrl:")) != -1) {
switch (opt) {
case 'c':
bind_flags &= ~XDP_USE_NEED_WAKEUP;
@@ -621,6 +634,9 @@ static void read_args(int argc, char *argv[])
case 'r':
skip_tx = true;
break;
+ case 'l':
+ txtime_delta_to_hw_rx_timestamp = atoll(optarg);
+ break;
case '?':
if (isprint(optopt))
fprintf(stderr, "Unknown option: -%c\n", optopt);
--
2.34.1

2023-12-01 10:47:07

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support



On 12/1/23 07:24, Song Yoong Siang wrote:
> This series expands XDP TX metadata framework to include ETF HW offload.
>
> Changes since v1:
> - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
> - rename launch-time to txtime
>

I strongly disagree with this renaming (sorry to disagree with Willem).

The i210 and i225 chips call this LaunchTime in their programmers
datasheets, and even in the driver code[1].

Using this "txtime" name in the code is also confusing, because how can
people reading the code know the difference between:
- tmo_request_timestamp and tmo_request_txtime


[1]
https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org

> v1: https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/
>
> Song Yoong Siang (3):
> xsk: add ETF support to XDP Tx metadata
> net: stmmac: Add txtime support to XDP ZC
> selftests/bpf: Add txtime to xdp_hw_metadata
>
> Documentation/netlink/specs/netdev.yaml | 4 ++++
> Documentation/networking/xsk-tx-metadata.rst | 5 +++++
> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
> .../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
> include/net/xdp_sock.h | 9 +++++++++
> include/net/xdp_sock_drv.h | 1 +
> include/uapi/linux/if_xdp.h | 9 +++++++++
> include/uapi/linux/netdev.h | 3 +++
> net/core/netdev-genl.c | 2 ++
> net/xdp/xsk.c | 3 +++
> tools/include/uapi/linux/if_xdp.h | 9 +++++++++
> tools/include/uapi/linux/netdev.h | 3 +++
> tools/net/ynl/generated/netdev-user.c | 1 +
> tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++-
> 14 files changed, 81 insertions(+), 1 deletion(-)
>

2023-12-01 13:43:28

by Song, Yoong Siang

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support

On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer <[email protected]> wrote:
>On 12/1/23 07:24, Song Yoong Siang wrote:
>> This series expands XDP TX metadata framework to include ETF HW offload.
>>
>> Changes since v1:
>> - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
>> - rename launch-time to txtime
>>
>
>I strongly disagree with this renaming (sorry to disagree with Willem).
>
>The i210 and i225 chips call this LaunchTime in their programmers
>datasheets, and even in the driver code[1].
>
>Using this "txtime" name in the code is also confusing, because how can
>people reading the code know the difference between:
> - tmo_request_timestamp and tmo_request_txtime
>

Hi Jesper and Willem,

How about using "launch_time" for the flag/variable and
"Earliest TxTime First" for the description/comments?

Thanks & Regards
Siang

>
>[1]
>https://github.com/xdp-project/xdp-
>project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
>
>> v1:
>https://patchwork.kernel.org/project/netdevbpf/cover/20231130162028.852006-1-
>[email protected]/
>>
>> Song Yoong Siang (3):
>> xsk: add ETF support to XDP Tx metadata
>> net: stmmac: Add txtime support to XDP ZC
>> selftests/bpf: Add txtime to xdp_hw_metadata
>>
>> Documentation/netlink/specs/netdev.yaml | 4 ++++
>> Documentation/networking/xsk-tx-metadata.rst | 5 +++++
>> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
>> .../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
>> include/net/xdp_sock.h | 9 +++++++++
>> include/net/xdp_sock_drv.h | 1 +
>> include/uapi/linux/if_xdp.h | 9 +++++++++
>> include/uapi/linux/netdev.h | 3 +++
>> net/core/netdev-genl.c | 2 ++
>> net/xdp/xsk.c | 3 +++
>> tools/include/uapi/linux/if_xdp.h | 9 +++++++++
>> tools/include/uapi/linux/netdev.h | 3 +++
>> tools/net/ynl/generated/netdev-user.c | 1 +
>> tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++-
>> 14 files changed, 81 insertions(+), 1 deletion(-)
>>

2023-12-01 15:03:00

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC



On 12/1/23 07:24, Song Yoong Siang wrote:
> This patch enables txtime support to XDP zero copy via XDP Tx
> metadata framework.
>
> Signed-off-by: Song Yoong Siang<[email protected]>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
> 2 files changed, 15 insertions(+)

I think we need to see other drivers using this new feature to evaluate
if API is sane.

I suggest implementing this for igc driver (chip i225) and also for igb
(i210 chip) that both support this kind of LaunchTime feature in HW.

The API and stmmac driver takes a u64 as time.
I'm wondering how this applies to i210 that[1] have 25-bit for
LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
second into the future.
And i225 that [1] have 30-bit max 1 second into the future.


[1]
https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org

2023-12-01 15:09:58

by Willem de Bruijn

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support

Song, Yoong Siang wrote:
> On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer <[email protected]> wrote:
> >On 12/1/23 07:24, Song Yoong Siang wrote:
> >> This series expands XDP TX metadata framework to include ETF HW offload.
> >>
> >> Changes since v1:
> >> - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
> >> - rename launch-time to txtime
> >>
> >
> >I strongly disagree with this renaming (sorry to disagree with Willem).
> >
> >The i210 and i225 chips call this LaunchTime in their programmers
> >datasheets, and even in the driver code[1].
> >
> >Using this "txtime" name in the code is also confusing, because how can
> >people reading the code know the difference between:
> > - tmo_request_timestamp and tmo_request_txtime
> >
>
> Hi Jesper and Willem,
>
> How about using "launch_time" for the flag/variable and
> "Earliest TxTime First" for the description/comments?

I don't particularly care which term we use, as long as we're
consistent. Especially, don't keep introducing new synonyms.

The fact that one happens to be one vendor's marketing term does not
make it preferable, IMHO. On the contrary.

SO_TXTIME is in the ABI, and EDT has been used publicly in kernel
patches and conference talks, e.g., Van Jacobson's Netdev 0x12
keynote. Those are vendor agnostic commonly used terms.

But as long as Launch Time is not an Intel only trademark, fine to
select that.

2023-12-01 15:27:14

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

Jesper Dangaard Brouer wrote:
>
>
> On 12/1/23 07:24, Song Yoong Siang wrote:
> > This patch enables txtime support to XDP zero copy via XDP Tx
> > metadata framework.
> >
> > Signed-off-by: Song Yoong Siang<[email protected]>
> > ---
> > drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
> > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
> > 2 files changed, 15 insertions(+)
>
> I think we need to see other drivers using this new feature to evaluate
> if API is sane.
>
> I suggest implementing this for igc driver (chip i225) and also for igb
> (i210 chip) that both support this kind of LaunchTime feature in HW.
>
> The API and stmmac driver takes a u64 as time.
> I'm wondering how this applies to i210 that[1] have 25-bit for
> LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
> second into the future.
> And i225 that [1] have 30-bit max 1 second into the future.
>
>
> [1]
> https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org

Good point Jesper.

Can we also explicitly document what the type of the field is?
Nanoseconds against the NIC hardware clock, it sounds like.

We have some experience with this, too. Something needs to do the
conversion from host clock to NIC clock. It is not sufficent to just
assume that the host clock is synced against the NIC clock by PTP.

2023-12-01 15:40:12

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support



On 12/1/23 16:09, Willem de Bruijn wrote:
> Song, Yoong Siang wrote:
>> On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer <[email protected]> wrote:
>>> On 12/1/23 07:24, Song Yoong Siang wrote:
>>>> This series expands XDP TX metadata framework to include ETF HW offload.
>>>>
>>>> Changes since v1:
>>>> - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
>>>> - rename launch-time to txtime
>>>>
>>>
>>> I strongly disagree with this renaming (sorry to disagree with Willem).
>>>
>>> The i210 and i225 chips call this LaunchTime in their programmers
>>> datasheets, and even in the driver code[1].
>>>
>>> Using this "txtime" name in the code is also confusing, because how can
>>> people reading the code know the difference between:
>>> - tmo_request_timestamp and tmo_request_txtime
>>>
>>
>> Hi Jesper and Willem,
>>
>> How about using "launch_time" for the flag/variable and
>> "Earliest TxTime First" for the description/comments?
>

I don't follow why you are calling the feature:
- "Earliest TxTime First" (ETF).
- AFAIK this just reference an qdisc name (that most don't know exists)


> I don't particularly care which term we use, as long as we're
> consistent. Especially, don't keep introducing new synonyms.
>
> The fact that one happens to be one vendor's marketing term does not
> make it preferable, IMHO. On the contrary.
>

These kind of hardware features are defined as part of Time Sensitive
Networking (TSN).
I believe these TSN features are defined as part of IEEE 802.1Qbv (2015)
and according to Wikipedia[2] incorporated into IEEE 802.1Q.

[2] https://en.wikipedia.org/wiki/Time-Sensitive_Networking


> SO_TXTIME is in the ABI, and EDT has been used publicly in kernel
> patches and conference talks, e.g., Van Jacobson's Netdev 0x12
> keynote. Those are vendor agnostic commonly used terms.
>

I agree that EDT (Earliest Departure Time) have become a thing and term
in our community.
We could associate this feature with this.
I do fear what hardware behavior will be it if I e.g. ask it to send a
packet 2 sec in the future on i225 which max support 1 sec.
Will hardware send it at 1 sec?
Because then I'm violating the *Earliest* Departure Time.


> But as long as Launch Time is not an Intel only trademark, fine to
> select that.

The IEEE 802.1Qbv is sometimes called Time-Aware Shaper (TAS), but I
don't like to for us to name this after this. This features is simply
taking advantage of exposing one of the hardware building blocks
(controlling/setting packet "launch time") that can be used for
implementing a TAS.

I like the name "launch time" because it doesn't get easily confused
with other timestamps, and intuitively describes packet will be send at
a specific time (likely in future).

--Jesper

2023-12-02 14:16:14

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support

Jesper Dangaard Brouer wrote:
>
>
> On 12/1/23 16:09, Willem de Bruijn wrote:
> > Song, Yoong Siang wrote:
> >> On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer <[email protected]> wrote:
> >>> On 12/1/23 07:24, Song Yoong Siang wrote:
> >>>> This series expands XDP TX metadata framework to include ETF HW offload.
> >>>>
> >>>> Changes since v1:
> >>>> - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
> >>>> - rename launch-time to txtime
> >>>>
> >>>
> >>> I strongly disagree with this renaming (sorry to disagree with Willem).
> >>>
> >>> The i210 and i225 chips call this LaunchTime in their programmers
> >>> datasheets, and even in the driver code[1].
> >>>
> >>> Using this "txtime" name in the code is also confusing, because how can
> >>> people reading the code know the difference between:
> >>> - tmo_request_timestamp and tmo_request_txtime
> >>>
> >>
> >> Hi Jesper and Willem,
> >>
> >> How about using "launch_time" for the flag/variable and
> >> "Earliest TxTime First" for the description/comments?
> >
>
> I don't follow why you are calling the feature:
> - "Earliest TxTime First" (ETF).
> - AFAIK this just reference an qdisc name (that most don't know exists)
>
>
> > I don't particularly care which term we use, as long as we're
> > consistent. Especially, don't keep introducing new synonyms.
> >
> > The fact that one happens to be one vendor's marketing term does not
> > make it preferable, IMHO. On the contrary.
> >
>
> These kind of hardware features are defined as part of Time Sensitive
> Networking (TSN).
> I believe these TSN features are defined as part of IEEE 802.1Qbv (2015)
> and according to Wikipedia[2] incorporated into IEEE 802.1Q.
>
> [2] https://en.wikipedia.org/wiki/Time-Sensitive_Networking
>
>
> > SO_TXTIME is in the ABI, and EDT has been used publicly in kernel
> > patches and conference talks, e.g., Van Jacobson's Netdev 0x12
> > keynote. Those are vendor agnostic commonly used terms.
> >
>
> I agree that EDT (Earliest Departure Time) have become a thing and term
> in our community.
> We could associate this feature with this.
> I do fear what hardware behavior will be it if I e.g. ask it to send a
> packet 2 sec in the future on i225 which max support 1 sec.
> Will hardware send it at 1 sec?
> Because then I'm violating the *Earliest* Departure Time.

That should definitely not happen. At least not on a device that
implements EDT semantics.

This relates to Jakub's question in the previous thread on whether
this mechanism allows out-of-order transmission or maintains FIFO
behavior. That really is device specific.

Older devices only support this for low rate (PTP) and with a small
fixed number of outstanding requests. For pacing offload, devices need
to support up to linerate and out-of-order.

I don't think we want to enforce either in software, as the hardware
is already out there. But it would be good if drivers can somehow
label these capabilities. Including programmable horizon.

It is up to the qdisc to ensure that it does not pass packets to the
device beyond its horizon.

ETF and FQ already have a concept of horizon. And a way to queue
errors for packets out of bound (SO_EE_CODE_TXTIME_..).

>
> > But as long as Launch Time is not an Intel only trademark, fine to
> > select that.
>
> The IEEE 802.1Qbv is sometimes called Time-Aware Shaper (TAS), but I
> don't like to for us to name this after this. This features is simply
> taking advantage of exposing one of the hardware building blocks
> (controlling/setting packet "launch time") that can be used for
> implementing a TAS.
>
> I like the name "launch time" because it doesn't get easily confused
> with other timestamps, and intuitively describes packet will be send at
> a specific time (likely in future).
>
> --Jesper

Understood on your point that txtime and tx_timestamp are too similar.
As said, I don't care strongly. Launch time sounds fine to me. Others
can speak up if they disagree.

I take launch time as a less strict than EDT: it is a request to send
at a certain time, with no strict definition on uncertainty. While EDT
more strictly ensures that a packet is not sent before the timestamp.

2023-12-03 09:17:17

by Song, Yoong Siang

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 0/3] xsk: TX metadata txtime support

On Saturday, December 2, 2023 10:16 PM, Willem de Bruijn wrote:
>Jesper Dangaard Brouer wrote:
>>
>>
>> On 12/1/23 16:09, Willem de Bruijn wrote:
>> > Song, Yoong Siang wrote:
>> >> On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer
><[email protected]> wrote:
>> >>> On 12/1/23 07:24, Song Yoong Siang wrote:
>> >>>> This series expands XDP TX metadata framework to include ETF HW offload.
>> >>>>
>> >>>> Changes since v1:
>> >>>> - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
>> >>>> - rename launch-time to txtime
>> >>>>
>> >>>
>> >>> I strongly disagree with this renaming (sorry to disagree with Willem).
>> >>>
>> >>> The i210 and i225 chips call this LaunchTime in their programmers
>> >>> datasheets, and even in the driver code[1].
>> >>>
>> >>> Using this "txtime" name in the code is also confusing, because how can
>> >>> people reading the code know the difference between:
>> >>> - tmo_request_timestamp and tmo_request_txtime
>> >>>
>> >>
>> >> Hi Jesper and Willem,
>> >>
>> >> How about using "launch_time" for the flag/variable and
>> >> "Earliest TxTime First" for the description/comments?
>> >
>>
>> I don't follow why you are calling the feature:
>> - "Earliest TxTime First" (ETF).
>> - AFAIK this just reference an qdisc name (that most don't know exists)
>>
>>
>> > I don't particularly care which term we use, as long as we're
>> > consistent. Especially, don't keep introducing new synonyms.
>> >
>> > The fact that one happens to be one vendor's marketing term does not
>> > make it preferable, IMHO. On the contrary.
>> >
>>
>> These kind of hardware features are defined as part of Time Sensitive
>> Networking (TSN).
>> I believe these TSN features are defined as part of IEEE 802.1Qbv (2015)
>> and according to Wikipedia[2] incorporated into IEEE 802.1Q.
>>
>> [2] https://en.wikipedia.org/wiki/Time-Sensitive_Networking
>>
>>
>> > SO_TXTIME is in the ABI, and EDT has been used publicly in kernel
>> > patches and conference talks, e.g., Van Jacobson's Netdev 0x12
>> > keynote. Those are vendor agnostic commonly used terms.
>> >
>>
>> I agree that EDT (Earliest Departure Time) have become a thing and term
>> in our community.
>> We could associate this feature with this.
>> I do fear what hardware behavior will be it if I e.g. ask it to send a
>> packet 2 sec in the future on i225 which max support 1 sec.
>> Will hardware send it at 1 sec?
>> Because then I'm violating the *Earliest* Departure Time.
>
>That should definitely not happen. At least not on a device that
>implements EDT semantics.
>
>This relates to Jakub's question in the previous thread on whether
>this mechanism allows out-of-order transmission or maintains FIFO
>behavior. That really is device specific.
>
>Older devices only support this for low rate (PTP) and with a small
>fixed number of outstanding requests. For pacing offload, devices need
>to support up to linerate and out-of-order.
>
>I don't think we want to enforce either in software, as the hardware
>is already out there. But it would be good if drivers can somehow
>label these capabilities. Including programmable horizon.
>
>It is up to the qdisc to ensure that it does not pass packets to the
>device beyond its horizon.
>
>ETF and FQ already have a concept of horizon. And a way to queue
>errors for packets out of bound (SO_EE_CODE_TXTIME_..).
>
>>
>> > But as long as Launch Time is not an Intel only trademark, fine to
>> > select that.
>>
>> The IEEE 802.1Qbv is sometimes called Time-Aware Shaper (TAS), but I
>> don't like to for us to name this after this. This features is simply
>> taking advantage of exposing one of the hardware building blocks
>> (controlling/setting packet "launch time") that can be used for
>> implementing a TAS.
>>
>> I like the name "launch time" because it doesn't get easily confused
>> with other timestamps, and intuitively describes packet will be send at
>> a specific time (likely in future).
>>
>> --Jesper
>
>Understood on your point that txtime and tx_timestamp are too similar.
>As said, I don't care strongly. Launch time sounds fine to me. Others
>can speak up if they disagree.
>
>I take launch time as a less strict than EDT: it is a request to send
>at a certain time, with no strict definition on uncertainty. While EDT
>more strictly ensures that a packet is not sent before the timestamp.

Thanks for the deep discussion and information. I agree with launch time too.
I will submit v3 with launch time so that others can review on the
new naming and provide their feedback.

2023-12-03 10:11:21

by Song, Yoong Siang

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
>On 12/1/23 07:24, Song Yoong Siang wrote:
>> This patch enables txtime support to XDP zero copy via XDP Tx
>> metadata framework.
>>
>> Signed-off-by: Song Yoong Siang<[email protected]>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
>> 2 files changed, 15 insertions(+)
>
>I think we need to see other drivers using this new feature to evaluate
>if API is sane.
>
>I suggest implementing this for igc driver (chip i225) and also for igb
>(i210 chip) that both support this kind of LaunchTime feature in HW.
>
>The API and stmmac driver takes a u64 as time.
>I'm wondering how this applies to i210 that[1] have 25-bit for
>LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
>second into the future.
>And i225 that [1] have 30-bit max 1 second into the future.
>
>
>[1]
>https://github.com/xdp-project/xdp-
>project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org

I am using u64 for launch time because existing EDT framework is using it.
Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time.
I choose u64 because ktime_t often requires additional type conversion and
we didn't expect negative value of time.

include/linux/skbuff.h-744- * @tstamp: Time we arrived/left
include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure time; start point
include/linux/skbuff.h-746- * for retransmit timer
--
include/linux/skbuff.h-880- union {
include/linux/skbuff.h-881- ktime_t tstamp;
include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure time */
include/linux/skbuff.h-883- };

tstamp/skb_mstamp_ns are used by various drivers for launch time support
on normal packet, so I think u64 should be "friendly" to all the drivers. For an
example, igc driver will take launch time from tstamp and recalculate it
accordingly (i225 expect user to program "delta time" instead of "time" into
HW register).

drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp;
drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0);
drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);

Do you think this is enough to say the API is sane?

2023-12-04 14:58:57

by Willem de Bruijn

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

Song, Yoong Siang wrote:
> On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
> >On 12/1/23 07:24, Song Yoong Siang wrote:
> >> This patch enables txtime support to XDP zero copy via XDP Tx
> >> metadata framework.
> >>
> >> Signed-off-by: Song Yoong Siang<[email protected]>
> >> ---
> >> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
> >> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
> >> 2 files changed, 15 insertions(+)
> >
> >I think we need to see other drivers using this new feature to evaluate
> >if API is sane.
> >
> >I suggest implementing this for igc driver (chip i225) and also for igb
> >(i210 chip) that both support this kind of LaunchTime feature in HW.
> >
> >The API and stmmac driver takes a u64 as time.
> >I'm wondering how this applies to i210 that[1] have 25-bit for
> >LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
> >second into the future.
> >And i225 that [1] have 30-bit max 1 second into the future.
> >
> >
> >[1]
> >https://github.com/xdp-project/xdp-
> >project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
>
> I am using u64 for launch time because existing EDT framework is using it.
> Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time.
> I choose u64 because ktime_t often requires additional type conversion and
> we didn't expect negative value of time.
>
> include/linux/skbuff.h-744- * @tstamp: Time we arrived/left
> include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure time; start point
> include/linux/skbuff.h-746- * for retransmit timer
> --
> include/linux/skbuff.h-880- union {
> include/linux/skbuff.h-881- ktime_t tstamp;
> include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure time */
> include/linux/skbuff.h-883- };
>
> tstamp/skb_mstamp_ns are used by various drivers for launch time support
> on normal packet, so I think u64 should be "friendly" to all the drivers. For an
> example, igc driver will take launch time from tstamp and recalculate it
> accordingly (i225 expect user to program "delta time" instead of "time" into
> HW register).
>
> drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp;
> drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0);
> drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
>
> Do you think this is enough to say the API is sane?

u64 nsec sounds sane to be. It must be made explicit with clock source
it is against.

Some applications could want to do the conversion from a clock source
to raw NIC cycle counter in userspace or BPF and program the raw
value. So it may be worthwhile to add an clock source argument -- even
if initially only CLOCK_MONOTONIC is supported.

See tools/testing/selftests/net/so_txtime.sh for how the FQ and ETF
qdiscs already disagree on the clock source that they use.


2023-12-05 14:44:07

by Song, Yoong Siang

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

On Monday, December 4, 2023 10:58 PM, Willem de Bruijn wrote:
>Song, Yoong Siang wrote:
>> On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
>> >On 12/1/23 07:24, Song Yoong Siang wrote:
>> >> This patch enables txtime support to XDP zero copy via XDP Tx
>> >> metadata framework.
>> >>
>> >> Signed-off-by: Song Yoong Siang<[email protected]>
>> >> ---
>> >> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
>> >> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
>> >> 2 files changed, 15 insertions(+)
>> >
>> >I think we need to see other drivers using this new feature to evaluate
>> >if API is sane.
>> >
>> >I suggest implementing this for igc driver (chip i225) and also for igb
>> >(i210 chip) that both support this kind of LaunchTime feature in HW.
>> >
>> >The API and stmmac driver takes a u64 as time.
>> >I'm wondering how this applies to i210 that[1] have 25-bit for
>> >LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
>> >second into the future.
>> >And i225 that [1] have 30-bit max 1 second into the future.
>> >
>> >
>> >[1]
>> >https://github.com/xdp-project/xdp-
>> >project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
>>
>> I am using u64 for launch time because existing EDT framework is using it.
>> Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time.
>> I choose u64 because ktime_t often requires additional type conversion and
>> we didn't expect negative value of time.
>>
>> include/linux/skbuff.h-744- * @tstamp: Time we arrived/left
>> include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure
>time; start point
>> include/linux/skbuff.h-746- * for retransmit timer
>> --
>> include/linux/skbuff.h-880- union {
>> include/linux/skbuff.h-881- ktime_t tstamp;
>> include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure
>time */
>> include/linux/skbuff.h-883- };
>>
>> tstamp/skb_mstamp_ns are used by various drivers for launch time support
>> on normal packet, so I think u64 should be "friendly" to all the drivers. For an
>> example, igc driver will take launch time from tstamp and recalculate it
>> accordingly (i225 expect user to program "delta time" instead of "time" into
>> HW register).
>>
>> drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp;
>> drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0);
>> drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time =
>igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
>>
>> Do you think this is enough to say the API is sane?
>
>u64 nsec sounds sane to be. It must be made explicit with clock source
>it is against.
>

The u64 launch time should base on NIC PTP hardware clock (PHC).
I will add documentation saying which clock source it is against

>Some applications could want to do the conversion from a clock source
>to raw NIC cycle counter in userspace or BPF and program the raw
>value. So it may be worthwhile to add an clock source argument -- even
>if initially only CLOCK_MONOTONIC is supported.

Sorry, not so understand your suggestion on adding clock source argument.
Are you suggesting to add clock source for the selftest xdp_hw_metadata apps?
IMHO, no need to add clock source as the clock source for launch time
should always base on NIC PHC.

>
>See tools/testing/selftests/net/so_txtime.sh for how the FQ and ETF
>qdiscs already disagree on the clock source that they use.
>



2023-12-05 14:56:09

by Willem de Bruijn

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

Song, Yoong Siang wrote:
> On Monday, December 4, 2023 10:58 PM, Willem de Bruijn wrote:
> >Song, Yoong Siang wrote:
> >> On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
> >> >On 12/1/23 07:24, Song Yoong Siang wrote:
> >> >> This patch enables txtime support to XDP zero copy via XDP Tx
> >> >> metadata framework.
> >> >>
> >> >> Signed-off-by: Song Yoong Siang<[email protected]>
> >> >> ---
> >> >> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
> >> >> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
> >> >> 2 files changed, 15 insertions(+)
> >> >
> >> >I think we need to see other drivers using this new feature to evaluate
> >> >if API is sane.
> >> >
> >> >I suggest implementing this for igc driver (chip i225) and also for igb
> >> >(i210 chip) that both support this kind of LaunchTime feature in HW.
> >> >
> >> >The API and stmmac driver takes a u64 as time.
> >> >I'm wondering how this applies to i210 that[1] have 25-bit for
> >> >LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
> >> >second into the future.
> >> >And i225 that [1] have 30-bit max 1 second into the future.
> >> >
> >> >
> >> >[1]
> >> >https://github.com/xdp-project/xdp-
> >> >project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
> >>
> >> I am using u64 for launch time because existing EDT framework is using it.
> >> Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time.
> >> I choose u64 because ktime_t often requires additional type conversion and
> >> we didn't expect negative value of time.
> >>
> >> include/linux/skbuff.h-744- * @tstamp: Time we arrived/left
> >> include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure
> >time; start point
> >> include/linux/skbuff.h-746- * for retransmit timer
> >> --
> >> include/linux/skbuff.h-880- union {
> >> include/linux/skbuff.h-881- ktime_t tstamp;
> >> include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure
> >time */
> >> include/linux/skbuff.h-883- };
> >>
> >> tstamp/skb_mstamp_ns are used by various drivers for launch time support
> >> on normal packet, so I think u64 should be "friendly" to all the drivers. For an
> >> example, igc driver will take launch time from tstamp and recalculate it
> >> accordingly (i225 expect user to program "delta time" instead of "time" into
> >> HW register).
> >>
> >> drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp;
> >> drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0);
> >> drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time =
> >igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
> >>
> >> Do you think this is enough to say the API is sane?
> >
> >u64 nsec sounds sane to be. It must be made explicit with clock source
> >it is against.
> >
>
> The u64 launch time should base on NIC PTP hardware clock (PHC).
> I will add documentation saying which clock source it is against

It's not that obvious to me that that is the right and only choice.
See below.

> >Some applications could want to do the conversion from a clock source
> >to raw NIC cycle counter in userspace or BPF and program the raw
> >value. So it may be worthwhile to add an clock source argument -- even
> >if initially only CLOCK_MONOTONIC is supported.
>
> Sorry, not so understand your suggestion on adding clock source argument.
> Are you suggesting to add clock source for the selftest xdp_hw_metadata apps?
> IMHO, no need to add clock source as the clock source for launch time
> should always base on NIC PHC.

This is not how FQ and ETF qdiscs pass timestamps to drivers today.

Those are in CLOCK_MONOTONIC or CLOCK_TAI. The driver is expected to
convert from that to its descriptor format, both to the reduced bit
width and the NIC PHC.

See also for instance how sch_etf has an explicit q->clock_id match,
and SO_TXTIME added an sk_clock_id for the same purpose: to agree on
which clock source is being used.

2023-12-05 15:28:55

by Song, Yoong Siang

[permalink] [raw]
Subject: RE: [PATCH bpf-next v2 2/3] net: stmmac: Add txtime support to XDP ZC

On Tuesday, December 5, 2023 10:55 PM, Willem de Bruijn wrote:
>Song, Yoong Siang wrote:
>> On Monday, December 4, 2023 10:58 PM, Willem de Bruijn wrote:
>> >Song, Yoong Siang wrote:
>> >> On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
>> >> >On 12/1/23 07:24, Song Yoong Siang wrote:
>> >> >> This patch enables txtime support to XDP zero copy via XDP Tx
>> >> >> metadata framework.
>> >> >>
>> >> >> Signed-off-by: Song Yoong Siang<[email protected]>
>> >> >> ---
>> >> >> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++
>> >> >> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++
>> >> >> 2 files changed, 15 insertions(+)
>> >> >
>> >> >I think we need to see other drivers using this new feature to evaluate
>> >> >if API is sane.
>> >> >
>> >> >I suggest implementing this for igc driver (chip i225) and also for igb
>> >> >(i210 chip) that both support this kind of LaunchTime feature in HW.
>> >> >
>> >> >The API and stmmac driver takes a u64 as time.
>> >> >I'm wondering how this applies to i210 that[1] have 25-bit for
>> >> >LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5
>> >> >second into the future.
>> >> >And i225 that [1] have 30-bit max 1 second into the future.
>> >> >
>> >> >
>> >> >[1]
>> >> >https://github.com/xdp-project/xdp-
>> >> >project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
>> >>
>> >> I am using u64 for launch time because existing EDT framework is using it.
>> >> Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time.
>> >> I choose u64 because ktime_t often requires additional type conversion and
>> >> we didn't expect negative value of time.
>> >>
>> >> include/linux/skbuff.h-744- * @tstamp: Time we arrived/left
>> >> include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest
>departure
>> >time; start point
>> >> include/linux/skbuff.h-746- * for retransmit timer
>> >> --
>> >> include/linux/skbuff.h-880- union {
>> >> include/linux/skbuff.h-881- ktime_t tstamp;
>> >> include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest
>departure
>> >time */
>> >> include/linux/skbuff.h-883- };
>> >>
>> >> tstamp/skb_mstamp_ns are used by various drivers for launch time support
>> >> on normal packet, so I think u64 should be "friendly" to all the drivers. For an
>> >> example, igc driver will take launch time from tstamp and recalculate it
>> >> accordingly (i225 expect user to program "delta time" instead of "time" into
>> >> HW register).
>> >>
>> >> drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp;
>> >> drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0);
>> >> drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time =
>> >igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
>> >>
>> >> Do you think this is enough to say the API is sane?
>> >
>> >u64 nsec sounds sane to be. It must be made explicit with clock source
>> >it is against.
>> >
>>
>> The u64 launch time should base on NIC PTP hardware clock (PHC).
>> I will add documentation saying which clock source it is against
>
>It's not that obvious to me that that is the right and only choice.
>See below.
>
>> >Some applications could want to do the conversion from a clock source
>> >to raw NIC cycle counter in userspace or BPF and program the raw
>> >value. So it may be worthwhile to add an clock source argument -- even
>> >if initially only CLOCK_MONOTONIC is supported.
>>
>> Sorry, not so understand your suggestion on adding clock source argument.
>> Are you suggesting to add clock source for the selftest xdp_hw_metadata apps?
>> IMHO, no need to add clock source as the clock source for launch time
>> should always base on NIC PHC.
>
>This is not how FQ and ETF qdiscs pass timestamps to drivers today.
>
>Those are in CLOCK_MONOTONIC or CLOCK_TAI. The driver is expected to
>convert from that to its descriptor format, both to the reduced bit
>width and the NIC PHC.
>
>See also for instance how sch_etf has an explicit q->clock_id match,
>and SO_TXTIME added an sk_clock_id for the same purpose: to agree on
>which clock source is being used.

I see. Thank for the explanation. I will try to add clock source arguments
In next version.