2022-09-14 15:36:35

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 00/13] Add tc-taprio support for queueMaxSDU

Michael and Xiaoliang will probably be aware that the tc-taprio offload
mode supported by the Felix DSA driver has limitations surrounding its
guard bands.

The initial discussion was at:
https://lore.kernel.org/netdev/[email protected]/

with the latest status being that we now have a vsc9959_tas_guard_bands_update()
method which makes a best-guess attempt at how much useful space to
reserve for packet scheduling in a taprio interval, and how much to
reserve for guard bands.

IEEE 802.1Q actually does offer a tunable variable (queueMaxSDU) which
can determine the max MTU supported per traffic class. In turn we can
determine the size we need for the guard bands, depending on the
queueMaxSDU. This way we can make the guard band of small taprio
intervals smaller than one full MTU worth of transmission time, if we
know that said traffic class will transport only smaller packets.

Allow input of queueMaxSDU through netlink into tc-taprio, offload it to
the hardware I have access to (LS1028A), and deny non-default values to
everyone else.

First 3 patches are some cleanups I made while figuring out what exactly
gets called for taprio software mode, and what gets called for offload
mode.

Vladimir Oltean (13):
net/sched: taprio: remove redundant FULL_OFFLOAD_IS_ENABLED check in
taprio_enqueue
net/sched: taprio: stop going through private ops for dequeue and peek
net/sched: taprio: add extack messages in taprio_init
net/sched: taprio: allow user input of per-tc max SDU
net: dsa: felix: offload per-tc max SDU from tc-taprio
net: enetc: cache accesses to &priv->si->hw
net: enetc: offload per-tc max SDU from tc-taprio
net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU
net: dsa: sja1105: deny tc-taprio changes to per-tc max SDU
tsnep: deny tc-taprio changes to per-tc max SDU
igc: deny tc-taprio changes to per-tc max SDU
net: stmmac: deny tc-taprio changes to per-tc max SDU
net: am65-cpsw: deny tc-taprio changes to per-tc max SDU

drivers/net/dsa/hirschmann/hellcreek.c | 5 +
drivers/net/dsa/ocelot/felix_vsc9959.c | 20 +-
drivers/net/dsa/sja1105/sja1105_tas.c | 6 +-
drivers/net/ethernet/engleder/tsnep_tc.c | 6 +-
drivers/net/ethernet/freescale/enetc/enetc.c | 28 ++-
drivers/net/ethernet/freescale/enetc/enetc.h | 12 +-
.../net/ethernet/freescale/enetc/enetc_pf.c | 25 ++-
.../net/ethernet/freescale/enetc/enetc_qos.c | 70 +++----
drivers/net/ethernet/intel/igc/igc_main.c | 6 +-
.../net/ethernet/stmicro/stmmac/stmmac_tc.c | 6 +-
drivers/net/ethernet/ti/am65-cpsw-qos.c | 6 +-
include/net/pkt_sched.h | 1 +
include/uapi/linux/pkt_sched.h | 11 +
net/sched/sch_taprio.c | 194 +++++++++++++-----
14 files changed, 283 insertions(+), 113 deletions(-)

--
2.34.1


2022-09-14 15:36:53

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 03/13] net/sched: taprio: add extack messages in taprio_init

Stop contributing to the proverbial user unfriendliness of tc, and tell
the user what is wrong wherever possible.

Signed-off-by: Vladimir Oltean <[email protected]>
---
net/sched/sch_taprio.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 226aa6efb365..2a4b8f59f444 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1675,11 +1675,15 @@ static int taprio_init(struct Qdisc *sch, struct nlattr *opt,
list_add(&q->taprio_list, &taprio_list);
spin_unlock(&taprio_list_lock);

- if (sch->parent != TC_H_ROOT)
+ if (sch->parent != TC_H_ROOT) {
+ NL_SET_ERR_MSG_MOD(extack, "Can only be attached as root qdisc");
return -EOPNOTSUPP;
+ }

- if (!netif_is_multiqueue(dev))
+ if (!netif_is_multiqueue(dev)) {
+ NL_SET_ERR_MSG_MOD(extack, "Multi-queue device is required");
return -EOPNOTSUPP;
+ }

/* pre-allocate qdisc, attachment can't fail */
q->qdiscs = kcalloc(dev->num_tx_queues,
--
2.34.1

2022-09-14 15:37:55

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data
types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the
existence of a per traffic class limitation of maximum frame sizes, with
a fallback on the port-based MTU.

As far as I am able to understand, the 802.1Q Service Data Unit (SDU)
represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding
any number of prepended VLAN headers which may be otherwise present in
the MSDU. Therefore, the queueMaxSDU is directly comparable to the
device MTU (1500 means L2 payload sizes are accepted, or frame sizes of
1518 octets, or 1522 plus one VLAN header). Drivers which offload this
are directly responsible of translating into other units of measurement.

Signed-off-by: Vladimir Oltean <[email protected]>
---
include/net/pkt_sched.h | 1 +
include/uapi/linux/pkt_sched.h | 11 +++
net/sched/sch_taprio.c | 122 ++++++++++++++++++++++++++++++++-
3 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 29f65632ebc5..88080998557b 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -168,6 +168,7 @@ struct tc_taprio_qopt_offload {
ktime_t base_time;
u64 cycle_time;
u64 cycle_time_extension;
+ u32 max_sdu[TC_MAX_QUEUE];

size_t num_entries;
struct tc_taprio_sched_entry entries[];
diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index f292b467b27f..000eec106856 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -1232,6 +1232,16 @@ enum {
#define TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST _BITUL(0)
#define TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD _BITUL(1)

+enum {
+ TCA_TAPRIO_TC_ENTRY_UNSPEC,
+ TCA_TAPRIO_TC_ENTRY_INDEX, /* u32 */
+ TCA_TAPRIO_TC_ENTRY_MAX_SDU, /* u32 */
+
+ /* add new constants above here */
+ __TCA_TAPRIO_TC_ENTRY_CNT,
+ TCA_TAPRIO_TC_ENTRY_MAX = (__TCA_TAPRIO_TC_ENTRY_CNT - 1)
+};
+
enum {
TCA_TAPRIO_ATTR_UNSPEC,
TCA_TAPRIO_ATTR_PRIOMAP, /* struct tc_mqprio_qopt */
@@ -1245,6 +1255,7 @@ enum {
TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION, /* s64 */
TCA_TAPRIO_ATTR_FLAGS, /* u32 */
TCA_TAPRIO_ATTR_TXTIME_DELAY, /* u32 */
+ TCA_TAPRIO_ATTR_TC_ENTRY, /* nest */
__TCA_TAPRIO_ATTR_MAX,
};

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 2a4b8f59f444..834cbed88e4f 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -79,6 +79,7 @@ struct taprio_sched {
struct sched_gate_list __rcu *admin_sched;
struct hrtimer advance_timer;
struct list_head taprio_list;
+ u32 max_sdu[TC_MAX_QUEUE];
u32 txtime_delay;
};

@@ -416,6 +417,9 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
struct Qdisc *child, struct sk_buff **to_free)
{
struct taprio_sched *q = qdisc_priv(sch);
+ struct net_device *dev = qdisc_dev(sch);
+ int prio = skb->priority;
+ u8 tc;

/* sk_flags are only safe to use on full sockets. */
if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) {
@@ -427,6 +431,12 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
return qdisc_drop(skb, sch, to_free);
}

+ /* Devices with full offload are expected to honor this in hardware */
+ tc = netdev_get_prio_tc_map(dev, prio);
+ if (q->max_sdu[tc] &&
+ q->max_sdu[tc] < max_t(int, 0, skb->len - skb_mac_header_len(skb)))
+ return qdisc_drop(skb, sch, to_free);
+
qdisc_qstats_backlog_inc(sch, skb);
sch->q.qlen++;

@@ -761,6 +771,11 @@ static const struct nla_policy entry_policy[TCA_TAPRIO_SCHED_ENTRY_MAX + 1] = {
[TCA_TAPRIO_SCHED_ENTRY_INTERVAL] = { .type = NLA_U32 },
};

+static const struct nla_policy taprio_tc_policy[TCA_TAPRIO_TC_ENTRY_MAX + 1] = {
+ [TCA_TAPRIO_TC_ENTRY_INDEX] = { .type = NLA_U32 },
+ [TCA_TAPRIO_TC_ENTRY_MAX_SDU] = { .type = NLA_U32 },
+};
+
static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
[TCA_TAPRIO_ATTR_PRIOMAP] = {
.len = sizeof(struct tc_mqprio_qopt)
@@ -773,6 +788,7 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
[TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 },
[TCA_TAPRIO_ATTR_FLAGS] = { .type = NLA_U32 },
[TCA_TAPRIO_ATTR_TXTIME_DELAY] = { .type = NLA_U32 },
+ [TCA_TAPRIO_ATTR_TC_ENTRY] = { .type = NLA_NESTED },
};

static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb,
@@ -1236,7 +1252,7 @@ static int taprio_enable_offload(struct net_device *dev,
{
const struct net_device_ops *ops = dev->netdev_ops;
struct tc_taprio_qopt_offload *offload;
- int err = 0;
+ int tc, err = 0;

if (!ops->ndo_setup_tc) {
NL_SET_ERR_MSG(extack,
@@ -1253,6 +1269,9 @@ static int taprio_enable_offload(struct net_device *dev,
offload->enable = 1;
taprio_sched_to_offload(dev, sched, offload);

+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ offload->max_sdu[tc] = q->max_sdu[tc];
+
err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
if (err < 0) {
NL_SET_ERR_MSG(extack,
@@ -1387,6 +1406,73 @@ static int taprio_parse_clockid(struct Qdisc *sch, struct nlattr **tb,
return err;
}

+static int taprio_parse_tc_entry(struct Qdisc *sch,
+ struct nlattr *opt,
+ unsigned long *seen_tcs,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr *tb[TCA_TAPRIO_TC_ENTRY_MAX + 1] = { };
+ struct taprio_sched *q = qdisc_priv(sch);
+ struct net_device *dev = qdisc_dev(sch);
+ u32 max_sdu = 0;
+ int err, tc;
+
+ err = nla_parse_nested(tb, TCA_TAPRIO_TC_ENTRY_MAX, opt,
+ taprio_tc_policy, extack);
+ if (err < 0)
+ return err;
+
+ if (!tb[TCA_TAPRIO_TC_ENTRY_INDEX]) {
+ NL_SET_ERR_MSG_MOD(extack, "TC entry index missing");
+ return -EINVAL;
+ }
+
+ tc = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_INDEX]);
+ if (tc >= TC_QOPT_MAX_QUEUE) {
+ NL_SET_ERR_MSG_MOD(extack, "TC entry index out of range");
+ return -ERANGE;
+ }
+
+ if (*seen_tcs & BIT(tc)) {
+ NL_SET_ERR_MSG_MOD(extack, "Duplicate TC entry");
+ return -EINVAL;
+ }
+
+ *seen_tcs |= BIT(tc);
+
+ if (tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU])
+ max_sdu = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU]);
+
+ if (max_sdu > dev->max_mtu) {
+ NL_SET_ERR_MSG_MOD(extack, "TC max SDU exceeds device max MTU");
+ return -ERANGE;
+ }
+
+ q->max_sdu[tc] = max_sdu;
+
+ return 0;
+}
+
+static int taprio_parse_tc_entries(struct Qdisc *sch,
+ struct nlattr *opt,
+ struct netlink_ext_ack *extack)
+{
+ unsigned long seen_tcs = 0;
+ struct nlattr *n;
+ int err = 0, rem;
+
+ nla_for_each_nested(n, opt, rem) {
+ if (nla_type(n) != TCA_TAPRIO_ATTR_TC_ENTRY)
+ continue;
+
+ err = taprio_parse_tc_entry(sch, n, &seen_tcs, extack);
+ if (err)
+ break;
+ }
+
+ return err;
+}
+
static int taprio_mqprio_cmp(const struct net_device *dev,
const struct tc_mqprio_qopt *mqprio)
{
@@ -1465,6 +1551,10 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
if (err < 0)
return err;

+ err = taprio_parse_tc_entries(sch, opt, extack);
+ if (err)
+ return err;
+
new_admin = kzalloc(sizeof(*new_admin), GFP_KERNEL);
if (!new_admin) {
NL_SET_ERR_MSG(extack, "Not enough memory for a new schedule");
@@ -1855,6 +1945,33 @@ static int dump_schedule(struct sk_buff *msg,
return -1;
}

+static int taprio_dump_tc_entries(struct taprio_sched *q, struct sk_buff *skb)
+{
+ struct nlattr *n;
+ int tc;
+
+ for (tc = 0; tc < TC_MAX_QUEUE; tc++) {
+ n = nla_nest_start(skb, TCA_TAPRIO_ATTR_TC_ENTRY);
+ if (!n)
+ return -EMSGSIZE;
+
+ if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_INDEX, tc))
+ goto nla_put_failure;
+
+ if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_MAX_SDU,
+ q->max_sdu[tc]))
+ goto nla_put_failure;
+
+ nla_nest_end(skb, n);
+ }
+
+ return 0;
+
+nla_put_failure:
+ nla_nest_cancel(skb, n);
+ return -EMSGSIZE;
+}
+
static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
{
struct taprio_sched *q = qdisc_priv(sch);
@@ -1894,6 +2011,9 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay))
goto options_error;

+ if (taprio_dump_tc_entries(q, skb))
+ goto options_error;
+
if (oper && dump_schedule(skb, oper))
goto options_error;

--
2.34.1

2022-09-14 15:38:03

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 07/13] net: enetc: offload per-tc max SDU from tc-taprio

The driver currently sets the PTCMSDUR register statically to the max
MTU supported by the interface. Keep this logic if tc-taprio is absent
or if the max_sdu for a traffic class is 0, and follow the requested max
SDU size otherwise.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/freescale/enetc/enetc.h | 3 +++
.../net/ethernet/freescale/enetc/enetc_pf.c | 25 ++++++++++++++++---
.../net/ethernet/freescale/enetc/enetc_qos.c | 10 +++++---
3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index f4cf12c743fe..4d0bdfef51b7 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -455,6 +455,9 @@ static inline void enetc_cbd_free_data_mem(struct enetc_si *si, int size,
data, *dma);
}

+void enetc_reset_ptcmsdur(struct enetc_hw *hw);
+void enetc_set_ptcmsdur(struct enetc_hw *hw, u32 *queue_max_sdu);
+
#ifdef CONFIG_FSL_ENETC_QOS
int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data);
void enetc_sched_speed_set(struct enetc_ndev_priv *priv, int speed);
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
index c4a0e836d4f0..03275846f416 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
@@ -516,15 +516,34 @@ static void enetc_port_si_configure(struct enetc_si *si)
enetc_port_wr(hw, ENETC_PSIVLANFMR, ENETC_PSIVLANFMR_VS);
}

-static void enetc_configure_port_mac(struct enetc_hw *hw)
+void enetc_set_ptcmsdur(struct enetc_hw *hw, u32 *max_sdu)
{
int tc;

- enetc_port_wr(hw, ENETC_PM0_MAXFRM,
- ENETC_SET_MAXFRM(ENETC_RX_MAXFRM_SIZE));
+ for (tc = 0; tc < 8; tc++) {
+ u32 val = ENETC_MAC_MAXFRM_SIZE;
+
+ if (max_sdu[tc])
+ val = max_sdu[tc] + VLAN_ETH_HLEN;
+
+ enetc_port_wr(hw, ENETC_PTCMSDUR(tc), val);
+ }
+}
+
+void enetc_reset_ptcmsdur(struct enetc_hw *hw)
+{
+ int tc;

for (tc = 0; tc < 8; tc++)
enetc_port_wr(hw, ENETC_PTCMSDUR(tc), ENETC_MAC_MAXFRM_SIZE);
+}
+
+static void enetc_configure_port_mac(struct enetc_hw *hw)
+{
+ enetc_port_wr(hw, ENETC_PM0_MAXFRM,
+ ENETC_SET_MAXFRM(ENETC_RX_MAXFRM_SIZE));
+
+ enetc_reset_ptcmsdur(hw);

enetc_port_wr(hw, ENETC_PM0_CMD_CFG, ENETC_PM0_CMD_PHY_TX_EN |
ENETC_PM0_CMD_TXP | ENETC_PM0_PROMISC);
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
index 29be8a1ecee1..a412f27bc4a4 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
@@ -68,6 +68,7 @@ static int enetc_setup_taprio(struct net_device *ndev,
tge = enetc_rd(hw, ENETC_QBV_PTGCR_OFFSET);
if (!admin_conf->enable) {
enetc_wr(hw, ENETC_QBV_PTGCR_OFFSET, tge & ~ENETC_QBV_TGE);
+ enetc_reset_ptcmsdur(hw);

priv->active_offloads &= ~ENETC_F_QBV;

@@ -123,10 +124,13 @@ static int enetc_setup_taprio(struct net_device *ndev,

enetc_cbd_free_data_mem(priv->si, data_size, tmp, &dma);

- if (!err)
- priv->active_offloads |= ENETC_F_QBV;
+ if (err)
+ return err;

- return err;
+ enetc_set_ptcmsdur(hw, admin_conf->max_sdu);
+ priv->active_offloads |= ENETC_F_QBV;
+
+ return 0;
}

int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data)
--
2.34.1

2022-09-14 15:38:09

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

Since the driver does not act upon the max_sdu argument (which it
should, in full offload mode), deny any other values except the default
all-zeroes, which means that all traffic classes should use the same MTU
as the port itself.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
index ea8bbfce0f1f..8ef7816627b6 100644
--- a/drivers/net/dsa/hirschmann/hellcreek.c
+++ b/drivers/net/dsa/hirschmann/hellcreek.c
@@ -1814,10 +1814,15 @@ static int hellcreek_port_setup_tc(struct dsa_switch *ds, int port,
{
struct tc_taprio_qopt_offload *taprio = type_data;
struct hellcreek *hellcreek = ds->priv;
+ int tc;

if (type != TC_SETUP_QDISC_TAPRIO)
return -EOPNOTSUPP;

+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ if (taprio->max_sdu[tc])
+ return -EOPNOTSUPP;
+
if (!hellcreek_validate_schedule(hellcreek, taprio))
return -EOPNOTSUPP;

--
2.34.1

2022-09-14 15:40:05

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 12/13] net: stmmac: deny tc-taprio changes to per-tc max SDU

Since the driver does not act upon the max_sdu argument, deny any other
values except the default all-zeroes, which means that all traffic
classes should use the same MTU as the port itself.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
index 773e415cc2de..8e9c7d54fb4c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
@@ -922,13 +922,17 @@ static int tc_setup_taprio(struct stmmac_priv *priv,
struct plat_stmmacenet_data *plat = priv->plat;
struct timespec64 time, current_time, qopt_time;
ktime_t current_time_ns;
+ int i, tc, ret = 0;
bool fpe = false;
- int i, ret = 0;
u64 ctr;

if (!priv->dma_cap.estsel)
return -EOPNOTSUPP;

+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ if (qopt->max_sdu[tc])
+ return -EOPNOTSUPP;
+
switch (wid) {
case 0x1:
wid = 16;
--
2.34.1

2022-09-14 15:40:06

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 13/13] net: am65-cpsw: deny tc-taprio changes to per-tc max SDU

Since the driver does not act upon the max_sdu argument, deny any other
values except the default all-zeroes, which means that all traffic
classes should use the same MTU as the port itself.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/ti/am65-cpsw-qos.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.c b/drivers/net/ethernet/ti/am65-cpsw-qos.c
index e162771893af..bd3b425083aa 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-qos.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-qos.c
@@ -503,7 +503,11 @@ static int am65_cpsw_set_taprio(struct net_device *ndev, void *type_data)
struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
struct tc_taprio_qopt_offload *taprio = type_data;
struct am65_cpsw_est *est_new;
- int ret = 0;
+ int tc, ret = 0;
+
+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ if (taprio->max_sdu[tc])
+ return -EOPNOTSUPP;

if (taprio->cycle_time_extension) {
dev_err(&ndev->dev, "Failed to set cycle time extension");
--
2.34.1

2022-09-14 16:06:24

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 06/13] net: enetc: cache accesses to &priv->si->hw

The &priv->si->hw construct dereferences 2 pointers and makes lines
longer than they need to be, in turn making the code harder to read.

Replace &priv->si->hw accesses with a "hw" variable when there are 2 or
more accesses within a function that dereference this. This includes
loops, since &priv->si->hw is a loop invariant.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/freescale/enetc/enetc.c | 28 +++++----
drivers/net/ethernet/freescale/enetc/enetc.h | 9 +--
.../net/ethernet/freescale/enetc/enetc_qos.c | 60 +++++++++----------
3 files changed, 49 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 4470a4a3e4c3..850312f00684 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2116,13 +2116,14 @@ static void enetc_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring)

static void enetc_setup_bdrs(struct enetc_ndev_priv *priv)
{
+ struct enetc_hw *hw = &priv->si->hw;
int i;

for (i = 0; i < priv->num_tx_rings; i++)
- enetc_setup_txbdr(&priv->si->hw, priv->tx_ring[i]);
+ enetc_setup_txbdr(hw, priv->tx_ring[i]);

for (i = 0; i < priv->num_rx_rings; i++)
- enetc_setup_rxbdr(&priv->si->hw, priv->rx_ring[i]);
+ enetc_setup_rxbdr(hw, priv->rx_ring[i]);
}

static void enetc_clear_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring)
@@ -2155,13 +2156,14 @@ static void enetc_clear_txbdr(struct enetc_hw *hw, struct enetc_bdr *tx_ring)

static void enetc_clear_bdrs(struct enetc_ndev_priv *priv)
{
+ struct enetc_hw *hw = &priv->si->hw;
int i;

for (i = 0; i < priv->num_tx_rings; i++)
- enetc_clear_txbdr(&priv->si->hw, priv->tx_ring[i]);
+ enetc_clear_txbdr(hw, priv->tx_ring[i]);

for (i = 0; i < priv->num_rx_rings; i++)
- enetc_clear_rxbdr(&priv->si->hw, priv->rx_ring[i]);
+ enetc_clear_rxbdr(hw, priv->rx_ring[i]);

udelay(1);
}
@@ -2169,13 +2171,13 @@ static void enetc_clear_bdrs(struct enetc_ndev_priv *priv)
static int enetc_setup_irqs(struct enetc_ndev_priv *priv)
{
struct pci_dev *pdev = priv->si->pdev;
+ struct enetc_hw *hw = &priv->si->hw;
int i, j, err;

for (i = 0; i < priv->bdr_int_num; i++) {
int irq = pci_irq_vector(pdev, ENETC_BDR_INT_BASE_IDX + i);
struct enetc_int_vector *v = priv->int_vector[i];
int entry = ENETC_BDR_INT_BASE_IDX + i;
- struct enetc_hw *hw = &priv->si->hw;

snprintf(v->name, sizeof(v->name), "%s-rxtx%d",
priv->ndev->name, i);
@@ -2263,13 +2265,14 @@ static void enetc_setup_interrupts(struct enetc_ndev_priv *priv)

static void enetc_clear_interrupts(struct enetc_ndev_priv *priv)
{
+ struct enetc_hw *hw = &priv->si->hw;
int i;

for (i = 0; i < priv->num_tx_rings; i++)
- enetc_txbdr_wr(&priv->si->hw, i, ENETC_TBIER, 0);
+ enetc_txbdr_wr(hw, i, ENETC_TBIER, 0);

for (i = 0; i < priv->num_rx_rings; i++)
- enetc_rxbdr_wr(&priv->si->hw, i, ENETC_RBIER, 0);
+ enetc_rxbdr_wr(hw, i, ENETC_RBIER, 0);
}

static int enetc_phylink_connect(struct net_device *ndev)
@@ -2436,6 +2439,7 @@ static int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
{
struct enetc_ndev_priv *priv = netdev_priv(ndev);
struct tc_mqprio_qopt *mqprio = type_data;
+ struct enetc_hw *hw = &priv->si->hw;
struct enetc_bdr *tx_ring;
int num_stack_tx_queues;
u8 num_tc;
@@ -2452,7 +2456,7 @@ static int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
/* Reset all ring priorities to 0 */
for (i = 0; i < priv->num_tx_rings; i++) {
tx_ring = priv->tx_ring[i];
- enetc_set_bdr_prio(&priv->si->hw, tx_ring->index, 0);
+ enetc_set_bdr_prio(hw, tx_ring->index, 0);
}

return 0;
@@ -2471,7 +2475,7 @@ static int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
*/
for (i = 0; i < num_tc; i++) {
tx_ring = priv->tx_ring[i];
- enetc_set_bdr_prio(&priv->si->hw, tx_ring->index, i);
+ enetc_set_bdr_prio(hw, tx_ring->index, i);
}

/* Reset the number of netdev queues based on the TC count */
@@ -2626,19 +2630,21 @@ static int enetc_set_psfp(struct net_device *ndev, int en)
static void enetc_enable_rxvlan(struct net_device *ndev, bool en)
{
struct enetc_ndev_priv *priv = netdev_priv(ndev);
+ struct enetc_hw *hw = &priv->si->hw;
int i;

for (i = 0; i < priv->num_rx_rings; i++)
- enetc_bdr_enable_rxvlan(&priv->si->hw, i, en);
+ enetc_bdr_enable_rxvlan(hw, i, en);
}

static void enetc_enable_txvlan(struct net_device *ndev, bool en)
{
struct enetc_ndev_priv *priv = netdev_priv(ndev);
+ struct enetc_hw *hw = &priv->si->hw;
int i;

for (i = 0; i < priv->num_tx_rings; i++)
- enetc_bdr_enable_txvlan(&priv->si->hw, i, en);
+ enetc_bdr_enable_txvlan(hw, i, en);
}

int enetc_set_features(struct net_device *ndev,
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index 29922c20531f..f4cf12c743fe 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -468,19 +468,20 @@ int enetc_psfp_clean(struct enetc_ndev_priv *priv);

static inline void enetc_get_max_cap(struct enetc_ndev_priv *priv)
{
+ struct enetc_hw *hw = &priv->si->hw;
u32 reg;

- reg = enetc_port_rd(&priv->si->hw, ENETC_PSIDCAPR);
+ reg = enetc_port_rd(hw, ENETC_PSIDCAPR);
priv->psfp_cap.max_streamid = reg & ENETC_PSIDCAPR_MSK;
/* Port stream filter capability */
- reg = enetc_port_rd(&priv->si->hw, ENETC_PSFCAPR);
+ reg = enetc_port_rd(hw, ENETC_PSFCAPR);
priv->psfp_cap.max_psfp_filter = reg & ENETC_PSFCAPR_MSK;
/* Port stream gate capability */
- reg = enetc_port_rd(&priv->si->hw, ENETC_PSGCAPR);
+ reg = enetc_port_rd(hw, ENETC_PSGCAPR);
priv->psfp_cap.max_psfp_gate = (reg & ENETC_PSGCAPR_SGIT_MSK);
priv->psfp_cap.max_psfp_gatelist = (reg & ENETC_PSGCAPR_GCL_MSK) >> 16;
/* Port flow meter capability */
- reg = enetc_port_rd(&priv->si->hw, ENETC_PFMCAPR);
+ reg = enetc_port_rd(hw, ENETC_PFMCAPR);
priv->psfp_cap.max_psfp_meter = reg & ENETC_PFMCAPR_MSK;
}

diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
index 582a663ed0ba..29be8a1ecee1 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
@@ -17,8 +17,9 @@ static u16 enetc_get_max_gcl_len(struct enetc_hw *hw)

void enetc_sched_speed_set(struct enetc_ndev_priv *priv, int speed)
{
+ struct enetc_hw *hw = &priv->si->hw;
u32 old_speed = priv->speed;
- u32 pspeed;
+ u32 pspeed, tmp;

if (speed == old_speed)
return;
@@ -39,16 +40,15 @@ void enetc_sched_speed_set(struct enetc_ndev_priv *priv, int speed)
}

priv->speed = speed;
- enetc_port_wr(&priv->si->hw, ENETC_PMR,
- (enetc_port_rd(&priv->si->hw, ENETC_PMR)
- & (~ENETC_PMR_PSPEED_MASK))
- | pspeed);
+ tmp = enetc_port_rd(hw, ENETC_PMR);
+ enetc_port_wr(hw, ENETC_PMR, (tmp & ~ENETC_PMR_PSPEED_MASK) | pspeed);
}

static int enetc_setup_taprio(struct net_device *ndev,
struct tc_taprio_qopt_offload *admin_conf)
{
struct enetc_ndev_priv *priv = netdev_priv(ndev);
+ struct enetc_hw *hw = &priv->si->hw;
struct enetc_cbd cbd = {.cmd = 0};
struct tgs_gcl_conf *gcl_config;
struct tgs_gcl_data *gcl_data;
@@ -61,15 +61,13 @@ static int enetc_setup_taprio(struct net_device *ndev,
int err;
int i;

- if (admin_conf->num_entries > enetc_get_max_gcl_len(&priv->si->hw))
+ if (admin_conf->num_entries > enetc_get_max_gcl_len(hw))
return -EINVAL;
gcl_len = admin_conf->num_entries;

- tge = enetc_rd(&priv->si->hw, ENETC_QBV_PTGCR_OFFSET);
+ tge = enetc_rd(hw, ENETC_QBV_PTGCR_OFFSET);
if (!admin_conf->enable) {
- enetc_wr(&priv->si->hw,
- ENETC_QBV_PTGCR_OFFSET,
- tge & (~ENETC_QBV_TGE));
+ enetc_wr(hw, ENETC_QBV_PTGCR_OFFSET, tge & ~ENETC_QBV_TGE);

priv->active_offloads &= ~ENETC_F_QBV;

@@ -117,14 +115,11 @@ static int enetc_setup_taprio(struct net_device *ndev,
cbd.cls = BDCR_CMD_PORT_GCL;
cbd.status_flags = 0;

- enetc_wr(&priv->si->hw, ENETC_QBV_PTGCR_OFFSET,
- tge | ENETC_QBV_TGE);
+ enetc_wr(hw, ENETC_QBV_PTGCR_OFFSET, tge | ENETC_QBV_TGE);

err = enetc_send_cmd(priv->si, &cbd);
if (err)
- enetc_wr(&priv->si->hw,
- ENETC_QBV_PTGCR_OFFSET,
- tge & (~ENETC_QBV_TGE));
+ enetc_wr(hw, ENETC_QBV_PTGCR_OFFSET, tge & ~ENETC_QBV_TGE);

enetc_cbd_free_data_mem(priv->si, data_size, tmp, &dma);

@@ -138,6 +133,7 @@ int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data)
{
struct tc_taprio_qopt_offload *taprio = type_data;
struct enetc_ndev_priv *priv = netdev_priv(ndev);
+ struct enetc_hw *hw = &priv->si->hw;
int err;
int i;

@@ -147,16 +143,14 @@ int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data)
return -EBUSY;

for (i = 0; i < priv->num_tx_rings; i++)
- enetc_set_bdr_prio(&priv->si->hw,
- priv->tx_ring[i]->index,
+ enetc_set_bdr_prio(hw, priv->tx_ring[i]->index,
taprio->enable ? i : 0);

err = enetc_setup_taprio(ndev, taprio);

if (err)
for (i = 0; i < priv->num_tx_rings; i++)
- enetc_set_bdr_prio(&priv->si->hw,
- priv->tx_ring[i]->index,
+ enetc_set_bdr_prio(hw, priv->tx_ring[i]->index,
taprio->enable ? 0 : i);

return err;
@@ -178,7 +172,7 @@ int enetc_setup_tc_cbs(struct net_device *ndev, void *type_data)
struct tc_cbs_qopt_offload *cbs = type_data;
u32 port_transmit_rate = priv->speed;
u8 tc_nums = netdev_get_num_tc(ndev);
- struct enetc_si *si = priv->si;
+ struct enetc_hw *hw = &priv->si->hw;
u32 hi_credit_bit, hi_credit_reg;
u32 max_interference_size;
u32 port_frame_max_size;
@@ -199,15 +193,15 @@ int enetc_setup_tc_cbs(struct net_device *ndev, void *type_data)
* lower than this TC have been disabled.
*/
if (tc == prio_top &&
- enetc_get_cbs_enable(&si->hw, prio_next)) {
+ enetc_get_cbs_enable(hw, prio_next)) {
dev_err(&ndev->dev,
"Disable TC%d before disable TC%d\n",
prio_next, tc);
return -EINVAL;
}

- enetc_port_wr(&si->hw, ENETC_PTCCBSR1(tc), 0);
- enetc_port_wr(&si->hw, ENETC_PTCCBSR0(tc), 0);
+ enetc_port_wr(hw, ENETC_PTCCBSR1(tc), 0);
+ enetc_port_wr(hw, ENETC_PTCCBSR0(tc), 0);

return 0;
}
@@ -224,13 +218,13 @@ int enetc_setup_tc_cbs(struct net_device *ndev, void *type_data)
* higher than this TC have been enabled.
*/
if (tc == prio_next) {
- if (!enetc_get_cbs_enable(&si->hw, prio_top)) {
+ if (!enetc_get_cbs_enable(hw, prio_top)) {
dev_err(&ndev->dev,
"Enable TC%d first before enable TC%d\n",
prio_top, prio_next);
return -EINVAL;
}
- bw_sum += enetc_get_cbs_bw(&si->hw, prio_top);
+ bw_sum += enetc_get_cbs_bw(hw, prio_top);
}

if (bw_sum + bw >= 100) {
@@ -239,7 +233,7 @@ int enetc_setup_tc_cbs(struct net_device *ndev, void *type_data)
return -EINVAL;
}

- enetc_port_rd(&si->hw, ENETC_PTCMSDUR(tc));
+ enetc_port_rd(hw, ENETC_PTCMSDUR(tc));

/* For top prio TC, the max_interfrence_size is maxSizedFrame.
*
@@ -259,8 +253,8 @@ int enetc_setup_tc_cbs(struct net_device *ndev, void *type_data)
u32 m0, ma, r0, ra;

m0 = port_frame_max_size * 8;
- ma = enetc_port_rd(&si->hw, ENETC_PTCMSDUR(prio_top)) * 8;
- ra = enetc_get_cbs_bw(&si->hw, prio_top) *
+ ma = enetc_port_rd(hw, ENETC_PTCMSDUR(prio_top)) * 8;
+ ra = enetc_get_cbs_bw(hw, prio_top) *
port_transmit_rate * 10000ULL;
r0 = port_transmit_rate * 1000000ULL;
max_interference_size = m0 + ma +
@@ -280,10 +274,10 @@ int enetc_setup_tc_cbs(struct net_device *ndev, void *type_data)
hi_credit_reg = (u32)div_u64((ENETC_CLK * 100ULL) * hi_credit_bit,
port_transmit_rate * 1000000ULL);

- enetc_port_wr(&si->hw, ENETC_PTCCBSR1(tc), hi_credit_reg);
+ enetc_port_wr(hw, ENETC_PTCCBSR1(tc), hi_credit_reg);

/* Set bw register and enable this traffic class */
- enetc_port_wr(&si->hw, ENETC_PTCCBSR0(tc), bw | ENETC_CBSE);
+ enetc_port_wr(hw, ENETC_PTCCBSR0(tc), bw | ENETC_CBSE);

return 0;
}
@@ -293,6 +287,7 @@ int enetc_setup_tc_txtime(struct net_device *ndev, void *type_data)
struct enetc_ndev_priv *priv = netdev_priv(ndev);
struct tc_etf_qopt_offload *qopt = type_data;
u8 tc_nums = netdev_get_num_tc(ndev);
+ struct enetc_hw *hw = &priv->si->hw;
int tc;

if (!tc_nums)
@@ -304,12 +299,11 @@ int enetc_setup_tc_txtime(struct net_device *ndev, void *type_data)
return -EINVAL;

/* TSD and Qbv are mutually exclusive in hardware */
- if (enetc_rd(&priv->si->hw, ENETC_QBV_PTGCR_OFFSET) & ENETC_QBV_TGE)
+ if (enetc_rd(hw, ENETC_QBV_PTGCR_OFFSET) & ENETC_QBV_TGE)
return -EBUSY;

priv->tx_ring[tc]->tsd_enable = qopt->enable;
- enetc_port_wr(&priv->si->hw, ENETC_PTCTSDR(tc),
- qopt->enable ? ENETC_TSDE : 0);
+ enetc_port_wr(hw, ENETC_PTCTSDR(tc), qopt->enable ? ENETC_TSDE : 0);

return 0;
}
--
2.34.1

2022-09-14 16:08:37

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 02/13] net/sched: taprio: stop going through private ops for dequeue and peek

Since commit 13511704f8d7 ("net: taprio offload: enforce qdisc to netdev
queue mapping"), taprio_dequeue_soft() and taprio_peek_soft() are de
facto the only implementations for Qdisc_ops :: dequeue and Qdisc_ops ::
peek that taprio provides.

This is because in full offload mode, __dev_queue_xmit() will select a
txq->qdisc which is never root taprio qdisc. So if nothing is enqueued
in the root qdisc, it will never be run and nothing will get dequeued
from it.

Therefore, we can remove the private indirection from taprio, and always
point Qdisc_ops :: dequeue to taprio_dequeue_soft (now simply named
taprio_dequeue) and Qdisc_ops :: peek to taprio_peek_soft (now simply
named taprio_peek).

Signed-off-by: Vladimir Oltean <[email protected]>
---
net/sched/sch_taprio.c | 58 +++++++++---------------------------------
1 file changed, 12 insertions(+), 46 deletions(-)

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index a172c1eba995..226aa6efb365 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -79,8 +79,6 @@ struct taprio_sched {
struct sched_gate_list __rcu *admin_sched;
struct hrtimer advance_timer;
struct list_head taprio_list;
- struct sk_buff *(*dequeue)(struct Qdisc *sch);
- struct sk_buff *(*peek)(struct Qdisc *sch);
u32 txtime_delay;
};

@@ -492,7 +490,7 @@ static int taprio_enqueue(struct sk_buff *skb, struct Qdisc *sch,
return taprio_enqueue_one(skb, sch, child, to_free);
}

-static struct sk_buff *taprio_peek_soft(struct Qdisc *sch)
+static struct sk_buff *taprio_peek(struct Qdisc *sch)
{
struct taprio_sched *q = qdisc_priv(sch);
struct net_device *dev = qdisc_dev(sch);
@@ -501,6 +499,11 @@ static struct sk_buff *taprio_peek_soft(struct Qdisc *sch)
u32 gate_mask;
int i;

+ if (unlikely(FULL_OFFLOAD_IS_ENABLED(q->flags))) {
+ WARN_ONCE(1, "Trying to peek into the root of a taprio qdisc configured with full offload\n");
+ return NULL;
+ }
+
rcu_read_lock();
entry = rcu_dereference(q->current_entry);
gate_mask = entry ? entry->gate_mask : TAPRIO_ALL_GATES_OPEN;
@@ -536,20 +539,6 @@ static struct sk_buff *taprio_peek_soft(struct Qdisc *sch)
return NULL;
}

-static struct sk_buff *taprio_peek_offload(struct Qdisc *sch)
-{
- WARN_ONCE(1, "Trying to peek into the root of a taprio qdisc configured with full offload\n");
-
- return NULL;
-}
-
-static struct sk_buff *taprio_peek(struct Qdisc *sch)
-{
- struct taprio_sched *q = qdisc_priv(sch);
-
- return q->peek(sch);
-}
-
static void taprio_set_budget(struct taprio_sched *q, struct sched_entry *entry)
{
atomic_set(&entry->budget,
@@ -557,7 +546,7 @@ static void taprio_set_budget(struct taprio_sched *q, struct sched_entry *entry)
atomic64_read(&q->picos_per_byte)));
}

-static struct sk_buff *taprio_dequeue_soft(struct Qdisc *sch)
+static struct sk_buff *taprio_dequeue(struct Qdisc *sch)
{
struct taprio_sched *q = qdisc_priv(sch);
struct net_device *dev = qdisc_dev(sch);
@@ -566,6 +555,11 @@ static struct sk_buff *taprio_dequeue_soft(struct Qdisc *sch)
u32 gate_mask;
int i;

+ if (unlikely(FULL_OFFLOAD_IS_ENABLED(q->flags))) {
+ WARN_ONCE(1, "Trying to dequeue from the root of a taprio qdisc configured with full offload\n");
+ return NULL;
+ }
+
rcu_read_lock();
entry = rcu_dereference(q->current_entry);
/* if there's no entry, it means that the schedule didn't
@@ -645,20 +639,6 @@ static struct sk_buff *taprio_dequeue_soft(struct Qdisc *sch)
return skb;
}

-static struct sk_buff *taprio_dequeue_offload(struct Qdisc *sch)
-{
- WARN_ONCE(1, "Trying to dequeue from the root of a taprio qdisc configured with full offload\n");
-
- return NULL;
-}
-
-static struct sk_buff *taprio_dequeue(struct Qdisc *sch)
-{
- struct taprio_sched *q = qdisc_priv(sch);
-
- return q->dequeue(sch);
-}
-
static bool should_restart_cycle(const struct sched_gate_list *oper,
const struct sched_entry *entry)
{
@@ -1565,17 +1545,6 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
q->advance_timer.function = advance_sched;
}

- if (FULL_OFFLOAD_IS_ENABLED(q->flags)) {
- q->dequeue = taprio_dequeue_offload;
- q->peek = taprio_peek_offload;
- } else {
- /* Be sure to always keep the function pointers
- * in a consistent state.
- */
- q->dequeue = taprio_dequeue_soft;
- q->peek = taprio_peek_soft;
- }
-
err = taprio_get_start_time(sch, new_admin, &start);
if (err < 0) {
NL_SET_ERR_MSG(extack, "Internal error: failed get start time");
@@ -1694,9 +1663,6 @@ static int taprio_init(struct Qdisc *sch, struct nlattr *opt,
hrtimer_init(&q->advance_timer, CLOCK_TAI, HRTIMER_MODE_ABS);
q->advance_timer.function = advance_sched;

- q->dequeue = taprio_dequeue_soft;
- q->peek = taprio_peek_soft;
-
q->root = sch;

/* We only support static clockids. Use an invalid value as default
--
2.34.1

2022-09-14 16:09:19

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 09/13] net: dsa: sja1105: deny tc-taprio changes to per-tc max SDU

Since the driver does not act upon the max_sdu argument (which it
should, in full offload mode), deny any other values except the default
all-zeroes, which means that all traffic classes should use the same MTU
as the port itself.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/sja1105/sja1105_tas.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_tas.c b/drivers/net/dsa/sja1105/sja1105_tas.c
index e6153848a950..607f4714fb01 100644
--- a/drivers/net/dsa/sja1105/sja1105_tas.c
+++ b/drivers/net/dsa/sja1105/sja1105_tas.c
@@ -511,7 +511,7 @@ int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port,
{
struct sja1105_private *priv = ds->priv;
struct sja1105_tas_data *tas_data = &priv->tas_data;
- int other_port, rc, i;
+ int other_port, rc, i, tc;

/* Can't change an already configured port (must delete qdisc first).
* Can't delete the qdisc from an unconfigured port.
@@ -519,6 +519,10 @@ int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port,
if (!!tas_data->offload[port] == admin->enable)
return -EINVAL;

+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ if (admin->max_sdu[tc])
+ return -EOPNOTSUPP;
+
if (!admin->enable) {
taprio_offload_free(tas_data->offload[port]);
tas_data->offload[port] = NULL;
--
2.34.1

2022-09-14 16:09:57

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 11/13] igc: deny tc-taprio changes to per-tc max SDU

Since the driver does not act upon the max_sdu argument, deny any other
values except the default all-zeroes, which means that all traffic
classes should use the same MTU as the port itself.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index bf6c461e1a2a..47fae443066c 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -5965,11 +5965,15 @@ static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
struct tc_taprio_qopt_offload *qopt)
{
struct igc_hw *hw = &adapter->hw;
- int err;
+ int tc, err;

if (hw->mac.type != igc_i225)
return -EOPNOTSUPP;

+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ if (qopt->max_sdu[tc])
+ return -EOPNOTSUPP;
+
err = igc_save_qbv_schedule(adapter, qopt);
if (err)
return err;
--
2.34.1

2022-09-14 16:11:20

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 05/13] net: dsa: felix: offload per-tc max SDU from tc-taprio

Our current vsc9959_tas_guard_bands_update() algorithm has a limitation
imposed by the hardware design. To avoid packet overruns between one
gate interval and the next (which would add jitter for scheduled traffic
in the next gate), we configure the switch to use guard bands. These are
as large as the largest packet which is possible to be transmitted.

The problem is that at tc-taprio intervals of sizes comparable to a
guard band, there isn't an obvious place in which to split the interval
between the useful portion (for scheduling) and the guard band portion
(where scheduling is blocked).

For example, a 10 us interval at 1Gbps allows 1225 octets to be
transmitted. We currently split the interval between the bare minimum of
33 ns useful time (required to schedule a single packet) and the rest as
guard band.

But 33 ns of useful scheduling time will only allow a single packet to
be sent, be that packet 1200 octets in size, or 60 octets in size. It is
impossible to send 2 60 octets frames in the 10 us window. Except that
if we reduced the guard band (and therefore the maximum allowable SDU
size) to 5 us, the useful time for scheduling is now also 5 us, so more
packets could be scheduled.

The hardware inflexibility of not scheduling according to individual
packet lengths must unfortunately propagate to the user, who needs to
tune the queueMaxSDU values if he wants to fit more small packets into a
10 us interval, rather than one large packet.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/ocelot/felix_vsc9959.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index 459288d6222c..3c9a15dc7f59 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1248,6 +1248,14 @@ static u32 vsc9959_port_qmaxsdu_get(struct ocelot *ocelot, int port, int tc)
}
}

+static u32 vsc9959_tas_tc_max_sdu(struct tc_taprio_qopt_offload *taprio, int tc)
+{
+ if (!taprio || !taprio->max_sdu[tc])
+ return 0;
+
+ return taprio->max_sdu[tc] + ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN;
+}
+
/* Update QSYS_PORT_MAX_SDU to make sure the static guard bands added by the
* switch (see the ALWAYS_GUARD_BAND_SCH_Q comment) are correct at all MTU
* values (the default value is 1518). Also, for traffic class windows smaller
@@ -1257,6 +1265,7 @@ static u32 vsc9959_port_qmaxsdu_get(struct ocelot *ocelot, int port, int tc)
static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
{
struct ocelot_port *ocelot_port = ocelot->ports[port];
+ struct tc_taprio_qopt_offload *taprio;
u64 min_gate_len[OCELOT_NUM_TC];
int speed, picos_per_byte;
u64 needed_bit_time_ps;
@@ -1266,6 +1275,8 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)

lockdep_assert_held(&ocelot->tas_lock);

+ taprio = ocelot_port->taprio;
+
val = ocelot_read_rix(ocelot, QSYS_TAG_CONFIG, port);
tas_speed = QSYS_TAG_CONFIG_LINK_SPEED_X(val);

@@ -1302,11 +1313,12 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
"port %d: max frame size %d needs %llu ps at speed %d\n",
port, maxlen, needed_bit_time_ps, speed);

- vsc9959_tas_min_gate_lengths(ocelot_port->taprio, min_gate_len);
+ vsc9959_tas_min_gate_lengths(taprio, min_gate_len);

mutex_lock(&ocelot->fwd_domain_lock);

for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+ u32 requested_max_sdu = vsc9959_tas_tc_max_sdu(taprio, tc);
u64 remaining_gate_len_ps;
u32 max_sdu;

@@ -1317,7 +1329,7 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
/* Setting QMAXSDU_CFG to 0 disables oversized frame
* dropping.
*/
- max_sdu = 0;
+ max_sdu = requested_max_sdu;
dev_dbg(ocelot->dev,
"port %d tc %d min gate len %llu"
", sending all frames\n",
@@ -1348,6 +1360,10 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
*/
if (max_sdu > 20)
max_sdu -= 20;
+
+ if (requested_max_sdu && requested_max_sdu < max_sdu)
+ max_sdu = requested_max_sdu;
+
dev_info(ocelot->dev,
"port %d tc %d min gate length %llu"
" ns not enough for max frame size %d at %d"
--
2.34.1

2022-09-14 16:20:43

by Vladimir Oltean

[permalink] [raw]
Subject: [PATCH net-next 10/13] tsnep: deny tc-taprio changes to per-tc max SDU

Since the driver does not act upon the max_sdu argument, deny any other
values except the default all-zeroes, which means that all traffic
classes should use the same MTU as the port itself.

Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/engleder/tsnep_tc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_tc.c b/drivers/net/ethernet/engleder/tsnep_tc.c
index c4c6e1357317..82df837ffc54 100644
--- a/drivers/net/ethernet/engleder/tsnep_tc.c
+++ b/drivers/net/ethernet/engleder/tsnep_tc.c
@@ -320,11 +320,15 @@ static int tsnep_taprio(struct tsnep_adapter *adapter,
{
struct tsnep_gcl *gcl;
struct tsnep_gcl *curr;
- int retval;
+ int retval, tc;

if (!adapter->gate_control)
return -EOPNOTSUPP;

+ for (tc = 0; tc < TC_MAX_QUEUE; tc++)
+ if (qopt->max_sdu[tc])
+ return -EOPNOTSUPP;
+
if (!qopt->enable) {
/* disable gate control if active */
mutex_lock(&adapter->gate_control_lock);
--
2.34.1

2022-09-14 18:37:51

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed Sep 14 2022, Vladimir Oltean wrote:
> Since the driver does not act upon the max_sdu argument (which it
> should, in full offload mode), deny any other values except the default
> all-zeroes, which means that all traffic classes should use the same MTU
> as the port itself.
>
> Signed-off-by: Vladimir Oltean <[email protected]>
> ---
> drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
> index ea8bbfce0f1f..8ef7816627b6 100644
> --- a/drivers/net/dsa/hirschmann/hellcreek.c
> +++ b/drivers/net/dsa/hirschmann/hellcreek.c
> @@ -1814,10 +1814,15 @@ static int hellcreek_port_setup_tc(struct dsa_switch *ds, int port,
> {
> struct tc_taprio_qopt_offload *taprio = type_data;
> struct hellcreek *hellcreek = ds->priv;
> + int tc;
>
> if (type != TC_SETUP_QDISC_TAPRIO)
> return -EOPNOTSUPP;
>
> + for (tc = 0; tc < TC_MAX_QUEUE; tc++)
> + if (taprio->max_sdu[tc])
> + return -EOPNOTSUPP;

I'd rather like to see that feature implemented :-). Something like below.

From 3d44683979bf50960125fa3005b1142af61525c7 Mon Sep 17 00:00:00 2001
From: Kurt Kanzenbach <[email protected]>
Date: Wed, 14 Sep 2022 19:51:40 +0200
Subject: [PATCH] net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio

Add support for configuring the max SDU per priority and per port. If not
specified, keep the default.

Signed-off-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/dsa/hirschmann/hellcreek.c | 61 +++++++++++++++++++++++---
drivers/net/dsa/hirschmann/hellcreek.h | 7 +++
2 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
index 5ceee71d9a25..1b22710e1641 100644
--- a/drivers/net/dsa/hirschmann/hellcreek.c
+++ b/drivers/net/dsa/hirschmann/hellcreek.c
@@ -128,6 +128,16 @@ static void hellcreek_select_prio(struct hellcreek *hellcreek, int prio)
hellcreek_write(hellcreek, val, HR_PSEL);
}

+static void hellcreek_select_port_prio(struct hellcreek *hellcreek, int port,
+ int prio)
+{
+ u16 val = port << HR_PSEL_PTWSEL_SHIFT;
+
+ val |= prio << HR_PSEL_PRTCWSEL_SHIFT;
+
+ hellcreek_write(hellcreek, val, HR_PSEL);
+}
+
static void hellcreek_select_counter(struct hellcreek *hellcreek, int counter)
{
u16 val = counter << HR_CSEL_SHIFT;
@@ -1537,6 +1547,42 @@ hellcreek_port_prechangeupper(struct dsa_switch *ds, int port,
return ret;
}

+static void hellcreek_setup_maxsdu(struct hellcreek *hellcreek, int port,
+ const struct tc_taprio_qopt_offload *schedule)
+{
+ int tc;
+
+ for (tc = 0; tc < 8; ++tc) {
+ u16 val;
+
+ if (!schedule->max_sdu[tc])
+ continue;
+
+ hellcreek_select_port_prio(hellcreek, port, tc);
+
+ val = (schedule->max_sdu[tc] & HR_PTPRTCCFG_MAXSDU_MASK)
+ << HR_PTPRTCCFG_MAXSDU_SHIFT;
+
+ hellcreek_write(hellcreek, val, HR_PTPRTCCFG);
+ }
+}
+
+static void hellcreek_reset_maxsdu(struct hellcreek *hellcreek, int port)
+{
+ int tc;
+
+ for (tc = 0; tc < 8; ++tc) {
+ u16 val;
+
+ hellcreek_select_port_prio(hellcreek, port, tc);
+
+ val = (HELLCREEK_DEFAULT_MAX_SDU & HR_PTPRTCCFG_MAXSDU_MASK)
+ << HR_PTPRTCCFG_MAXSDU_SHIFT;
+
+ hellcreek_write(hellcreek, val, HR_PTPRTCCFG);
+ }
+}
+
static void hellcreek_setup_gcl(struct hellcreek *hellcreek, int port,
const struct tc_taprio_qopt_offload *schedule)
{
@@ -1720,7 +1766,10 @@ static int hellcreek_port_set_schedule(struct dsa_switch *ds, int port,
}
hellcreek_port->current_schedule = taprio_offload_get(taprio);

- /* Then select port */
+ /* Configure max sdu */
+ hellcreek_setup_maxsdu(hellcreek, port, hellcreek_port->current_schedule);
+
+ /* Select tdg */
hellcreek_select_tgd(hellcreek, port);

/* Enable gating and keep defaults */
@@ -1772,7 +1821,10 @@ static int hellcreek_port_del_schedule(struct dsa_switch *ds, int port)
hellcreek_port->current_schedule = NULL;
}

- /* Then select port */
+ /* Reset max sdu */
+ hellcreek_reset_maxsdu(hellcreek, port);
+
+ /* Select tgd */
hellcreek_select_tgd(hellcreek, port);

/* Disable gating and return to regular switching flow */
@@ -1814,15 +1866,10 @@ static int hellcreek_port_setup_tc(struct dsa_switch *ds, int port,
{
struct tc_taprio_qopt_offload *taprio = type_data;
struct hellcreek *hellcreek = ds->priv;
- int tc;

if (type != TC_SETUP_QDISC_TAPRIO)
return -EOPNOTSUPP;

- for (tc = 0; tc < TC_MAX_QUEUE; tc++)
- if (taprio->max_sdu[tc])
- return -EOPNOTSUPP;
-
if (!hellcreek_validate_schedule(hellcreek, taprio))
return -EOPNOTSUPP;

diff --git a/drivers/net/dsa/hirschmann/hellcreek.h b/drivers/net/dsa/hirschmann/hellcreek.h
index c96b8c278904..66b989d946e4 100644
--- a/drivers/net/dsa/hirschmann/hellcreek.h
+++ b/drivers/net/dsa/hirschmann/hellcreek.h
@@ -37,6 +37,7 @@
#define HELLCREEK_VLAN_UNTAGGED_MEMBER 0x1
#define HELLCREEK_VLAN_TAGGED_MEMBER 0x3
#define HELLCREEK_NUM_EGRESS_QUEUES 8
+#define HELLCREEK_DEFAULT_MAX_SDU 1536

/* Register definitions */
#define HR_MODID_C (0 * 2)
@@ -72,6 +73,12 @@
#define HR_PRTCCFG_PCP_TC_MAP_SHIFT 0
#define HR_PRTCCFG_PCP_TC_MAP_MASK GENMASK(2, 0)

+#define HR_PTPRTCCFG (0xa9 * 2)
+#define HR_PTPRTCCFG_SET_QTRACK BIT(15)
+#define HR_PTPRTCCFG_REJECT BIT(14)
+#define HR_PTPRTCCFG_MAXSDU_SHIFT 0
+#define HR_PTPRTCCFG_MAXSDU_MASK GENMASK(10, 0)
+
#define HR_CSEL (0x8d * 2)
#define HR_CSEL_SHIFT 0
#define HR_CSEL_MASK GENMASK(7, 0)
--
2.30.2


Attachments:
signature.asc (877.00 B)

2022-09-14 19:13:59

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed, Sep 14, 2022 at 08:13:53PM +0200, Kurt Kanzenbach wrote:
> I'd rather like to see that feature implemented :-). Something like below.
>
> From 3d44683979bf50960125fa3005b1142af61525c7 Mon Sep 17 00:00:00 2001
> From: Kurt Kanzenbach <[email protected]>
> Date: Wed, 14 Sep 2022 19:51:40 +0200
> Subject: [PATCH] net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio
>
> Add support for configuring the max SDU per priority and per port. If not
> specified, keep the default.
>
> Signed-off-by: Kurt Kanzenbach <[email protected]>
> ---

Nice :) Do you also want the iproute2 patch, so you can test it?

> drivers/net/dsa/hirschmann/hellcreek.c | 61 +++++++++++++++++++++++---
> drivers/net/dsa/hirschmann/hellcreek.h | 7 +++
> 2 files changed, 61 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
> index 5ceee71d9a25..1b22710e1641 100644
> --- a/drivers/net/dsa/hirschmann/hellcreek.c
> +++ b/drivers/net/dsa/hirschmann/hellcreek.c
> @@ -128,6 +128,16 @@ static void hellcreek_select_prio(struct hellcreek *hellcreek, int prio)
> hellcreek_write(hellcreek, val, HR_PSEL);
> }
>
> +static void hellcreek_select_port_prio(struct hellcreek *hellcreek, int port,
> + int prio)
> +{
> + u16 val = port << HR_PSEL_PTWSEL_SHIFT;
> +
> + val |= prio << HR_PSEL_PRTCWSEL_SHIFT;
> +
> + hellcreek_write(hellcreek, val, HR_PSEL);
> +}
> +
> static void hellcreek_select_counter(struct hellcreek *hellcreek, int counter)
> {
> u16 val = counter << HR_CSEL_SHIFT;
> @@ -1537,6 +1547,42 @@ hellcreek_port_prechangeupper(struct dsa_switch *ds, int port,
> return ret;
> }
>
> +static void hellcreek_setup_maxsdu(struct hellcreek *hellcreek, int port,
> + const struct tc_taprio_qopt_offload *schedule)
> +{
> + int tc;
> +
> + for (tc = 0; tc < 8; ++tc) {
> + u16 val;
> +
> + if (!schedule->max_sdu[tc])
> + continue;
> +
> + hellcreek_select_port_prio(hellcreek, port, tc);
> +
> + val = (schedule->max_sdu[tc] & HR_PTPRTCCFG_MAXSDU_MASK)
> + << HR_PTPRTCCFG_MAXSDU_SHIFT;
> +
> + hellcreek_write(hellcreek, val, HR_PTPRTCCFG);

So the maxSDU hardware register tracks exactly the L2 payload size, like
the software variable does, or does it include the Ethernet header size
and/or FCS?

> + }
> +}

2022-09-14 20:49:11

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed, Sep 14, 2022 at 09:40:51PM +0300, Vladimir Oltean wrote:
> Nice :) Do you also want the iproute2 patch, so you can test it?

Need it or not, here it is:
https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

2022-09-14 21:48:58

by Vinicius Costa Gomes

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

Vladimir Oltean <[email protected]> writes:

> IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data
> types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the
> existence of a per traffic class limitation of maximum frame sizes, with
> a fallback on the port-based MTU.
>
> As far as I am able to understand, the 802.1Q Service Data Unit (SDU)
> represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding
> any number of prepended VLAN headers which may be otherwise present in
> the MSDU. Therefore, the queueMaxSDU is directly comparable to the
> device MTU (1500 means L2 payload sizes are accepted, or frame sizes of
> 1518 octets, or 1522 plus one VLAN header). Drivers which offload this
> are directly responsible of translating into other units of measurement.
>
> Signed-off-by: Vladimir Oltean <[email protected]>
> ---
> include/net/pkt_sched.h | 1 +
> include/uapi/linux/pkt_sched.h | 11 +++
> net/sched/sch_taprio.c | 122 ++++++++++++++++++++++++++++++++-
> 3 files changed, 133 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> index 29f65632ebc5..88080998557b 100644
> --- a/include/net/pkt_sched.h
> +++ b/include/net/pkt_sched.h
> @@ -168,6 +168,7 @@ struct tc_taprio_qopt_offload {
> ktime_t base_time;
> u64 cycle_time;
> u64 cycle_time_extension;
> + u32 max_sdu[TC_MAX_QUEUE];
>
> size_t num_entries;
> struct tc_taprio_sched_entry entries[];
> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index f292b467b27f..000eec106856 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
> @@ -1232,6 +1232,16 @@ enum {
> #define TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST _BITUL(0)
> #define TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD _BITUL(1)
>
> +enum {
> + TCA_TAPRIO_TC_ENTRY_UNSPEC,
> + TCA_TAPRIO_TC_ENTRY_INDEX, /* u32 */
> + TCA_TAPRIO_TC_ENTRY_MAX_SDU, /* u32 */
> +
> + /* add new constants above here */
> + __TCA_TAPRIO_TC_ENTRY_CNT,
> + TCA_TAPRIO_TC_ENTRY_MAX = (__TCA_TAPRIO_TC_ENTRY_CNT - 1)
> +};
> +
> enum {
> TCA_TAPRIO_ATTR_UNSPEC,
> TCA_TAPRIO_ATTR_PRIOMAP, /* struct tc_mqprio_qopt */
> @@ -1245,6 +1255,7 @@ enum {
> TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION, /* s64 */
> TCA_TAPRIO_ATTR_FLAGS, /* u32 */
> TCA_TAPRIO_ATTR_TXTIME_DELAY, /* u32 */
> + TCA_TAPRIO_ATTR_TC_ENTRY, /* nest */
> __TCA_TAPRIO_ATTR_MAX,
> };
>
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index 2a4b8f59f444..834cbed88e4f 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -79,6 +79,7 @@ struct taprio_sched {
> struct sched_gate_list __rcu *admin_sched;
> struct hrtimer advance_timer;
> struct list_head taprio_list;
> + u32 max_sdu[TC_MAX_QUEUE];
> u32 txtime_delay;
> };
>
> @@ -416,6 +417,9 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
> struct Qdisc *child, struct sk_buff **to_free)
> {
> struct taprio_sched *q = qdisc_priv(sch);
> + struct net_device *dev = qdisc_dev(sch);
> + int prio = skb->priority;
> + u8 tc;
>
> /* sk_flags are only safe to use on full sockets. */
> if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) {
> @@ -427,6 +431,12 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
> return qdisc_drop(skb, sch, to_free);
> }
>
> + /* Devices with full offload are expected to honor this in hardware */
> + tc = netdev_get_prio_tc_map(dev, prio);
> + if (q->max_sdu[tc] &&
> + q->max_sdu[tc] < max_t(int, 0, skb->len - skb_mac_header_len(skb)))
> + return qdisc_drop(skb, sch, to_free);
> +

One minor idea, perhaps if you initialize q->max_sdu[] with a value that
you could use to compare here (2^32 - 1), this comparison could be
simplified. The issue is that that value would become invalid for a
maximum SDU, not a problem for ethernet.

> qdisc_qstats_backlog_inc(sch, skb);
> sch->q.qlen++;
>
> @@ -761,6 +771,11 @@ static const struct nla_policy entry_policy[TCA_TAPRIO_SCHED_ENTRY_MAX + 1] = {
> [TCA_TAPRIO_SCHED_ENTRY_INTERVAL] = { .type = NLA_U32 },
> };
>
> +static const struct nla_policy taprio_tc_policy[TCA_TAPRIO_TC_ENTRY_MAX + 1] = {
> + [TCA_TAPRIO_TC_ENTRY_INDEX] = { .type = NLA_U32 },
> + [TCA_TAPRIO_TC_ENTRY_MAX_SDU] = { .type = NLA_U32 },
> +};
> +
> static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
> [TCA_TAPRIO_ATTR_PRIOMAP] = {
> .len = sizeof(struct tc_mqprio_qopt)
> @@ -773,6 +788,7 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
> [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 },
> [TCA_TAPRIO_ATTR_FLAGS] = { .type = NLA_U32 },
> [TCA_TAPRIO_ATTR_TXTIME_DELAY] = { .type = NLA_U32 },
> + [TCA_TAPRIO_ATTR_TC_ENTRY] = { .type = NLA_NESTED },
> };
>
> static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb,
> @@ -1236,7 +1252,7 @@ static int taprio_enable_offload(struct net_device *dev,
> {
> const struct net_device_ops *ops = dev->netdev_ops;
> struct tc_taprio_qopt_offload *offload;
> - int err = 0;
> + int tc, err = 0;
>
> if (!ops->ndo_setup_tc) {
> NL_SET_ERR_MSG(extack,
> @@ -1253,6 +1269,9 @@ static int taprio_enable_offload(struct net_device *dev,
> offload->enable = 1;
> taprio_sched_to_offload(dev, sched, offload);
>
> + for (tc = 0; tc < TC_MAX_QUEUE; tc++)
> + offload->max_sdu[tc] = q->max_sdu[tc];
> +
> err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
> if (err < 0) {
> NL_SET_ERR_MSG(extack,
> @@ -1387,6 +1406,73 @@ static int taprio_parse_clockid(struct Qdisc *sch, struct nlattr **tb,
> return err;
> }
>
> +static int taprio_parse_tc_entry(struct Qdisc *sch,
> + struct nlattr *opt,
> + unsigned long *seen_tcs,
> + struct netlink_ext_ack *extack)
> +{
> + struct nlattr *tb[TCA_TAPRIO_TC_ENTRY_MAX + 1] = { };
> + struct taprio_sched *q = qdisc_priv(sch);
> + struct net_device *dev = qdisc_dev(sch);
> + u32 max_sdu = 0;
> + int err, tc;
> +
> + err = nla_parse_nested(tb, TCA_TAPRIO_TC_ENTRY_MAX, opt,
> + taprio_tc_policy, extack);
> + if (err < 0)
> + return err;
> +
> + if (!tb[TCA_TAPRIO_TC_ENTRY_INDEX]) {
> + NL_SET_ERR_MSG_MOD(extack, "TC entry index missing");
> + return -EINVAL;
> + }
> +
> + tc = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_INDEX]);
> + if (tc >= TC_QOPT_MAX_QUEUE) {
> + NL_SET_ERR_MSG_MOD(extack, "TC entry index out of range");
> + return -ERANGE;
> + }
> +
> + if (*seen_tcs & BIT(tc)) {
> + NL_SET_ERR_MSG_MOD(extack, "Duplicate TC entry");
> + return -EINVAL;
> + }
> +
> + *seen_tcs |= BIT(tc);
> +
> + if (tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU])
> + max_sdu = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU]);
> +
> + if (max_sdu > dev->max_mtu) {
> + NL_SET_ERR_MSG_MOD(extack, "TC max SDU exceeds device max MTU");
> + return -ERANGE;
> + }
> +
> + q->max_sdu[tc] = max_sdu;
> +
> + return 0;
> +}
> +
> +static int taprio_parse_tc_entries(struct Qdisc *sch,
> + struct nlattr *opt,
> + struct netlink_ext_ack *extack)
> +{
> + unsigned long seen_tcs = 0;
> + struct nlattr *n;
> + int err = 0, rem;
> +
> + nla_for_each_nested(n, opt, rem) {
> + if (nla_type(n) != TCA_TAPRIO_ATTR_TC_ENTRY)
> + continue;
> +
> + err = taprio_parse_tc_entry(sch, n, &seen_tcs, extack);
> + if (err)
> + break;
> + }
> +
> + return err;
> +}
> +
> static int taprio_mqprio_cmp(const struct net_device *dev,
> const struct tc_mqprio_qopt *mqprio)
> {
> @@ -1465,6 +1551,10 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
> if (err < 0)
> return err;
>
> + err = taprio_parse_tc_entries(sch, opt, extack);
> + if (err)
> + return err;
> +
> new_admin = kzalloc(sizeof(*new_admin), GFP_KERNEL);
> if (!new_admin) {
> NL_SET_ERR_MSG(extack, "Not enough memory for a new schedule");
> @@ -1855,6 +1945,33 @@ static int dump_schedule(struct sk_buff *msg,
> return -1;
> }
>
> +static int taprio_dump_tc_entries(struct taprio_sched *q, struct sk_buff *skb)
> +{
> + struct nlattr *n;
> + int tc;
> +
> + for (tc = 0; tc < TC_MAX_QUEUE; tc++) {
> + n = nla_nest_start(skb, TCA_TAPRIO_ATTR_TC_ENTRY);
> + if (!n)
> + return -EMSGSIZE;
> +
> + if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_INDEX, tc))
> + goto nla_put_failure;
> +
> + if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_MAX_SDU,
> + q->max_sdu[tc]))
> + goto nla_put_failure;
> +
> + nla_nest_end(skb, n);
> + }
> +
> + return 0;
> +
> +nla_put_failure:
> + nla_nest_cancel(skb, n);
> + return -EMSGSIZE;
> +}
> +
> static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
> {
> struct taprio_sched *q = qdisc_priv(sch);
> @@ -1894,6 +2011,9 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
> nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay))
> goto options_error;
>
> + if (taprio_dump_tc_entries(q, skb))
> + goto options_error;
> +
> if (oper && dump_schedule(skb, oper))
> goto options_error;
>
> --
> 2.34.1
>

--
Vinicius

2022-09-14 22:01:19

by Vinicius Costa Gomes

[permalink] [raw]
Subject: Re: [PATCH net-next 00/13] Add tc-taprio support for queueMaxSDU

Vladimir Oltean <[email protected]> writes:

> Michael and Xiaoliang will probably be aware that the tc-taprio offload
> mode supported by the Felix DSA driver has limitations surrounding its
> guard bands.
>
> The initial discussion was at:
> https://lore.kernel.org/netdev/[email protected]/
>
> with the latest status being that we now have a vsc9959_tas_guard_bands_update()
> method which makes a best-guess attempt at how much useful space to
> reserve for packet scheduling in a taprio interval, and how much to
> reserve for guard bands.
>
> IEEE 802.1Q actually does offer a tunable variable (queueMaxSDU) which
> can determine the max MTU supported per traffic class. In turn we can
> determine the size we need for the guard bands, depending on the
> queueMaxSDU. This way we can make the guard band of small taprio
> intervals smaller than one full MTU worth of transmission time, if we
> know that said traffic class will transport only smaller packets.
>
> Allow input of queueMaxSDU through netlink into tc-taprio, offload it to
> the hardware I have access to (LS1028A), and deny non-default values to
> everyone else.
>
> First 3 patches are some cleanups I made while figuring out what exactly
> gets called for taprio software mode, and what gets called for offload
> mode.
>
> Vladimir Oltean (13):
> net/sched: taprio: remove redundant FULL_OFFLOAD_IS_ENABLED check in
> taprio_enqueue
> net/sched: taprio: stop going through private ops for dequeue and peek
> net/sched: taprio: add extack messages in taprio_init

Indeed. I think the first three patches can be in a separate series.

> net/sched: taprio: allow user input of per-tc max SDU
> net: dsa: felix: offload per-tc max SDU from tc-taprio
> net: enetc: cache accesses to &priv->si->hw
> net: enetc: offload per-tc max SDU from tc-taprio
> net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU
> net: dsa: sja1105: deny tc-taprio changes to per-tc max SDU
> tsnep: deny tc-taprio changes to per-tc max SDU
> igc: deny tc-taprio changes to per-tc max SDU
> net: stmmac: deny tc-taprio changes to per-tc max SDU
> net: am65-cpsw: deny tc-taprio changes to per-tc max SDU
>
> drivers/net/dsa/hirschmann/hellcreek.c | 5 +
> drivers/net/dsa/ocelot/felix_vsc9959.c | 20 +-
> drivers/net/dsa/sja1105/sja1105_tas.c | 6 +-
> drivers/net/ethernet/engleder/tsnep_tc.c | 6 +-
> drivers/net/ethernet/freescale/enetc/enetc.c | 28 ++-
> drivers/net/ethernet/freescale/enetc/enetc.h | 12 +-
> .../net/ethernet/freescale/enetc/enetc_pf.c | 25 ++-
> .../net/ethernet/freescale/enetc/enetc_qos.c | 70 +++----
> drivers/net/ethernet/intel/igc/igc_main.c | 6 +-
> .../net/ethernet/stmicro/stmmac/stmmac_tc.c | 6 +-
> drivers/net/ethernet/ti/am65-cpsw-qos.c | 6 +-
> include/net/pkt_sched.h | 1 +
> include/uapi/linux/pkt_sched.h | 11 +
> net/sched/sch_taprio.c | 194 +++++++++++++-----
> 14 files changed, 283 insertions(+), 113 deletions(-)
>
> --
> 2.34.1
>


Cheers,
--
Vinicius

2022-09-14 22:21:04

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

On Wed, Sep 14, 2022 at 02:43:02PM -0700, Vinicius Costa Gomes wrote:
> > @@ -416,6 +417,9 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
> > struct Qdisc *child, struct sk_buff **to_free)
> > {
> > struct taprio_sched *q = qdisc_priv(sch);
> > + struct net_device *dev = qdisc_dev(sch);
> > + int prio = skb->priority;
> > + u8 tc;
> >
> > /* sk_flags are only safe to use on full sockets. */
> > if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) {
> > @@ -427,6 +431,12 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
> > return qdisc_drop(skb, sch, to_free);
> > }
> >
> > + /* Devices with full offload are expected to honor this in hardware */
> > + tc = netdev_get_prio_tc_map(dev, prio);
> > + if (q->max_sdu[tc] &&
> > + q->max_sdu[tc] < max_t(int, 0, skb->len - skb_mac_header_len(skb)))
> > + return qdisc_drop(skb, sch, to_free);
> > +
>
> One minor idea, perhaps if you initialize q->max_sdu[] with a value that
> you could use to compare here (2^32 - 1), this comparison could be
> simplified. The issue is that that value would become invalid for a
> maximum SDU, not a problem for ethernet.

Could do (and the fact that U32_MAX becomes a reserved value shouldn't
be a problem for any linklayer), but if I optimize the code for this one
place, I need, in turn, to increase the complexity in the netlink dump
and in the offload procedures, to hide what I've done.

If I look at the difference in generated code, maybe it's worth it
(I get rid of a "cbz" instruction). Maybe it's worth simply creating a
shadow array of q->max_sdu[], but which is also adjusted for something
like dev->hard_header_len (also a fast path invariant)? This way, we
could only check for q->max_frm_len[tc] > skb->len and save even more
checks in the fast path.

2022-09-14 23:43:18

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

On Wed, Sep 14, 2022 at 04:00:07PM -0700, Vinicius Costa Gomes wrote:
> Hm, I just noticed something.
>
> During parse the user only sets the max-sdu for the traffic classes she
> is interested on. During dump you are showing all of them, the unset
> ones will be shown as zero, that seems a bit confusing, which could mean
> that you would have to add some checks anyway.
>
> For the offload side, you could just document that U32_MAX means unset.

Yes, choosing '0' rather than other value, to mean 'default to port MTU'
was intentional. It is also in line with what other places, like the
YANG models, expect to see:
https://github.com/YangModels/yang/blob/main/standard/ieee/draft/802.1/Qcw/ieee802-dot1q-sched.yang#L128

2022-09-14 23:47:31

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

On Wed, Sep 14, 2022 at 04:03:26PM -0700, Vinicius Costa Gomes wrote:
> Could you please add an example of the 'tc' command syntax you are
> thinking about?

I was working with this, as a matter of fact:

#!/bin/bash

h1=eno0
h2=eno2
swp1=swp0
swp2=swp4

ip link set $h2 address 00:01:02:03:04:05
ip link set $h1 up
ip link set $h2 up
tc qdisc del dev $swp1 root || :
ip link del br0 || :

ip link add br0 type bridge && ip link set br0 up
ip link set $swp1 master br0 && ip link set $swp1 up
ip link set $swp2 master br0 && ip link set $swp2 up
tc qdisc replace dev $swp1 root taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 \
sched-entry S 0x7f 990000 \
sched-entry S 0x80 10000 \
max-sdu 0 0 0 0 0 0 0 200 \
flags 0x2

echo "Run:"
echo "isochron rcv --interface $h1 &"
echo "isochron send --interface $h2 --client 127.0.0.1 --cycle-time 0.001 --frame-size 60 --omit-sync --num-frames 10 --priority 7 --vid 0"

I've also tested it in software mode, and at least on my system, a 10 us
interval is not really enough for the qdisc to make forward progress and
dequeue any packet. My application freezes until I destroy the qdisc and
re-create it using a larger interval for TC7. I'll debug that some more.
I was thinking, after the basic queueMaxSDU feature is merged, maybe
taprio could automatically further limit queueMaxSDU based on the
smallest intervals from the schedule. Something like a generalization of
vsc9959_tas_guard_bands_update(), basically.

> Another point to think about: does it make sense to allow 'only' the
> max-sdu to be changed, i.e. the user doesn't set an a schedule, nor a
> map, only the max-sdu information.

Maybe, I haven't tried. I think iproute2 doesn't currently support
skipping TCA_TAPRIO_ATTR_SCHED_ENTRY_LIST.

2022-09-14 23:48:25

by Vinicius Costa Gomes

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

Vladimir Oltean <[email protected]> writes:

> On Wed, Sep 14, 2022 at 02:43:02PM -0700, Vinicius Costa Gomes wrote:
>> > @@ -416,6 +417,9 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
>> > struct Qdisc *child, struct sk_buff **to_free)
>> > {
>> > struct taprio_sched *q = qdisc_priv(sch);
>> > + struct net_device *dev = qdisc_dev(sch);
>> > + int prio = skb->priority;
>> > + u8 tc;
>> >
>> > /* sk_flags are only safe to use on full sockets. */
>> > if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) {
>> > @@ -427,6 +431,12 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
>> > return qdisc_drop(skb, sch, to_free);
>> > }
>> >
>> > + /* Devices with full offload are expected to honor this in hardware */
>> > + tc = netdev_get_prio_tc_map(dev, prio);
>> > + if (q->max_sdu[tc] &&
>> > + q->max_sdu[tc] < max_t(int, 0, skb->len - skb_mac_header_len(skb)))
>> > + return qdisc_drop(skb, sch, to_free);
>> > +
>>
>> One minor idea, perhaps if you initialize q->max_sdu[] with a value that
>> you could use to compare here (2^32 - 1), this comparison could be
>> simplified. The issue is that that value would become invalid for a
>> maximum SDU, not a problem for ethernet.
>
> Could do (and the fact that U32_MAX becomes a reserved value shouldn't
> be a problem for any linklayer), but if I optimize the code for this one
> place, I need, in turn, to increase the complexity in the netlink dump
> and in the offload procedures, to hide what I've done.

Hm, I just noticed something.

During parse the user only sets the max-sdu for the traffic classes she
is interested on. During dump you are showing all of them, the unset
ones will be shown as zero, that seems a bit confusing, which could mean
that you would have to add some checks anyway.

For the offload side, you could just document that U32_MAX means unset.

>
> If I look at the difference in generated code, maybe it's worth it
> (I get rid of a "cbz" instruction). Maybe it's worth simply creating a
> shadow array of q->max_sdu[], but which is also adjusted for something
> like dev->hard_header_len (also a fast path invariant)? This way, we
> could only check for q->max_frm_len[tc] > skb->len and save even more
> checks in the fast path.

--
Vinicius

2022-09-14 23:48:36

by Vinicius Costa Gomes

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

Vladimir Oltean <[email protected]> writes:

> On Wed, Sep 14, 2022 at 04:00:07PM -0700, Vinicius Costa Gomes wrote:
>> Hm, I just noticed something.
>>
>> During parse the user only sets the max-sdu for the traffic classes she
>> is interested on. During dump you are showing all of them, the unset
>> ones will be shown as zero, that seems a bit confusing, which could mean
>> that you would have to add some checks anyway.
>>
>> For the offload side, you could just document that U32_MAX means unset.
>
> Yes, choosing '0' rather than other value, to mean 'default to port MTU'
> was intentional. It is also in line with what other places, like the
> YANG models, expect to see:
> https://github.com/YangModels/yang/blob/main/standard/ieee/draft/802.1/Qcw/ieee802-dot1q-sched.yang#L128

Oh, I see. My bad. So, only that comment about thinking about making the
comparison simpler is still valid.


Cheers,
--
Vinicius

2022-09-14 23:57:41

by Vinicius Costa Gomes

[permalink] [raw]
Subject: Re: [PATCH net-next 04/13] net/sched: taprio: allow user input of per-tc max SDU

Vladimir Oltean <[email protected]> writes:

> IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data
> types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the
> existence of a per traffic class limitation of maximum frame sizes, with
> a fallback on the port-based MTU.
>
> As far as I am able to understand, the 802.1Q Service Data Unit (SDU)
> represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding
> any number of prepended VLAN headers which may be otherwise present in
> the MSDU. Therefore, the queueMaxSDU is directly comparable to the
> device MTU (1500 means L2 payload sizes are accepted, or frame sizes of
> 1518 octets, or 1522 plus one VLAN header). Drivers which offload this
> are directly responsible of translating into other units of measurement.
>
> Signed-off-by: Vladimir Oltean <[email protected]>
> ---

Could you please add an example of the 'tc' command syntax you are
thinking about?

Another point to think about: does it make sense to allow 'only' the
max-sdu to be changed, i.e. the user doesn't set an a schedule, nor a
map, only the max-sdu information.

> include/net/pkt_sched.h | 1 +
> include/uapi/linux/pkt_sched.h | 11 +++
> net/sched/sch_taprio.c | 122 ++++++++++++++++++++++++++++++++-
> 3 files changed, 133 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> index 29f65632ebc5..88080998557b 100644
> --- a/include/net/pkt_sched.h
> +++ b/include/net/pkt_sched.h
> @@ -168,6 +168,7 @@ struct tc_taprio_qopt_offload {
> ktime_t base_time;
> u64 cycle_time;
> u64 cycle_time_extension;
> + u32 max_sdu[TC_MAX_QUEUE];
>
> size_t num_entries;
> struct tc_taprio_sched_entry entries[];
> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index f292b467b27f..000eec106856 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
> @@ -1232,6 +1232,16 @@ enum {
> #define TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST _BITUL(0)
> #define TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD _BITUL(1)
>
> +enum {
> + TCA_TAPRIO_TC_ENTRY_UNSPEC,
> + TCA_TAPRIO_TC_ENTRY_INDEX, /* u32 */
> + TCA_TAPRIO_TC_ENTRY_MAX_SDU, /* u32 */
> +
> + /* add new constants above here */
> + __TCA_TAPRIO_TC_ENTRY_CNT,
> + TCA_TAPRIO_TC_ENTRY_MAX = (__TCA_TAPRIO_TC_ENTRY_CNT - 1)
> +};
> +
> enum {
> TCA_TAPRIO_ATTR_UNSPEC,
> TCA_TAPRIO_ATTR_PRIOMAP, /* struct tc_mqprio_qopt */
> @@ -1245,6 +1255,7 @@ enum {
> TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION, /* s64 */
> TCA_TAPRIO_ATTR_FLAGS, /* u32 */
> TCA_TAPRIO_ATTR_TXTIME_DELAY, /* u32 */
> + TCA_TAPRIO_ATTR_TC_ENTRY, /* nest */
> __TCA_TAPRIO_ATTR_MAX,
> };
>
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index 2a4b8f59f444..834cbed88e4f 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -79,6 +79,7 @@ struct taprio_sched {
> struct sched_gate_list __rcu *admin_sched;
> struct hrtimer advance_timer;
> struct list_head taprio_list;
> + u32 max_sdu[TC_MAX_QUEUE];
> u32 txtime_delay;
> };
>
> @@ -416,6 +417,9 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
> struct Qdisc *child, struct sk_buff **to_free)
> {
> struct taprio_sched *q = qdisc_priv(sch);
> + struct net_device *dev = qdisc_dev(sch);
> + int prio = skb->priority;
> + u8 tc;
>
> /* sk_flags are only safe to use on full sockets. */
> if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) {
> @@ -427,6 +431,12 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch,
> return qdisc_drop(skb, sch, to_free);
> }
>
> + /* Devices with full offload are expected to honor this in hardware */
> + tc = netdev_get_prio_tc_map(dev, prio);
> + if (q->max_sdu[tc] &&
> + q->max_sdu[tc] < max_t(int, 0, skb->len - skb_mac_header_len(skb)))
> + return qdisc_drop(skb, sch, to_free);
> +
> qdisc_qstats_backlog_inc(sch, skb);
> sch->q.qlen++;
>
> @@ -761,6 +771,11 @@ static const struct nla_policy entry_policy[TCA_TAPRIO_SCHED_ENTRY_MAX + 1] = {
> [TCA_TAPRIO_SCHED_ENTRY_INTERVAL] = { .type = NLA_U32 },
> };
>
> +static const struct nla_policy taprio_tc_policy[TCA_TAPRIO_TC_ENTRY_MAX + 1] = {
> + [TCA_TAPRIO_TC_ENTRY_INDEX] = { .type = NLA_U32 },
> + [TCA_TAPRIO_TC_ENTRY_MAX_SDU] = { .type = NLA_U32 },
> +};
> +
> static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
> [TCA_TAPRIO_ATTR_PRIOMAP] = {
> .len = sizeof(struct tc_mqprio_qopt)
> @@ -773,6 +788,7 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
> [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 },
> [TCA_TAPRIO_ATTR_FLAGS] = { .type = NLA_U32 },
> [TCA_TAPRIO_ATTR_TXTIME_DELAY] = { .type = NLA_U32 },
> + [TCA_TAPRIO_ATTR_TC_ENTRY] = { .type = NLA_NESTED },
> };
>
> static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb,
> @@ -1236,7 +1252,7 @@ static int taprio_enable_offload(struct net_device *dev,
> {
> const struct net_device_ops *ops = dev->netdev_ops;
> struct tc_taprio_qopt_offload *offload;
> - int err = 0;
> + int tc, err = 0;
>
> if (!ops->ndo_setup_tc) {
> NL_SET_ERR_MSG(extack,
> @@ -1253,6 +1269,9 @@ static int taprio_enable_offload(struct net_device *dev,
> offload->enable = 1;
> taprio_sched_to_offload(dev, sched, offload);
>
> + for (tc = 0; tc < TC_MAX_QUEUE; tc++)
> + offload->max_sdu[tc] = q->max_sdu[tc];
> +
> err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
> if (err < 0) {
> NL_SET_ERR_MSG(extack,
> @@ -1387,6 +1406,73 @@ static int taprio_parse_clockid(struct Qdisc *sch, struct nlattr **tb,
> return err;
> }
>
> +static int taprio_parse_tc_entry(struct Qdisc *sch,
> + struct nlattr *opt,
> + unsigned long *seen_tcs,
> + struct netlink_ext_ack *extack)
> +{
> + struct nlattr *tb[TCA_TAPRIO_TC_ENTRY_MAX + 1] = { };
> + struct taprio_sched *q = qdisc_priv(sch);
> + struct net_device *dev = qdisc_dev(sch);
> + u32 max_sdu = 0;
> + int err, tc;
> +
> + err = nla_parse_nested(tb, TCA_TAPRIO_TC_ENTRY_MAX, opt,
> + taprio_tc_policy, extack);
> + if (err < 0)
> + return err;
> +
> + if (!tb[TCA_TAPRIO_TC_ENTRY_INDEX]) {
> + NL_SET_ERR_MSG_MOD(extack, "TC entry index missing");
> + return -EINVAL;
> + }
> +
> + tc = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_INDEX]);
> + if (tc >= TC_QOPT_MAX_QUEUE) {
> + NL_SET_ERR_MSG_MOD(extack, "TC entry index out of range");
> + return -ERANGE;
> + }
> +
> + if (*seen_tcs & BIT(tc)) {
> + NL_SET_ERR_MSG_MOD(extack, "Duplicate TC entry");
> + return -EINVAL;
> + }
> +
> + *seen_tcs |= BIT(tc);
> +
> + if (tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU])
> + max_sdu = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU]);
> +
> + if (max_sdu > dev->max_mtu) {
> + NL_SET_ERR_MSG_MOD(extack, "TC max SDU exceeds device max MTU");
> + return -ERANGE;
> + }
> +
> + q->max_sdu[tc] = max_sdu;
> +
> + return 0;
> +}
> +
> +static int taprio_parse_tc_entries(struct Qdisc *sch,
> + struct nlattr *opt,
> + struct netlink_ext_ack *extack)
> +{
> + unsigned long seen_tcs = 0;
> + struct nlattr *n;
> + int err = 0, rem;
> +
> + nla_for_each_nested(n, opt, rem) {
> + if (nla_type(n) != TCA_TAPRIO_ATTR_TC_ENTRY)
> + continue;
> +
> + err = taprio_parse_tc_entry(sch, n, &seen_tcs, extack);
> + if (err)
> + break;
> + }
> +
> + return err;
> +}
> +
> static int taprio_mqprio_cmp(const struct net_device *dev,
> const struct tc_mqprio_qopt *mqprio)
> {
> @@ -1465,6 +1551,10 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
> if (err < 0)
> return err;
>
> + err = taprio_parse_tc_entries(sch, opt, extack);
> + if (err)
> + return err;
> +
> new_admin = kzalloc(sizeof(*new_admin), GFP_KERNEL);
> if (!new_admin) {
> NL_SET_ERR_MSG(extack, "Not enough memory for a new schedule");
> @@ -1855,6 +1945,33 @@ static int dump_schedule(struct sk_buff *msg,
> return -1;
> }
>
> +static int taprio_dump_tc_entries(struct taprio_sched *q, struct sk_buff *skb)
> +{
> + struct nlattr *n;
> + int tc;
> +
> + for (tc = 0; tc < TC_MAX_QUEUE; tc++) {
> + n = nla_nest_start(skb, TCA_TAPRIO_ATTR_TC_ENTRY);
> + if (!n)
> + return -EMSGSIZE;
> +
> + if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_INDEX, tc))
> + goto nla_put_failure;
> +
> + if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_MAX_SDU,
> + q->max_sdu[tc]))
> + goto nla_put_failure;
> +
> + nla_nest_end(skb, n);
> + }
> +
> + return 0;
> +
> +nla_put_failure:
> + nla_nest_cancel(skb, n);
> + return -EMSGSIZE;
> +}
> +
> static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
> {
> struct taprio_sched *q = qdisc_priv(sch);
> @@ -1894,6 +2011,9 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
> nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay))
> goto options_error;
>
> + if (taprio_dump_tc_entries(q, skb))
> + goto options_error;
> +
> if (oper && dump_schedule(skb, oper))
> goto options_error;
>
> --
> 2.34.1
>

--
Vinicius

2022-09-15 06:27:30

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed Sep 14 2022, Vladimir Oltean wrote:
> On Wed, Sep 14, 2022 at 08:13:53PM +0200, Kurt Kanzenbach wrote:
>> I'd rather like to see that feature implemented :-). Something like below.
>>
>> From 3d44683979bf50960125fa3005b1142af61525c7 Mon Sep 17 00:00:00 2001
>> From: Kurt Kanzenbach <[email protected]>
>> Date: Wed, 14 Sep 2022 19:51:40 +0200
>> Subject: [PATCH] net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio
>>
>> Add support for configuring the max SDU per priority and per port. If not
>> specified, keep the default.
>>
>> Signed-off-by: Kurt Kanzenbach <[email protected]>
>> ---
>
> Nice :) Do you also want the iproute2 patch, so you can test it?

Sure. I see you posted that patch already.

>
>> drivers/net/dsa/hirschmann/hellcreek.c | 61 +++++++++++++++++++++++---
>> drivers/net/dsa/hirschmann/hellcreek.h | 7 +++
>> 2 files changed, 61 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
>> index 5ceee71d9a25..1b22710e1641 100644
>> --- a/drivers/net/dsa/hirschmann/hellcreek.c
>> +++ b/drivers/net/dsa/hirschmann/hellcreek.c
>> @@ -128,6 +128,16 @@ static void hellcreek_select_prio(struct hellcreek *hellcreek, int prio)
>> hellcreek_write(hellcreek, val, HR_PSEL);
>> }
>>
>> +static void hellcreek_select_port_prio(struct hellcreek *hellcreek, int port,
>> + int prio)
>> +{
>> + u16 val = port << HR_PSEL_PTWSEL_SHIFT;
>> +
>> + val |= prio << HR_PSEL_PRTCWSEL_SHIFT;
>> +
>> + hellcreek_write(hellcreek, val, HR_PSEL);
>> +}
>> +
>> static void hellcreek_select_counter(struct hellcreek *hellcreek, int counter)
>> {
>> u16 val = counter << HR_CSEL_SHIFT;
>> @@ -1537,6 +1547,42 @@ hellcreek_port_prechangeupper(struct dsa_switch *ds, int port,
>> return ret;
>> }
>>
>> +static void hellcreek_setup_maxsdu(struct hellcreek *hellcreek, int port,
>> + const struct tc_taprio_qopt_offload *schedule)
>> +{
>> + int tc;
>> +
>> + for (tc = 0; tc < 8; ++tc) {
>> + u16 val;
>> +
>> + if (!schedule->max_sdu[tc])
>> + continue;
>> +
>> + hellcreek_select_port_prio(hellcreek, port, tc);
>> +
>> + val = (schedule->max_sdu[tc] & HR_PTPRTCCFG_MAXSDU_MASK)
>> + << HR_PTPRTCCFG_MAXSDU_SHIFT;
>> +
>> + hellcreek_write(hellcreek, val, HR_PTPRTCCFG);
>
> So the maxSDU hardware register tracks exactly the L2 payload size, like
> the software variable does, or does it include the Ethernet header size
> and/or FCS?

This is something I'm not sure about. I'll ask the HW engineer when he's
back from vacation.

Thanks,
Kurt


Attachments:
signature.asc (877.00 B)

2022-09-15 12:30:36

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Thu, Sep 15, 2022 at 08:15:54AM +0200, Kurt Kanzenbach wrote:
> > So the maxSDU hardware register tracks exactly the L2 payload size, like
> > the software variable does, or does it include the Ethernet header size
> > and/or FCS?
>
> This is something I'm not sure about. I'll ask the HW engineer when he's
> back from vacation.

You can also probably figure this out by limiting the max-sdu to a value
like 200 and seeing what frame sizes pass through.

2022-09-15 20:00:46

by Gerhard Engleder

[permalink] [raw]
Subject: Re: [PATCH net-next 10/13] tsnep: deny tc-taprio changes to per-tc max SDU

> Since the driver does not act upon the max_sdu argument, deny any other
> values except the default all-zeroes, which means that all traffic
> classes should use the same MTU as the port itself.
>
> Signed-off-by: Vladimir Oltean <[email protected]>
> ---
> drivers/net/ethernet/engleder/tsnep_tc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/engleder/tsnep_tc.c b/drivers/net/ethernet/engleder/tsnep_tc.c
> index c4c6e1357317..82df837ffc54 100644
> --- a/drivers/net/ethernet/engleder/tsnep_tc.c
> +++ b/drivers/net/ethernet/engleder/tsnep_tc.c
> @@ -320,11 +320,15 @@ static int tsnep_taprio(struct tsnep_adapter *adapter,
> {
> struct tsnep_gcl *gcl;
> struct tsnep_gcl *curr;
> - int retval;
> + int retval, tc;
>
> if (!adapter->gate_control)
> return -EOPNOTSUPP;
>
> + for (tc = 0; tc < TC_MAX_QUEUE; tc++)
> + if (qopt->max_sdu[tc])
> + return -EOPNOTSUPP;

Does it make any difference if the MAC already guarantees that too
long frames, which would violate the next taprio interval, will not
be transmitted? MACs are able to do that, switches not.

The user could be informed, that the MAC is considering the length of
the frames by accepting any max_sdu value lower than the MTU of the netdev.

2022-09-16 14:03:34

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 10/13] tsnep: deny tc-taprio changes to per-tc max SDU

On Thu, Sep 15, 2022 at 09:01:19PM +0200, Gerhard Engleder wrote:
> Does it make any difference if the MAC already guarantees that too
> long frames, which would violate the next taprio interval, will not
> be transmitted?

This is not, in essence, about gate overruns, but about transmitting
packets larger than X bytes for a traffic class. It also becomes about
overruns and guard bands to avoid them, only in relation to certain
hardware implementations.

But it could also be useful for reducing latency in a given time slot
with mixed traffic classes where you don't have frame preemption.

> MACs are able to do that, switches not.

Switches could in principle be able to do that too, but just in store
and forward mode (not cut-through).

> The user could be informed, that the MAC is considering the length of the
> frames by accepting any max_sdu value lower than the MTU of the netdev.

By accepting any max_sdu lower than the MTU of the netdev, I would
expect that the observable behavior is that the netdev will not send any
frame for this traffic class that is larger than max_sdu. But if you
accept the config as valid and not act on it, this will not be enforced
by anyone. This is because of the way in which the taprio qdisc works in
full offload mode.

2022-09-16 20:02:11

by Gerhard Engleder

[permalink] [raw]
Subject: Re: [PATCH net-next 10/13] tsnep: deny tc-taprio changes to per-tc max SDU

On 16.09.22 15:57, Vladimir Oltean wrote:
> On Thu, Sep 15, 2022 at 09:01:19PM +0200, Gerhard Engleder wrote:
>> Does it make any difference if the MAC already guarantees that too
>> long frames, which would violate the next taprio interval, will not
>> be transmitted?
>
> This is not, in essence, about gate overruns, but about transmitting
> packets larger than X bytes for a traffic class. It also becomes about
> overruns and guard bands to avoid them, only in relation to certain
> hardware implementations.
>
> But it could also be useful for reducing latency in a given time slot
> with mixed traffic classes where you don't have frame preemption.
>
>> MACs are able to do that, switches not.
>
> Switches could in principle be able to do that too, but just in store
> and forward mode (not cut-through).
>
>> The user could be informed, that the MAC is considering the length of the
>> frames by accepting any max_sdu value lower than the MTU of the netdev.
>
> By accepting any max_sdu lower than the MTU of the netdev, I would
> expect that the observable behavior is that the netdev will not send any
> frame for this traffic class that is larger than max_sdu. But if you
> accept the config as valid and not act on it, this will not be enforced
> by anyone. This is because of the way in which the taprio qdisc works in
> full offload mode.

Ok, denying max_sdu makes sense. It can be used to prevent too large
frames for a traffic class by discarding them.

In my opinion for MACs a software implementation would make sense even
in full offload mode, because it would prevent copying frames from RAM
to MAC which are discarded anyway. But that should be decided driver
specific.

Thanks for your explanations!

2022-09-16 22:37:28

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 10/13] tsnep: deny tc-taprio changes to per-tc max SDU

On Fri, Sep 16, 2022 at 09:53:56PM +0200, Gerhard Engleder wrote:
> Ok, denying max_sdu makes sense. It can be used to prevent too large
> frames for a traffic class by discarding them.
>
> In my opinion for MACs a software implementation would make sense even
> in full offload mode, because it would prevent copying frames from RAM
> to MAC which are discarded anyway. But that should be decided driver
> specific.
>
> Thanks for your explanations!

Depending on how your hardware architecture looks like, discarding the
oversized frames in the driver might or might not be possible. If you
have for example queues exposed to user space, you're at user space's
mercy. Since for example enetc can have virtual station interfaces
(essentially a set of queues with a way to reach them, by MAC address)
exported using vfio-pci to a guest OS running arbitrary software,
dropping them in the physical port MAC is pretty much the only way you
can ensure you meet your latency target.

2022-09-19 13:49:52

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Thu Sep 15 2022, Vladimir Oltean wrote:
> On Thu, Sep 15, 2022 at 08:15:54AM +0200, Kurt Kanzenbach wrote:
>> > So the maxSDU hardware register tracks exactly the L2 payload size, like
>> > the software variable does, or does it include the Ethernet header size
>> > and/or FCS?
>>
>> This is something I'm not sure about. I'll ask the HW engineer when he's
>> back from vacation.
>
> You can also probably figure this out by limiting the max-sdu to a value
> like 200 and seeing what frame sizes pass through.

Oh, yes. I'll try that.

Thanks,
Kurt


Attachments:
signature.asc (877.00 B)

2022-09-21 11:28:26

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Thu Sep 15 2022, Vladimir Oltean wrote:
> On Thu, Sep 15, 2022 at 08:15:54AM +0200, Kurt Kanzenbach wrote:
>> > So the maxSDU hardware register tracks exactly the L2 payload size, like
>> > the software variable does, or does it include the Ethernet header size
>> > and/or FCS?
>>
>> This is something I'm not sure about. I'll ask the HW engineer when he's
>> back from vacation.
>
> You can also probably figure this out by limiting the max-sdu to a value
> like 200 and seeing what frame sizes pass through.

So, configured to 128 and 132 bytes (including VLAN Ethernet header) is
the maximum frame size which passes through.

Thanks,
Kurt


Attachments:
signature.asc (877.00 B)

2022-09-21 11:32:10

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed, Sep 21, 2022 at 01:23:24PM +0200, Kurt Kanzenbach wrote:
> On Thu Sep 15 2022, Vladimir Oltean wrote:
> > On Thu, Sep 15, 2022 at 08:15:54AM +0200, Kurt Kanzenbach wrote:
> >> > So the maxSDU hardware register tracks exactly the L2 payload size, like
> >> > the software variable does, or does it include the Ethernet header size
> >> > and/or FCS?
> >>
> >> This is something I'm not sure about. I'll ask the HW engineer when he's
> >> back from vacation.
> >
> > You can also probably figure this out by limiting the max-sdu to a value
> > like 200 and seeing what frame sizes pass through.
>
> So, configured to 128 and 132 bytes (including VLAN Ethernet header) is
> the maximum frame size which passes through.

Frame size means MAC DA + MAC SA + VLAN header + Ethertype + L2 payload,
and without FCS, right?

Because max_sdu 128 only counts the L2 payload, so the maximum frame
size that passes should be 142 octets, or 146 octets with VLAN.

The same is true with the port MTU. When it's 1500, the maximum frame
size is 1514, or 1518 with VLAN.

2022-09-21 12:01:28

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed Sep 21 2022, Vladimir Oltean wrote:
> On Wed, Sep 21, 2022 at 01:23:24PM +0200, Kurt Kanzenbach wrote:
>> On Thu Sep 15 2022, Vladimir Oltean wrote:
>> > On Thu, Sep 15, 2022 at 08:15:54AM +0200, Kurt Kanzenbach wrote:
>> >> > So the maxSDU hardware register tracks exactly the L2 payload size, like
>> >> > the software variable does, or does it include the Ethernet header size
>> >> > and/or FCS?
>> >>
>> >> This is something I'm not sure about. I'll ask the HW engineer when he's
>> >> back from vacation.
>> >
>> > You can also probably figure this out by limiting the max-sdu to a value
>> > like 200 and seeing what frame sizes pass through.
>>
>> So, configured to 128 and 132 bytes (including VLAN Ethernet header) is
>> the maximum frame size which passes through.
>
> Frame size means MAC DA + MAC SA + VLAN header + Ethertype + L2 payload,
> and without FCS, right?

Yes.

>
> Because max_sdu 128 only counts the L2 payload, so the maximum frame
> size that passes should be 142 octets, or 146 octets with VLAN.

Ok, i see. So, for 128 max-sdu we should end up with something like this
in the prio config register:

schedule->max_sdu[tc] + VLAN_ETH_HLEN - 4

Thanks,
Kurt


Attachments:
signature.asc (877.00 B)

2022-09-21 14:27:06

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed, Sep 21, 2022 at 01:46:22PM +0200, Kurt Kanzenbach wrote:
> >> So, configured to 128 and 132 bytes (including VLAN Ethernet header) is
> >> the maximum frame size which passes through.
> >
> > Frame size means MAC DA + MAC SA + VLAN header + Ethertype + L2 payload,
> > and without FCS, right?
>
> Yes.
>
> >
> > Because max_sdu 128 only counts the L2 payload, so the maximum frame
> > size that passes should be 142 octets, or 146 octets with VLAN.
>
> Ok, i see. So, for 128 max-sdu we should end up with something like this
> in the prio config register:
>
> schedule->max_sdu[tc] + VLAN_ETH_HLEN - 4

What does 4 represent? ETH_FCS_LEN?

So when schedule->max_sdu[tc] is 128, you write to HR_PTPRTCCFG_MAXSDU
the value of 128 + 18 - 4 = 142, and this will pass packets (VLAN-tagged
or not) with an skb->len (on xmit) <= 142, right?

2022-09-21 14:33:18

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed, Sep 21, 2022 at 05:12:06PM +0300, Vladimir Oltean wrote:
> On Wed, Sep 21, 2022 at 01:46:22PM +0200, Kurt Kanzenbach wrote:
> > >> So, configured to 128 and 132 bytes (including VLAN Ethernet header) is
> > >> the maximum frame size which passes through.
> > >
> > > Frame size means MAC DA + MAC SA + VLAN header + Ethertype + L2 payload,
> > > and without FCS, right?
> >
> > Yes.
> >
> > >
> > > Because max_sdu 128 only counts the L2 payload, so the maximum frame
> > > size that passes should be 142 octets, or 146 octets with VLAN.
> >
> > Ok, i see. So, for 128 max-sdu we should end up with something like this
> > in the prio config register:
> >
> > schedule->max_sdu[tc] + VLAN_ETH_HLEN - 4
>
> What does 4 represent? ETH_FCS_LEN?
>
> So when schedule->max_sdu[tc] is 128, you write to HR_PTPRTCCFG_MAXSDU
> the value of 128 + 18 - 4 = 142, and this will pass packets (VLAN-tagged
> or not) with an skb->len (on xmit) <= 142, right?

Mistake, I meant skb->len <= 146 (max_sdu[tc] + VLAN_ETH_HLEN). So the
hardware calculates the max frame len to include FCS, and skb->len is
without FCS, hence the 4 discrepancy.

2022-09-22 08:19:26

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH net-next 08/13] net: dsa: hellcreek: deny tc-taprio changes to per-tc max SDU

On Wed Sep 21 2022, Vladimir Oltean wrote:
> On Wed, Sep 21, 2022 at 05:12:06PM +0300, Vladimir Oltean wrote:
>> On Wed, Sep 21, 2022 at 01:46:22PM +0200, Kurt Kanzenbach wrote:
>> > >> So, configured to 128 and 132 bytes (including VLAN Ethernet header) is
>> > >> the maximum frame size which passes through.
>> > >
>> > > Frame size means MAC DA + MAC SA + VLAN header + Ethertype + L2 payload,
>> > > and without FCS, right?
>> >
>> > Yes.
>> >
>> > >
>> > > Because max_sdu 128 only counts the L2 payload, so the maximum frame
>> > > size that passes should be 142 octets, or 146 octets with VLAN.
>> >
>> > Ok, i see. So, for 128 max-sdu we should end up with something like this
>> > in the prio config register:
>> >
>> > schedule->max_sdu[tc] + VLAN_ETH_HLEN - 4
>>
>> What does 4 represent? ETH_FCS_LEN?
>>
>> So when schedule->max_sdu[tc] is 128, you write to HR_PTPRTCCFG_MAXSDU
>> the value of 128 + 18 - 4 = 142, and this will pass packets (VLAN-tagged
>> or not) with an skb->len (on xmit) <= 142, right?
>
> Mistake, I meant skb->len <= 146 (max_sdu[tc] + VLAN_ETH_HLEN). So the
> hardware calculates the max frame len to include FCS, and skb->len is
> without FCS, hence the 4 discrepancy.

Yes, seems like it.

In case you repost this series, can you replace your patch with the one
below?

From 6488e0f979f3ea8c6130beb1b3e661cdb7796916 Mon Sep 17 00:00:00 2001
From: Kurt Kanzenbach <[email protected]>
Date: Wed, 14 Sep 2022 19:51:40 +0200
Subject: [PATCH] net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio

Add support for configuring the max SDU per priority and per port. If not
specified, keep the default.

Signed-off-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/dsa/hirschmann/hellcreek.c | 64 +++++++++++++++++++++++---
drivers/net/dsa/hirschmann/hellcreek.h | 7 +++
2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
index 5ceee71d9a25..c4b76b1e7a63 100644
--- a/drivers/net/dsa/hirschmann/hellcreek.c
+++ b/drivers/net/dsa/hirschmann/hellcreek.c
@@ -128,6 +128,16 @@ static void hellcreek_select_prio(struct hellcreek *hellcreek, int prio)
hellcreek_write(hellcreek, val, HR_PSEL);
}

+static void hellcreek_select_port_prio(struct hellcreek *hellcreek, int port,
+ int prio)
+{
+ u16 val = port << HR_PSEL_PTWSEL_SHIFT;
+
+ val |= prio << HR_PSEL_PRTCWSEL_SHIFT;
+
+ hellcreek_write(hellcreek, val, HR_PSEL);
+}
+
static void hellcreek_select_counter(struct hellcreek *hellcreek, int counter)
{
u16 val = counter << HR_CSEL_SHIFT;
@@ -1537,6 +1547,45 @@ hellcreek_port_prechangeupper(struct dsa_switch *ds, int port,
return ret;
}

+static void hellcreek_setup_maxsdu(struct hellcreek *hellcreek, int port,
+ const struct tc_taprio_qopt_offload *schedule)
+{
+ int tc;
+
+ for (tc = 0; tc < 8; ++tc) {
+ u32 max_sdu = schedule->max_sdu[tc] + VLAN_ETH_HLEN - ETH_FCS_LEN;
+ u16 val;
+
+ if (!schedule->max_sdu[tc])
+ continue;
+
+ dev_dbg(hellcreek->dev, "Configure max-sdu %u for tc %d on port %d\n",
+ max_sdu, tc, port);
+
+ hellcreek_select_port_prio(hellcreek, port, tc);
+
+ val = (max_sdu & HR_PTPRTCCFG_MAXSDU_MASK) << HR_PTPRTCCFG_MAXSDU_SHIFT;
+
+ hellcreek_write(hellcreek, val, HR_PTPRTCCFG);
+ }
+}
+
+static void hellcreek_reset_maxsdu(struct hellcreek *hellcreek, int port)
+{
+ int tc;
+
+ for (tc = 0; tc < 8; ++tc) {
+ u16 val;
+
+ hellcreek_select_port_prio(hellcreek, port, tc);
+
+ val = (HELLCREEK_DEFAULT_MAX_SDU & HR_PTPRTCCFG_MAXSDU_MASK)
+ << HR_PTPRTCCFG_MAXSDU_SHIFT;
+
+ hellcreek_write(hellcreek, val, HR_PTPRTCCFG);
+ }
+}
+
static void hellcreek_setup_gcl(struct hellcreek *hellcreek, int port,
const struct tc_taprio_qopt_offload *schedule)
{
@@ -1720,7 +1769,10 @@ static int hellcreek_port_set_schedule(struct dsa_switch *ds, int port,
}
hellcreek_port->current_schedule = taprio_offload_get(taprio);

- /* Then select port */
+ /* Configure max sdu */
+ hellcreek_setup_maxsdu(hellcreek, port, hellcreek_port->current_schedule);
+
+ /* Select tdg */
hellcreek_select_tgd(hellcreek, port);

/* Enable gating and keep defaults */
@@ -1772,7 +1824,10 @@ static int hellcreek_port_del_schedule(struct dsa_switch *ds, int port)
hellcreek_port->current_schedule = NULL;
}

- /* Then select port */
+ /* Reset max sdu */
+ hellcreek_reset_maxsdu(hellcreek, port);
+
+ /* Select tgd */
hellcreek_select_tgd(hellcreek, port);

/* Disable gating and return to regular switching flow */
@@ -1814,15 +1869,10 @@ static int hellcreek_port_setup_tc(struct dsa_switch *ds, int port,
{
struct tc_taprio_qopt_offload *taprio = type_data;
struct hellcreek *hellcreek = ds->priv;
- int tc;

if (type != TC_SETUP_QDISC_TAPRIO)
return -EOPNOTSUPP;

- for (tc = 0; tc < TC_MAX_QUEUE; tc++)
- if (taprio->max_sdu[tc])
- return -EOPNOTSUPP;
-
if (!hellcreek_validate_schedule(hellcreek, taprio))
return -EOPNOTSUPP;

diff --git a/drivers/net/dsa/hirschmann/hellcreek.h b/drivers/net/dsa/hirschmann/hellcreek.h
index c96b8c278904..66b989d946e4 100644
--- a/drivers/net/dsa/hirschmann/hellcreek.h
+++ b/drivers/net/dsa/hirschmann/hellcreek.h
@@ -37,6 +37,7 @@
#define HELLCREEK_VLAN_UNTAGGED_MEMBER 0x1
#define HELLCREEK_VLAN_TAGGED_MEMBER 0x3
#define HELLCREEK_NUM_EGRESS_QUEUES 8
+#define HELLCREEK_DEFAULT_MAX_SDU 1536

/* Register definitions */
#define HR_MODID_C (0 * 2)
@@ -72,6 +73,12 @@
#define HR_PRTCCFG_PCP_TC_MAP_SHIFT 0
#define HR_PRTCCFG_PCP_TC_MAP_MASK GENMASK(2, 0)

+#define HR_PTPRTCCFG (0xa9 * 2)
+#define HR_PTPRTCCFG_SET_QTRACK BIT(15)
+#define HR_PTPRTCCFG_REJECT BIT(14)
+#define HR_PTPRTCCFG_MAXSDU_SHIFT 0
+#define HR_PTPRTCCFG_MAXSDU_MASK GENMASK(10, 0)
+
#define HR_CSEL (0x8d * 2)
#define HR_CSEL_SHIFT 0
#define HR_CSEL_MASK GENMASK(7, 0)
--
2.30.2


Attachments:
signature.asc (877.00 B)