2020-11-14 04:00:45

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH net-next 0/3] net: ethernet: ti: cpsw: enable broadcast/multicast rate limit support

Hi

This series first adds supports for the ALE feature to rate limit number ingress
broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC storm
prevention.

And then enables corresponding support for ingress broadcast(BC)/multicast(MC)
rate limiting for TI CPSW switchdev and AM65x/J221E CPSW_NUSS drivers by
implementing HW offload for simple tc-flower policer with matches on dst_mac:
- ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
- 01:00:00:00:00:00 fixed value has to be used for MC rate limiting

Hence tc policer defines rate limit in terms of bits per second, but the
ALE supports limiting in terms of packets per second - the rate limit
bits/sec is converted to number of packets per second assuming minimum
Ethernet packet size ETH_ZLEN=60 bytes.

The solution inspired patch from Vladimir Oltean [1].

Examples:
- BC rate limit to 1000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police rate 480kbit burst 64k

rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.

- MC rate limit to 20000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
action police rate 9600kbit burst 64k

rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.

- show: tc filter show dev eth0 ingress
filter protocol all pref 49151 flower chain 0
filter protocol all pref 49151 flower chain 0 handle 0x1
dst_mac ff:ff:ff:ff:ff:ff
skip_sw
in_hw in_hw_count 1
action order 1: police 0x2 rate 480Kbit burst 64Kb mtu 2Kb action reclassify overhead 0b
ref 1 bind 1

filter protocol all pref 49152 flower chain 0
filter protocol all pref 49152 flower chain 0 handle 0x1
dst_mac 01:00:00:00:00:00
skip_sw
in_hw in_hw_count 1
action order 1: police 0x1 rate 9600Kbit burst 64Kb mtu 2Kb action reclassify overhead 0b
ref 1 bind

Testing MC with iperf:
- client
-- setup tc-flower as per above
route add -host 239.255.1.3 eth0
iperf -s -B 239.255.1.3 -u -f m &
cat /sys/class/net/eth0/statistics/rx_packets

- server
route add -host 239.255.1.3 eth0
iperf -c 239.255.1.3 -u -f m -i 5 -t 30 -l1472 -b121760000 -t1 //~10000pps

[1] https://lore.kernel.org/patchwork/patch/1217254/

Grygorii Strashko (3):
drivers: net: cpsw: ale: add broadcast/multicast rate limit support
net: ethernet: ti: cpsw_new: enable broadcast/multicast rate limit
support
net: ethernet: ti: am65-cpsw: enable broadcast/multicast rate limit
support

drivers/net/ethernet/ti/am65-cpsw-qos.c | 148 ++++++++++++++++++++
drivers/net/ethernet/ti/am65-cpsw-qos.h | 8 ++
drivers/net/ethernet/ti/cpsw_ale.c | 66 +++++++++
drivers/net/ethernet/ti/cpsw_ale.h | 2 +
drivers/net/ethernet/ti/cpsw_new.c | 4 +-
drivers/net/ethernet/ti/cpsw_priv.c | 171 ++++++++++++++++++++++++
drivers/net/ethernet/ti/cpsw_priv.h | 8 ++
7 files changed, 406 insertions(+), 1 deletion(-)

--
2.17.1


2020-11-14 04:00:45

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH net-next 1/3] drivers: net: cpsw: ale: add broadcast/multicast rate limit support

The CPSW ALE supports feature to rate limit number ingress
broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC
storm prevention.

The ALE BC/MC packet rate limit configuration consist of two parts:
- global
ALE_CONTROL.ENABLE_RATE_LIMIT bit 0 which enables rate limiting globally
ALE_PRESCALE.PRESCALE specifies rate limiting interval
- per-port
ALE_PORTCTLx.BCASTMCAST/_LIMIT specifies number of BC/MC packets allowed
per rate limiting interval.
When port.BCASTMCAST/_LIMIT is 0 rate limiting is disabled for Port.

When BC/MC packet rate limiting is enabled the number of allowed packets
per/sec is defined as:
number_of_packets/sec = (Fclk / ALE_PRESCALE) * port.BCASTMCAST/_LIMIT

Hence, the ALE_PRESCALE configuration is common for all ports the 1ms
interval is selected and configured during ALE initialization while
port.BCAST/MCAST_LIMIT are configured per-port.
This allows to achieve:
- min number_of_packets = 1000 when port.BCAST/MCAST_LIMIT = 1
- max number_of_packets = 1000 * 255 = 255000
when port.BCAST/MCAST_LIMIT = 0xFF

The ALE_CONTROL.ENABLE_RATE_LIMIT can also be enabled once during ALE
initialization as rate limiting enabled by non zero port.BCASTMCAST/_LIMIT
values.

This patch implements above logic in ALE and adds new ALE APIs
cpsw_ale_rx_ratelimit_bc();
cpsw_ale_rx_ratelimit_mc();

Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/net/ethernet/ti/cpsw_ale.c | 66 ++++++++++++++++++++++++++++++
drivers/net/ethernet/ti/cpsw_ale.h | 2 +
2 files changed, 68 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c
index cdc308a2aa3e..771e4d9f98ab 100644
--- a/drivers/net/ethernet/ti/cpsw_ale.c
+++ b/drivers/net/ethernet/ti/cpsw_ale.c
@@ -50,6 +50,8 @@
/* ALE_AGING_TIMER */
#define ALE_AGING_TIMER_MASK GENMASK(23, 0)

+#define ALE_RATE_LIMIT_MIN_PPS 1000
+
/**
* struct ale_entry_fld - The ALE tbl entry field description
* @start_bit: field start bit
@@ -1136,6 +1138,50 @@ int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control)
return tmp & BITMASK(info->bits);
}

+int cpsw_ale_rx_ratelimit_mc(struct cpsw_ale *ale, int port, unsigned int ratelimit_pps)
+
+{
+ int val = ratelimit_pps / ALE_RATE_LIMIT_MIN_PPS;
+ u32 remainder = ratelimit_pps % ALE_RATE_LIMIT_MIN_PPS;
+
+ if (ratelimit_pps && !val) {
+ dev_err(ale->params.dev, "ALE MC port:%d ratelimit min value 1000pps\n", port);
+ return -EINVAL;
+ }
+
+ if (remainder)
+ dev_info(ale->params.dev, "ALE port:%d MC ratelimit set to %dpps (requested %d)\n",
+ port, ratelimit_pps - remainder, ratelimit_pps);
+
+ cpsw_ale_control_set(ale, port, ALE_PORT_MCAST_LIMIT, val);
+
+ dev_dbg(ale->params.dev, "ALE port:%d MC ratelimit set %d\n",
+ port, val * ALE_RATE_LIMIT_MIN_PPS);
+ return 0;
+}
+
+int cpsw_ale_rx_ratelimit_bc(struct cpsw_ale *ale, int port, unsigned int ratelimit_pps)
+
+{
+ int val = ratelimit_pps / ALE_RATE_LIMIT_MIN_PPS;
+ u32 remainder = ratelimit_pps % ALE_RATE_LIMIT_MIN_PPS;
+
+ if (ratelimit_pps && !val) {
+ dev_err(ale->params.dev, "ALE port:%d BC ratelimit min value 1000pps\n", port);
+ return -EINVAL;
+ }
+
+ if (remainder)
+ dev_info(ale->params.dev, "ALE port:%d BC ratelimit set to %dpps (requested %d)\n",
+ port, ratelimit_pps - remainder, ratelimit_pps);
+
+ cpsw_ale_control_set(ale, port, ALE_PORT_BCAST_LIMIT, val);
+
+ dev_dbg(ale->params.dev, "ALE port:%d BC ratelimit set %d\n",
+ port, val * ALE_RATE_LIMIT_MIN_PPS);
+ return 0;
+}
+
static void cpsw_ale_timer(struct timer_list *t)
{
struct cpsw_ale *ale = from_timer(ale, t, timer);
@@ -1199,6 +1245,26 @@ static void cpsw_ale_aging_stop(struct cpsw_ale *ale)

void cpsw_ale_start(struct cpsw_ale *ale)
{
+ unsigned long ale_prescale;
+
+ /* configure Broadcast and Multicast Rate Limit
+ * number_of_packets = (Fclk / ALE_PRESCALE) * port.BCAST/MCAST_LIMIT
+ * ALE_PRESCALE width is 19bit and min value 0x10
+ * port.BCAST/MCAST_LIMIT is 8bit
+ *
+ * For multi port configuration support the ALE_PRESCALE is configured to 1ms interval,
+ * which allows to configure port.BCAST/MCAST_LIMIT per port and achieve:
+ * min number_of_packets = 1000 when port.BCAST/MCAST_LIMIT = 1
+ * max number_of_packets = 1000 * 255 = 255000 when port.BCAST/MCAST_LIMIT = 0xFF
+ */
+ ale_prescale = ale->params.bus_freq / ALE_RATE_LIMIT_MIN_PPS;
+ writel((u32)ale_prescale, ale->params.ale_regs + ALE_PRESCALE);
+
+ /* Allow MC/BC rate limiting globally.
+ * The actual Rate Limit cfg enabled per-port by port.BCAST/MCAST_LIMIT
+ */
+ cpsw_ale_control_set(ale, 0, ALE_RATE_LIMIT, 1);
+
cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1);
cpsw_ale_control_set(ale, 0, ALE_CLEAR, 1);

diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h
index 13fe47687fde..aba4572cfa3b 100644
--- a/drivers/net/ethernet/ti/cpsw_ale.h
+++ b/drivers/net/ethernet/ti/cpsw_ale.h
@@ -120,6 +120,8 @@ int cpsw_ale_add_vlan(struct cpsw_ale *ale, u16 vid, int port, int untag,
int reg_mcast, int unreg_mcast);
int cpsw_ale_del_vlan(struct cpsw_ale *ale, u16 vid, int port);
void cpsw_ale_set_allmulti(struct cpsw_ale *ale, int allmulti, int port);
+int cpsw_ale_rx_ratelimit_bc(struct cpsw_ale *ale, int port, unsigned int ratelimit_pps);
+int cpsw_ale_rx_ratelimit_mc(struct cpsw_ale *ale, int port, unsigned int ratelimit_pps);

int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control);
int cpsw_ale_control_set(struct cpsw_ale *ale, int port,
--
2.17.1

2020-11-14 04:01:14

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH net-next 2/3] net: ethernet: ti: cpsw_new: enable broadcast/multicast rate limit support

This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
in TI CPSW switchdev driver (the corresponding ALE support was added in previous
patch) by implementing HW offload for simple tc-flower policer with matches
on dst_mac:
- ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
- 01:00:00:00:00:00 fixed value has to be used for MC rate limiting

Hence tc policer defines rate limit in terms of bits per second, but the
ALE supports limiting in terms of packets per second - the rate limit
bits/sec is converted to number of packets per second assuming minimum
Ethernet packet size ETH_ZLEN=60 bytes.

Examples:
- BC rate limit to 1000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police rate 480kbit burst 64k

rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.

- MC rate limit to 20000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
action police rate 9600kbit burst 64k

rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.

Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/net/ethernet/ti/cpsw_new.c | 4 +-
drivers/net/ethernet/ti/cpsw_priv.c | 171 ++++++++++++++++++++++++++++
drivers/net/ethernet/ti/cpsw_priv.h | 8 ++
3 files changed, 182 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
index 2f5e0ad23ad7..6fad5a5461f6 100644
--- a/drivers/net/ethernet/ti/cpsw_new.c
+++ b/drivers/net/ethernet/ti/cpsw_new.c
@@ -505,6 +505,8 @@ static void cpsw_restore(struct cpsw_priv *priv)

/* restore CBS offload */
cpsw_cbs_resume(&cpsw->slaves[priv->emac_port - 1], priv);
+
+ cpsw_qos_clsflower_resume(priv);
}

static void cpsw_init_stp_ale_entry(struct cpsw_common *cpsw)
@@ -1418,7 +1420,7 @@ static int cpsw_create_ports(struct cpsw_common *cpsw)
cpsw->slaves[i].ndev = ndev;

ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER |
- NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_NETNS_LOCAL;
+ NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_NETNS_LOCAL | NETIF_F_HW_TC;

ndev->netdev_ops = &cpsw_netdev_ops;
ndev->ethtool_ops = &cpsw_ethtool_ops;
diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c
index 31c5e36ff706..0908d476b854 100644
--- a/drivers/net/ethernet/ti/cpsw_priv.c
+++ b/drivers/net/ethernet/ti/cpsw_priv.c
@@ -502,6 +502,7 @@ int cpsw_init_common(struct cpsw_common *cpsw, void __iomem *ss_regs,
ale_params.ale_ageout = ale_ageout;
ale_params.ale_ports = CPSW_ALE_PORTS_NUM;
ale_params.dev_id = "cpsw";
+ ale_params.bus_freq = cpsw->bus_freq_mhz * 1000000;

cpsw->ale = cpsw_ale_create(&ale_params);
if (IS_ERR(cpsw->ale)) {
@@ -1046,6 +1047,8 @@ static int cpsw_set_mqprio(struct net_device *ndev, void *type_data)
return 0;
}

+static int cpsw_qos_setup_tc_block(struct net_device *ndev, struct flow_block_offload *f);
+
int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
void *type_data)
{
@@ -1056,6 +1059,9 @@ int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
case TC_SETUP_QDISC_MQPRIO:
return cpsw_set_mqprio(ndev, type_data);

+ case TC_SETUP_BLOCK:
+ return cpsw_qos_setup_tc_block(ndev, type_data);
+
default:
return -EOPNOTSUPP;
}
@@ -1383,3 +1389,168 @@ int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
page_pool_recycle_direct(cpsw->page_pool[ch], page);
return ret;
}
+
+static int cpsw_qos_clsflower_add_policer(struct cpsw_priv *priv,
+ struct netlink_ext_ack *extack,
+ struct flow_cls_offload *cls,
+ u64 rate_bytes_ps)
+{
+ struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+ struct flow_dissector *dissector = rule->match.dissector;
+ u8 null_mac[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+ u8 bc_mac[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+ u8 mc_mac[] = {0x01, 0x00, 0x00, 0x00, 0x00, 0x00};
+ struct flow_match_eth_addrs match;
+ u32 pps, port_id;
+ int ret;
+
+ if (dissector->used_keys &
+ ~(BIT(FLOW_DISSECTOR_KEY_BASIC) |
+ BIT(FLOW_DISSECTOR_KEY_CONTROL) |
+ BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS))) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Unsupported keys used");
+ return -EOPNOTSUPP;
+ }
+
+ if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+ NL_SET_ERR_MSG_MOD(extack, "Not matching on eth address");
+ return -EOPNOTSUPP;
+ }
+
+ flow_rule_match_eth_addrs(rule, &match);
+
+ if (!ether_addr_equal_masked(match.key->src, null_mac,
+ match.mask->src)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Matching on source MAC not supported");
+ return -EOPNOTSUPP;
+ }
+
+ port_id = cpsw_slave_index(priv->cpsw, priv) + 1;
+ /* Calculate number of packets per second for given bps
+ * assuming min ethernet packet size
+ */
+ pps = div_u64(rate_bytes_ps, ETH_ZLEN);
+
+ if (ether_addr_equal(match.key->dst, bc_mac)) {
+ ret = cpsw_ale_rx_ratelimit_bc(priv->cpsw->ale, port_id, pps);
+ if (ret)
+ return ret;
+
+ priv->ale_bc_ratelimit.cookie = cls->cookie;
+ priv->ale_bc_ratelimit.rate_packet_ps = pps;
+ }
+
+ if (ether_addr_equal(match.key->dst, mc_mac)) {
+ ret = cpsw_ale_rx_ratelimit_mc(priv->cpsw->ale, port_id, pps);
+ if (ret)
+ return ret;
+
+ priv->ale_mc_ratelimit.cookie = cls->cookie;
+ priv->ale_mc_ratelimit.rate_packet_ps = pps;
+ }
+
+ return 0;
+}
+
+static int cpsw_qos_configure_clsflower(struct cpsw_priv *priv, struct flow_cls_offload *cls)
+{
+ struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+ struct netlink_ext_ack *extack = cls->common.extack;
+ const struct flow_action_entry *act;
+ int i;
+
+ flow_action_for_each(i, act, &rule->action) {
+ switch (act->id) {
+ case FLOW_ACTION_POLICE:
+ return cpsw_qos_clsflower_add_policer(priv, extack, cls,
+ act->police.rate_bytes_ps);
+ default:
+ NL_SET_ERR_MSG_MOD(extack, "Action not supported");
+ return -EOPNOTSUPP;
+ }
+ }
+ return -EOPNOTSUPP;
+}
+
+static int cpsw_qos_delete_clsflower(struct cpsw_priv *priv, struct flow_cls_offload *cls)
+{
+ u32 port_id = cpsw_slave_index(priv->cpsw, priv) + 1;
+
+ if (cls->cookie == priv->ale_bc_ratelimit.cookie) {
+ priv->ale_bc_ratelimit.cookie = 0;
+ priv->ale_bc_ratelimit.rate_packet_ps = 0;
+ cpsw_ale_rx_ratelimit_bc(priv->cpsw->ale, port_id, 0);
+ }
+
+ if (cls->cookie == priv->ale_mc_ratelimit.cookie) {
+ priv->ale_mc_ratelimit.cookie = 0;
+ priv->ale_mc_ratelimit.rate_packet_ps = 0;
+ cpsw_ale_rx_ratelimit_mc(priv->cpsw->ale, port_id, 0);
+ }
+
+ return 0;
+}
+
+static int cpsw_qos_setup_tc_clsflower(struct cpsw_priv *priv, struct flow_cls_offload *cls_flower)
+{
+ switch (cls_flower->command) {
+ case FLOW_CLS_REPLACE:
+ return cpsw_qos_configure_clsflower(priv, cls_flower);
+ case FLOW_CLS_DESTROY:
+ return cpsw_qos_delete_clsflower(priv, cls_flower);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int cpsw_qos_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv)
+{
+ struct cpsw_priv *priv = cb_priv;
+ int ret;
+
+ if (!tc_cls_can_offload_and_chain0(priv->ndev, type_data))
+ return -EOPNOTSUPP;
+
+ ret = pm_runtime_get_sync(priv->dev);
+ if (ret < 0) {
+ pm_runtime_put_noidle(priv->dev);
+ return ret;
+ }
+
+ switch (type) {
+ case TC_SETUP_CLSFLOWER:
+ ret = cpsw_qos_setup_tc_clsflower(priv, type_data);
+ break;
+ default:
+ ret = -EOPNOTSUPP;
+ }
+
+ pm_runtime_put(priv->dev);
+ return ret;
+}
+
+static LIST_HEAD(cpsw_qos_block_cb_list);
+
+static int cpsw_qos_setup_tc_block(struct net_device *ndev, struct flow_block_offload *f)
+{
+ struct cpsw_priv *priv = netdev_priv(ndev);
+
+ return flow_block_cb_setup_simple(f, &cpsw_qos_block_cb_list,
+ cpsw_qos_setup_tc_block_cb,
+ priv, priv, true);
+}
+
+void cpsw_qos_clsflower_resume(struct cpsw_priv *priv)
+{
+ u32 port_id = cpsw_slave_index(priv->cpsw, priv) + 1;
+
+ if (priv->ale_bc_ratelimit.cookie)
+ cpsw_ale_rx_ratelimit_bc(priv->cpsw->ale, port_id,
+ priv->ale_bc_ratelimit.rate_packet_ps);
+
+ if (priv->ale_mc_ratelimit.cookie)
+ cpsw_ale_rx_ratelimit_mc(priv->cpsw->ale, port_id,
+ priv->ale_mc_ratelimit.rate_packet_ps);
+}
diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h
index 7b7f3596b20d..9e2fac0873c9 100644
--- a/drivers/net/ethernet/ti/cpsw_priv.h
+++ b/drivers/net/ethernet/ti/cpsw_priv.h
@@ -361,6 +361,11 @@ struct cpsw_common {
u8 base_mac[ETH_ALEN];
};

+struct cpsw_ale_ratelimit {
+ unsigned long cookie;
+ u64 rate_packet_ps;
+};
+
struct cpsw_priv {
struct net_device *ndev;
struct device *dev;
@@ -380,6 +385,8 @@ struct cpsw_priv {
u32 emac_port;
struct cpsw_common *cpsw;
int offload_fwd_mark;
+ struct cpsw_ale_ratelimit ale_bc_ratelimit;
+ struct cpsw_ale_ratelimit ale_mc_ratelimit;
};

#define ndev_to_cpsw(ndev) (((struct cpsw_priv *)netdev_priv(ndev))->cpsw)
@@ -458,6 +465,7 @@ int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
bool cpsw_shp_is_off(struct cpsw_priv *priv);
void cpsw_cbs_resume(struct cpsw_slave *slave, struct cpsw_priv *priv);
void cpsw_mqprio_resume(struct cpsw_slave *slave, struct cpsw_priv *priv);
+void cpsw_qos_clsflower_resume(struct cpsw_priv *priv);

/* ethtool */
u32 cpsw_get_msglevel(struct net_device *ndev);
--
2.17.1

2020-11-14 04:03:08

by Grygorii Strashko

[permalink] [raw]
Subject: [PATCH net-next 3/3] net: ethernet: ti: am65-cpsw: enable broadcast/multicast rate limit support

This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
in TI AM65x CPSW driver (the corresponding ALE support was added in previous
patch) by implementing HW offload for simple tc-flower policer with matches
on dst_mac:
- ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
- 01:00:00:00:00:00 fixed value has to be used for MC rate limiting

Hence tc policer defines rate limit in terms of bits per second, but the
ALE supports limiting in terms of packets per second - the rate limit
bits/sec is converted to number of packets per second assuming minimum
Ethernet packet size ETH_ZLEN=60 bytes.

Examples:
- BC rate limit to 1000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police rate 480kbit burst 64k

rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.

- MC rate limit to 20000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
action police rate 9600kbit burst 64k

rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.

Signed-off-by: Grygorii Strashko <[email protected]>
---
drivers/net/ethernet/ti/am65-cpsw-qos.c | 148 ++++++++++++++++++++++++
drivers/net/ethernet/ti/am65-cpsw-qos.h | 8 ++
2 files changed, 156 insertions(+)

diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.c b/drivers/net/ethernet/ti/am65-cpsw-qos.c
index 3bdd4dbcd2ff..a06207233cd5 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-qos.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-qos.c
@@ -8,10 +8,12 @@

#include <linux/pm_runtime.h>
#include <linux/time.h>
+#include <net/pkt_cls.h>

#include "am65-cpsw-nuss.h"
#include "am65-cpsw-qos.h"
#include "am65-cpts.h"
+#include "cpsw_ale.h"

#define AM65_CPSW_REG_CTL 0x004
#define AM65_CPSW_PN_REG_CTL 0x004
@@ -588,12 +590,158 @@ static int am65_cpsw_setup_taprio(struct net_device *ndev, void *type_data)
return am65_cpsw_set_taprio(ndev, type_data);
}

+static int am65_cpsw_qos_clsflower_add_policer(struct am65_cpsw_port *port,
+ struct netlink_ext_ack *extack,
+ struct flow_cls_offload *cls,
+ u64 rate_bytes_ps)
+{
+ struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+ struct flow_dissector *dissector = rule->match.dissector;
+ u8 null_mac[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+ u8 bc_mac[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+ u8 mc_mac[] = {0x01, 0x00, 0x00, 0x00, 0x00, 0x00};
+ struct am65_cpsw_qos *qos = &port->qos;
+ struct flow_match_eth_addrs match;
+ u32 pps;
+ int ret;
+
+ if (dissector->used_keys &
+ ~(BIT(FLOW_DISSECTOR_KEY_BASIC) |
+ BIT(FLOW_DISSECTOR_KEY_CONTROL) |
+ BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS))) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Unsupported keys used");
+ return -EOPNOTSUPP;
+ }
+
+ if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+ NL_SET_ERR_MSG_MOD(extack, "Not matching on eth address");
+ return -EOPNOTSUPP;
+ }
+
+ flow_rule_match_eth_addrs(rule, &match);
+
+ if (!ether_addr_equal_masked(match.key->src, null_mac,
+ match.mask->src)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Matching on source MAC not supported");
+ return -EOPNOTSUPP;
+ }
+
+ /* Calculate number of packets per second for given bps
+ * assuming min ethernet packet size
+ */
+ pps = div_u64(rate_bytes_ps, ETH_ZLEN);
+
+ if (ether_addr_equal(match.key->dst, bc_mac)) {
+ ret = cpsw_ale_rx_ratelimit_bc(port->common->ale, port->port_id, pps);
+ if (ret)
+ return ret;
+
+ qos->ale_bc_ratelimit.cookie = cls->cookie;
+ qos->ale_bc_ratelimit.rate_packet_ps = pps;
+ }
+
+ if (ether_addr_equal(match.key->dst, mc_mac)) {
+ ret = cpsw_ale_rx_ratelimit_mc(port->common->ale, port->port_id, pps);
+ if (ret)
+ return ret;
+
+ qos->ale_mc_ratelimit.cookie = cls->cookie;
+ qos->ale_mc_ratelimit.rate_packet_ps = pps;
+ }
+
+ return 0;
+}
+
+static int am65_cpsw_qos_configure_clsflower(struct am65_cpsw_port *port,
+ struct flow_cls_offload *cls)
+{
+ struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+ struct netlink_ext_ack *extack = cls->common.extack;
+ const struct flow_action_entry *act;
+ int i;
+
+ flow_action_for_each(i, act, &rule->action) {
+ switch (act->id) {
+ case FLOW_ACTION_POLICE:
+ return am65_cpsw_qos_clsflower_add_policer(port, extack, cls,
+ act->police.rate_bytes_ps);
+ default:
+ NL_SET_ERR_MSG_MOD(extack,
+ "Action not supported");
+ return -EOPNOTSUPP;
+ }
+ }
+ return -EOPNOTSUPP;
+}
+
+static int am65_cpsw_qos_delete_clsflower(struct am65_cpsw_port *port, struct flow_cls_offload *cls)
+{
+ struct am65_cpsw_qos *qos = &port->qos;
+
+ if (cls->cookie == qos->ale_bc_ratelimit.cookie) {
+ qos->ale_bc_ratelimit.cookie = 0;
+ qos->ale_bc_ratelimit.rate_packet_ps = 0;
+ cpsw_ale_rx_ratelimit_bc(port->common->ale, port->port_id, 0);
+ }
+
+ if (cls->cookie == qos->ale_mc_ratelimit.cookie) {
+ qos->ale_mc_ratelimit.cookie = 0;
+ qos->ale_mc_ratelimit.rate_packet_ps = 0;
+ cpsw_ale_rx_ratelimit_mc(port->common->ale, port->port_id, 0);
+ }
+
+ return 0;
+}
+
+static int am65_cpsw_qos_setup_tc_clsflower(struct am65_cpsw_port *port,
+ struct flow_cls_offload *cls_flower)
+{
+ switch (cls_flower->command) {
+ case FLOW_CLS_REPLACE:
+ return am65_cpsw_qos_configure_clsflower(port, cls_flower);
+ case FLOW_CLS_DESTROY:
+ return am65_cpsw_qos_delete_clsflower(port, cls_flower);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int am65_cpsw_qos_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv)
+{
+ struct am65_cpsw_port *port = cb_priv;
+
+ if (!tc_cls_can_offload_and_chain0(port->ndev, type_data))
+ return -EOPNOTSUPP;
+
+ switch (type) {
+ case TC_SETUP_CLSFLOWER:
+ return am65_cpsw_qos_setup_tc_clsflower(port, type_data);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static LIST_HEAD(am65_cpsw_qos_block_cb_list);
+
+static int am65_cpsw_qos_setup_tc_block(struct net_device *ndev, struct flow_block_offload *f)
+{
+ struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
+
+ return flow_block_cb_setup_simple(f, &am65_cpsw_qos_block_cb_list,
+ am65_cpsw_qos_setup_tc_block_cb,
+ port, port, true);
+}
+
int am65_cpsw_qos_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
void *type_data)
{
switch (type) {
case TC_SETUP_QDISC_TAPRIO:
return am65_cpsw_setup_taprio(ndev, type_data);
+ case TC_SETUP_BLOCK:
+ return am65_cpsw_qos_setup_tc_block(ndev, type_data);
default:
return -EOPNOTSUPP;
}
diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.h b/drivers/net/ethernet/ti/am65-cpsw-qos.h
index e8f1b6b59e93..fb223b43b196 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-qos.h
+++ b/drivers/net/ethernet/ti/am65-cpsw-qos.h
@@ -14,11 +14,19 @@ struct am65_cpsw_est {
struct tc_taprio_qopt_offload taprio;
};

+struct am65_cpsw_ale_ratelimit {
+ unsigned long cookie;
+ u64 rate_packet_ps;
+};
+
struct am65_cpsw_qos {
struct am65_cpsw_est *est_admin;
struct am65_cpsw_est *est_oper;
ktime_t link_down_time;
int link_speed;
+
+ struct am65_cpsw_ale_ratelimit ale_bc_ratelimit;
+ struct am65_cpsw_ale_ratelimit ale_mc_ratelimit;
};

int am65_cpsw_qos_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
--
2.17.1

2020-11-14 19:11:40

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 2/3] net: ethernet: ti: cpsw_new: enable broadcast/multicast rate limit support

On Sat, Nov 14, 2020 at 05:56:53AM +0200, Grygorii Strashko wrote:
> This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
> in TI CPSW switchdev driver (the corresponding ALE support was added in previous
> patch) by implementing HW offload for simple tc-flower policer with matches
> on dst_mac:
> - ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
> - 01:00:00:00:00:00 fixed value has to be used for MC rate limiting
>
> Hence tc policer defines rate limit in terms of bits per second, but the
> ALE supports limiting in terms of packets per second - the rate limit
> bits/sec is converted to number of packets per second assuming minimum
> Ethernet packet size ETH_ZLEN=60 bytes.
>
> Examples:
> - BC rate limit to 1000pps:
> tc qdisc add dev eth0 clsact
> tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
> action police rate 480kbit burst 64k
>
> rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.
>
> - MC rate limit to 20000pps:
> tc qdisc add dev eth0 clsact
> tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
> action police rate 9600kbit burst 64k
>
> rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.
>
> Signed-off-by: Grygorii Strashko <[email protected]>
> ---

Your example for multicast would actually be correct if you specified
the mask as well. Like this:

tc filter add dev eth0 ingress flower skip_sw \
dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
action police rate 9600kbit burst 64k

But as things stand, the flow rule would have a certain meaning in
software (rate-limit only that particular multicast MAC address) and a
different meaning in hardware. Please modify the driver code to also
match on the mask.

2020-11-14 19:20:47

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: ethernet: ti: am65-cpsw: enable broadcast/multicast rate limit support

On Sat, Nov 14, 2020 at 05:56:54AM +0200, Grygorii Strashko wrote:
> This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
> in TI AM65x CPSW driver (the corresponding ALE support was added in previous
> patch) by implementing HW offload for simple tc-flower policer with matches
> on dst_mac:
> - ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
> - 01:00:00:00:00:00 fixed value has to be used for MC rate limiting
>
> Hence tc policer defines rate limit in terms of bits per second, but the
> ALE supports limiting in terms of packets per second - the rate limit
> bits/sec is converted to number of packets per second assuming minimum
> Ethernet packet size ETH_ZLEN=60 bytes.
>
> Examples:
> - BC rate limit to 1000pps:
> tc qdisc add dev eth0 clsact
> tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
> action police rate 480kbit burst 64k
>
> rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.
>
> - MC rate limit to 20000pps:
> tc qdisc add dev eth0 clsact
> tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
> action police rate 9600kbit burst 64k
>
> rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.
>
> Signed-off-by: Grygorii Strashko <[email protected]>
> ---

I understand this is unpleasant feedback, but don't you want to extend
tc-police to have an option to rate-limit based on packet count and not
based on byte count? The assumption you make in the driver that the
packets are all going to be minimum-sized is not a great one. I can
imagine that the user's policer budget is vastly exceeded if they enable
jumbo frames and they put a policer at 9.6 Mbps, and this is not at all
according to their expectation. 20Kpps assuming 60 bytes per packet
might be 9.6 Mbps, and the user will assume this bandwidth profile is
not exceeded, that's the whole point. But 20Kpps assuming 9KB per packet
is 1.44Gbps. Weird.

2020-11-16 13:05:26

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH net-next 2/3] net: ethernet: ti: cpsw_new: enable broadcast/multicast rate limit support



On 14/11/2020 21:09, Vladimir Oltean wrote:
> On Sat, Nov 14, 2020 at 05:56:53AM +0200, Grygorii Strashko wrote:
>> This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
>> in TI CPSW switchdev driver (the corresponding ALE support was added in previous
>> patch) by implementing HW offload for simple tc-flower policer with matches
>> on dst_mac:
>> - ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
>> - 01:00:00:00:00:00 fixed value has to be used for MC rate limiting
>>
>> Hence tc policer defines rate limit in terms of bits per second, but the
>> ALE supports limiting in terms of packets per second - the rate limit
>> bits/sec is converted to number of packets per second assuming minimum
>> Ethernet packet size ETH_ZLEN=60 bytes.
>>
>> Examples:
>> - BC rate limit to 1000pps:
>> tc qdisc add dev eth0 clsact
>> tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
>> action police rate 480kbit burst 64k
>>
>> rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.
>>
>> - MC rate limit to 20000pps:
>> tc qdisc add dev eth0 clsact
>> tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
>> action police rate 9600kbit burst 64k
>>
>> rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.
>>
>> Signed-off-by: Grygorii Strashko <[email protected]>
>> ---
>
> Your example for multicast would actually be correct if you specified
> the mask as well. Like this:
>
> tc filter add dev eth0 ingress flower skip_sw \
> dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
> action police rate 9600kbit burst 64k
>
> But as things stand, the flow rule would have a certain meaning in
> software (rate-limit only that particular multicast MAC address) and a
> different meaning in hardware. Please modify the driver code to also
> match on the mask.
>

ok.

--
Best regards,
grygorii

2020-11-16 18:42:16

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: ethernet: ti: am65-cpsw: enable broadcast/multicast rate limit support



On 14/11/2020 21:17, Vladimir Oltean wrote:
> On Sat, Nov 14, 2020 at 05:56:54AM +0200, Grygorii Strashko wrote:
>> This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
>> in TI AM65x CPSW driver (the corresponding ALE support was added in previous
>> patch) by implementing HW offload for simple tc-flower policer with matches
>> on dst_mac:
>> - ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
>> - 01:00:00:00:00:00 fixed value has to be used for MC rate limiting
>>
>> Hence tc policer defines rate limit in terms of bits per second, but the
>> ALE supports limiting in terms of packets per second - the rate limit
>> bits/sec is converted to number of packets per second assuming minimum
>> Ethernet packet size ETH_ZLEN=60 bytes.
>>
>> Examples:
>> - BC rate limit to 1000pps:
>> tc qdisc add dev eth0 clsact
>> tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
>> action police rate 480kbit burst 64k
>>
>> rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.
>>
>> - MC rate limit to 20000pps:
>> tc qdisc add dev eth0 clsact
>> tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
>> action police rate 9600kbit burst 64k
>>
>> rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.
>>
>> Signed-off-by: Grygorii Strashko <[email protected]>
>> ---
>
> I understand this is unpleasant feedback, but don't you want to extend
> tc-police to have an option to rate-limit based on packet count and not
> based on byte count?

Huh.
I'd be appreciated if you could provide more detailed opinion of how it can look like?
Sry, it's my first experience with tc.

> The assumption you make in the driver that the
> packets are all going to be minimum-sized is not a great one.
> I can
> imagine that the user's policer budget is vastly exceeded if they enable
> jumbo frames and they put a policer at 9.6 Mbps, and this is not at all
> according to their expectation. 20Kpps assuming 60 bytes per packet
> might be 9.6 Mbps, and the user will assume this bandwidth profile is
> not exceeded, that's the whole point. But 20Kpps assuming 9KB per packet
> is 1.44Gbps. Weird.

Sry, not sure I completely understood above. This is specific HW feature, which can
imit packet rate only. And it is expected to be applied by admin who know what he is doing.
The main purpose is to preserve CPU resource, which first of all affected by packet rate.
So, I see it as not "assumption", but requirement/agreement which will be reflected
in docs and works for a specific use case which is determined by presence of:
- skip_sw flag
- specific dst_mac/dst_mac_mask pair
in which case rate determines pps and not K/Mbps.


Also some ref on previous discussion [1] [2]
[1] https://www.spinics.net/lists/netdev/msg494630.html
[2] https://lore.kernel.org/patchwork/patch/481285/

--
Best regards,
grygorii

2020-11-16 21:22:24

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: ethernet: ti: am65-cpsw: enable broadcast/multicast rate limit support

On Mon, Nov 16, 2020 at 08:39:54PM +0200, Grygorii Strashko wrote:
>
>
> On 14/11/2020 21:17, Vladimir Oltean wrote:
> > On Sat, Nov 14, 2020 at 05:56:54AM +0200, Grygorii Strashko wrote:
> > > This patch enables support for ingress broadcast(BC)/multicast(MC) rate limiting
> > > in TI AM65x CPSW driver (the corresponding ALE support was added in previous
> > > patch) by implementing HW offload for simple tc-flower policer with matches
> > > on dst_mac:
> > > - ff:ff:ff:ff:ff:ff has to be used for BC rate limiting
> > > - 01:00:00:00:00:00 fixed value has to be used for MC rate limiting
> > >
> > > Hence tc policer defines rate limit in terms of bits per second, but the
> > > ALE supports limiting in terms of packets per second - the rate limit
> > > bits/sec is converted to number of packets per second assuming minimum
> > > Ethernet packet size ETH_ZLEN=60 bytes.
> > >
> > > Examples:
> > > - BC rate limit to 1000pps:
> > > tc qdisc add dev eth0 clsact
> > > tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
> > > action police rate 480kbit burst 64k
> > >
> > > rate 480kbit - 1000pps * 60 bytes * 8, burst - not used.
> > >
> > > - MC rate limit to 20000pps:
> > > tc qdisc add dev eth0 clsact
> > > tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00 \
> > > action police rate 9600kbit burst 64k
> > >
> > > rate 9600kbit - 20000pps * 60 bytes * 8, burst - not used.
> > >
> > > Signed-off-by: Grygorii Strashko <[email protected]>
> > > ---
> >
> > I understand this is unpleasant feedback, but don't you want to extend
> > tc-police to have an option to rate-limit based on packet count and not
> > based on byte count?
>
> Huh.
> I'd be appreciated if you could provide more detailed opinion of how it can look like?
> Sry, it's my first experience with tc.

Same as above, just in packets per second.

tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw \
dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
action police rate 20kpps

> > The assumption you make in the driver that the
> > packets are all going to be minimum-sized is not a great one.
> > I can
> > imagine that the user's policer budget is vastly exceeded if they enable
> > jumbo frames and they put a policer at 9.6 Mbps, and this is not at all
> > according to their expectation. 20Kpps assuming 60 bytes per packet
> > might be 9.6 Mbps, and the user will assume this bandwidth profile is
> > not exceeded, that's the whole point. But 20Kpps assuming 9KB per packet
> > is 1.44Gbps. Weird.
>
> Sry, not sure I completely understood above. This is specific HW feature, which can
> imit packet rate only. And it is expected to be applied by admin who know what he is doing.

Yes but you're not helping the admin to "know what he's doing" if you're
asking them to translate apples into oranges. A policer that counts
packets is not equivalent to a policer that counts bytes, unless all
packets are guaranteed to be of equal length, something which you cannot
ensure.

> The main purpose is to preserve CPU resource, which first of all affected by packet rate.
> So, I see it as not "assumption", but requirement/agreement which will be reflected
> in docs and works for a specific use case which is determined by presence of:
> - skip_sw flag
> - specific dst_mac/dst_mac_mask pair
> in which case rate determines pps and not K/Mbps.
>
>
> Also some ref on previous discussion [1] [2]
> [1] https://www.spinics.net/lists/netdev/msg494630.html
> [2] https://lore.kernel.org/patchwork/patch/481285/

ethtool coalescing as a tool to configure a policer makes zero sense.

You are definitely on the right path with the tc-police action. This was
just trying to be constructive feedback that the software implementation
of tc-police needs more work before your hardware could offload its job
in a way that would not violate its semantics.

2020-11-16 23:34:10

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: ethernet: ti: am65-cpsw: enable broadcast/multicast rate limit support

> Same as above, just in packets per second.
>
> tc qdisc add dev eth0 clsact
> tc filter add dev eth0 ingress flower skip_sw \
> dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
> action police rate 20kpps

I agree with Vladimir here. Since the hardware does PPS limits, the TC
API should also be PPS limit based. And as you said, CPU load is more
a factor of PPS than BPS, so it is a useful feature in general to
have. You just need to implement the software version first, before
you offload it to the hardware.

Andrew