From: Vladimir Oltean <[email protected]>
This series adds support for 2 types of policers:
- port policers, via tc matchall filter
- flow policers, via tc flower filter
for 2 DSA drivers:
- sja1105
- felix/ocelot
First we start with ocelot/felix. Prior to this patch, the ocelot core
library currently only supported:
- Port policers
- Flow-based dropping and trapping
But the felix wrapper could not actually use the port policers due to
missing linkage and support in the DSA core. So one of the patches
addresses exactly that limitation by adding the missing support to the
DSA core. The other patch for felix flow policers (via the VCAP IS2
engine) is actually in the ocelot library itself, since the linkage with
the ocelot flower classifier has already been done in an earlier patch
set.
Then with the newly added .port_policer_add and .port_policer_del, we
can also start supporting the L2 policers on sja1105.
Then, for full functionality of these L2 policers on sja1105, we also
implement a more limited set of flow-based policing keys for this
switch, namely for broadcast and VLAN PCP.
Vladimir Oltean (5):
net: dsa: refactor matchall mirred action to separate function
net: dsa: add port policers
net: dsa: felix: add port policers
net: dsa: sja1105: add configuration of port policers
net: dsa: sja1105: add broadcast and per-traffic class policers
Xiaoliang Yang (1):
net: mscc: ocelot: add action of police on vcap_is2
drivers/net/dsa/ocelot/felix.c | 24 ++
drivers/net/dsa/sja1105/Makefile | 1 +
drivers/net/dsa/sja1105/sja1105.h | 40 +++
drivers/net/dsa/sja1105/sja1105_flower.c | 340 ++++++++++++++++++++++
drivers/net/dsa/sja1105/sja1105_main.c | 136 +++++++--
drivers/net/ethernet/mscc/ocelot_ace.c | 64 +++-
drivers/net/ethernet/mscc/ocelot_ace.h | 4 +
drivers/net/ethernet/mscc/ocelot_flower.c | 9 +
drivers/net/ethernet/mscc/ocelot_police.c | 27 ++
drivers/net/ethernet/mscc/ocelot_police.h | 11 +-
drivers/net/ethernet/mscc/ocelot_tc.c | 2 +-
include/net/dsa.h | 13 +-
include/soc/mscc/ocelot.h | 9 +
net/dsa/slave.c | 145 ++++++---
14 files changed, 742 insertions(+), 83 deletions(-)
create mode 100644 drivers/net/dsa/sja1105/sja1105_flower.c
--
2.17.1
From: Xiaoliang Yang <[email protected]>
Ocelot has 384 policers that can be allocated to ingress ports,
QoS classes per port, and VCAP IS2 entries. ocelot_police.c
supports to set policers which can be allocated to police action
of VCAP IS2. We allocate policers from maximum pol_id, and
decrease the pol_id when add a new vcap_is2 entry which is
police action.
Signed-off-by: Xiaoliang Yang <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
---
Patch taken from Xiaoliang's submission here:
https://patchwork.ozlabs.org/patch/1263213/
with the compilation error fixed.
drivers/net/ethernet/mscc/ocelot_ace.c | 64 ++++++++++++++++++++---
drivers/net/ethernet/mscc/ocelot_ace.h | 4 ++
drivers/net/ethernet/mscc/ocelot_flower.c | 9 ++++
drivers/net/ethernet/mscc/ocelot_police.c | 24 +++++++++
drivers/net/ethernet/mscc/ocelot_police.h | 5 ++
include/soc/mscc/ocelot.h | 1 +
6 files changed, 100 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
index 906b54025b17..3bd286044480 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.c
+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
@@ -7,6 +7,7 @@
#include <linux/proc_fs.h>
#include <soc/mscc/ocelot_vcap.h>
+#include "ocelot_police.h"
#include "ocelot_ace.h"
#include "ocelot_s2.h"
@@ -299,9 +300,9 @@ static void vcap_action_set(struct ocelot *ocelot, struct vcap_data *data,
}
static void is2_action_set(struct ocelot *ocelot, struct vcap_data *data,
- enum ocelot_ace_action action)
+ struct ocelot_ace_rule *ace)
{
- switch (action) {
+ switch (ace->action) {
case OCELOT_ACL_ACTION_DROP:
vcap_action_set(ocelot, data, VCAP_IS2_ACT_PORT_MASK, 0);
vcap_action_set(ocelot, data, VCAP_IS2_ACT_MASK_MODE, 1);
@@ -319,6 +320,15 @@ static void is2_action_set(struct ocelot *ocelot, struct vcap_data *data,
vcap_action_set(ocelot, data, VCAP_IS2_ACT_CPU_QU_NUM, 0);
vcap_action_set(ocelot, data, VCAP_IS2_ACT_CPU_COPY_ENA, 1);
break;
+ case OCELOT_ACL_ACTION_POLICE:
+ vcap_action_set(ocelot, data, VCAP_IS2_ACT_PORT_MASK, 0);
+ vcap_action_set(ocelot, data, VCAP_IS2_ACT_MASK_MODE, 0);
+ vcap_action_set(ocelot, data, VCAP_IS2_ACT_POLICE_ENA, 1);
+ vcap_action_set(ocelot, data, VCAP_IS2_ACT_POLICE_IDX,
+ ace->pol_ix);
+ vcap_action_set(ocelot, data, VCAP_IS2_ACT_CPU_QU_NUM, 0);
+ vcap_action_set(ocelot, data, VCAP_IS2_ACT_CPU_COPY_ENA, 0);
+ break;
}
}
@@ -611,7 +621,7 @@ static void is2_entry_set(struct ocelot *ocelot, int ix,
}
vcap_key_set(ocelot, &data, VCAP_IS2_TYPE, type, type_mask);
- is2_action_set(ocelot, &data, ace->action);
+ is2_action_set(ocelot, &data, ace);
vcap_data_set(data.counter, data.counter_offset,
vcap_is2->counter_width, ace->stats.pkts);
@@ -639,12 +649,19 @@ static void is2_entry_get(struct ocelot *ocelot, struct ocelot_ace_rule *rule,
rule->stats.pkts = cnt;
}
-static void ocelot_ace_rule_add(struct ocelot_acl_block *block,
+static void ocelot_ace_rule_add(struct ocelot *ocelot,
+ struct ocelot_acl_block *block,
struct ocelot_ace_rule *rule)
{
struct ocelot_ace_rule *tmp;
struct list_head *pos, *n;
+ if (rule->action == OCELOT_ACL_ACTION_POLICE) {
+ block->pol_lpr--;
+ rule->pol_ix = block->pol_lpr;
+ ocelot_ace_policer_add(ocelot, rule->pol_ix, &rule->pol);
+ }
+
block->count++;
if (list_empty(&block->rules)) {
@@ -697,7 +714,7 @@ int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
int i, index;
/* Add rule to the linked list */
- ocelot_ace_rule_add(block, rule);
+ ocelot_ace_rule_add(ocelot, block, rule);
/* Get the index of the inserted rule */
index = ocelot_ace_rule_get_index_id(block, rule);
@@ -713,7 +730,33 @@ int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
return 0;
}
-static void ocelot_ace_rule_del(struct ocelot_acl_block *block,
+static void ocelot_ace_police_del(struct ocelot *ocelot,
+ struct ocelot_acl_block *block,
+ u32 ix)
+{
+ struct ocelot_ace_rule *ace;
+ int index = -1;
+
+ if (ix < block->pol_lpr)
+ return;
+
+ list_for_each_entry(ace, &block->rules, list) {
+ index++;
+ if (ace->action == OCELOT_ACL_ACTION_POLICE &&
+ ace->pol_ix < ix) {
+ ace->pol_ix += 1;
+ ocelot_ace_policer_add(ocelot, ace->pol_ix,
+ &ace->pol);
+ is2_entry_set(ocelot, index, ace);
+ }
+ }
+
+ ocelot_ace_policer_del(ocelot, block->pol_lpr);
+ block->pol_lpr++;
+}
+
+static void ocelot_ace_rule_del(struct ocelot *ocelot,
+ struct ocelot_acl_block *block,
struct ocelot_ace_rule *rule)
{
struct ocelot_ace_rule *tmp;
@@ -722,6 +765,10 @@ static void ocelot_ace_rule_del(struct ocelot_acl_block *block,
list_for_each_safe(pos, q, &block->rules) {
tmp = list_entry(pos, struct ocelot_ace_rule, list);
if (tmp->id == rule->id) {
+ if (tmp->action == OCELOT_ACL_ACTION_POLICE)
+ ocelot_ace_police_del(ocelot, block,
+ tmp->pol_ix);
+
list_del(pos);
kfree(tmp);
}
@@ -744,7 +791,7 @@ int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
index = ocelot_ace_rule_get_index_id(block, rule);
/* Delete rule */
- ocelot_ace_rule_del(block, rule);
+ ocelot_ace_rule_del(ocelot, block, rule);
/* Move up all the blocks over the deleted rule */
for (i = index; i < block->count; i++) {
@@ -779,6 +826,7 @@ int ocelot_ace_rule_stats_update(struct ocelot *ocelot,
int ocelot_ace_init(struct ocelot *ocelot)
{
const struct vcap_props *vcap_is2 = &ocelot->vcap[VCAP_IS2];
+ struct ocelot_acl_block *block = &ocelot->acl_block;
struct vcap_data data;
memset(&data, 0, sizeof(data));
@@ -807,6 +855,8 @@ int ocelot_ace_init(struct ocelot *ocelot)
ocelot_write_gix(ocelot, 0x3fffff, ANA_POL_CIR_STATE,
OCELOT_POLICER_DISCARD);
+ block->pol_lpr = OCELOT_POLICER_DISCARD - 1;
+
INIT_LIST_HEAD(&ocelot->acl_block.rules);
return 0;
diff --git a/drivers/net/ethernet/mscc/ocelot_ace.h b/drivers/net/ethernet/mscc/ocelot_ace.h
index b9a5868e3f15..29d22c566786 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.h
+++ b/drivers/net/ethernet/mscc/ocelot_ace.h
@@ -7,6 +7,7 @@
#define _MSCC_OCELOT_ACE_H_
#include "ocelot.h"
+#include "ocelot_police.h"
#include <net/sch_generic.h>
#include <net/pkt_cls.h>
@@ -176,6 +177,7 @@ struct ocelot_ace_frame_ipv6 {
enum ocelot_ace_action {
OCELOT_ACL_ACTION_DROP,
OCELOT_ACL_ACTION_TRAP,
+ OCELOT_ACL_ACTION_POLICE,
};
struct ocelot_ace_stats {
@@ -208,6 +210,8 @@ struct ocelot_ace_rule {
struct ocelot_ace_frame_ipv4 ipv4;
struct ocelot_ace_frame_ipv6 ipv6;
} frame;
+ struct ocelot_policer pol;
+ u32 pol_ix;
};
int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 6cbca9b05520..bf0b04775dda 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -12,6 +12,8 @@ static int ocelot_flower_parse_action(struct flow_cls_offload *f,
struct ocelot_ace_rule *ace)
{
const struct flow_action_entry *a;
+ s64 burst;
+ u64 rate;
int i;
if (!flow_offload_has_one_action(&f->rule->action))
@@ -29,6 +31,13 @@ static int ocelot_flower_parse_action(struct flow_cls_offload *f,
case FLOW_ACTION_TRAP:
ace->action = OCELOT_ACL_ACTION_TRAP;
break;
+ case FLOW_ACTION_POLICE:
+ ace->action = OCELOT_ACL_ACTION_POLICE;
+ rate = a->police.rate_bytes_ps;
+ ace->pol.rate = div_u64(rate, 1000) * 8;
+ burst = rate * PSCHED_NS2TICKS(a->police.burst);
+ ace->pol.burst = div_u64(burst, PSCHED_TICKS_PER_SEC);
+ break;
default:
return -EOPNOTSUPP;
}
diff --git a/drivers/net/ethernet/mscc/ocelot_police.c b/drivers/net/ethernet/mscc/ocelot_police.c
index faddce43f2e3..8d25b2706ff0 100644
--- a/drivers/net/ethernet/mscc/ocelot_police.c
+++ b/drivers/net/ethernet/mscc/ocelot_police.c
@@ -225,3 +225,27 @@ int ocelot_port_policer_del(struct ocelot *ocelot, int port)
return 0;
}
+
+int ocelot_ace_policer_add(struct ocelot *ocelot, u32 pol_ix,
+ struct ocelot_policer *pol)
+{
+ struct qos_policer_conf pp = { 0 };
+
+ if (!pol)
+ return -EINVAL;
+
+ pp.mode = MSCC_QOS_RATE_MODE_DATA;
+ pp.pir = pol->rate;
+ pp.pbs = pol->burst;
+
+ return qos_policer_conf_set(ocelot, 0, pol_ix, &pp);
+}
+
+int ocelot_ace_policer_del(struct ocelot *ocelot, u32 pol_ix)
+{
+ struct qos_policer_conf pp = { 0 };
+
+ pp.mode = MSCC_QOS_RATE_MODE_DISABLED;
+
+ return qos_policer_conf_set(ocelot, 0, pol_ix, &pp);
+}
diff --git a/drivers/net/ethernet/mscc/ocelot_police.h b/drivers/net/ethernet/mscc/ocelot_police.h
index ae9509229463..22025cce0a6a 100644
--- a/drivers/net/ethernet/mscc/ocelot_police.h
+++ b/drivers/net/ethernet/mscc/ocelot_police.h
@@ -19,4 +19,9 @@ int ocelot_port_policer_add(struct ocelot *ocelot, int port,
int ocelot_port_policer_del(struct ocelot *ocelot, int port);
+int ocelot_ace_policer_add(struct ocelot *ocelot, u32 pol_ix,
+ struct ocelot_policer *pol);
+
+int ocelot_ace_policer_del(struct ocelot *ocelot, u32 pol_ix);
+
#endif /* _MSCC_OCELOT_POLICE_H_ */
diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 23a78d927838..3db66638a3b2 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -473,6 +473,7 @@ struct ocelot_ops {
struct ocelot_acl_block {
struct list_head rules;
int count;
+ int pol_lpr;
};
struct ocelot_port {
--
2.17.1
From: Vladimir Oltean <[email protected]>
This patch adds complete support for manipulating the L2 Policing Tables
from this switch. There are 45 table entries, one entry per each port
and traffic class, and one dedicated entry for broadcast traffic for
each ingress port.
Policing entries are shareable, and we use this functionality to support
shared block filters.
We are modeling broadcast policers as simple tc-flower matches on
dst_mac. As for the traffic class policers, the switch only deduces the
traffic class from the VLAN PCP field, so it makes sense to model this
as a tc-flower match on vlan_prio.
How to limit broadcast traffic coming from all front-panel ports to a
cumulated total of 10 Mbit/s:
tc qdisc add dev sw0p0 ingress_block 1 clsact
tc qdisc add dev sw0p1 ingress_block 1 clsact
tc qdisc add dev sw0p2 ingress_block 1 clsact
tc qdisc add dev sw0p3 ingress_block 1 clsact
tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police rate 10mbit burst 64k
How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
100 Mbit/s on port 0 only:
tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
vlan_prio 0 action police rate 100mbit burst 64k
The broadcast, VLAN PCP and port policers are compatible with one
another (can be installed at the same time on a port).
Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/sja1105/Makefile | 1 +
drivers/net/dsa/sja1105/sja1105.h | 40 +++
drivers/net/dsa/sja1105/sja1105_flower.c | 340 +++++++++++++++++++++++
drivers/net/dsa/sja1105/sja1105_main.c | 4 +
4 files changed, 385 insertions(+)
create mode 100644 drivers/net/dsa/sja1105/sja1105_flower.c
diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
index 66161e874344..8943d8d66f2b 100644
--- a/drivers/net/dsa/sja1105/Makefile
+++ b/drivers/net/dsa/sja1105/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
sja1105-objs := \
sja1105_spi.o \
sja1105_main.o \
+ sja1105_flower.o \
sja1105_ethtool.o \
sja1105_clocking.o \
sja1105_static_config.o \
diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index d97d4699104e..8b60dbd567f2 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -19,6 +19,7 @@
* The passed parameter is in multiples of 1 ms.
*/
#define SJA1105_AGEING_TIME_MS(ms) ((ms) / 10)
+#define SJA1105_NUM_L2_POLICERS 45
typedef enum {
SPI_READ = 0,
@@ -95,6 +96,36 @@ struct sja1105_info {
const char *name;
};
+enum sja1105_rule_type {
+ SJA1105_RULE_BCAST_POLICER,
+ SJA1105_RULE_TC_POLICER,
+};
+
+struct sja1105_rule {
+ struct list_head list;
+ unsigned long cookie;
+ unsigned long port_mask;
+ enum sja1105_rule_type type;
+
+ union {
+ /* SJA1105_RULE_BCAST_POLICER */
+ struct {
+ int sharindx;
+ } bcast_pol;
+
+ /* SJA1105_RULE_TC_POLICER */
+ struct {
+ int sharindx;
+ int tc;
+ } tc_pol;
+ };
+};
+
+struct sja1105_flow_block {
+ struct list_head rules;
+ bool l2_policer_used[SJA1105_NUM_L2_POLICERS];
+};
+
struct sja1105_private {
struct sja1105_static_config static_config;
bool rgmii_rx_delay[SJA1105_NUM_PORTS];
@@ -103,6 +134,7 @@ struct sja1105_private {
struct gpio_desc *reset_gpio;
struct spi_device *spidev;
struct dsa_switch *ds;
+ struct sja1105_flow_block flow_block;
struct sja1105_port ports[SJA1105_NUM_PORTS];
/* Serializes transmission of management frames so that
* the switch doesn't confuse them with one another.
@@ -222,4 +254,12 @@ size_t sja1105pqrs_mac_config_entry_packing(void *buf, void *entry_ptr,
size_t sja1105pqrs_avb_params_entry_packing(void *buf, void *entry_ptr,
enum packing_op op);
+/* From sja1105_flower.c */
+int sja1105_cls_flower_del(struct dsa_switch *ds, int port,
+ struct flow_cls_offload *cls, bool ingress);
+int sja1105_cls_flower_add(struct dsa_switch *ds, int port,
+ struct flow_cls_offload *cls, bool ingress);
+void sja1105_flower_setup(struct dsa_switch *ds);
+void sja1105_flower_teardown(struct dsa_switch *ds);
+
#endif
diff --git a/drivers/net/dsa/sja1105/sja1105_flower.c b/drivers/net/dsa/sja1105/sja1105_flower.c
new file mode 100644
index 000000000000..5288a722e625
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_flower.c
@@ -0,0 +1,340 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright 2020, NXP Semiconductors
+ */
+#include "sja1105.h"
+
+static struct sja1105_rule *sja1105_rule_find(struct sja1105_private *priv,
+ unsigned long cookie)
+{
+ struct sja1105_rule *rule;
+
+ list_for_each_entry(rule, &priv->flow_block.rules, list)
+ if (rule->cookie == cookie)
+ return rule;
+
+ return NULL;
+}
+
+static int sja1105_find_free_l2_policer(struct sja1105_private *priv)
+{
+ int i;
+
+ for (i = 0; i < SJA1105_NUM_L2_POLICERS; i++)
+ if (!priv->flow_block.l2_policer_used[i])
+ return i;
+
+ return -1;
+}
+
+static int sja1105_setup_bcast_policer(struct sja1105_private *priv,
+ struct netlink_ext_ack *extack,
+ unsigned long cookie, int port,
+ u64 rate_bytes_per_sec,
+ s64 burst)
+{
+ struct sja1105_rule *rule = sja1105_rule_find(priv, cookie);
+ struct sja1105_l2_policing_entry *policing;
+ bool new_rule = false;
+ unsigned long p;
+ int rc;
+
+ if (!rule) {
+ rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+ if (!rule)
+ return -ENOMEM;
+
+ rule->cookie = cookie;
+ rule->type = SJA1105_RULE_BCAST_POLICER;
+ rule->bcast_pol.sharindx = sja1105_find_free_l2_policer(priv);
+ new_rule = true;
+ }
+
+ if (rule->bcast_pol.sharindx == -1) {
+ NL_SET_ERR_MSG_MOD(extack, "No more L2 policers free");
+ rc = -ENOSPC;
+ goto out;
+ }
+
+ policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
+
+ if (policing[(SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port].sharindx != port) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Port already has a broadcast policer");
+ rc = -EEXIST;
+ goto out;
+ }
+
+ rule->port_mask |= BIT(port);
+
+ /* Make the broadcast policers of all ports attached to this block
+ * point to the newly allocated policer
+ */
+ for_each_set_bit(p, &rule->port_mask, SJA1105_NUM_PORTS) {
+ int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + p;
+
+ policing[bcast].sharindx = rule->bcast_pol.sharindx;
+ }
+
+ policing[rule->bcast_pol.sharindx].rate = div_u64(rate_bytes_per_sec *
+ 512, 1000000);
+ policing[rule->bcast_pol.sharindx].smax = div_u64(rate_bytes_per_sec *
+ PSCHED_NS2TICKS(burst),
+ PSCHED_TICKS_PER_SEC);
+ /* TODO: support per-flow MTU */
+ policing[rule->bcast_pol.sharindx].maxlen = VLAN_ETH_FRAME_LEN +
+ ETH_FCS_LEN;
+
+ rc = sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
+
+out:
+ if (rc == 0 && new_rule) {
+ priv->flow_block.l2_policer_used[rule->bcast_pol.sharindx] = true;
+ list_add(&rule->list, &priv->flow_block.rules);
+ } else if (new_rule) {
+ kfree(rule);
+ }
+
+ return rc;
+}
+
+static int sja1105_setup_tc_policer(struct sja1105_private *priv,
+ struct netlink_ext_ack *extack,
+ unsigned long cookie, int port, int tc,
+ u64 rate_bytes_per_sec,
+ s64 burst)
+{
+ struct sja1105_rule *rule = sja1105_rule_find(priv, cookie);
+ struct sja1105_l2_policing_entry *policing;
+ bool new_rule = false;
+ unsigned long p;
+ int rc;
+
+ if (!rule) {
+ rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+ if (!rule)
+ return -ENOMEM;
+
+ rule->cookie = cookie;
+ rule->type = SJA1105_RULE_TC_POLICER;
+ rule->tc_pol.sharindx = sja1105_find_free_l2_policer(priv);
+ rule->tc_pol.tc = tc;
+ new_rule = true;
+ }
+
+ if (rule->tc_pol.sharindx == -1) {
+ NL_SET_ERR_MSG_MOD(extack, "No more L2 policers free");
+ rc = -ENOSPC;
+ goto out;
+ }
+
+ policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
+
+ if (policing[(port * SJA1105_NUM_TC) + tc].sharindx != port) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Port-TC pair already has an L2 policer");
+ rc = -EEXIST;
+ goto out;
+ }
+
+ rule->port_mask |= BIT(port);
+
+ /* Make the policers for traffic class @tc of all ports attached to
+ * this block point to the newly allocated policer
+ */
+ for_each_set_bit(p, &rule->port_mask, SJA1105_NUM_PORTS) {
+ int index = (p * SJA1105_NUM_TC) + tc;
+
+ policing[index].sharindx = rule->tc_pol.sharindx;
+ }
+
+ policing[rule->tc_pol.sharindx].rate = div_u64(rate_bytes_per_sec *
+ 512, 1000000);
+ policing[rule->tc_pol.sharindx].smax = div_u64(rate_bytes_per_sec *
+ PSCHED_NS2TICKS(burst),
+ PSCHED_TICKS_PER_SEC);
+ /* TODO: support per-flow MTU */
+ policing[rule->tc_pol.sharindx].maxlen = VLAN_ETH_FRAME_LEN +
+ ETH_FCS_LEN;
+
+ rc = sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
+
+out:
+ if (rc == 0 && new_rule) {
+ priv->flow_block.l2_policer_used[rule->tc_pol.sharindx] = true;
+ list_add(&rule->list, &priv->flow_block.rules);
+ } else if (new_rule) {
+ kfree(rule);
+ }
+
+ return rc;
+}
+
+static int sja1105_flower_parse_policer(struct sja1105_private *priv, int port,
+ struct netlink_ext_ack *extack,
+ struct flow_cls_offload *cls,
+ u64 rate_bytes_per_sec,
+ s64 burst)
+{
+ struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+ struct flow_dissector *dissector = rule->match.dissector;
+
+ if (dissector->used_keys &
+ ~(BIT(FLOW_DISSECTOR_KEY_BASIC) |
+ BIT(FLOW_DISSECTOR_KEY_CONTROL) |
+ BIT(FLOW_DISSECTOR_KEY_VLAN) |
+ BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS))) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Unsupported keys used");
+ return -EOPNOTSUPP;
+ }
+
+ if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) {
+ struct flow_match_basic match;
+
+ flow_rule_match_basic(rule, &match);
+ if (match.key->n_proto) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Matching on protocol not supported");
+ return -EOPNOTSUPP;
+ }
+ }
+
+ if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+ u8 bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+ u8 null[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+ struct flow_match_eth_addrs match;
+
+ flow_rule_match_eth_addrs(rule, &match);
+
+ if (!ether_addr_equal_masked(match.key->src, null,
+ match.mask->src)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Matching on source MAC not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (!ether_addr_equal_masked(match.key->dst, bcast,
+ match.mask->dst)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Only matching on broadcast DMAC is supported");
+ return -EOPNOTSUPP;
+ }
+
+ return sja1105_setup_bcast_policer(priv, extack, cls->cookie,
+ port, rate_bytes_per_sec,
+ burst);
+ }
+
+ if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) {
+ struct flow_match_vlan match;
+
+ flow_rule_match_vlan(rule, &match);
+
+ if (match.key->vlan_id & match.mask->vlan_id) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Matching on VID is not supported");
+ return -EOPNOTSUPP;
+ }
+
+ if (match.mask->vlan_priority != 0x7) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Masked matching on PCP is not supported");
+ return -EOPNOTSUPP;
+ }
+
+ return sja1105_setup_tc_policer(priv, extack, cls->cookie, port,
+ match.key->vlan_priority,
+ rate_bytes_per_sec,
+ burst);
+ }
+
+ NL_SET_ERR_MSG_MOD(extack, "Not matching on any known key");
+ return -EOPNOTSUPP;
+}
+
+int sja1105_cls_flower_add(struct dsa_switch *ds, int port,
+ struct flow_cls_offload *cls, bool ingress)
+{
+ struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+ struct netlink_ext_ack *extack = cls->common.extack;
+ struct sja1105_private *priv = ds->priv;
+ const struct flow_action_entry *act;
+ int rc = -EOPNOTSUPP, i;
+
+ flow_action_for_each(i, act, &rule->action) {
+ switch (act->id) {
+ case FLOW_ACTION_POLICE:
+ rc = sja1105_flower_parse_policer(priv, port, extack, cls,
+ act->police.rate_bytes_ps,
+ act->police.burst);
+ break;
+ default:
+ NL_SET_ERR_MSG_MOD(extack,
+ "Action not supported");
+ break;
+ }
+ }
+
+ return rc;
+}
+
+int sja1105_cls_flower_del(struct dsa_switch *ds, int port,
+ struct flow_cls_offload *cls, bool ingress)
+{
+ struct sja1105_private *priv = ds->priv;
+ struct sja1105_rule *rule = sja1105_rule_find(priv, cls->cookie);
+ struct sja1105_l2_policing_entry *policing;
+ int old_sharindx;
+
+ if (!rule)
+ return 0;
+
+ policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
+
+ if (rule->type == SJA1105_RULE_BCAST_POLICER) {
+ int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
+
+ old_sharindx = policing[bcast].sharindx;
+ policing[bcast].sharindx = port;
+ } else if (rule->type == SJA1105_RULE_TC_POLICER) {
+ int index = (port * SJA1105_NUM_TC) + rule->tc_pol.tc;
+
+ old_sharindx = policing[index].sharindx;
+ policing[index].sharindx = port;
+ } else {
+ return -EINVAL;
+ }
+
+ rule->port_mask &= ~BIT(port);
+ if (!rule->port_mask) {
+ priv->flow_block.l2_policer_used[old_sharindx] = false;
+ list_del(&rule->list);
+ kfree(rule);
+ }
+
+ return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
+}
+
+void sja1105_flower_setup(struct dsa_switch *ds)
+{
+ struct sja1105_private *priv = ds->priv;
+ int port;
+
+ INIT_LIST_HEAD(&priv->flow_block.rules);
+
+ for (port = 0; port < SJA1105_NUM_PORTS; port++)
+ priv->flow_block.l2_policer_used[port] = true;
+}
+
+void sja1105_flower_teardown(struct dsa_switch *ds)
+{
+ struct sja1105_private *priv = ds->priv;
+ struct sja1105_rule *rule;
+ struct list_head *pos, *n;
+
+ list_for_each_safe(pos, n, &priv->flow_block.rules) {
+ rule = list_entry(pos, struct sja1105_rule, list);
+ list_del(&rule->list);
+ kfree(rule);
+ }
+}
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 81d2e5e5ce96..472f4eb20c49 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2021,6 +2021,7 @@ static void sja1105_teardown(struct dsa_switch *ds)
kthread_destroy_worker(sp->xmit_worker);
}
+ sja1105_flower_teardown(ds);
sja1105_tas_teardown(ds);
sja1105_ptp_clock_unregister(ds);
sja1105_static_config_free(&priv->static_config);
@@ -2356,6 +2357,8 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
.port_mirror_del = sja1105_mirror_del,
.port_policer_add = sja1105_port_policer_add,
.port_policer_del = sja1105_port_policer_del,
+ .cls_flower_add = sja1105_cls_flower_add,
+ .cls_flower_del = sja1105_cls_flower_del,
};
static int sja1105_check_device_id(struct sja1105_private *priv)
@@ -2459,6 +2462,7 @@ static int sja1105_probe(struct spi_device *spi)
mutex_init(&priv->mgmt_lock);
sja1105_tas_setup(ds);
+ sja1105_flower_setup(ds);
rc = dsa_register_switch(priv->ds);
if (rc)
--
2.17.1
From: Vladimir Oltean <[email protected]>
This adds partial configuration support for the L2 Policing Table. Out
of the 45 policing entries, only 5 are used (one for each port), in a
shared manner. All 8 traffic classes, and the broadcast policer, are
redirected to a common instance which belongs to the ingress port.
Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/sja1105/sja1105_main.c | 132 +++++++++++++++++++------
1 file changed, 100 insertions(+), 32 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 763ae1d3bca8..81d2e5e5ce96 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -516,23 +516,56 @@ static int sja1105_init_avb_params(struct sja1105_private *priv)
return 0;
}
+/* The L2 policing table is 2-stage. The table is looked up for each frame
+ * according to the ingress port, whether it was broadcast or not, and the
+ * classified traffic class (given by VLAN PCP). This portion of the lookup is
+ * fixed, and gives access to the SHARINDX, an indirection register pointing
+ * within the policing table itself, which is used to resolve the policer that
+ * will be used for this frame.
+ *
+ * Stage 1 Stage 2
+ * +------------+--------+ +---------------------------------+
+ * |Port 0 TC 0 |SHARINDX| | Policer 0: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * |Port 0 TC 1 |SHARINDX| | Policer 1: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * ... | Policer 2: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * |Port 0 TC 7 |SHARINDX| | Policer 3: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * |Port 1 TC 0 |SHARINDX| | Policer 4: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * ... | Policer 5: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * |Port 1 TC 7 |SHARINDX| | Policer 6: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * ... | Policer 7: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ * |Port 4 TC 7 |SHARINDX| ...
+ * +------------+--------+
+ * |Port 0 BCAST|SHARINDX| ...
+ * +------------+--------+
+ * |Port 1 BCAST|SHARINDX| ...
+ * +------------+--------+
+ * ... ...
+ * +------------+--------+ +---------------------------------+
+ * |Port 4 BCAST|SHARINDX| | Policer 44: Rate, Burst, MTU |
+ * +------------+--------+ +---------------------------------+
+ *
+ * In this driver, we shall use policers 0-4 as statically alocated port
+ * (matchall) policers. So we need to make the SHARINDX for all lookups
+ * corresponding to this ingress port (8 VLAN PCP lookups and 1 broadcast
+ * lookup) equal.
+ * The remaining policers (40) shall be dynamically allocated for flower
+ * policers, where the key is either vlan_prio or dst_mac ff:ff:ff:ff:ff:ff.
+ */
#define SJA1105_RATE_MBPS(speed) (((speed) * 64000) / 1000)
-static void sja1105_setup_policer(struct sja1105_l2_policing_entry *policing,
- int index, int mtu)
-{
- policing[index].sharindx = index;
- policing[index].smax = 65535; /* Burst size in bytes */
- policing[index].rate = SJA1105_RATE_MBPS(1000);
- policing[index].maxlen = mtu;
- policing[index].partition = 0;
-}
-
static int sja1105_init_l2_policing(struct sja1105_private *priv)
{
struct sja1105_l2_policing_entry *policing;
struct sja1105_table *table;
- int i, j, k;
+ int port, tc;
table = &priv->static_config.tables[BLK_IDX_L2_POLICING];
@@ -551,22 +584,29 @@ static int sja1105_init_l2_policing(struct sja1105_private *priv)
policing = table->entries;
- /* k sweeps through all unicast policers (0-39).
- * bcast sweeps through policers 40-44.
- */
- for (i = 0, k = 0; i < SJA1105_NUM_PORTS; i++) {
- int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + i;
+ /* Setup shared indices for the matchall policers */
+ for (port = 0; port < SJA1105_NUM_PORTS; port++) {
+ int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
+
+ for (tc = 0; tc < SJA1105_NUM_TC; tc++)
+ policing[port * SJA1105_NUM_TC + tc].sharindx = port;
+
+ policing[bcast].sharindx = port;
+ }
+
+ /* Setup the matchall policer parameters */
+ for (port = 0; port < SJA1105_NUM_PORTS; port++) {
int mtu = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
- if (dsa_is_cpu_port(priv->ds, i))
+ if (dsa_is_cpu_port(priv->ds, port))
mtu += VLAN_HLEN;
- for (j = 0; j < SJA1105_NUM_TC; j++, k++)
- sja1105_setup_policer(policing, k, mtu);
-
- /* Set up this port's policer for broadcast traffic */
- sja1105_setup_policer(policing, bcast, mtu);
+ policing[port].smax = 65535; /* Burst size in bytes */
+ policing[port].rate = SJA1105_RATE_MBPS(1000);
+ policing[port].maxlen = mtu;
+ policing[port].partition = 0;
}
+
return 0;
}
@@ -2129,10 +2169,8 @@ static int sja1105_set_ageing_time(struct dsa_switch *ds,
static int sja1105_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
{
- int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
struct sja1105_l2_policing_entry *policing;
struct sja1105_private *priv = ds->priv;
- int tc;
new_mtu += VLAN_ETH_HLEN + ETH_FCS_LEN;
@@ -2141,16 +2179,10 @@ static int sja1105_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
- /* We set all 9 port policers to the same value, so just checking the
- * broadcast one is fine.
- */
- if (policing[bcast].maxlen == new_mtu)
+ if (policing[port].maxlen == new_mtu)
return 0;
- for (tc = 0; tc < SJA1105_NUM_TC; tc++)
- policing[port * SJA1105_NUM_TC + tc].maxlen = new_mtu;
-
- policing[bcast].maxlen = new_mtu;
+ policing[port].maxlen = new_mtu;
return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
}
@@ -2250,6 +2282,40 @@ static void sja1105_mirror_del(struct dsa_switch *ds, int port,
mirror->ingress, false);
}
+static int sja1105_port_policer_add(struct dsa_switch *ds, int port,
+ struct dsa_mall_policer_tc_entry *policer)
+{
+ struct sja1105_l2_policing_entry *policing;
+ struct sja1105_private *priv = ds->priv;
+
+ policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
+
+ /* In hardware, every 8 microseconds the credit level is incremented by
+ * the value of RATE bytes divided by 64, up to a maximum of SMAX
+ * bytes.
+ */
+ policing[port].rate = div_u64(512 * policer->rate_bytes_per_sec,
+ 1000000);
+ policing[port].smax = div_u64(policer->rate_bytes_per_sec *
+ PSCHED_NS2TICKS(policer->burst),
+ PSCHED_TICKS_PER_SEC);
+
+ return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
+}
+
+static void sja1105_port_policer_del(struct dsa_switch *ds, int port)
+{
+ struct sja1105_l2_policing_entry *policing;
+ struct sja1105_private *priv = ds->priv;
+
+ policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
+
+ policing[port].rate = SJA1105_RATE_MBPS(1000);
+ policing[port].smax = 65535;
+
+ sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
+}
+
static const struct dsa_switch_ops sja1105_switch_ops = {
.get_tag_protocol = sja1105_get_tag_protocol,
.setup = sja1105_setup,
@@ -2288,6 +2354,8 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
.port_setup_tc = sja1105_port_setup_tc,
.port_mirror_add = sja1105_mirror_add,
.port_mirror_del = sja1105_mirror_del,
+ .port_policer_add = sja1105_port_policer_add,
+ .port_policer_del = sja1105_port_policer_del,
};
static int sja1105_check_device_id(struct sja1105_private *priv)
--
2.17.1
From: Vladimir Oltean <[email protected]>
This patch is a trivial passthrough towards the ocelot library, which
support port policers since commit 2c1d029a017f ("net: mscc: ocelot:
Implement port policers via tc command").
Some data structure conversion between the DSA core and the Ocelot
library is necessary, for policer parameters.
Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/ocelot/felix.c | 24 +++++++++++++++++++++++
drivers/net/ethernet/mscc/ocelot_police.c | 3 +++
drivers/net/ethernet/mscc/ocelot_police.h | 10 ----------
drivers/net/ethernet/mscc/ocelot_tc.c | 2 +-
include/soc/mscc/ocelot.h | 8 ++++++++
5 files changed, 36 insertions(+), 11 deletions(-)
diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
index eef9fa812a3c..7f7dd6736051 100644
--- a/drivers/net/dsa/ocelot/felix.c
+++ b/drivers/net/dsa/ocelot/felix.c
@@ -14,6 +14,7 @@
#include <linux/of_net.h>
#include <linux/pci.h>
#include <linux/of.h>
+#include <net/pkt_sched.h>
#include <net/dsa.h>
#include "felix.h"
@@ -651,6 +652,27 @@ static int felix_cls_flower_stats(struct dsa_switch *ds, int port,
return ocelot_cls_flower_stats(ocelot, port, cls, ingress);
}
+static int felix_port_policer_add(struct dsa_switch *ds, int port,
+ struct dsa_mall_policer_tc_entry *policer)
+{
+ struct ocelot *ocelot = ds->priv;
+ struct ocelot_policer pol = {
+ .rate = div_u64(policer->rate_bytes_per_sec, 1000) * 8,
+ .burst = div_u64(policer->rate_bytes_per_sec *
+ PSCHED_NS2TICKS(policer->burst),
+ PSCHED_TICKS_PER_SEC),
+ };
+
+ return ocelot_port_policer_add(ocelot, port, &pol);
+}
+
+static void felix_port_policer_del(struct dsa_switch *ds, int port)
+{
+ struct ocelot *ocelot = ds->priv;
+
+ ocelot_port_policer_del(ocelot, port);
+}
+
static const struct dsa_switch_ops felix_switch_ops = {
.get_tag_protocol = felix_get_tag_protocol,
.setup = felix_setup,
@@ -684,6 +706,8 @@ static const struct dsa_switch_ops felix_switch_ops = {
.port_txtstamp = felix_txtstamp,
.port_change_mtu = felix_change_mtu,
.port_max_mtu = felix_get_max_mtu,
+ .port_policer_add = felix_port_policer_add,
+ .port_policer_del = felix_port_policer_del,
.cls_flower_add = felix_cls_flower_add,
.cls_flower_del = felix_cls_flower_del,
.cls_flower_stats = felix_cls_flower_stats,
diff --git a/drivers/net/ethernet/mscc/ocelot_police.c b/drivers/net/ethernet/mscc/ocelot_police.c
index 8d25b2706ff0..2e1d8e187332 100644
--- a/drivers/net/ethernet/mscc/ocelot_police.c
+++ b/drivers/net/ethernet/mscc/ocelot_police.c
@@ -4,6 +4,7 @@
* Copyright (c) 2019 Microsemi Corporation
*/
+#include <soc/mscc/ocelot.h>
#include "ocelot_police.h"
enum mscc_qos_rate_mode {
@@ -203,6 +204,7 @@ int ocelot_port_policer_add(struct ocelot *ocelot, int port,
return 0;
}
+EXPORT_SYMBOL(ocelot_port_policer_add);
int ocelot_port_policer_del(struct ocelot *ocelot, int port)
{
@@ -225,6 +227,7 @@ int ocelot_port_policer_del(struct ocelot *ocelot, int port)
return 0;
}
+EXPORT_SYMBOL(ocelot_port_policer_del);
int ocelot_ace_policer_add(struct ocelot *ocelot, u32 pol_ix,
struct ocelot_policer *pol)
diff --git a/drivers/net/ethernet/mscc/ocelot_police.h b/drivers/net/ethernet/mscc/ocelot_police.h
index 22025cce0a6a..792abd28010a 100644
--- a/drivers/net/ethernet/mscc/ocelot_police.h
+++ b/drivers/net/ethernet/mscc/ocelot_police.h
@@ -9,16 +9,6 @@
#include "ocelot.h"
-struct ocelot_policer {
- u32 rate; /* kilobit per second */
- u32 burst; /* bytes */
-};
-
-int ocelot_port_policer_add(struct ocelot *ocelot, int port,
- struct ocelot_policer *pol);
-
-int ocelot_port_policer_del(struct ocelot *ocelot, int port);
-
int ocelot_ace_policer_add(struct ocelot *ocelot, u32 pol_ix,
struct ocelot_policer *pol);
diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
index 3ff5ef41eccf..d326e231f0ad 100644
--- a/drivers/net/ethernet/mscc/ocelot_tc.c
+++ b/drivers/net/ethernet/mscc/ocelot_tc.c
@@ -4,8 +4,8 @@
* Copyright (c) 2019 Microsemi Corporation
*/
+#include <soc/mscc/ocelot.h>
#include "ocelot_tc.h"
-#include "ocelot_police.h"
#include "ocelot_ace.h"
#include <net/pkt_cls.h>
diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 3db66638a3b2..ca49f7a114de 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -555,6 +555,11 @@ struct ocelot {
struct ptp_pin_desc ptp_pins[OCELOT_PTP_PINS_NUM];
};
+struct ocelot_policer {
+ u32 rate; /* kilobit per second */
+ u32 burst; /* bytes */
+};
+
#define ocelot_read_ix(ocelot, reg, gi, ri) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_read_gix(ocelot, reg, gi) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi))
#define ocelot_read_rix(ocelot, reg, ri) __ocelot_read_ix(ocelot, reg, reg##_RSZ * (ri))
@@ -624,6 +629,9 @@ int ocelot_port_add_txtstamp_skb(struct ocelot_port *ocelot_port,
void ocelot_get_txtstamp(struct ocelot *ocelot);
void ocelot_port_set_maxlen(struct ocelot *ocelot, int port, size_t sdu);
int ocelot_get_max_mtu(struct ocelot *ocelot, int port);
+int ocelot_port_policer_add(struct ocelot *ocelot, int port,
+ struct ocelot_policer *pol);
+int ocelot_port_policer_del(struct ocelot *ocelot, int port);
int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress);
int ocelot_cls_flower_destroy(struct ocelot *ocelot, int port,
--
2.17.1
From: Vladimir Oltean <[email protected]>
The approach taken to pass the port policer methods on to drivers is
pragmatic. It is similar to the port mirroring implementation (in that
the DSA core does all of the filter block interaction and only passes
simple operations for the driver to implement) and dissimilar to how
flow-based policers are going to be implemented (where the driver has
full control over the flow_cls_offload data structure).
Signed-off-by: Vladimir Oltean <[email protected]>
---
include/net/dsa.h | 13 +++++++-
net/dsa/slave.c | 79 +++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 85 insertions(+), 7 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h
index aeb411e77b9a..fb3f9222f2a1 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -130,9 +130,10 @@ struct dsa_switch_tree {
struct list_head rtable;
};
-/* TC matchall action types, only mirroring for now */
+/* TC matchall action types */
enum dsa_port_mall_action_type {
DSA_PORT_MALL_MIRROR,
+ DSA_PORT_MALL_POLICER,
};
/* TC mirroring entry */
@@ -141,6 +142,12 @@ struct dsa_mall_mirror_tc_entry {
bool ingress;
};
+/* TC port policer entry */
+struct dsa_mall_policer_tc_entry {
+ s64 burst;
+ u64 rate_bytes_per_sec;
+};
+
/* TC matchall entry */
struct dsa_mall_tc_entry {
struct list_head list;
@@ -148,6 +155,7 @@ struct dsa_mall_tc_entry {
enum dsa_port_mall_action_type type;
union {
struct dsa_mall_mirror_tc_entry mirror;
+ struct dsa_mall_policer_tc_entry policer;
};
};
@@ -557,6 +565,9 @@ struct dsa_switch_ops {
bool ingress);
void (*port_mirror_del)(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror);
+ int (*port_policer_add)(struct dsa_switch *ds, int port,
+ struct dsa_mall_policer_tc_entry *policer);
+ void (*port_policer_del)(struct dsa_switch *ds, int port);
int (*port_setup_tc)(struct dsa_switch *ds, int port,
enum tc_setup_type type, void *type_data);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index e6040a11bd83..9692a726f2ed 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -859,14 +859,14 @@ dsa_slave_add_cls_matchall_mirred(struct net_device *dev,
act = &cls->rule->action.entries[0];
if (!ds->ops->port_mirror_add)
- return err;
+ return -EOPNOTSUPP;
if (!act->dev)
return -EINVAL;
if (!flow_action_basic_hw_stats_check(&cls->rule->action,
cls->common.extack))
- return err;
+ return -EOPNOTSUPP;
act = &cls->rule->action.entries[0];
@@ -897,6 +897,67 @@ dsa_slave_add_cls_matchall_mirred(struct net_device *dev,
return err;
}
+static int
+dsa_slave_add_cls_matchall_police(struct net_device *dev,
+ struct tc_cls_matchall_offload *cls,
+ bool ingress)
+{
+ struct netlink_ext_ack *extack = cls->common.extack;
+ struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_slave_priv *p = netdev_priv(dev);
+ struct dsa_mall_policer_tc_entry *policer;
+ struct dsa_mall_tc_entry *mall_tc_entry;
+ struct dsa_switch *ds = dp->ds;
+ struct flow_action_entry *act;
+ int err;
+
+ if (!ds->ops->port_policer_add) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Policing offload not implemented\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (!ingress) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Only supported on ingress qdisc\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (!flow_action_basic_hw_stats_check(&cls->rule->action,
+ cls->common.extack))
+ return -EOPNOTSUPP;
+
+ list_for_each_entry(mall_tc_entry, &p->mall_tc_list, list) {
+ if (mall_tc_entry->type == DSA_PORT_MALL_POLICER) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Only one port policer allowed\n");
+ return -EEXIST;
+ }
+ }
+
+ act = &cls->rule->action.entries[0];
+
+ mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL);
+ if (!mall_tc_entry)
+ return -ENOMEM;
+
+ mall_tc_entry->cookie = cls->cookie;
+ mall_tc_entry->type = DSA_PORT_MALL_POLICER;
+ policer = &mall_tc_entry->policer;
+ policer->rate_bytes_per_sec = act->police.rate_bytes_ps;
+ policer->burst = act->police.burst;
+
+ err = ds->ops->port_policer_add(ds, dp->index, policer);
+ if (err) {
+ kfree(mall_tc_entry);
+ return err;
+ }
+
+ list_add_tail(&mall_tc_entry->list, &p->mall_tc_list);
+
+ return err;
+}
+
static int dsa_slave_add_cls_matchall(struct net_device *dev,
struct tc_cls_matchall_offload *cls,
bool ingress)
@@ -907,6 +968,9 @@ static int dsa_slave_add_cls_matchall(struct net_device *dev,
flow_offload_has_one_action(&cls->rule->action) &&
cls->rule->action.entries[0].id == FLOW_ACTION_MIRRED)
err = dsa_slave_add_cls_matchall_mirred(dev, cls, ingress);
+ else if (flow_offload_has_one_action(&cls->rule->action) &&
+ cls->rule->action.entries[0].id == FLOW_ACTION_POLICE)
+ err = dsa_slave_add_cls_matchall_police(dev, cls, ingress);
return err;
}
@@ -918,9 +982,6 @@ static void dsa_slave_del_cls_matchall(struct net_device *dev,
struct dsa_mall_tc_entry *mall_tc_entry;
struct dsa_switch *ds = dp->ds;
- if (!ds->ops->port_mirror_del)
- return;
-
mall_tc_entry = dsa_slave_mall_tc_entry_find(dev, cls->cookie);
if (!mall_tc_entry)
return;
@@ -929,7 +990,13 @@ static void dsa_slave_del_cls_matchall(struct net_device *dev,
switch (mall_tc_entry->type) {
case DSA_PORT_MALL_MIRROR:
- ds->ops->port_mirror_del(ds, dp->index, &mall_tc_entry->mirror);
+ if (ds->ops->port_mirror_del)
+ ds->ops->port_mirror_del(ds, dp->index,
+ &mall_tc_entry->mirror);
+ break;
+ case DSA_PORT_MALL_POLICER:
+ if (ds->ops->port_policer_del)
+ ds->ops->port_policer_del(ds, dp->index);
break;
default:
WARN_ON(1);
--
2.17.1
From: Vladimir Oltean <[email protected]>
Make room for other actions for the matchall filter by keeping the
mirred argument parsing self-contained in its own function.
Signed-off-by: Vladimir Oltean <[email protected]>
---
The diff for this patch does not look amazing..
net/dsa/slave.c | 70 ++++++++++++++++++++++++++++---------------------
1 file changed, 40 insertions(+), 30 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 8ced165a7908..e6040a11bd83 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -842,24 +842,27 @@ dsa_slave_mall_tc_entry_find(struct net_device *dev, unsigned long cookie)
return NULL;
}
-static int dsa_slave_add_cls_matchall(struct net_device *dev,
- struct tc_cls_matchall_offload *cls,
- bool ingress)
+static int
+dsa_slave_add_cls_matchall_mirred(struct net_device *dev,
+ struct tc_cls_matchall_offload *cls,
+ bool ingress)
{
struct dsa_port *dp = dsa_slave_to_port(dev);
struct dsa_slave_priv *p = netdev_priv(dev);
+ struct dsa_mall_mirror_tc_entry *mirror;
struct dsa_mall_tc_entry *mall_tc_entry;
- __be16 protocol = cls->common.protocol;
struct dsa_switch *ds = dp->ds;
struct flow_action_entry *act;
struct dsa_port *to_dp;
- int err = -EOPNOTSUPP;
+ int err;
+
+ act = &cls->rule->action.entries[0];
if (!ds->ops->port_mirror_add)
return err;
- if (!flow_offload_has_one_action(&cls->rule->action))
- return err;
+ if (!act->dev)
+ return -EINVAL;
if (!flow_action_basic_hw_stats_check(&cls->rule->action,
cls->common.extack))
@@ -867,38 +870,45 @@ static int dsa_slave_add_cls_matchall(struct net_device *dev,
act = &cls->rule->action.entries[0];
- if (act->id == FLOW_ACTION_MIRRED && protocol == htons(ETH_P_ALL)) {
- struct dsa_mall_mirror_tc_entry *mirror;
+ if (!dsa_slave_dev_check(act->dev))
+ return -EOPNOTSUPP;
- if (!act->dev)
- return -EINVAL;
+ mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL);
+ if (!mall_tc_entry)
+ return -ENOMEM;
- if (!dsa_slave_dev_check(act->dev))
- return -EOPNOTSUPP;
+ mall_tc_entry->cookie = cls->cookie;
+ mall_tc_entry->type = DSA_PORT_MALL_MIRROR;
+ mirror = &mall_tc_entry->mirror;
- mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL);
- if (!mall_tc_entry)
- return -ENOMEM;
+ to_dp = dsa_slave_to_port(act->dev);
- mall_tc_entry->cookie = cls->cookie;
- mall_tc_entry->type = DSA_PORT_MALL_MIRROR;
- mirror = &mall_tc_entry->mirror;
+ mirror->to_local_port = to_dp->index;
+ mirror->ingress = ingress;
- to_dp = dsa_slave_to_port(act->dev);
+ err = ds->ops->port_mirror_add(ds, dp->index, mirror, ingress);
+ if (err) {
+ kfree(mall_tc_entry);
+ return err;
+ }
- mirror->to_local_port = to_dp->index;
- mirror->ingress = ingress;
+ list_add_tail(&mall_tc_entry->list, &p->mall_tc_list);
- err = ds->ops->port_mirror_add(ds, dp->index, mirror, ingress);
- if (err) {
- kfree(mall_tc_entry);
- return err;
- }
+ return err;
+}
- list_add_tail(&mall_tc_entry->list, &p->mall_tc_list);
- }
+static int dsa_slave_add_cls_matchall(struct net_device *dev,
+ struct tc_cls_matchall_offload *cls,
+ bool ingress)
+{
+ int err = -EOPNOTSUPP;
- return 0;
+ if (cls->common.protocol == htons(ETH_P_ALL) &&
+ flow_offload_has_one_action(&cls->rule->action) &&
+ cls->rule->action.entries[0].id == FLOW_ACTION_MIRRED)
+ err = dsa_slave_add_cls_matchall_mirred(dev, cls, ingress);
+
+ return err;
}
static void dsa_slave_del_cls_matchall(struct net_device *dev,
--
2.17.1
+ Nik, Roopa
On Sun, Mar 29, 2020 at 02:52:02AM +0200, Vladimir Oltean wrote:
> From: Vladimir Oltean <[email protected]>
>
> This patch adds complete support for manipulating the L2 Policing Tables
> from this switch. There are 45 table entries, one entry per each port
> and traffic class, and one dedicated entry for broadcast traffic for
> each ingress port.
>
> Policing entries are shareable, and we use this functionality to support
> shared block filters.
>
> We are modeling broadcast policers as simple tc-flower matches on
> dst_mac. As for the traffic class policers, the switch only deduces the
> traffic class from the VLAN PCP field, so it makes sense to model this
> as a tc-flower match on vlan_prio.
>
> How to limit broadcast traffic coming from all front-panel ports to a
> cumulated total of 10 Mbit/s:
>
> tc qdisc add dev sw0p0 ingress_block 1 clsact
> tc qdisc add dev sw0p1 ingress_block 1 clsact
> tc qdisc add dev sw0p2 ingress_block 1 clsact
> tc qdisc add dev sw0p3 ingress_block 1 clsact
> tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
> action police rate 10mbit burst 64k
>
> How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
> 100 Mbit/s on port 0 only:
>
> tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
> vlan_prio 0 action police rate 100mbit burst 64k
>
> The broadcast, VLAN PCP and port policers are compatible with one
> another (can be installed at the same time on a port).
Hi Vladimir,
Some switches have a feature called "storm control". It allows one to
police incoming BUM traffic. See this entry from Cumulus Linux
documentation:
https://docs.cumulusnetworks.com/cumulus-linux-40/Layer-2/Spanning-Tree-and-Rapid-Spanning-Tree/#storm-control
In the past I was thinking about ways to implement this in Linux. The
only place in the pipeline where packets are actually classified to
broadcast / unknown unicast / multicast is at bridge ingress. Therefore,
my thinking was to implement these storm control policers as a
"bridge_slave" operation. It can then be offloaded to capable drivers
via the switchdev framework.
I think that if we have this implemented in the Linux bridge, then your
patch can be used to support the policing of broadcast packets while
returning an error if user tries to police unknown unicast or multicast
packets. Or maybe the hardware you are working with supports these types
as well?
WDYT?
>
> Signed-off-by: Vladimir Oltean <[email protected]>
> ---
> drivers/net/dsa/sja1105/Makefile | 1 +
> drivers/net/dsa/sja1105/sja1105.h | 40 +++
> drivers/net/dsa/sja1105/sja1105_flower.c | 340 +++++++++++++++++++++++
> drivers/net/dsa/sja1105/sja1105_main.c | 4 +
> 4 files changed, 385 insertions(+)
> create mode 100644 drivers/net/dsa/sja1105/sja1105_flower.c
>
> diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
> index 66161e874344..8943d8d66f2b 100644
> --- a/drivers/net/dsa/sja1105/Makefile
> +++ b/drivers/net/dsa/sja1105/Makefile
> @@ -4,6 +4,7 @@ obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
> sja1105-objs := \
> sja1105_spi.o \
> sja1105_main.o \
> + sja1105_flower.o \
> sja1105_ethtool.o \
> sja1105_clocking.o \
> sja1105_static_config.o \
> diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
> index d97d4699104e..8b60dbd567f2 100644
> --- a/drivers/net/dsa/sja1105/sja1105.h
> +++ b/drivers/net/dsa/sja1105/sja1105.h
> @@ -19,6 +19,7 @@
> * The passed parameter is in multiples of 1 ms.
> */
> #define SJA1105_AGEING_TIME_MS(ms) ((ms) / 10)
> +#define SJA1105_NUM_L2_POLICERS 45
>
> typedef enum {
> SPI_READ = 0,
> @@ -95,6 +96,36 @@ struct sja1105_info {
> const char *name;
> };
>
> +enum sja1105_rule_type {
> + SJA1105_RULE_BCAST_POLICER,
> + SJA1105_RULE_TC_POLICER,
> +};
> +
> +struct sja1105_rule {
> + struct list_head list;
> + unsigned long cookie;
> + unsigned long port_mask;
> + enum sja1105_rule_type type;
> +
> + union {
> + /* SJA1105_RULE_BCAST_POLICER */
> + struct {
> + int sharindx;
> + } bcast_pol;
> +
> + /* SJA1105_RULE_TC_POLICER */
> + struct {
> + int sharindx;
> + int tc;
> + } tc_pol;
> + };
> +};
> +
> +struct sja1105_flow_block {
> + struct list_head rules;
> + bool l2_policer_used[SJA1105_NUM_L2_POLICERS];
> +};
> +
> struct sja1105_private {
> struct sja1105_static_config static_config;
> bool rgmii_rx_delay[SJA1105_NUM_PORTS];
> @@ -103,6 +134,7 @@ struct sja1105_private {
> struct gpio_desc *reset_gpio;
> struct spi_device *spidev;
> struct dsa_switch *ds;
> + struct sja1105_flow_block flow_block;
> struct sja1105_port ports[SJA1105_NUM_PORTS];
> /* Serializes transmission of management frames so that
> * the switch doesn't confuse them with one another.
> @@ -222,4 +254,12 @@ size_t sja1105pqrs_mac_config_entry_packing(void *buf, void *entry_ptr,
> size_t sja1105pqrs_avb_params_entry_packing(void *buf, void *entry_ptr,
> enum packing_op op);
>
> +/* From sja1105_flower.c */
> +int sja1105_cls_flower_del(struct dsa_switch *ds, int port,
> + struct flow_cls_offload *cls, bool ingress);
> +int sja1105_cls_flower_add(struct dsa_switch *ds, int port,
> + struct flow_cls_offload *cls, bool ingress);
> +void sja1105_flower_setup(struct dsa_switch *ds);
> +void sja1105_flower_teardown(struct dsa_switch *ds);
> +
> #endif
> diff --git a/drivers/net/dsa/sja1105/sja1105_flower.c b/drivers/net/dsa/sja1105/sja1105_flower.c
> new file mode 100644
> index 000000000000..5288a722e625
> --- /dev/null
> +++ b/drivers/net/dsa/sja1105/sja1105_flower.c
> @@ -0,0 +1,340 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright 2020, NXP Semiconductors
> + */
> +#include "sja1105.h"
> +
> +static struct sja1105_rule *sja1105_rule_find(struct sja1105_private *priv,
> + unsigned long cookie)
> +{
> + struct sja1105_rule *rule;
> +
> + list_for_each_entry(rule, &priv->flow_block.rules, list)
> + if (rule->cookie == cookie)
> + return rule;
> +
> + return NULL;
> +}
> +
> +static int sja1105_find_free_l2_policer(struct sja1105_private *priv)
> +{
> + int i;
> +
> + for (i = 0; i < SJA1105_NUM_L2_POLICERS; i++)
> + if (!priv->flow_block.l2_policer_used[i])
> + return i;
> +
> + return -1;
> +}
> +
> +static int sja1105_setup_bcast_policer(struct sja1105_private *priv,
> + struct netlink_ext_ack *extack,
> + unsigned long cookie, int port,
> + u64 rate_bytes_per_sec,
> + s64 burst)
> +{
> + struct sja1105_rule *rule = sja1105_rule_find(priv, cookie);
> + struct sja1105_l2_policing_entry *policing;
> + bool new_rule = false;
> + unsigned long p;
> + int rc;
> +
> + if (!rule) {
> + rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> + if (!rule)
> + return -ENOMEM;
> +
> + rule->cookie = cookie;
> + rule->type = SJA1105_RULE_BCAST_POLICER;
> + rule->bcast_pol.sharindx = sja1105_find_free_l2_policer(priv);
> + new_rule = true;
> + }
> +
> + if (rule->bcast_pol.sharindx == -1) {
> + NL_SET_ERR_MSG_MOD(extack, "No more L2 policers free");
> + rc = -ENOSPC;
> + goto out;
> + }
> +
> + policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
> +
> + if (policing[(SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port].sharindx != port) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Port already has a broadcast policer");
> + rc = -EEXIST;
> + goto out;
> + }
> +
> + rule->port_mask |= BIT(port);
> +
> + /* Make the broadcast policers of all ports attached to this block
> + * point to the newly allocated policer
> + */
> + for_each_set_bit(p, &rule->port_mask, SJA1105_NUM_PORTS) {
> + int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + p;
> +
> + policing[bcast].sharindx = rule->bcast_pol.sharindx;
> + }
> +
> + policing[rule->bcast_pol.sharindx].rate = div_u64(rate_bytes_per_sec *
> + 512, 1000000);
> + policing[rule->bcast_pol.sharindx].smax = div_u64(rate_bytes_per_sec *
> + PSCHED_NS2TICKS(burst),
> + PSCHED_TICKS_PER_SEC);
> + /* TODO: support per-flow MTU */
> + policing[rule->bcast_pol.sharindx].maxlen = VLAN_ETH_FRAME_LEN +
> + ETH_FCS_LEN;
> +
> + rc = sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
> +
> +out:
> + if (rc == 0 && new_rule) {
> + priv->flow_block.l2_policer_used[rule->bcast_pol.sharindx] = true;
> + list_add(&rule->list, &priv->flow_block.rules);
> + } else if (new_rule) {
> + kfree(rule);
> + }
> +
> + return rc;
> +}
> +
> +static int sja1105_setup_tc_policer(struct sja1105_private *priv,
> + struct netlink_ext_ack *extack,
> + unsigned long cookie, int port, int tc,
> + u64 rate_bytes_per_sec,
> + s64 burst)
> +{
> + struct sja1105_rule *rule = sja1105_rule_find(priv, cookie);
> + struct sja1105_l2_policing_entry *policing;
> + bool new_rule = false;
> + unsigned long p;
> + int rc;
> +
> + if (!rule) {
> + rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> + if (!rule)
> + return -ENOMEM;
> +
> + rule->cookie = cookie;
> + rule->type = SJA1105_RULE_TC_POLICER;
> + rule->tc_pol.sharindx = sja1105_find_free_l2_policer(priv);
> + rule->tc_pol.tc = tc;
> + new_rule = true;
> + }
> +
> + if (rule->tc_pol.sharindx == -1) {
> + NL_SET_ERR_MSG_MOD(extack, "No more L2 policers free");
> + rc = -ENOSPC;
> + goto out;
> + }
> +
> + policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
> +
> + if (policing[(port * SJA1105_NUM_TC) + tc].sharindx != port) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Port-TC pair already has an L2 policer");
> + rc = -EEXIST;
> + goto out;
> + }
> +
> + rule->port_mask |= BIT(port);
> +
> + /* Make the policers for traffic class @tc of all ports attached to
> + * this block point to the newly allocated policer
> + */
> + for_each_set_bit(p, &rule->port_mask, SJA1105_NUM_PORTS) {
> + int index = (p * SJA1105_NUM_TC) + tc;
> +
> + policing[index].sharindx = rule->tc_pol.sharindx;
> + }
> +
> + policing[rule->tc_pol.sharindx].rate = div_u64(rate_bytes_per_sec *
> + 512, 1000000);
> + policing[rule->tc_pol.sharindx].smax = div_u64(rate_bytes_per_sec *
> + PSCHED_NS2TICKS(burst),
> + PSCHED_TICKS_PER_SEC);
> + /* TODO: support per-flow MTU */
> + policing[rule->tc_pol.sharindx].maxlen = VLAN_ETH_FRAME_LEN +
> + ETH_FCS_LEN;
> +
> + rc = sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
> +
> +out:
> + if (rc == 0 && new_rule) {
> + priv->flow_block.l2_policer_used[rule->tc_pol.sharindx] = true;
> + list_add(&rule->list, &priv->flow_block.rules);
> + } else if (new_rule) {
> + kfree(rule);
> + }
> +
> + return rc;
> +}
> +
> +static int sja1105_flower_parse_policer(struct sja1105_private *priv, int port,
> + struct netlink_ext_ack *extack,
> + struct flow_cls_offload *cls,
> + u64 rate_bytes_per_sec,
> + s64 burst)
> +{
> + struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
> + struct flow_dissector *dissector = rule->match.dissector;
> +
> + if (dissector->used_keys &
> + ~(BIT(FLOW_DISSECTOR_KEY_BASIC) |
> + BIT(FLOW_DISSECTOR_KEY_CONTROL) |
> + BIT(FLOW_DISSECTOR_KEY_VLAN) |
> + BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS))) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Unsupported keys used");
> + return -EOPNOTSUPP;
> + }
> +
> + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) {
> + struct flow_match_basic match;
> +
> + flow_rule_match_basic(rule, &match);
> + if (match.key->n_proto) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Matching on protocol not supported");
> + return -EOPNOTSUPP;
> + }
> + }
> +
> + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
> + u8 bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
> + u8 null[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
> + struct flow_match_eth_addrs match;
> +
> + flow_rule_match_eth_addrs(rule, &match);
> +
> + if (!ether_addr_equal_masked(match.key->src, null,
> + match.mask->src)) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Matching on source MAC not supported");
> + return -EOPNOTSUPP;
> + }
> +
> + if (!ether_addr_equal_masked(match.key->dst, bcast,
> + match.mask->dst)) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Only matching on broadcast DMAC is supported");
> + return -EOPNOTSUPP;
> + }
> +
> + return sja1105_setup_bcast_policer(priv, extack, cls->cookie,
> + port, rate_bytes_per_sec,
> + burst);
> + }
> +
> + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) {
> + struct flow_match_vlan match;
> +
> + flow_rule_match_vlan(rule, &match);
> +
> + if (match.key->vlan_id & match.mask->vlan_id) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Matching on VID is not supported");
> + return -EOPNOTSUPP;
> + }
> +
> + if (match.mask->vlan_priority != 0x7) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Masked matching on PCP is not supported");
> + return -EOPNOTSUPP;
> + }
> +
> + return sja1105_setup_tc_policer(priv, extack, cls->cookie, port,
> + match.key->vlan_priority,
> + rate_bytes_per_sec,
> + burst);
> + }
> +
> + NL_SET_ERR_MSG_MOD(extack, "Not matching on any known key");
> + return -EOPNOTSUPP;
> +}
> +
> +int sja1105_cls_flower_add(struct dsa_switch *ds, int port,
> + struct flow_cls_offload *cls, bool ingress)
> +{
> + struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
> + struct netlink_ext_ack *extack = cls->common.extack;
> + struct sja1105_private *priv = ds->priv;
> + const struct flow_action_entry *act;
> + int rc = -EOPNOTSUPP, i;
> +
> + flow_action_for_each(i, act, &rule->action) {
> + switch (act->id) {
> + case FLOW_ACTION_POLICE:
> + rc = sja1105_flower_parse_policer(priv, port, extack, cls,
> + act->police.rate_bytes_ps,
> + act->police.burst);
> + break;
> + default:
> + NL_SET_ERR_MSG_MOD(extack,
> + "Action not supported");
> + break;
> + }
> + }
> +
> + return rc;
> +}
> +
> +int sja1105_cls_flower_del(struct dsa_switch *ds, int port,
> + struct flow_cls_offload *cls, bool ingress)
> +{
> + struct sja1105_private *priv = ds->priv;
> + struct sja1105_rule *rule = sja1105_rule_find(priv, cls->cookie);
> + struct sja1105_l2_policing_entry *policing;
> + int old_sharindx;
> +
> + if (!rule)
> + return 0;
> +
> + policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
> +
> + if (rule->type == SJA1105_RULE_BCAST_POLICER) {
> + int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
> +
> + old_sharindx = policing[bcast].sharindx;
> + policing[bcast].sharindx = port;
> + } else if (rule->type == SJA1105_RULE_TC_POLICER) {
> + int index = (port * SJA1105_NUM_TC) + rule->tc_pol.tc;
> +
> + old_sharindx = policing[index].sharindx;
> + policing[index].sharindx = port;
> + } else {
> + return -EINVAL;
> + }
> +
> + rule->port_mask &= ~BIT(port);
> + if (!rule->port_mask) {
> + priv->flow_block.l2_policer_used[old_sharindx] = false;
> + list_del(&rule->list);
> + kfree(rule);
> + }
> +
> + return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
> +}
> +
> +void sja1105_flower_setup(struct dsa_switch *ds)
> +{
> + struct sja1105_private *priv = ds->priv;
> + int port;
> +
> + INIT_LIST_HEAD(&priv->flow_block.rules);
> +
> + for (port = 0; port < SJA1105_NUM_PORTS; port++)
> + priv->flow_block.l2_policer_used[port] = true;
> +}
> +
> +void sja1105_flower_teardown(struct dsa_switch *ds)
> +{
> + struct sja1105_private *priv = ds->priv;
> + struct sja1105_rule *rule;
> + struct list_head *pos, *n;
> +
> + list_for_each_safe(pos, n, &priv->flow_block.rules) {
> + rule = list_entry(pos, struct sja1105_rule, list);
> + list_del(&rule->list);
> + kfree(rule);
> + }
> +}
> diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> index 81d2e5e5ce96..472f4eb20c49 100644
> --- a/drivers/net/dsa/sja1105/sja1105_main.c
> +++ b/drivers/net/dsa/sja1105/sja1105_main.c
> @@ -2021,6 +2021,7 @@ static void sja1105_teardown(struct dsa_switch *ds)
> kthread_destroy_worker(sp->xmit_worker);
> }
>
> + sja1105_flower_teardown(ds);
> sja1105_tas_teardown(ds);
> sja1105_ptp_clock_unregister(ds);
> sja1105_static_config_free(&priv->static_config);
> @@ -2356,6 +2357,8 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
> .port_mirror_del = sja1105_mirror_del,
> .port_policer_add = sja1105_port_policer_add,
> .port_policer_del = sja1105_port_policer_del,
> + .cls_flower_add = sja1105_cls_flower_add,
> + .cls_flower_del = sja1105_cls_flower_del,
> };
>
> static int sja1105_check_device_id(struct sja1105_private *priv)
> @@ -2459,6 +2462,7 @@ static int sja1105_probe(struct spi_device *spi)
> mutex_init(&priv->mgmt_lock);
>
> sja1105_tas_setup(ds);
> + sja1105_flower_setup(ds);
>
> rc = dsa_register_switch(priv->ds);
> if (rc)
> --
> 2.17.1
>
On Sun, 29 Mar 2020 at 02:53, Vladimir Oltean <[email protected]> wrote:
>
> From: Vladimir Oltean <[email protected]>
>
> This patch is a trivial passthrough towards the ocelot library, which
> support port policers since commit 2c1d029a017f ("net: mscc: ocelot:
> Implement port policers via tc command").
>
> Some data structure conversion between the DSA core and the Ocelot
> library is necessary, for policer parameters.
>
> Signed-off-by: Vladimir Oltean <[email protected]>
> ---
> drivers/net/dsa/ocelot/felix.c | 24 +++++++++++++++++++++++
> drivers/net/ethernet/mscc/ocelot_police.c | 3 +++
> drivers/net/ethernet/mscc/ocelot_police.h | 10 ----------
> drivers/net/ethernet/mscc/ocelot_tc.c | 2 +-
> include/soc/mscc/ocelot.h | 8 ++++++++
> 5 files changed, 36 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
> index eef9fa812a3c..7f7dd6736051 100644
> --- a/drivers/net/dsa/ocelot/felix.c
> +++ b/drivers/net/dsa/ocelot/felix.c
> @@ -14,6 +14,7 @@
> #include <linux/of_net.h>
> #include <linux/pci.h>
> #include <linux/of.h>
> +#include <net/pkt_sched.h>
> #include <net/dsa.h>
> #include "felix.h"
>
> @@ -651,6 +652,27 @@ static int felix_cls_flower_stats(struct dsa_switch *ds, int port,
> return ocelot_cls_flower_stats(ocelot, port, cls, ingress);
> }
>
> +static int felix_port_policer_add(struct dsa_switch *ds, int port,
> + struct dsa_mall_policer_tc_entry *policer)
> +{
> + struct ocelot *ocelot = ds->priv;
> + struct ocelot_policer pol = {
> + .rate = div_u64(policer->rate_bytes_per_sec, 1000) * 8,
> + .burst = div_u64(policer->rate_bytes_per_sec *
> + PSCHED_NS2TICKS(policer->burst),
> + PSCHED_TICKS_PER_SEC),
> + };
> +
> + return ocelot_port_policer_add(ocelot, port, &pol);
> +}
> +
> +static void felix_port_policer_del(struct dsa_switch *ds, int port)
> +{
> + struct ocelot *ocelot = ds->priv;
> +
> + ocelot_port_policer_del(ocelot, port);
> +}
> +
> static const struct dsa_switch_ops felix_switch_ops = {
> .get_tag_protocol = felix_get_tag_protocol,
> .setup = felix_setup,
> @@ -684,6 +706,8 @@ static const struct dsa_switch_ops felix_switch_ops = {
> .port_txtstamp = felix_txtstamp,
> .port_change_mtu = felix_change_mtu,
> .port_max_mtu = felix_get_max_mtu,
> + .port_policer_add = felix_port_policer_add,
> + .port_policer_del = felix_port_policer_del,
> .cls_flower_add = felix_cls_flower_add,
> .cls_flower_del = felix_cls_flower_del,
> .cls_flower_stats = felix_cls_flower_stats,
> diff --git a/drivers/net/ethernet/mscc/ocelot_police.c b/drivers/net/ethernet/mscc/ocelot_police.c
> index 8d25b2706ff0..2e1d8e187332 100644
> --- a/drivers/net/ethernet/mscc/ocelot_police.c
> +++ b/drivers/net/ethernet/mscc/ocelot_police.c
> @@ -4,6 +4,7 @@
> * Copyright (c) 2019 Microsemi Corporation
> */
>
> +#include <soc/mscc/ocelot.h>
> #include "ocelot_police.h"
>
> enum mscc_qos_rate_mode {
> @@ -203,6 +204,7 @@ int ocelot_port_policer_add(struct ocelot *ocelot, int port,
>
> return 0;
> }
> +EXPORT_SYMBOL(ocelot_port_policer_add);
>
> int ocelot_port_policer_del(struct ocelot *ocelot, int port)
> {
> @@ -225,6 +227,7 @@ int ocelot_port_policer_del(struct ocelot *ocelot, int port)
>
> return 0;
> }
> +EXPORT_SYMBOL(ocelot_port_policer_del);
>
> int ocelot_ace_policer_add(struct ocelot *ocelot, u32 pol_ix,
> struct ocelot_policer *pol)
> diff --git a/drivers/net/ethernet/mscc/ocelot_police.h b/drivers/net/ethernet/mscc/ocelot_police.h
> index 22025cce0a6a..792abd28010a 100644
> --- a/drivers/net/ethernet/mscc/ocelot_police.h
> +++ b/drivers/net/ethernet/mscc/ocelot_police.h
> @@ -9,16 +9,6 @@
>
> #include "ocelot.h"
>
> -struct ocelot_policer {
> - u32 rate; /* kilobit per second */
> - u32 burst; /* bytes */
> -};
> -
> -int ocelot_port_policer_add(struct ocelot *ocelot, int port,
> - struct ocelot_policer *pol);
> -
> -int ocelot_port_policer_del(struct ocelot *ocelot, int port);
> -
> int ocelot_ace_policer_add(struct ocelot *ocelot, u32 pol_ix,
> struct ocelot_policer *pol);
>
> diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
> index 3ff5ef41eccf..d326e231f0ad 100644
> --- a/drivers/net/ethernet/mscc/ocelot_tc.c
> +++ b/drivers/net/ethernet/mscc/ocelot_tc.c
> @@ -4,8 +4,8 @@
> * Copyright (c) 2019 Microsemi Corporation
> */
>
> +#include <soc/mscc/ocelot.h>
> #include "ocelot_tc.h"
> -#include "ocelot_police.h"
> #include "ocelot_ace.h"
> #include <net/pkt_cls.h>
>
> diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
> index 3db66638a3b2..ca49f7a114de 100644
> --- a/include/soc/mscc/ocelot.h
> +++ b/include/soc/mscc/ocelot.h
> @@ -555,6 +555,11 @@ struct ocelot {
> struct ptp_pin_desc ptp_pins[OCELOT_PTP_PINS_NUM];
> };
>
Oops, it looks like I had Yangbo's patch in my tree, and the ptp_pins
are messing with this patch's context, causing it to fail to apply
cleanly:
https://patchwork.ozlabs.org/patch/1258827/
I think I need to send a v2 for this. Sorry.
> +struct ocelot_policer {
> + u32 rate; /* kilobit per second */
> + u32 burst; /* bytes */
> +};
> +
> #define ocelot_read_ix(ocelot, reg, gi, ri) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
> #define ocelot_read_gix(ocelot, reg, gi) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi))
> #define ocelot_read_rix(ocelot, reg, ri) __ocelot_read_ix(ocelot, reg, reg##_RSZ * (ri))
> @@ -624,6 +629,9 @@ int ocelot_port_add_txtstamp_skb(struct ocelot_port *ocelot_port,
> void ocelot_get_txtstamp(struct ocelot *ocelot);
> void ocelot_port_set_maxlen(struct ocelot *ocelot, int port, size_t sdu);
> int ocelot_get_max_mtu(struct ocelot *ocelot, int port);
> +int ocelot_port_policer_add(struct ocelot *ocelot, int port,
> + struct ocelot_policer *pol);
> +int ocelot_port_policer_del(struct ocelot *ocelot, int port);
> int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
> struct flow_cls_offload *f, bool ingress);
> int ocelot_cls_flower_destroy(struct ocelot *ocelot, int port,
> --
> 2.17.1
>
Thanks,
-Vladimir
On Sun, 29 Mar 2020 at 12:57, Ido Schimmel <[email protected]> wrote:
>
> + Nik, Roopa
>
> On Sun, Mar 29, 2020 at 02:52:02AM +0200, Vladimir Oltean wrote:
> > From: Vladimir Oltean <[email protected]>
> >
> > This patch adds complete support for manipulating the L2 Policing Tables
> > from this switch. There are 45 table entries, one entry per each port
> > and traffic class, and one dedicated entry for broadcast traffic for
> > each ingress port.
> >
> > Policing entries are shareable, and we use this functionality to support
> > shared block filters.
> >
> > We are modeling broadcast policers as simple tc-flower matches on
> > dst_mac. As for the traffic class policers, the switch only deduces the
> > traffic class from the VLAN PCP field, so it makes sense to model this
> > as a tc-flower match on vlan_prio.
> >
> > How to limit broadcast traffic coming from all front-panel ports to a
> > cumulated total of 10 Mbit/s:
> >
> > tc qdisc add dev sw0p0 ingress_block 1 clsact
> > tc qdisc add dev sw0p1 ingress_block 1 clsact
> > tc qdisc add dev sw0p2 ingress_block 1 clsact
> > tc qdisc add dev sw0p3 ingress_block 1 clsact
> > tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
> > action police rate 10mbit burst 64k
> >
> > How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
> > 100 Mbit/s on port 0 only:
> >
> > tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
> > vlan_prio 0 action police rate 100mbit burst 64k
> >
> > The broadcast, VLAN PCP and port policers are compatible with one
> > another (can be installed at the same time on a port).
>
> Hi Vladimir,
>
> Some switches have a feature called "storm control". It allows one to
> police incoming BUM traffic.
Yes, I am aware.
DPAA2 switches have a single (as far as I am aware) knob for 'flood
policers', and Ocelot has individual 'storm policers' for unknown
unicast, for multicast, broadcast and for 'learn frames'.
> See this entry from Cumulus Linux
> documentation:
>
> https://docs.cumulusnetworks.com/cumulus-linux-40/Layer-2/Spanning-Tree-and-Rapid-Spanning-Tree/#storm-control
>
> In the past I was thinking about ways to implement this in Linux. The
> only place in the pipeline where packets are actually classified to
> broadcast / unknown unicast / multicast is at bridge ingress. Therefore,
Actually I think only 'unknown unicast' is tricky here, and indeed the
bridge driver is the only place in the software datapath that would
know that.
I know very little about frame classification in the Linux network
stack, but would it be possible to introduce a match key in tc-flower
for whether packets have a known destination or not?
> my thinking was to implement these storm control policers as a
> "bridge_slave" operation. It can then be offloaded to capable drivers
> via the switchdev framework.
>
I think it would be a bit odd to duplicate tc functionality in the
bridge sysfs. I don't have a better suggestion though.
> I think that if we have this implemented in the Linux bridge, then your
> patch can be used to support the policing of broadcast packets while
> returning an error if user tries to police unknown unicast or multicast
> packets.
So even if the Linux bridge gains these knobs for flood policers,
still have the dst_mac ff:ff:ff:ff:ff:ff as a valid way to configure
one of those knobs?
> Or maybe the hardware you are working with supports these types
> as well?
Nope, on this hardware it's just broadcast, I just checked that. Which
simplifies things quite a bit.
>
> WDYT?
>
I don't know.
Thanks,
-Vladimir
On Sun, 29 Mar 2020 at 14:37, Vladimir Oltean <[email protected]> wrote:
>
> On Sun, 29 Mar 2020 at 12:57, Ido Schimmel <[email protected]> wrote:
> >
> > + Nik, Roopa
> >
> > On Sun, Mar 29, 2020 at 02:52:02AM +0200, Vladimir Oltean wrote:
> > > From: Vladimir Oltean <[email protected]>
> > >
> > > This patch adds complete support for manipulating the L2 Policing Tables
> > > from this switch. There are 45 table entries, one entry per each port
> > > and traffic class, and one dedicated entry for broadcast traffic for
> > > each ingress port.
> > >
> > > Policing entries are shareable, and we use this functionality to support
> > > shared block filters.
> > >
> > > We are modeling broadcast policers as simple tc-flower matches on
> > > dst_mac. As for the traffic class policers, the switch only deduces the
> > > traffic class from the VLAN PCP field, so it makes sense to model this
> > > as a tc-flower match on vlan_prio.
> > >
> > > How to limit broadcast traffic coming from all front-panel ports to a
> > > cumulated total of 10 Mbit/s:
> > >
> > > tc qdisc add dev sw0p0 ingress_block 1 clsact
> > > tc qdisc add dev sw0p1 ingress_block 1 clsact
> > > tc qdisc add dev sw0p2 ingress_block 1 clsact
> > > tc qdisc add dev sw0p3 ingress_block 1 clsact
> > > tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
> > > action police rate 10mbit burst 64k
> > >
> > > How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
> > > 100 Mbit/s on port 0 only:
> > >
> > > tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
> > > vlan_prio 0 action police rate 100mbit burst 64k
> > >
> > > The broadcast, VLAN PCP and port policers are compatible with one
> > > another (can be installed at the same time on a port).
> >
> > Hi Vladimir,
> >
> > Some switches have a feature called "storm control". It allows one to
> > police incoming BUM traffic.
>
> Yes, I am aware.
> DPAA2 switches have a single (as far as I am aware) knob for 'flood
> policers', and Ocelot has individual 'storm policers' for unknown
> unicast, for multicast, broadcast and for 'learn frames'.
>
> > See this entry from Cumulus Linux
> > documentation:
> >
> > https://docs.cumulusnetworks.com/cumulus-linux-40/Layer-2/Spanning-Tree-and-Rapid-Spanning-Tree/#storm-control
> >
> > In the past I was thinking about ways to implement this in Linux. The
> > only place in the pipeline where packets are actually classified to
> > broadcast / unknown unicast / multicast is at bridge ingress. Therefore,
>
> Actually I think only 'unknown unicast' is tricky here, and indeed the
> bridge driver is the only place in the software datapath that would
> know that.
> I know very little about frame classification in the Linux network
> stack, but would it be possible to introduce a match key in tc-flower
> for whether packets have a known destination or not?
>
> > my thinking was to implement these storm control policers as a
> > "bridge_slave" operation. It can then be offloaded to capable drivers
> > via the switchdev framework.
> >
>
> I think it would be a bit odd to duplicate tc functionality in the
> bridge sysfs. I don't have a better suggestion though.
>
Not to mention that for hardware like this, to have the same level of
flexibility via a switchdev control would mean to duplicate quite a
lot of tc functionality. On this 5-port switch I can put a shared
broadcast policer on 2 ports (via the ingress_block functionality),
and individual policers on the other 3, and the bandwidth budgeting is
separate. I can only assume that there are more switches out there
that allow this.
> > I think that if we have this implemented in the Linux bridge, then your
> > patch can be used to support the policing of broadcast packets while
> > returning an error if user tries to police unknown unicast or multicast
> > packets.
>
> So even if the Linux bridge gains these knobs for flood policers,
> still have the dst_mac ff:ff:ff:ff:ff:ff as a valid way to configure
> one of those knobs?
>
> > Or maybe the hardware you are working with supports these types
> > as well?
>
> Nope, on this hardware it's just broadcast, I just checked that. Which
> simplifies things quite a bit.
>
> >
> > WDYT?
> >
>
> I don't know.
>
> Thanks,
> -Vladimir
-Vladimir
On 29/03/2020 14:46, Vladimir Oltean wrote:
> On Sun, 29 Mar 2020 at 14:37, Vladimir Oltean <[email protected]> wrote:
>>
>> On Sun, 29 Mar 2020 at 12:57, Ido Schimmel <[email protected]> wrote:
>>>
>>> + Nik, Roopa
>>>
>>> On Sun, Mar 29, 2020 at 02:52:02AM +0200, Vladimir Oltean wrote:
>>>> From: Vladimir Oltean <[email protected]>
[snip]
>>> In the past I was thinking about ways to implement this in Linux. The
>>> only place in the pipeline where packets are actually classified to
>>> broadcast / unknown unicast / multicast is at bridge ingress. Therefore,
>>
>> Actually I think only 'unknown unicast' is tricky here, and indeed the
>> bridge driver is the only place in the software datapath that would
>> know that.
Yep, unknown unicast is hard to pass outside of the bridge, especially at ingress
where the bridge hasn't been hit yet. One possible solution is to expose a function
from the bridge which can make such a decision at the cost of 1 more fdb hash lookup,
but if the packet is going to hit the bridge anyway that cost won't be that high
since it will have to do the same. We already have some internal bridge functionality
exposed for netfilter, tc and some drivers so it would be in line with that.
I haven't looked into how feasible the above is, so I'm open to other ideas (the
bridge_slave functions for example, we've discussed such extensions before in other
contexts). But I think this can be much simpler if we just expose the unknown unicast
information, the mcast/bcast can be decided by the classifier already or with very
little change. I think such exposed function can be useful to netfilter as well.
>> I know very little about frame classification in the Linux network
>> stack, but would it be possible to introduce a match key in tc-flower
>> for whether packets have a known destination or not?
>>
>>> my thinking was to implement these storm control policers as a
>>> "bridge_slave" operation. It can then be offloaded to capable drivers
>>> via the switchdev framework.
>>>
>>
>> I think it would be a bit odd to duplicate tc functionality in the
>> bridge sysfs. I don't have a better suggestion though.
>>
>
> Not to mention that for hardware like this, to have the same level of
> flexibility via a switchdev control would mean to duplicate quite a
> lot of tc functionality. On this 5-port switch I can put a shared
> broadcast policer on 2 ports (via the ingress_block functionality),
> and individual policers on the other 3, and the bandwidth budgeting is
> separate. I can only assume that there are more switches out there
> that allow this.
>
>>> I think that if we have this implemented in the Linux bridge, then your
>>> patch can be used to support the policing of broadcast packets while
>>> returning an error if user tries to police unknown unicast or multicast
>>> packets.
>>
>> So even if the Linux bridge gains these knobs for flood policers,
>> still have the dst_mac ff:ff:ff:ff:ff:ff as a valid way to configure
>> one of those knobs?
>>
>>> Or maybe the hardware you are working with supports these types
>>> as well?
>>
>> Nope, on this hardware it's just broadcast, I just checked that. Which
>> simplifies things quite a bit.
>>
>>>
>>> WDYT?
>>>
>>
>> I don't know.
>>
>> Thanks,
>> -Vladimir
>
> -Vladimir
>
On 29/03/2020 15:02, Nikolay Aleksandrov wrote:
> On 29/03/2020 14:46, Vladimir Oltean wrote:
>> On Sun, 29 Mar 2020 at 14:37, Vladimir Oltean <[email protected]> wrote:
>>>
>>> On Sun, 29 Mar 2020 at 12:57, Ido Schimmel <[email protected]> wrote:
>>>>
>>>> + Nik, Roopa
>>>>
>>>> On Sun, Mar 29, 2020 at 02:52:02AM +0200, Vladimir Oltean wrote:
>>>>> From: Vladimir Oltean <[email protected]>
> [snip]
>>>> In the past I was thinking about ways to implement this in Linux. The
>>>> only place in the pipeline where packets are actually classified to
>>>> broadcast / unknown unicast / multicast is at bridge ingress. Therefore,
>>>
>>> Actually I think only 'unknown unicast' is tricky here, and indeed the
>>> bridge driver is the only place in the software datapath that would
>>> know that.
>
> Yep, unknown unicast is hard to pass outside of the bridge, especially at ingress
> where the bridge hasn't been hit yet. One possible solution is to expose a function
> from the bridge which can make such a decision at the cost of 1 more fdb hash lookup,
> but if the packet is going to hit the bridge anyway that cost won't be that high
> since it will have to do the same. We already have some internal bridge functionality
> exposed for netfilter, tc and some drivers so it would be in line with that.
> I haven't looked into how feasible the above is, so I'm open to other ideas (the
> bridge_slave functions for example, we've discussed such extensions before in other
> contexts). But I think this can be much simpler if we just expose the unknown unicast
> information, the mcast/bcast can be decided by the classifier already or with very
> little change. I think such exposed function can be useful to netfilter as well.
>
Of course along with the unknown unicast, we should include unknown multicast.
>>> I know very little about frame classification in the Linux network
>>> stack, but would it be possible to introduce a match key in tc-flower
>>> for whether packets have a known destination or not?
>>>
>>>> my thinking was to implement these storm control policers as a
>>>> "bridge_slave" operation. It can then be offloaded to capable drivers
>>>> via the switchdev framework.
>>>>
>>>
>>> I think it would be a bit odd to duplicate tc functionality in the
>>> bridge sysfs. I don't have a better suggestion though.
>>>
>>
>> Not to mention that for hardware like this, to have the same level of
>> flexibility via a switchdev control would mean to duplicate quite a
>> lot of tc functionality. On this 5-port switch I can put a shared
>> broadcast policer on 2 ports (via the ingress_block functionality),
>> and individual policers on the other 3, and the bandwidth budgeting is
>> separate. I can only assume that there are more switches out there
>> that allow this.
>>>>>> I think that if we have this implemented in the Linux bridge, then your
>>>> patch can be used to support the policing of broadcast packets while
>>>> returning an error if user tries to police unknown unicast or multicast
>>>> packets.
>>>
>>> So even if the Linux bridge gains these knobs for flood policers,
>>> still have the dst_mac ff:ff:ff:ff:ff:ff as a valid way to configure
>>> one of those knobs?
>>>
>>>> Or maybe the hardware you are working with supports these types
>>>> as well?
>>>
>>> Nope, on this hardware it's just broadcast, I just checked that. Which
>>> simplifies things quite a bit.
>>>
>>>>
>>>> WDYT?
>>>>
>>>
>>> I don't know.
>>>
>>> Thanks,
>>> -Vladimir
>>
>> -Vladimir
>>
>