2020-06-02 05:28:15

by Xiaoliang Yang

[permalink] [raw]
Subject: [PATCH v2 net-next 00/10] net: ocelot: VCAP IS1 and ES0 support

This series patches adds support for VCAP IS1 and ES0 module, each VCAP
correspond to a flow chain to offload.

VCAP IS1 supports FLOW_ACTION_VLAN_MANGLE action to filter MAC, IP,
VLAN, protocol, and TCP/UDP ports keys and retag vlian tag,
FLOW_ACTION_PRIORITY action to classify packages to different Qos in hw.

VCAP ES0 supports FLOW_ACTION_VLAN_PUSH action to filter vlan keys
and push a specific vlan tag to frames.

Changes since v1->v2:
- Use different chain to assign rules to different hardware VCAP, and
use action goto chain to express flow order.
- Add FLOW_ACTION_PRIORITY to add Qos classification on VCAP IS1.
- Multiple actions support.
- Fix some code issues.

Vladimir Oltean (3):
net: mscc: ocelot: introduce a new ocelot_target_{read,write} API
net: mscc: ocelot: generalize existing code for VCAP
net: dsa: tag_ocelot: use VLAN information from tagging header when
available

Xiaoliang Yang (7):
net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by
chain index
net: mscc: ocelot: change vcap to be compatible with full and quad
entry
net: mscc: ocelot: VCAP IS1 support
net: mscc: ocelot: VCAP ES0 support
net: mscc: ocelot: multiple actions support
net: ocelot: return error if rule is not found
net: dsa: felix: correct VCAP IS2 keys offset

drivers/net/dsa/ocelot/felix.c | 2 -
drivers/net/dsa/ocelot/felix.h | 2 -
drivers/net/dsa/ocelot/felix_vsc9959.c | 202 +++++-
drivers/net/ethernet/mscc/ocelot.c | 11 +
drivers/net/ethernet/mscc/ocelot_ace.c | 729 ++++++++++++++++------
drivers/net/ethernet/mscc/ocelot_ace.h | 26 +-
drivers/net/ethernet/mscc/ocelot_board.c | 5 +-
drivers/net/ethernet/mscc/ocelot_flower.c | 95 ++-
drivers/net/ethernet/mscc/ocelot_io.c | 17 +
drivers/net/ethernet/mscc/ocelot_regs.c | 21 +-
drivers/net/ethernet/mscc/ocelot_s2.h | 64 --
include/soc/mscc/ocelot.h | 39 +-
include/soc/mscc/ocelot_vcap.h | 199 +++++-
net/dsa/tag_ocelot.c | 29 +
14 files changed, 1105 insertions(+), 336 deletions(-)
delete mode 100644 drivers/net/ethernet/mscc/ocelot_s2.h

--
2.17.1


2020-06-02 05:28:17

by Xiaoliang Yang

[permalink] [raw]
Subject: [PATCH v2 net-next 03/10] net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by chain index

There are three hardware TCAMs for ocelot chips: IS1, IS2 and ES0. Each
one supports different actions. The hardware flow order is: IS1->IS2->ES0.

This patch add three blocks to store rules according to chain index.
chain 0 is offloaded to IS1, chain 1 is offloaded to IS2, and egress chain
0 is offloaded to ES0.

Using action goto chain to express flow order as follows:
tc filter add dev swp0 chain 0 parent ffff: flower skip_sw \
action goto chain 1

Signed-off-by: Xiaoliang Yang <[email protected]>
---
drivers/net/ethernet/mscc/ocelot_ace.c | 51 +++++++++++++++--------
drivers/net/ethernet/mscc/ocelot_ace.h | 7 ++--
drivers/net/ethernet/mscc/ocelot_flower.c | 46 +++++++++++++++++---
include/soc/mscc/ocelot.h | 2 +-
include/soc/mscc/ocelot_vcap.h | 4 +-
5 files changed, 81 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
index 748c618db7d8..b76593b40097 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.c
+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
@@ -341,6 +341,8 @@ static void is2_action_set(struct ocelot *ocelot, struct vcap_data *data,
vcap_action_set(vcap, data, VCAP_IS2_ACT_CPU_QU_NUM, 0);
vcap_action_set(vcap, data, VCAP_IS2_ACT_CPU_COPY_ENA, 0);
break;
+ default:
+ break;
}
}

@@ -644,9 +646,9 @@ static void is2_entry_set(struct ocelot *ocelot, int ix,
}

static void vcap_entry_get(struct ocelot *ocelot, struct ocelot_ace_rule *rule,
- int ix)
+ int ix, int block_id)
{
- const struct vcap_props *vcap = &ocelot->vcap[VCAP_IS2];
+ const struct vcap_props *vcap = &ocelot->vcap[block_id];
struct vcap_data data;
int row, count;
u32 cnt;
@@ -663,6 +665,19 @@ static void vcap_entry_get(struct ocelot *ocelot, struct ocelot_ace_rule *rule,
rule->stats.pkts = cnt;
}

+static void vcap_entry_set(struct ocelot *ocelot, int ix,
+ struct ocelot_ace_rule *ace,
+ int block_id)
+{
+ switch (block_id) {
+ case VCAP_IS2:
+ is2_entry_set(ocelot, ix, ace);
+ break;
+ default:
+ break;
+ }
+}
+
static void ocelot_ace_rule_add(struct ocelot *ocelot,
struct ocelot_acl_block *block,
struct ocelot_ace_rule *rule)
@@ -790,7 +805,7 @@ static bool ocelot_ace_is_problematic_non_mac_etype(struct ocelot_ace_rule *ace)
static bool ocelot_exclusive_mac_etype_ace_rules(struct ocelot *ocelot,
struct ocelot_ace_rule *ace)
{
- struct ocelot_acl_block *block = &ocelot->acl_block;
+ struct ocelot_acl_block *block = &ocelot->acl_block[VCAP_IS2];
struct ocelot_ace_rule *tmp;
unsigned long port;
int i;
@@ -824,15 +839,16 @@ static bool ocelot_exclusive_mac_etype_ace_rules(struct ocelot *ocelot,
return true;
}

-int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
+int ocelot_ace_rule_offload_add(struct ocelot *ocelot, int block_id,
struct ocelot_ace_rule *rule,
struct netlink_ext_ack *extack)
{
- struct ocelot_acl_block *block = &ocelot->acl_block;
+ struct ocelot_acl_block *block = &ocelot->acl_block[block_id];
struct ocelot_ace_rule *ace;
int i, index;

- if (!ocelot_exclusive_mac_etype_ace_rules(ocelot, rule)) {
+ if (block_id == VCAP_IS2 &&
+ !ocelot_exclusive_mac_etype_ace_rules(ocelot, rule)) {
NL_SET_ERR_MSG_MOD(extack,
"Cannot mix MAC_ETYPE with non-MAC_ETYPE rules");
return -EBUSY;
@@ -847,11 +863,11 @@ int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
/* Move down the rules to make place for the new rule */
for (i = block->count - 1; i > index; i--) {
ace = ocelot_ace_rule_get_rule_index(block, i);
- is2_entry_set(ocelot, i, ace);
+ vcap_entry_set(ocelot, i, ace, block_id);
}

/* Now insert the new rule */
- is2_entry_set(ocelot, index, rule);
+ vcap_entry_set(ocelot, index, rule, block_id);
return 0;
}

@@ -902,10 +918,10 @@ static void ocelot_ace_rule_del(struct ocelot *ocelot,
block->count--;
}

-int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
+int ocelot_ace_rule_offload_del(struct ocelot *ocelot, int block_id,
struct ocelot_ace_rule *rule)
{
- struct ocelot_acl_block *block = &ocelot->acl_block;
+ struct ocelot_acl_block *block = &ocelot->acl_block[block_id];
struct ocelot_ace_rule del_ace;
struct ocelot_ace_rule *ace;
int i, index;
@@ -921,29 +937,29 @@ int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
/* Move up all the blocks over the deleted rule */
for (i = index; i < block->count; i++) {
ace = ocelot_ace_rule_get_rule_index(block, i);
- is2_entry_set(ocelot, i, ace);
+ vcap_entry_set(ocelot, i, ace, block_id);
}

/* Now delete the last rule, because it is duplicated */
- is2_entry_set(ocelot, block->count, &del_ace);
+ vcap_entry_set(ocelot, block->count, &del_ace, block_id);

return 0;
}

-int ocelot_ace_rule_stats_update(struct ocelot *ocelot,
+int ocelot_ace_rule_stats_update(struct ocelot *ocelot, int block_id,
struct ocelot_ace_rule *rule)
{
- struct ocelot_acl_block *block = &ocelot->acl_block;
+ struct ocelot_acl_block *block = &ocelot->acl_block[block_id];
struct ocelot_ace_rule *tmp;
int index;

index = ocelot_ace_rule_get_index_id(block, rule);
- vcap_entry_get(ocelot, rule, index);
+ vcap_entry_get(ocelot, rule, index, block_id);

/* After we get the result we need to clear the counters */
tmp = ocelot_ace_rule_get_rule_index(block, index);
tmp->stats.pkts = 0;
- is2_entry_set(ocelot, index, tmp);
+ vcap_entry_set(ocelot, index, tmp, block_id);

return 0;
}
@@ -968,7 +984,7 @@ static void vcap_init(struct ocelot *ocelot, const struct vcap_props *vcap)

int ocelot_ace_init(struct ocelot *ocelot)
{
- struct ocelot_acl_block *block = &ocelot->acl_block;
+ struct ocelot_acl_block *block;

vcap_init(ocelot, &ocelot->vcap[VCAP_IS2]);

@@ -987,6 +1003,7 @@ int ocelot_ace_init(struct ocelot *ocelot)
ocelot_write_gix(ocelot, 0x3fffff, ANA_POL_CIR_STATE,
OCELOT_POLICER_DISCARD);

+ block = &ocelot->acl_block[VCAP_IS2];
block->pol_lpr = OCELOT_POLICER_DISCARD - 1;
INIT_LIST_HEAD(&block->rules);

diff --git a/drivers/net/ethernet/mscc/ocelot_ace.h b/drivers/net/ethernet/mscc/ocelot_ace.h
index 099e177f2617..a9fd99401a65 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.h
+++ b/drivers/net/ethernet/mscc/ocelot_ace.h
@@ -175,6 +175,7 @@ struct ocelot_ace_frame_ipv6 {
};

enum ocelot_ace_action {
+ OCELOT_ACL_ACTION_NULL,
OCELOT_ACL_ACTION_DROP,
OCELOT_ACL_ACTION_TRAP,
OCELOT_ACL_ACTION_POLICE,
@@ -214,12 +215,12 @@ struct ocelot_ace_rule {
u32 pol_ix;
};

-int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
+int ocelot_ace_rule_offload_add(struct ocelot *ocelot, int block_id,
struct ocelot_ace_rule *rule,
struct netlink_ext_ack *extack);
-int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
+int ocelot_ace_rule_offload_del(struct ocelot *ocelot, int block_id,
struct ocelot_ace_rule *rule);
-int ocelot_ace_rule_stats_update(struct ocelot *ocelot,
+int ocelot_ace_rule_stats_update(struct ocelot *ocelot, int block_id,
struct ocelot_ace_rule *rule);

int ocelot_ace_init(struct ocelot *ocelot);
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 891925f73cbc..a1f7b6b28170 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -9,13 +9,26 @@

#include "ocelot_ace.h"

+static int ocelot_block_id_get(int chain, bool ingress)
+{
+ /* Select TCAM blocks by using chain index. Rules in chain 0 are
+ * implemented on IS1, chain 1 are implemented on IS2, and egress
+ * chain corresponds to ES0 block.
+ */
+ if (ingress)
+ return chain ? VCAP_IS2 : VCAP_IS1;
+ else
+ return VCAP_ES0;
+}
+
static int ocelot_flower_parse_action(struct flow_cls_offload *f,
struct ocelot_ace_rule *ace)
{
+ struct netlink_ext_ack *extack = f->common.extack;
const struct flow_action_entry *a;
+ int i, allowed_chain = 0;
s64 burst;
u64 rate;
- int i;

if (!flow_offload_has_one_action(&f->rule->action))
return -EOPNOTSUPP;
@@ -28,9 +41,11 @@ static int ocelot_flower_parse_action(struct flow_cls_offload *f,
switch (a->id) {
case FLOW_ACTION_DROP:
ace->action = OCELOT_ACL_ACTION_DROP;
+ allowed_chain = 1;
break;
case FLOW_ACTION_TRAP:
ace->action = OCELOT_ACL_ACTION_TRAP;
+ allowed_chain = 1;
break;
case FLOW_ACTION_POLICE:
ace->action = OCELOT_ACL_ACTION_POLICE;
@@ -38,10 +53,23 @@ static int ocelot_flower_parse_action(struct flow_cls_offload *f,
ace->pol.rate = div_u64(rate, 1000) * 8;
burst = rate * PSCHED_NS2TICKS(a->police.burst);
ace->pol.burst = div_u64(burst, PSCHED_TICKS_PER_SEC);
+ allowed_chain = 1;
+ break;
+ case FLOW_ACTION_GOTO:
+ if (a->chain_index != f->common.chain_index + 1) {
+ NL_SET_ERR_MSG_MOD(extack, "HW only support goto next chain\n");
+ return -EOPNOTSUPP;
+ }
+ ace->action = OCELOT_ACL_ACTION_NULL;
+ allowed_chain = f->common.chain_index;
break;
default:
return -EOPNOTSUPP;
}
+ if (f->common.chain_index != allowed_chain) {
+ NL_SET_ERR_MSG_MOD(extack, "Action is not supported on this chain\n");
+ return -EOPNOTSUPP;
+ }
}

return 0;
@@ -205,7 +233,7 @@ int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress)
{
struct ocelot_ace_rule *ace;
- int ret;
+ int ret, block_id;

ace = ocelot_ace_rule_create(ocelot, port, f);
if (!ace)
@@ -216,8 +244,10 @@ int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
kfree(ace);
return ret;
}
+ block_id = ocelot_block_id_get(f->common.chain_index, ingress);

- return ocelot_ace_rule_offload_add(ocelot, ace, f->common.extack);
+ return ocelot_ace_rule_offload_add(ocelot, block_id, ace,
+ f->common.extack);
}
EXPORT_SYMBOL_GPL(ocelot_cls_flower_replace);

@@ -225,11 +255,13 @@ int ocelot_cls_flower_destroy(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress)
{
struct ocelot_ace_rule ace;
+ int block_id;

ace.prio = f->common.prio;
ace.id = f->cookie;
+ block_id = ocelot_block_id_get(f->common.chain_index, ingress);

- return ocelot_ace_rule_offload_del(ocelot, &ace);
+ return ocelot_ace_rule_offload_del(ocelot, block_id, &ace);
}
EXPORT_SYMBOL_GPL(ocelot_cls_flower_destroy);

@@ -237,11 +269,13 @@ int ocelot_cls_flower_stats(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress)
{
struct ocelot_ace_rule ace;
- int ret;
+ int ret, block_id;

ace.prio = f->common.prio;
ace.id = f->cookie;
- ret = ocelot_ace_rule_stats_update(ocelot, &ace);
+ block_id = ocelot_block_id_get(f->common.chain_index, ingress);
+
+ ret = ocelot_ace_rule_stats_update(ocelot, block_id, &ace);
if (ret)
return ret;

diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 91357b1c8f31..4b2320bdc036 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -540,7 +540,7 @@ struct ocelot {

struct list_head multicast;

- struct ocelot_acl_block acl_block;
+ struct ocelot_acl_block acl_block[3];

const struct vcap_props *vcap;

diff --git a/include/soc/mscc/ocelot_vcap.h b/include/soc/mscc/ocelot_vcap.h
index 26d9384b3657..495847a40490 100644
--- a/include/soc/mscc/ocelot_vcap.h
+++ b/include/soc/mscc/ocelot_vcap.h
@@ -14,9 +14,9 @@
*/

enum {
- /* VCAP_IS1, */
+ VCAP_IS1,
VCAP_IS2,
- /* VCAP_ES0, */
+ VCAP_ES0,
};

struct vcap_props {
--
2.17.1

2020-06-02 05:28:39

by Xiaoliang Yang

[permalink] [raw]
Subject: [PATCH v2 net-next 08/10] net: ocelot: return error if rule is not found

Return error if rule is not found in rule list to avoid Kernel panic.

Signed-off-by: Xiaoliang Yang <[email protected]>
---
drivers/net/ethernet/mscc/ocelot_ace.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
index bf2b7a03c832..2ba2859fa2cd 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.c
+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
@@ -982,9 +982,9 @@ static int ocelot_ace_rule_get_index_id(struct ocelot_acl_block *block,
list_for_each_entry(tmp, &block->rules, list) {
++index;
if (rule->id == tmp->id)
- break;
+ return index;
}
- return index;
+ return -ENOENT;
}

static struct ocelot_ace_rule*
@@ -1197,6 +1197,8 @@ int ocelot_ace_rule_offload_del(struct ocelot *ocelot, int block_id,

/* Gets index of the rule */
index = ocelot_ace_rule_get_index_id(block, rule);
+ if (index < 0)
+ return -ENOENT;

/* Delete rule */
ocelot_ace_rule_del(ocelot, block, rule);
@@ -1221,6 +1223,9 @@ int ocelot_ace_rule_stats_update(struct ocelot *ocelot, int block_id,
int index;

index = ocelot_ace_rule_get_index_id(block, rule);
+ if (index < 0)
+ return -ENOENT;
+
vcap_entry_get(ocelot, rule, index, block_id);

/* After we get the result we need to clear the counters */
--
2.17.1

2020-06-02 05:28:44

by Xiaoliang Yang

[permalink] [raw]
Subject: [PATCH v2 net-next 09/10] net: dsa: felix: correct VCAP IS2 keys offset

Some of IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT, L4_SPORT
and other L4 keys. It causes the issue that VCAP IS2 could not filter
a right dst/src port for TCP/UDP packages.

Signed-off-by: Xiaoliang Yang <[email protected]>
---
drivers/net/dsa/ocelot/felix_vsc9959.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index fceba87509ba..539f3c062b50 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -730,17 +730,17 @@ struct vcap_field vsc9959_vcap_is2_keys[] = {
[VCAP_IS2_HK_DIP_EQ_SIP] = {118, 1},
/* IP4_TCP_UDP (TYPE=100) */
[VCAP_IS2_HK_TCP] = {119, 1},
- [VCAP_IS2_HK_L4_SPORT] = {120, 16},
- [VCAP_IS2_HK_L4_DPORT] = {136, 16},
+ [VCAP_IS2_HK_L4_DPORT] = {120, 16},
+ [VCAP_IS2_HK_L4_SPORT] = {136, 16},
[VCAP_IS2_HK_L4_RNG] = {152, 8},
[VCAP_IS2_HK_L4_SPORT_EQ_DPORT] = {160, 1},
[VCAP_IS2_HK_L4_SEQUENCE_EQ0] = {161, 1},
- [VCAP_IS2_HK_L4_URG] = {162, 1},
- [VCAP_IS2_HK_L4_ACK] = {163, 1},
- [VCAP_IS2_HK_L4_PSH] = {164, 1},
- [VCAP_IS2_HK_L4_RST] = {165, 1},
- [VCAP_IS2_HK_L4_SYN] = {166, 1},
- [VCAP_IS2_HK_L4_FIN] = {167, 1},
+ [VCAP_IS2_HK_L4_FIN] = {162, 1},
+ [VCAP_IS2_HK_L4_SYN] = {163, 1},
+ [VCAP_IS2_HK_L4_RST] = {164, 1},
+ [VCAP_IS2_HK_L4_PSH] = {165, 1},
+ [VCAP_IS2_HK_L4_ACK] = {166, 1},
+ [VCAP_IS2_HK_L4_URG] = {167, 1},
[VCAP_IS2_HK_L4_1588_DOM] = {168, 8},
[VCAP_IS2_HK_L4_1588_VER] = {176, 4},
/* IP4_OTHER (TYPE=101) */
--
2.17.1

2020-06-02 08:05:46

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH v2 net-next 00/10] net: ocelot: VCAP IS1 and ES0 support

Hi Xiaoliang,

On Tue, 2 Jun 2020 at 08:25, Xiaoliang Yang <[email protected]> wrote:
>
> This series patches adds support for VCAP IS1 and ES0 module, each VCAP
> correspond to a flow chain to offload.
>
> VCAP IS1 supports FLOW_ACTION_VLAN_MANGLE action to filter MAC, IP,
> VLAN, protocol, and TCP/UDP ports keys and retag vlian tag,
> FLOW_ACTION_PRIORITY action to classify packages to different Qos in hw.
>
> VCAP ES0 supports FLOW_ACTION_VLAN_PUSH action to filter vlan keys
> and push a specific vlan tag to frames.
>
> Changes since v1->v2:
> - Use different chain to assign rules to different hardware VCAP, and
> use action goto chain to express flow order.
> - Add FLOW_ACTION_PRIORITY to add Qos classification on VCAP IS1.
> - Multiple actions support.
> - Fix some code issues.
>
> Vladimir Oltean (3):
> net: mscc: ocelot: introduce a new ocelot_target_{read,write} API
> net: mscc: ocelot: generalize existing code for VCAP
> net: dsa: tag_ocelot: use VLAN information from tagging header when
> available
>
> Xiaoliang Yang (7):
> net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by
> chain index
> net: mscc: ocelot: change vcap to be compatible with full and quad
> entry
> net: mscc: ocelot: VCAP IS1 support
> net: mscc: ocelot: VCAP ES0 support
> net: mscc: ocelot: multiple actions support
> net: ocelot: return error if rule is not found
> net: dsa: felix: correct VCAP IS2 keys offset
>
> drivers/net/dsa/ocelot/felix.c | 2 -
> drivers/net/dsa/ocelot/felix.h | 2 -
> drivers/net/dsa/ocelot/felix_vsc9959.c | 202 +++++-
> drivers/net/ethernet/mscc/ocelot.c | 11 +
> drivers/net/ethernet/mscc/ocelot_ace.c | 729 ++++++++++++++++------
> drivers/net/ethernet/mscc/ocelot_ace.h | 26 +-
> drivers/net/ethernet/mscc/ocelot_board.c | 5 +-
> drivers/net/ethernet/mscc/ocelot_flower.c | 95 ++-
> drivers/net/ethernet/mscc/ocelot_io.c | 17 +
> drivers/net/ethernet/mscc/ocelot_regs.c | 21 +-
> drivers/net/ethernet/mscc/ocelot_s2.h | 64 --
> include/soc/mscc/ocelot.h | 39 +-
> include/soc/mscc/ocelot_vcap.h | 199 +++++-
> net/dsa/tag_ocelot.c | 29 +
> 14 files changed, 1105 insertions(+), 336 deletions(-)
> delete mode 100644 drivers/net/ethernet/mscc/ocelot_s2.h
>
> --
> 2.17.1
>

First of all, net-next has just closed yesterday and will be closed
for the following 2 weeks:
http://vger.kernel.org/~davem/net-next.html

Secondly, could you give an example of how different chains could
express the fact that rules are executed in parallel between the IS1,
IS2 and ES0 TCAMs?

Thanks,
-Vladimir

2020-06-02 08:39:00

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v2 net-next 03/10] net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by chain index

Hi Xiaoliang,

Happy to see that you are moving in the directions of multi chain - this
seems ilke a much better fit to me.


On 02.06.2020 13:18, Xiaoliang Yang wrote:
>There are three hardware TCAMs for ocelot chips: IS1, IS2 and ES0. Each
>one supports different actions. The hardware flow order is: IS1->IS2->ES0.
>
>This patch add three blocks to store rules according to chain index.
>chain 0 is offloaded to IS1, chain 1 is offloaded to IS2, and egress chain
>0 is offloaded to ES0.

Using "static" allocation to to say chain-X goes to TCAM Y, also seems
like the right approach to me. Given the capabilities of the HW, this
will most likely be the easiest scheme to implement and to explain to
the end-user.

But I think we should make some adjustments to this mapping schema.

Here are some important "things" I would like to consider when defining
this schema:

- As you explain, we have 3 TCAMs (IS1, IS2 and ES0), but we have 3
parallel lookups in IS1 and 2 parallel lookups in IS2 - and also these
TCAMs has a wide verity of keys.

- We can utilize these multiple parallel lookups such that it seems like
they are done in serial (that is if they do not touch the same
actions), but as they are done in parallel they can not influence each
other.

- We can let IS1 influence the IS2 lookup (like the GOTO actions was
intended to be used).

- The chip also has other QoS classification facilities which sits
before the TCAM (take a look at 3.7.3 QoS, DP, and DSCP Classification
in vsc7514 datasheet). It we at some point in time want to enable
this, then I think we need to do that in the same tc-flower framework.

Here is my initial suggestion for an alternative chain-schema:

Chain 0: The default chain - today this is in IS2. If we proceed
with this as is - then this will change.
Chain 1-9999: These are offloaded by "basic" classification.
Chain 10000-19999: These are offloaded in IS1
Chain 10000: Lookup-0 in IS1, and here we could limit the
action to do QoS related stuff (priority
update)
Chain 11000: Lookup-1 in IS1, here we could do VLAN
stuff
Chain 12000: Lookup-2 in IS1, here we could apply the
"PAG" which is essentially a GOTO.

Chain 20000-29999: These are offloaded in IS2
Chain 20000-20255: Lookup-0 in IS2, where CHAIN-ID -
20000 is the PAG value.
Chain 21000-21000: Lookup-1 in IS2.

All these chains should be optional - users should only need to
configure the chains they need. To make this work, we need to configure
both the desired actions (could be priority update) and the goto action.
Remember in HW, all packets goes through this process, while in SW they
only follow the "goto" path.

An example could be (I have not tested this yet - sorry):

tc qdisc add dev eth0 ingress

# Activate lookup 11000. We can not do any other rules in chain 0, also
# this implicitly means that we do not want any chains <11000.
tc filter add dev eth0 parent ffff: chain 0
action
matchall goto 11000

tc filter add dev eth0 parent ffff: chain 11000 \
flower src_mac 00:01:00:00:00:00/00:ff:00:00:00:00 \
action \
vlan modify id 1234 \
pipe \
goto 20001

tc filter add dev eth0 parent ffff: chain 20001 ...

Maybe it would be an idea to create some use-cases, implement them in a
test which can pass with today's SW, and then once we have a common
understanding of what we want, we can implement it?

/Allan

>Using action goto chain to express flow order as follows:
> tc filter add dev swp0 chain 0 parent ffff: flower skip_sw \
> action goto chain 1
>
>Signed-off-by: Xiaoliang Yang <[email protected]>
>---
> drivers/net/ethernet/mscc/ocelot_ace.c | 51 +++++++++++++++--------
> drivers/net/ethernet/mscc/ocelot_ace.h | 7 ++--
> drivers/net/ethernet/mscc/ocelot_flower.c | 46 +++++++++++++++++---
> include/soc/mscc/ocelot.h | 2 +-
> include/soc/mscc/ocelot_vcap.h | 4 +-
> 5 files changed, 81 insertions(+), 29 deletions(-)
>
>diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
>index 748c618db7d8..b76593b40097 100644
>--- a/drivers/net/ethernet/mscc/ocelot_ace.c
>+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
>@@ -341,6 +341,8 @@ static void is2_action_set(struct ocelot *ocelot, struct vcap_data *data,
> vcap_action_set(vcap, data, VCAP_IS2_ACT_CPU_QU_NUM, 0);
> vcap_action_set(vcap, data, VCAP_IS2_ACT_CPU_COPY_ENA, 0);
> break;
>+ default:
>+ break;
> }
> }
>
>@@ -644,9 +646,9 @@ static void is2_entry_set(struct ocelot *ocelot, int ix,
> }
>
> static void vcap_entry_get(struct ocelot *ocelot, struct ocelot_ace_rule *rule,
>- int ix)
>+ int ix, int block_id)
> {
>- const struct vcap_props *vcap = &ocelot->vcap[VCAP_IS2];
>+ const struct vcap_props *vcap = &ocelot->vcap[block_id];
> struct vcap_data data;
> int row, count;
> u32 cnt;
>@@ -663,6 +665,19 @@ static void vcap_entry_get(struct ocelot *ocelot, struct ocelot_ace_rule *rule,
> rule->stats.pkts = cnt;
> }
>
>+static void vcap_entry_set(struct ocelot *ocelot, int ix,
>+ struct ocelot_ace_rule *ace,
>+ int block_id)
>+{
>+ switch (block_id) {
>+ case VCAP_IS2:
>+ is2_entry_set(ocelot, ix, ace);
>+ break;
>+ default:
>+ break;
>+ }
>+}
>+
> static void ocelot_ace_rule_add(struct ocelot *ocelot,
> struct ocelot_acl_block *block,
> struct ocelot_ace_rule *rule)
>@@ -790,7 +805,7 @@ static bool ocelot_ace_is_problematic_non_mac_etype(struct ocelot_ace_rule *ace)
> static bool ocelot_exclusive_mac_etype_ace_rules(struct ocelot *ocelot,
> struct ocelot_ace_rule *ace)
> {
>- struct ocelot_acl_block *block = &ocelot->acl_block;
>+ struct ocelot_acl_block *block = &ocelot->acl_block[VCAP_IS2];
> struct ocelot_ace_rule *tmp;
> unsigned long port;
> int i;
>@@ -824,15 +839,16 @@ static bool ocelot_exclusive_mac_etype_ace_rules(struct ocelot *ocelot,
> return true;
> }
>
>-int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
>+int ocelot_ace_rule_offload_add(struct ocelot *ocelot, int block_id,
> struct ocelot_ace_rule *rule,
> struct netlink_ext_ack *extack)
> {
>- struct ocelot_acl_block *block = &ocelot->acl_block;
>+ struct ocelot_acl_block *block = &ocelot->acl_block[block_id];
> struct ocelot_ace_rule *ace;
> int i, index;
>
>- if (!ocelot_exclusive_mac_etype_ace_rules(ocelot, rule)) {
>+ if (block_id == VCAP_IS2 &&
>+ !ocelot_exclusive_mac_etype_ace_rules(ocelot, rule)) {
> NL_SET_ERR_MSG_MOD(extack,
> "Cannot mix MAC_ETYPE with non-MAC_ETYPE rules");
> return -EBUSY;
>@@ -847,11 +863,11 @@ int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
> /* Move down the rules to make place for the new rule */
> for (i = block->count - 1; i > index; i--) {
> ace = ocelot_ace_rule_get_rule_index(block, i);
>- is2_entry_set(ocelot, i, ace);
>+ vcap_entry_set(ocelot, i, ace, block_id);
> }
>
> /* Now insert the new rule */
>- is2_entry_set(ocelot, index, rule);
>+ vcap_entry_set(ocelot, index, rule, block_id);
> return 0;
> }
>
>@@ -902,10 +918,10 @@ static void ocelot_ace_rule_del(struct ocelot *ocelot,
> block->count--;
> }
>
>-int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
>+int ocelot_ace_rule_offload_del(struct ocelot *ocelot, int block_id,
> struct ocelot_ace_rule *rule)
> {
>- struct ocelot_acl_block *block = &ocelot->acl_block;
>+ struct ocelot_acl_block *block = &ocelot->acl_block[block_id];
> struct ocelot_ace_rule del_ace;
> struct ocelot_ace_rule *ace;
> int i, index;
>@@ -921,29 +937,29 @@ int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
> /* Move up all the blocks over the deleted rule */
> for (i = index; i < block->count; i++) {
> ace = ocelot_ace_rule_get_rule_index(block, i);
>- is2_entry_set(ocelot, i, ace);
>+ vcap_entry_set(ocelot, i, ace, block_id);
> }
>
> /* Now delete the last rule, because it is duplicated */
>- is2_entry_set(ocelot, block->count, &del_ace);
>+ vcap_entry_set(ocelot, block->count, &del_ace, block_id);
>
> return 0;
> }
>
>-int ocelot_ace_rule_stats_update(struct ocelot *ocelot,
>+int ocelot_ace_rule_stats_update(struct ocelot *ocelot, int block_id,
> struct ocelot_ace_rule *rule)
> {
>- struct ocelot_acl_block *block = &ocelot->acl_block;
>+ struct ocelot_acl_block *block = &ocelot->acl_block[block_id];
> struct ocelot_ace_rule *tmp;
> int index;
>
> index = ocelot_ace_rule_get_index_id(block, rule);
>- vcap_entry_get(ocelot, rule, index);
>+ vcap_entry_get(ocelot, rule, index, block_id);
>
> /* After we get the result we need to clear the counters */
> tmp = ocelot_ace_rule_get_rule_index(block, index);
> tmp->stats.pkts = 0;
>- is2_entry_set(ocelot, index, tmp);
>+ vcap_entry_set(ocelot, index, tmp, block_id);
>
> return 0;
> }
>@@ -968,7 +984,7 @@ static void vcap_init(struct ocelot *ocelot, const struct vcap_props *vcap)
>
> int ocelot_ace_init(struct ocelot *ocelot)
> {
>- struct ocelot_acl_block *block = &ocelot->acl_block;
>+ struct ocelot_acl_block *block;
>
> vcap_init(ocelot, &ocelot->vcap[VCAP_IS2]);
>
>@@ -987,6 +1003,7 @@ int ocelot_ace_init(struct ocelot *ocelot)
> ocelot_write_gix(ocelot, 0x3fffff, ANA_POL_CIR_STATE,
> OCELOT_POLICER_DISCARD);
>
>+ block = &ocelot->acl_block[VCAP_IS2];
> block->pol_lpr = OCELOT_POLICER_DISCARD - 1;
> INIT_LIST_HEAD(&block->rules);
>
>diff --git a/drivers/net/ethernet/mscc/ocelot_ace.h b/drivers/net/ethernet/mscc/ocelot_ace.h
>index 099e177f2617..a9fd99401a65 100644
>--- a/drivers/net/ethernet/mscc/ocelot_ace.h
>+++ b/drivers/net/ethernet/mscc/ocelot_ace.h
>@@ -175,6 +175,7 @@ struct ocelot_ace_frame_ipv6 {
> };
>
> enum ocelot_ace_action {
>+ OCELOT_ACL_ACTION_NULL,
> OCELOT_ACL_ACTION_DROP,
> OCELOT_ACL_ACTION_TRAP,
> OCELOT_ACL_ACTION_POLICE,
>@@ -214,12 +215,12 @@ struct ocelot_ace_rule {
> u32 pol_ix;
> };
>
>-int ocelot_ace_rule_offload_add(struct ocelot *ocelot,
>+int ocelot_ace_rule_offload_add(struct ocelot *ocelot, int block_id,
> struct ocelot_ace_rule *rule,
> struct netlink_ext_ack *extack);
>-int ocelot_ace_rule_offload_del(struct ocelot *ocelot,
>+int ocelot_ace_rule_offload_del(struct ocelot *ocelot, int block_id,
> struct ocelot_ace_rule *rule);
>-int ocelot_ace_rule_stats_update(struct ocelot *ocelot,
>+int ocelot_ace_rule_stats_update(struct ocelot *ocelot, int block_id,
> struct ocelot_ace_rule *rule);
>
> int ocelot_ace_init(struct ocelot *ocelot);
>diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
>index 891925f73cbc..a1f7b6b28170 100644
>--- a/drivers/net/ethernet/mscc/ocelot_flower.c
>+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
>@@ -9,13 +9,26 @@
>
> #include "ocelot_ace.h"
>
>+static int ocelot_block_id_get(int chain, bool ingress)
>+{
>+ /* Select TCAM blocks by using chain index. Rules in chain 0 are
>+ * implemented on IS1, chain 1 are implemented on IS2, and egress
>+ * chain corresponds to ES0 block.
>+ */
>+ if (ingress)
>+ return chain ? VCAP_IS2 : VCAP_IS1;
>+ else
>+ return VCAP_ES0;
>+}
>+
> static int ocelot_flower_parse_action(struct flow_cls_offload *f,
> struct ocelot_ace_rule *ace)
> {
>+ struct netlink_ext_ack *extack = f->common.extack;
> const struct flow_action_entry *a;
>+ int i, allowed_chain = 0;
> s64 burst;
> u64 rate;
>- int i;
>
> if (!flow_offload_has_one_action(&f->rule->action))
> return -EOPNOTSUPP;
>@@ -28,9 +41,11 @@ static int ocelot_flower_parse_action(struct flow_cls_offload *f,
> switch (a->id) {
> case FLOW_ACTION_DROP:
> ace->action = OCELOT_ACL_ACTION_DROP;
>+ allowed_chain = 1;
> break;
> case FLOW_ACTION_TRAP:
> ace->action = OCELOT_ACL_ACTION_TRAP;
>+ allowed_chain = 1;
> break;
> case FLOW_ACTION_POLICE:
> ace->action = OCELOT_ACL_ACTION_POLICE;
>@@ -38,10 +53,23 @@ static int ocelot_flower_parse_action(struct flow_cls_offload *f,
> ace->pol.rate = div_u64(rate, 1000) * 8;
> burst = rate * PSCHED_NS2TICKS(a->police.burst);
> ace->pol.burst = div_u64(burst, PSCHED_TICKS_PER_SEC);
>+ allowed_chain = 1;
>+ break;
>+ case FLOW_ACTION_GOTO:
>+ if (a->chain_index != f->common.chain_index + 1) {
>+ NL_SET_ERR_MSG_MOD(extack, "HW only support goto next chain\n");
>+ return -EOPNOTSUPP;
>+ }
>+ ace->action = OCELOT_ACL_ACTION_NULL;
>+ allowed_chain = f->common.chain_index;
> break;
> default:
> return -EOPNOTSUPP;
> }
>+ if (f->common.chain_index != allowed_chain) {
>+ NL_SET_ERR_MSG_MOD(extack, "Action is not supported on this chain\n");
>+ return -EOPNOTSUPP;
>+ }
> }
>
> return 0;
>@@ -205,7 +233,7 @@ int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
> struct flow_cls_offload *f, bool ingress)
> {
> struct ocelot_ace_rule *ace;
>- int ret;
>+ int ret, block_id;
>
> ace = ocelot_ace_rule_create(ocelot, port, f);
> if (!ace)
>@@ -216,8 +244,10 @@ int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
> kfree(ace);
> return ret;
> }
>+ block_id = ocelot_block_id_get(f->common.chain_index, ingress);
>
>- return ocelot_ace_rule_offload_add(ocelot, ace, f->common.extack);
>+ return ocelot_ace_rule_offload_add(ocelot, block_id, ace,
>+ f->common.extack);
> }
> EXPORT_SYMBOL_GPL(ocelot_cls_flower_replace);
>
>@@ -225,11 +255,13 @@ int ocelot_cls_flower_destroy(struct ocelot *ocelot, int port,
> struct flow_cls_offload *f, bool ingress)
> {
> struct ocelot_ace_rule ace;
>+ int block_id;
>
> ace.prio = f->common.prio;
> ace.id = f->cookie;
>+ block_id = ocelot_block_id_get(f->common.chain_index, ingress);
>
>- return ocelot_ace_rule_offload_del(ocelot, &ace);
>+ return ocelot_ace_rule_offload_del(ocelot, block_id, &ace);
> }
> EXPORT_SYMBOL_GPL(ocelot_cls_flower_destroy);
>
>@@ -237,11 +269,13 @@ int ocelot_cls_flower_stats(struct ocelot *ocelot, int port,
> struct flow_cls_offload *f, bool ingress)
> {
> struct ocelot_ace_rule ace;
>- int ret;
>+ int ret, block_id;
>
> ace.prio = f->common.prio;
> ace.id = f->cookie;
>- ret = ocelot_ace_rule_stats_update(ocelot, &ace);
>+ block_id = ocelot_block_id_get(f->common.chain_index, ingress);
>+
>+ ret = ocelot_ace_rule_stats_update(ocelot, block_id, &ace);
> if (ret)
> return ret;
>
>diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
>index 91357b1c8f31..4b2320bdc036 100644
>--- a/include/soc/mscc/ocelot.h
>+++ b/include/soc/mscc/ocelot.h
>@@ -540,7 +540,7 @@ struct ocelot {
>
> struct list_head multicast;
>
>- struct ocelot_acl_block acl_block;
>+ struct ocelot_acl_block acl_block[3];
>
> const struct vcap_props *vcap;
>
>diff --git a/include/soc/mscc/ocelot_vcap.h b/include/soc/mscc/ocelot_vcap.h
>index 26d9384b3657..495847a40490 100644
>--- a/include/soc/mscc/ocelot_vcap.h
>+++ b/include/soc/mscc/ocelot_vcap.h
>@@ -14,9 +14,9 @@
> */
>
> enum {
>- /* VCAP_IS1, */
>+ VCAP_IS1,
> VCAP_IS2,
>- /* VCAP_ES0, */
>+ VCAP_ES0,
> };
>
> struct vcap_props {
>--
>2.17.1
>
/Allan

2020-06-02 08:52:11

by Xiaoliang Yang

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2 net-next 00/10] net: ocelot: VCAP IS1 and ES0 support

Hi Vladimir,

On Tus, 2 Jun 2020 at 16:04,
> First of all, net-next has just closed yesterday and will be closed for the following 2 weeks:
> https://eur01.safelinks.protection.outlook.com/?url=http:%2F%2Fvger.kernel.org%2F~davem%2Fnet-next.html&amp;data=02%7C01% 7Cxiaoliang.yang_1%40nxp.com%7C2fad4495dabc4f4ca5fd08d806cb70af%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637266818117666386&amp;sdata=ziVybWb4HzYXanehF5KwNv5RJL%2BZz6NeFvrZWg657B8%3D&amp;reserved=0
>
> Secondly, could you give an example of how different chains could express the fact that rules are executed in parallel between the IS1,
> IS2 and ES0 TCAMs?
>

Different TCAMs are not running in parallel, they have flow order: IS1->IS2->ES0. Using goto chain to express the flow order.
For example:
tc qdisc add dev swp0 ingress
tc filter add dev swp0 chain 0 protocol 802.1Q parent ffff: flower skip_sw vlan_id 1 vlan_prio 1 action vlan modify id 2 priority 2 action goto chain 1
tc filter add dev swp0 chain 1 protocol 802.1Q parent ffff: flower skip_sw vlan_id 2 vlan_prio 2 action drop
In this example, package with (vid=1,pcp=1) vlan tag will be modified to (vid=2,pcp=2) vlan tag on IS1, then will be dropped on IS2.

If there is no rule match on IS1, it will still lookup on IS2. We can set a rule on chain 0 to express this:
tc filter add dev swp0 chain 0 parent ffff: flower skip_sw action goto chain 1

In addition, VSC9959 chip has PSFP and "Sequence Generation recovery" modules are running after IS2, the flow order like this: IS1->IS2->PSFP-> "Sequence Generation recovery" ->ES0, we can also add chains like this to express these two modules in future.

BTW, where should I sent patches to due to net-next closed?

Thanks,
Xiaoliang Yang

2020-06-03 10:07:09

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH v2 net-next 03/10] net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by chain index

Hi Allan,

On Tue, 2 Jun 2020 at 11:38, Allan W. Nielsen
<[email protected]> wrote:
>
> Hi Xiaoliang,
>
> Happy to see that you are moving in the directions of multi chain - this
> seems ilke a much better fit to me.
>
>
> On 02.06.2020 13:18, Xiaoliang Yang wrote:
> >There are three hardware TCAMs for ocelot chips: IS1, IS2 and ES0. Each
> >one supports different actions. The hardware flow order is: IS1->IS2->ES0.
> >
> >This patch add three blocks to store rules according to chain index.
> >chain 0 is offloaded to IS1, chain 1 is offloaded to IS2, and egress chain
> >0 is offloaded to ES0.
>
> Using "static" allocation to to say chain-X goes to TCAM Y, also seems
> like the right approach to me. Given the capabilities of the HW, this
> will most likely be the easiest scheme to implement and to explain to
> the end-user.
>
> But I think we should make some adjustments to this mapping schema.
>
> Here are some important "things" I would like to consider when defining
> this schema:
>
> - As you explain, we have 3 TCAMs (IS1, IS2 and ES0), but we have 3
> parallel lookups in IS1 and 2 parallel lookups in IS2 - and also these
> TCAMs has a wide verity of keys.
>
> - We can utilize these multiple parallel lookups such that it seems like
> they are done in serial (that is if they do not touch the same
> actions), but as they are done in parallel they can not influence each
> other.
>
> - We can let IS1 influence the IS2 lookup (like the GOTO actions was
> intended to be used).
>
> - The chip also has other QoS classification facilities which sits
> before the TCAM (take a look at 3.7.3 QoS, DP, and DSCP Classification
> in vsc7514 datasheet). It we at some point in time want to enable
> this, then I think we need to do that in the same tc-flower framework.
>
> Here is my initial suggestion for an alternative chain-schema:
>
> Chain 0: The default chain - today this is in IS2. If we proceed
> with this as is - then this will change.
> Chain 1-9999: These are offloaded by "basic" classification.
> Chain 10000-19999: These are offloaded in IS1
> Chain 10000: Lookup-0 in IS1, and here we could limit the
> action to do QoS related stuff (priority
> update)
> Chain 11000: Lookup-1 in IS1, here we could do VLAN
> stuff
> Chain 12000: Lookup-2 in IS1, here we could apply the
> "PAG" which is essentially a GOTO.
>
> Chain 20000-29999: These are offloaded in IS2
> Chain 20000-20255: Lookup-0 in IS2, where CHAIN-ID -
> 20000 is the PAG value.
> Chain 21000-21000: Lookup-1 in IS2.
>
> All these chains should be optional - users should only need to
> configure the chains they need. To make this work, we need to configure
> both the desired actions (could be priority update) and the goto action.
> Remember in HW, all packets goes through this process, while in SW they
> only follow the "goto" path.
>
> An example could be (I have not tested this yet - sorry):
>
> tc qdisc add dev eth0 ingress
>
> # Activate lookup 11000. We can not do any other rules in chain 0, also
> # this implicitly means that we do not want any chains <11000.
> tc filter add dev eth0 parent ffff: chain 0
> action
> matchall goto 11000
>
> tc filter add dev eth0 parent ffff: chain 11000 \
> flower src_mac 00:01:00:00:00:00/00:ff:00:00:00:00 \
> action \
> vlan modify id 1234 \
> pipe \
> goto 20001
>
> tc filter add dev eth0 parent ffff: chain 20001 ...
>
> Maybe it would be an idea to create some use-cases, implement them in a
> test which can pass with today's SW, and then once we have a common
> understanding of what we want, we can implement it?
>
> /Allan
>
> >Using action goto chain to express flow order as follows:
> > tc filter add dev swp0 chain 0 parent ffff: flower skip_sw \
> > action goto chain 1
> >
> >Signed-off-by: Xiaoliang Yang <[email protected]>
> >---
> > drivers/net/ethernet/mscc/ocelot_ace.c | 51 +++++++++++++++--------
> > drivers/net/ethernet/mscc/ocelot_ace.h | 7 ++--
> > drivers/net/ethernet/mscc/ocelot_flower.c | 46 +++++++++++++++++---
> > include/soc/mscc/ocelot.h | 2 +-
> > include/soc/mscc/ocelot_vcap.h | 4 +-
> > 5 files changed, 81 insertions(+), 29 deletions(-)

> /Allan

What would be the advantage, from a user perspective, in exposing the
3 IS1 lookups as separate chains with orthogonal actions?
If the user wants to add an IS1 action that performs QoS
classification, VLAN classification and selects a custom PAG, they
would have to install 3 separate filters with the same key, each into
its own chain. Then the driver would be smart enough to figure out
that the 3 keys are actually the same, so it could merge them.
In comparison, we could just add a single filter to the IS1 chain,
with 3 actions (skbedit priority, vlan modify, goto is2).

Thanks,
-Vladimir

2020-06-08 13:58:51

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v2 net-next 03/10] net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by chain index

On 03.06.2020 13:04, Vladimir Oltean wrote:
>On Tue, 2 Jun 2020 at 11:38, Allan W. Nielsen
><[email protected]> wrote:
>>
>> Hi Xiaoliang,
>>
>> Happy to see that you are moving in the directions of multi chain - this
>> seems ilke a much better fit to me.
>>
>>
>> On 02.06.2020 13:18, Xiaoliang Yang wrote:
>> >There are three hardware TCAMs for ocelot chips: IS1, IS2 and ES0. Each
>> >one supports different actions. The hardware flow order is: IS1->IS2->ES0.
>> >
>> >This patch add three blocks to store rules according to chain index.
>> >chain 0 is offloaded to IS1, chain 1 is offloaded to IS2, and egress chain
>> >0 is offloaded to ES0.
>>
>> Using "static" allocation to to say chain-X goes to TCAM Y, also seems
>> like the right approach to me. Given the capabilities of the HW, this
>> will most likely be the easiest scheme to implement and to explain to
>> the end-user.
>>
>> But I think we should make some adjustments to this mapping schema.
>>
>> Here are some important "things" I would like to consider when defining
>> this schema:
>>
>> - As you explain, we have 3 TCAMs (IS1, IS2 and ES0), but we have 3
>> parallel lookups in IS1 and 2 parallel lookups in IS2 - and also these
>> TCAMs has a wide verity of keys.
>>
>> - We can utilize these multiple parallel lookups such that it seems like
>> they are done in serial (that is if they do not touch the same
>> actions), but as they are done in parallel they can not influence each
>> other.
>>
>> - We can let IS1 influence the IS2 lookup (like the GOTO actions was
>> intended to be used).
>>
>> - The chip also has other QoS classification facilities which sits
>> before the TCAM (take a look at 3.7.3 QoS, DP, and DSCP Classification
>> in vsc7514 datasheet). It we at some point in time want to enable
>> this, then I think we need to do that in the same tc-flower framework.
>>
>> Here is my initial suggestion for an alternative chain-schema:
>>
>> Chain 0: The default chain - today this is in IS2. If we proceed
>> with this as is - then this will change.
>> Chain 1-9999: These are offloaded by "basic" classification.
>> Chain 10000-19999: These are offloaded in IS1
>> Chain 10000: Lookup-0 in IS1, and here we could limit the
>> action to do QoS related stuff (priority
>> update)
>> Chain 11000: Lookup-1 in IS1, here we could do VLAN
>> stuff
>> Chain 12000: Lookup-2 in IS1, here we could apply the
>> "PAG" which is essentially a GOTO.
>>
>> Chain 20000-29999: These are offloaded in IS2
>> Chain 20000-20255: Lookup-0 in IS2, where CHAIN-ID -
>> 20000 is the PAG value.
>> Chain 21000-21000: Lookup-1 in IS2.
>>
>> All these chains should be optional - users should only need to
>> configure the chains they need. To make this work, we need to configure
>> both the desired actions (could be priority update) and the goto action.
>> Remember in HW, all packets goes through this process, while in SW they
>> only follow the "goto" path.
>>
>> An example could be (I have not tested this yet - sorry):
>>
>> tc qdisc add dev eth0 ingress
>>
>> # Activate lookup 11000. We can not do any other rules in chain 0, also
>> # this implicitly means that we do not want any chains <11000.
>> tc filter add dev eth0 parent ffff: chain 0
>> action
>> matchall goto 11000
>>
>> tc filter add dev eth0 parent ffff: chain 11000 \
>> flower src_mac 00:01:00:00:00:00/00:ff:00:00:00:00 \
>> action \
>> vlan modify id 1234 \
>> pipe \
>> goto 20001
>>
>> tc filter add dev eth0 parent ffff: chain 20001 ...
>>
>> Maybe it would be an idea to create some use-cases, implement them in a
>> test which can pass with today's SW, and then once we have a common
>> understanding of what we want, we can implement it?
>>
>> /Allan
>>
>> >Using action goto chain to express flow order as follows:
>> > tc filter add dev swp0 chain 0 parent ffff: flower skip_sw \
>> > action goto chain 1
>> >
>> >Signed-off-by: Xiaoliang Yang <[email protected]>
>> >---
>> > drivers/net/ethernet/mscc/ocelot_ace.c | 51 +++++++++++++++--------
>> > drivers/net/ethernet/mscc/ocelot_ace.h | 7 ++--
>> > drivers/net/ethernet/mscc/ocelot_flower.c | 46 +++++++++++++++++---
>> > include/soc/mscc/ocelot.h | 2 +-
>> > include/soc/mscc/ocelot_vcap.h | 4 +-
>> > 5 files changed, 81 insertions(+), 29 deletions(-)
>
>> /Allan
>
>What would be the advantage, from a user perspective, in exposing the
>3 IS1 lookups as separate chains with orthogonal actions?
>If the user wants to add an IS1 action that performs QoS
>classification, VLAN classification and selects a custom PAG, they
>would have to install 3 separate filters with the same key, each into
>its own chain. Then the driver would be smart enough to figure out
>that the 3 keys are actually the same, so it could merge them.
>In comparison, we could just add a single filter to the IS1 chain,
>with 3 actions (skbedit priority, vlan modify, goto is2).

Hi, I realize I forgot to answer this one - sorry.

The reason for this design is that we have use-cases where the rules to
do QoS classification must not impact VLAN classification. The easiest
way to do that, it to have it in separated lookups.

But we could make this more flexible to support your use-case better. A
alternative approach would be to assign exclusive-right-to-action on
first use. If the user choose to use the VLAN update in a given loopup,
then it cannot be used in others.

If the user attempt to use a given action across different lookups we
need to return an error.

Would that work for you? Any downside in such approach?

/Allan

2020-06-09 11:41:11

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [EXT] Re: [PATCH v2 net-next 00/10] net: ocelot: VCAP IS1 and ES0 support

Hi Xiaoliang,

On Tue, 2 Jun 2020 at 11:50, Xiaoliang Yang <[email protected]> wrote:
>
> Hi Vladimir,
>
> On Tus, 2 Jun 2020 at 16:04,
> > First of all, net-next has just closed yesterday and will be closed for the following 2 weeks:
> > https://eur01.safelinks.protection.outlook.com/?url=http:%2F%2Fvger.kernel.org%2F~davem%2Fnet-next.html&amp;data=02%7C01% 7Cxiaoliang.yang_1%40nxp.com%7C2fad4495dabc4f4ca5fd08d806cb70af%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637266818117666386&amp;sdata=ziVybWb4HzYXanehF5KwNv5RJL%2BZz6NeFvrZWg657B8%3D&amp;reserved=0
> >
> > Secondly, could you give an example of how different chains could express the fact that rules are executed in parallel between the IS1,
> > IS2 and ES0 TCAMs?
> >
>
> Different TCAMs are not running in parallel, they have flow order: IS1->IS2->ES0. Using goto chain to express the flow order.
> For example:
> tc qdisc add dev swp0 ingress
> tc filter add dev swp0 chain 0 protocol 802.1Q parent ffff: flower skip_sw vlan_id 1 vlan_prio 1 action vlan modify id 2 priority 2 action goto chain 1
> tc filter add dev swp0 chain 1 protocol 802.1Q parent ffff: flower skip_sw vlan_id 2 vlan_prio 2 action drop
> In this example, package with (vid=1,pcp=1) vlan tag will be modified to (vid=2,pcp=2) vlan tag on IS1, then will be dropped on IS2.
>
> If there is no rule match on IS1, it will still lookup on IS2. We can set a rule on chain 0 to express this:
> tc filter add dev swp0 chain 0 parent ffff: flower skip_sw action goto chain 1
>
> In addition, VSC9959 chip has PSFP and "Sequence Generation recovery" modules are running after IS2, the flow order like this: IS1->IS2->PSFP-> "Sequence Generation recovery" ->ES0, we can also add chains like this to express these two modules in future.
>

I've been pondering over what is a good abstraction for 802.1CB and I
don't think that it would be a tc action. After reading Annex C "Frame
Replication and Elimination for Reliability in systems" in
8021CB-2017, I think maybe it should be modeled as a stacked net
device a la hsr, but with the ability to add its own stream filtering
rules and actions (a la bridge fdb).
But for the PSFP policers, in principle I think you are correct, we
could designate a static chain id for those.

> BTW, where should I sent patches to due to net-next closed?
>

You can keep sending patches just as you did. There's nothing wrong
with doing that as long as you're only doing it for the feedback (RFC
== Request For Comments).
Since David receives a lot of patches and the backlog builds up very
quickly, he just rejects patches sent to net-next during the merge
window instead of queuing them up.
Patches that are bugfixes (not the case here, just in general) can be
sent to the net tree at all times (even during the merge window).
In all cases, the mailing list is the same, just the --subject-prefix
is different (net, net-next, rfc).

> Thanks,
> Xiaoliang Yang

Thanks,
-Vladimir

2020-06-09 13:04:13

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH v2 net-next 03/10] net: mscc: ocelot: allocated rules to different hardware VCAP TCAMs by chain index

Hi Allan,

On Mon, 8 Jun 2020 at 16:56, Allan W. Nielsen
<[email protected]> wrote:
>
> On 03.06.2020 13:04, Vladimir Oltean wrote:
> >On Tue, 2 Jun 2020 at 11:38, Allan W. Nielsen
> ><[email protected]> wrote:
> >>
> >> Hi Xiaoliang,
> >>
> >> Happy to see that you are moving in the directions of multi chain - this
> >> seems ilke a much better fit to me.
> >>
> >>
> >> On 02.06.2020 13:18, Xiaoliang Yang wrote:
> >> >There are three hardware TCAMs for ocelot chips: IS1, IS2 and ES0. Each
> >> >one supports different actions. The hardware flow order is: IS1->IS2->ES0.
> >> >
> >> >This patch add three blocks to store rules according to chain index.
> >> >chain 0 is offloaded to IS1, chain 1 is offloaded to IS2, and egress chain
> >> >0 is offloaded to ES0.
> >>
> >> Using "static" allocation to to say chain-X goes to TCAM Y, also seems
> >> like the right approach to me. Given the capabilities of the HW, this
> >> will most likely be the easiest scheme to implement and to explain to
> >> the end-user.
> >>
> >> But I think we should make some adjustments to this mapping schema.
> >>
> >> Here are some important "things" I would like to consider when defining
> >> this schema:
> >>
> >> - As you explain, we have 3 TCAMs (IS1, IS2 and ES0), but we have 3
> >> parallel lookups in IS1 and 2 parallel lookups in IS2 - and also these
> >> TCAMs has a wide verity of keys.
> >>
> >> - We can utilize these multiple parallel lookups such that it seems like
> >> they are done in serial (that is if they do not touch the same
> >> actions), but as they are done in parallel they can not influence each
> >> other.
> >>
> >> - We can let IS1 influence the IS2 lookup (like the GOTO actions was
> >> intended to be used).
> >>
> >> - The chip also has other QoS classification facilities which sits
> >> before the TCAM (take a look at 3.7.3 QoS, DP, and DSCP Classification
> >> in vsc7514 datasheet). It we at some point in time want to enable
> >> this, then I think we need to do that in the same tc-flower framework.
> >>
> >> Here is my initial suggestion for an alternative chain-schema:
> >>
> >> Chain 0: The default chain - today this is in IS2. If we proceed
> >> with this as is - then this will change.
> >> Chain 1-9999: These are offloaded by "basic" classification.
> >> Chain 10000-19999: These are offloaded in IS1
> >> Chain 10000: Lookup-0 in IS1, and here we could limit the
> >> action to do QoS related stuff (priority
> >> update)
> >> Chain 11000: Lookup-1 in IS1, here we could do VLAN
> >> stuff
> >> Chain 12000: Lookup-2 in IS1, here we could apply the
> >> "PAG" which is essentially a GOTO.
> >>
> >> Chain 20000-29999: These are offloaded in IS2
> >> Chain 20000-20255: Lookup-0 in IS2, where CHAIN-ID -
> >> 20000 is the PAG value.
> >> Chain 21000-21000: Lookup-1 in IS2.
> >>
> >> All these chains should be optional - users should only need to
> >> configure the chains they need. To make this work, we need to configure
> >> both the desired actions (could be priority update) and the goto action.
> >> Remember in HW, all packets goes through this process, while in SW they
> >> only follow the "goto" path.
> >>
> >> An example could be (I have not tested this yet - sorry):
> >>
> >> tc qdisc add dev eth0 ingress
> >>
> >> # Activate lookup 11000. We can not do any other rules in chain 0, also
> >> # this implicitly means that we do not want any chains <11000.
> >> tc filter add dev eth0 parent ffff: chain 0
> >> action
> >> matchall goto 11000
> >>
> >> tc filter add dev eth0 parent ffff: chain 11000 \
> >> flower src_mac 00:01:00:00:00:00/00:ff:00:00:00:00 \
> >> action \
> >> vlan modify id 1234 \
> >> pipe \
> >> goto 20001
> >>
> >> tc filter add dev eth0 parent ffff: chain 20001 ...
> >>
> >> Maybe it would be an idea to create some use-cases, implement them in a
> >> test which can pass with today's SW, and then once we have a common
> >> understanding of what we want, we can implement it?
> >>
> >> /Allan
> >>
> >> >Using action goto chain to express flow order as follows:
> >> > tc filter add dev swp0 chain 0 parent ffff: flower skip_sw \
> >> > action goto chain 1
> >> >
> >> >Signed-off-by: Xiaoliang Yang <[email protected]>
> >> >---
> >> > drivers/net/ethernet/mscc/ocelot_ace.c | 51 +++++++++++++++--------
> >> > drivers/net/ethernet/mscc/ocelot_ace.h | 7 ++--
> >> > drivers/net/ethernet/mscc/ocelot_flower.c | 46 +++++++++++++++++---
> >> > include/soc/mscc/ocelot.h | 2 +-
> >> > include/soc/mscc/ocelot_vcap.h | 4 +-
> >> > 5 files changed, 81 insertions(+), 29 deletions(-)
> >
> >> /Allan
> >
> >What would be the advantage, from a user perspective, in exposing the
> >3 IS1 lookups as separate chains with orthogonal actions?
> >If the user wants to add an IS1 action that performs QoS
> >classification, VLAN classification and selects a custom PAG, they
> >would have to install 3 separate filters with the same key, each into
> >its own chain. Then the driver would be smart enough to figure out
> >that the 3 keys are actually the same, so it could merge them.
> >In comparison, we could just add a single filter to the IS1 chain,
> >with 3 actions (skbedit priority, vlan modify, goto is2).
>
> Hi, I realize I forgot to answer this one - sorry.
>
> The reason for this design is that we have use-cases where the rules to
> do QoS classification must not impact VLAN classification. The easiest
> way to do that, it to have it in separated lookups.
>

Impact in the sense that an IS1 rule for VLAN classification might
'steal' packets from an IS1 rule for QoS classification (since the
TCAM stops after the first matching entry in each lookup)?
Such as:

tc filter add dev swp0 ingress protocol 802.1Q flower vlan_ethtype
ipv4 action skbedit priority 3
tc filter add dev swp0 ingress protocol 802.1Q flower vlan_id 0
vlan_prio 2 action vlan modify id 12 priority 1

The trouble with this TCAM seems to be that it doesn't support the
'pipe' control, so rules on overlapping keys will have unpredictable
results. So if we had these 2 rules in the same lookup in hardware,
then an IPv4 packet tagged with VID 0 and PCP 2 would get classified
to QoS class 3 but would not get retagged to VID 12 PCP 1, right?

So we should error out if the rule's control is 'pipe' (I haven't
checked if we can do that, I hope we can), and basically only 'goto'
something.
Which brings the discussion to "goto where?".
From the IS1 chain for QoS classification, you could goto the IS1
chain for VLAN classification, ok, that's your example. Is this goto
optional? (i.e. can you goto a PAG in IS2 directly, and in that case
would VLAN classification be skipped in hardware for the packets that
skip it in software?)
Your suggestion of having an entire chain for PAG selection suggests
to me that the rules in chain 10000 (IS1 for QoS classification) would
need to have a non-optional goto to chain 11000 (IS1 for VLAN
classification), which in turn would have a non-optional goto to chain
12000, where there would be a bunch of rules with just a goto action
that selects the IS2 policy. You can't have a goto from chains 10000
or 11000 directly into an IS2 chain (either the default policy of 0 or
a custom one).

> But we could make this more flexible to support your use-case better. A
> alternative approach would be to assign exclusive-right-to-action on
> first use. If the user choose to use the VLAN update in a given loopup,
> then it cannot be used in others.
>
> If the user attempt to use a given action across different lookups we
> need to return an error.
>
> Would that work for you? Any downside in such approach?
>
> /Allan
>

No, I definitely wasn't trying to suggest that.
If I understand correctly what you're proposing, then I don't
necessarily have anything against breaking compatibility with the
current single-chain layout. Our users will definitely need an extra
book for filling in TCAM filters (the other one being on how to use tc
in general), but on the other hand, at the end of the day, having
things like the actions rigidly assigned to static id's per chain is
the more sane thing to do given the TCAM design.

My only concern was literally that duplicating filter keys in a bunch
of different chains (all of which eventually land in the same TCAM,
just in different lookups) might result in waste of TCAM space. In
Felix, VCAP IS1 has space for 512 half key entries.
Consider just the trivial example when the user wants to do something like:

tc filter add dev swp0 ingress protocol 802.1Q flower \
vlan_ethtype ipv4 src_ip 192.168.1.1 \
action skbedit priority 1 \
action vlan modify id 101 priority 1
tc filter add dev swp0 ingress protocol 802.1Q flower \
vlan_ethtype ipv4 src_ip 192.168.1.2 \
action skbedit priority 2 \
action vlan modify id 101 priority 2

and 100 more rules like that.
With a forced "1-action-per-chain" split, the rules would have to look
like this:

Chain 10000:
tc filter add dev swp0 ingress protocol 802.1Q flower \
vlan_ethtype ipv4 src_ip 192.168.1.1 \
action skbedit priority 1
tc filter add dev swp0 ingress protocol 802.1Q flower \
vlan_ethtype ipv4 src_ip 192.168.1.2 \
action skbedit priority 2

Chain 11000:
tc filter add dev swp0 ingress protocol 802.1Q flower \
vlan_ethtype ipv4 src_ip 192.168.1.1 \
action vlan modify id 101 priority 1
tc filter add dev swp0 ingress protocol 802.1Q flower \
vlan_ethtype ipv4 src_ip 192.168.1.2 \
action vlan modify id 101 priority 2

So I would just like to avoid this latter approach consuming 200
entries. It sounds to me like it would need pretty complicated
bookkeeping in the driver.

Also, another topic: would the lookups in VCAP IS2 be ever exposed to
the user? Looking at the VSC7514 datasheet Table 60 "MASK_MODE and
PORT_MASK Combinations" (for how the action vectors resulted from
different lookups combine into a single result), I think it's
relatively difficult to make sense out of it.

And last question: why the large spacing between chain IDs in your example?

Thanks,
-Vladimir