2021-12-09 09:46:17

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 0/6] net: lan966x: Add switchdev and vlan support

This patch series extends lan966x with switchdev and vlan support.
The first patches just adds new registers and extend the MAC table to
handle the interrupts when a new address is learn/forget.
The last 2 patches adds the vlan and the switchdev support.

v2->v3:
- separate the PVID used when the port is in host mode or vlan unaware
- fix issue when the port was leaving the bridge

v1->v2:
- when allocating entries for the mac table use kzalloc instead of
devm_kzalloc
- also use GFP_KERNEL instead of GFP_ATOMIC, because is never called
in atomic context
- when deleting an mac table entry, the order of operations was wrong
- if ana irq is enabled make sure it gets disabled when the driver is
removed

Horatiu Vultur (6):
net: lan966x: Add registers that are used for switch and vlan
functionality
dt-bindings: net: lan966x: Extend with the analyzer interrupt
net: lan966x: add support for interrupts from analyzer
net: lan966x: More MAC table functionality
net: lan966x: Add vlan support
net: lan966x: Add switchdev support

.../net/microchip,lan966x-switch.yaml | 2 +
.../net/ethernet/microchip/lan966x/Makefile | 3 +-
.../ethernet/microchip/lan966x/lan966x_mac.c | 352 +++++++++++
.../ethernet/microchip/lan966x/lan966x_main.c | 99 +++-
.../ethernet/microchip/lan966x/lan966x_main.h | 73 ++-
.../ethernet/microchip/lan966x/lan966x_regs.h | 129 +++++
.../microchip/lan966x/lan966x_switchdev.c | 548 ++++++++++++++++++
.../ethernet/microchip/lan966x/lan966x_vlan.c | 446 ++++++++++++++
8 files changed, 1637 insertions(+), 15 deletions(-)
create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c

--
2.33.0



2021-12-09 09:46:20

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 1/6] net: lan966x: Add registers that are used for switch and vlan functionality

This patch adds the registers that will be used to enable switchdev and
vlan functionality in the HW.

Signed-off-by: Horatiu Vultur <[email protected]>
---
.../ethernet/microchip/lan966x/lan966x_regs.h | 129 ++++++++++++++++++
1 file changed, 129 insertions(+)

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
index 879dcd807dec..2f2b26b9f8c6 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
@@ -61,6 +61,9 @@ enum lan966x_target {
#define ANA_ADVLEARN_VLAN_CHK_GET(x)\
FIELD_GET(ANA_ADVLEARN_VLAN_CHK, x)

+/* ANA:ANA:VLANMASK */
+#define ANA_VLANMASK __REG(TARGET_ANA, 0, 1, 29824, 0, 1, 244, 8, 0, 1, 4)
+
/* ANA:ANA:ANAINTR */
#define ANA_ANAINTR __REG(TARGET_ANA, 0, 1, 29824, 0, 1, 244, 16, 0, 1, 4)

@@ -184,6 +187,102 @@ enum lan966x_target {
#define ANA_MACACCESS_MAC_TABLE_CMD_GET(x)\
FIELD_GET(ANA_MACACCESS_MAC_TABLE_CMD, x)

+/* ANA:ANA_TABLES:MACTINDX */
+#define ANA_MACTINDX __REG(TARGET_ANA, 0, 1, 27520, 0, 1, 128, 52, 0, 1, 4)
+
+#define ANA_MACTINDX_BUCKET GENMASK(12, 11)
+#define ANA_MACTINDX_BUCKET_SET(x)\
+ FIELD_PREP(ANA_MACTINDX_BUCKET, x)
+#define ANA_MACTINDX_BUCKET_GET(x)\
+ FIELD_GET(ANA_MACTINDX_BUCKET, x)
+
+#define ANA_MACTINDX_M_INDEX GENMASK(10, 0)
+#define ANA_MACTINDX_M_INDEX_SET(x)\
+ FIELD_PREP(ANA_MACTINDX_M_INDEX, x)
+#define ANA_MACTINDX_M_INDEX_GET(x)\
+ FIELD_GET(ANA_MACTINDX_M_INDEX, x)
+
+/* ANA:ANA_TABLES:VLAN_PORT_MASK */
+#define ANA_VLAN_PORT_MASK __REG(TARGET_ANA, 0, 1, 27520, 0, 1, 128, 56, 0, 1, 4)
+
+#define ANA_VLAN_PORT_MASK_VLAN_PORT_MASK GENMASK(8, 0)
+#define ANA_VLAN_PORT_MASK_VLAN_PORT_MASK_SET(x)\
+ FIELD_PREP(ANA_VLAN_PORT_MASK_VLAN_PORT_MASK, x)
+#define ANA_VLAN_PORT_MASK_VLAN_PORT_MASK_GET(x)\
+ FIELD_GET(ANA_VLAN_PORT_MASK_VLAN_PORT_MASK, x)
+
+/* ANA:ANA_TABLES:VLANACCESS */
+#define ANA_VLANACCESS __REG(TARGET_ANA, 0, 1, 27520, 0, 1, 128, 60, 0, 1, 4)
+
+#define ANA_VLANACCESS_VLAN_TBL_CMD GENMASK(1, 0)
+#define ANA_VLANACCESS_VLAN_TBL_CMD_SET(x)\
+ FIELD_PREP(ANA_VLANACCESS_VLAN_TBL_CMD, x)
+#define ANA_VLANACCESS_VLAN_TBL_CMD_GET(x)\
+ FIELD_GET(ANA_VLANACCESS_VLAN_TBL_CMD, x)
+
+/* ANA:ANA_TABLES:VLANTIDX */
+#define ANA_VLANTIDX __REG(TARGET_ANA, 0, 1, 27520, 0, 1, 128, 64, 0, 1, 4)
+
+#define ANA_VLANTIDX_VLAN_PGID_CPU_DIS BIT(18)
+#define ANA_VLANTIDX_VLAN_PGID_CPU_DIS_SET(x)\
+ FIELD_PREP(ANA_VLANTIDX_VLAN_PGID_CPU_DIS, x)
+#define ANA_VLANTIDX_VLAN_PGID_CPU_DIS_GET(x)\
+ FIELD_GET(ANA_VLANTIDX_VLAN_PGID_CPU_DIS, x)
+
+#define ANA_VLANTIDX_V_INDEX GENMASK(11, 0)
+#define ANA_VLANTIDX_V_INDEX_SET(x)\
+ FIELD_PREP(ANA_VLANTIDX_V_INDEX, x)
+#define ANA_VLANTIDX_V_INDEX_GET(x)\
+ FIELD_GET(ANA_VLANTIDX_V_INDEX, x)
+
+/* ANA:PORT:VLAN_CFG */
+#define ANA_VLAN_CFG(g) __REG(TARGET_ANA, 0, 1, 28672, g, 9, 128, 0, 0, 1, 4)
+
+#define ANA_VLAN_CFG_VLAN_AWARE_ENA BIT(20)
+#define ANA_VLAN_CFG_VLAN_AWARE_ENA_SET(x)\
+ FIELD_PREP(ANA_VLAN_CFG_VLAN_AWARE_ENA, x)
+#define ANA_VLAN_CFG_VLAN_AWARE_ENA_GET(x)\
+ FIELD_GET(ANA_VLAN_CFG_VLAN_AWARE_ENA, x)
+
+#define ANA_VLAN_CFG_VLAN_POP_CNT GENMASK(19, 18)
+#define ANA_VLAN_CFG_VLAN_POP_CNT_SET(x)\
+ FIELD_PREP(ANA_VLAN_CFG_VLAN_POP_CNT, x)
+#define ANA_VLAN_CFG_VLAN_POP_CNT_GET(x)\
+ FIELD_GET(ANA_VLAN_CFG_VLAN_POP_CNT, x)
+
+#define ANA_VLAN_CFG_VLAN_VID GENMASK(11, 0)
+#define ANA_VLAN_CFG_VLAN_VID_SET(x)\
+ FIELD_PREP(ANA_VLAN_CFG_VLAN_VID, x)
+#define ANA_VLAN_CFG_VLAN_VID_GET(x)\
+ FIELD_GET(ANA_VLAN_CFG_VLAN_VID, x)
+
+/* ANA:PORT:DROP_CFG */
+#define ANA_DROP_CFG(g) __REG(TARGET_ANA, 0, 1, 28672, g, 9, 128, 4, 0, 1, 4)
+
+#define ANA_DROP_CFG_DROP_UNTAGGED_ENA BIT(6)
+#define ANA_DROP_CFG_DROP_UNTAGGED_ENA_SET(x)\
+ FIELD_PREP(ANA_DROP_CFG_DROP_UNTAGGED_ENA, x)
+#define ANA_DROP_CFG_DROP_UNTAGGED_ENA_GET(x)\
+ FIELD_GET(ANA_DROP_CFG_DROP_UNTAGGED_ENA, x)
+
+#define ANA_DROP_CFG_DROP_PRIO_S_TAGGED_ENA BIT(3)
+#define ANA_DROP_CFG_DROP_PRIO_S_TAGGED_ENA_SET(x)\
+ FIELD_PREP(ANA_DROP_CFG_DROP_PRIO_S_TAGGED_ENA, x)
+#define ANA_DROP_CFG_DROP_PRIO_S_TAGGED_ENA_GET(x)\
+ FIELD_GET(ANA_DROP_CFG_DROP_PRIO_S_TAGGED_ENA, x)
+
+#define ANA_DROP_CFG_DROP_PRIO_C_TAGGED_ENA BIT(2)
+#define ANA_DROP_CFG_DROP_PRIO_C_TAGGED_ENA_SET(x)\
+ FIELD_PREP(ANA_DROP_CFG_DROP_PRIO_C_TAGGED_ENA, x)
+#define ANA_DROP_CFG_DROP_PRIO_C_TAGGED_ENA_GET(x)\
+ FIELD_GET(ANA_DROP_CFG_DROP_PRIO_C_TAGGED_ENA, x)
+
+#define ANA_DROP_CFG_DROP_MC_SMAC_ENA BIT(0)
+#define ANA_DROP_CFG_DROP_MC_SMAC_ENA_SET(x)\
+ FIELD_PREP(ANA_DROP_CFG_DROP_MC_SMAC_ENA, x)
+#define ANA_DROP_CFG_DROP_MC_SMAC_ENA_GET(x)\
+ FIELD_GET(ANA_DROP_CFG_DROP_MC_SMAC_ENA, x)
+
/* ANA:PORT:CPU_FWD_CFG */
#define ANA_CPU_FWD_CFG(g) __REG(TARGET_ANA, 0, 1, 28672, g, 9, 128, 96, 0, 1, 4)

@@ -589,6 +688,36 @@ enum lan966x_target {
/* QSYS:RES_CTRL:RES_CFG */
#define QSYS_RES_CFG(g) __REG(TARGET_QSYS, 0, 1, 32768, g, 1024, 8, 0, 0, 1, 4)

+/* REW:PORT:PORT_VLAN_CFG */
+#define REW_PORT_VLAN_CFG(g) __REG(TARGET_REW, 0, 1, 0, g, 10, 128, 0, 0, 1, 4)
+
+#define REW_PORT_VLAN_CFG_PORT_TPID GENMASK(31, 16)
+#define REW_PORT_VLAN_CFG_PORT_TPID_SET(x)\
+ FIELD_PREP(REW_PORT_VLAN_CFG_PORT_TPID, x)
+#define REW_PORT_VLAN_CFG_PORT_TPID_GET(x)\
+ FIELD_GET(REW_PORT_VLAN_CFG_PORT_TPID, x)
+
+#define REW_PORT_VLAN_CFG_PORT_VID GENMASK(11, 0)
+#define REW_PORT_VLAN_CFG_PORT_VID_SET(x)\
+ FIELD_PREP(REW_PORT_VLAN_CFG_PORT_VID, x)
+#define REW_PORT_VLAN_CFG_PORT_VID_GET(x)\
+ FIELD_GET(REW_PORT_VLAN_CFG_PORT_VID, x)
+
+/* REW:PORT:TAG_CFG */
+#define REW_TAG_CFG(g) __REG(TARGET_REW, 0, 1, 0, g, 10, 128, 4, 0, 1, 4)
+
+#define REW_TAG_CFG_TAG_CFG GENMASK(8, 7)
+#define REW_TAG_CFG_TAG_CFG_SET(x)\
+ FIELD_PREP(REW_TAG_CFG_TAG_CFG, x)
+#define REW_TAG_CFG_TAG_CFG_GET(x)\
+ FIELD_GET(REW_TAG_CFG_TAG_CFG, x)
+
+#define REW_TAG_CFG_TAG_TPID_CFG GENMASK(6, 5)
+#define REW_TAG_CFG_TAG_TPID_CFG_SET(x)\
+ FIELD_PREP(REW_TAG_CFG_TAG_TPID_CFG, x)
+#define REW_TAG_CFG_TAG_TPID_CFG_GET(x)\
+ FIELD_GET(REW_TAG_CFG_TAG_TPID_CFG, x)
+
/* REW:PORT:PORT_CFG */
#define REW_PORT_CFG(g) __REG(TARGET_REW, 0, 1, 0, g, 10, 128, 8, 0, 1, 4)

--
2.33.0


2021-12-09 09:46:27

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 2/6] dt-bindings: net: lan966x: Extend with the analyzer interrupt

Extend dt-bindings for lan966x with analyzer interrupt.
This interrupt can be generated for example when the HW learn/forgets
an entry in the MAC table.

Signed-off-by: Horatiu Vultur <[email protected]>
---
.../devicetree/bindings/net/microchip,lan966x-switch.yaml | 2 ++
1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
index 5bee665d5fcf..e79e4e166ad8 100644
--- a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
+++ b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
@@ -37,12 +37,14 @@ properties:
items:
- description: register based extraction
- description: frame dma based extraction
+ - description: analyzer interrupt

interrupt-names:
minItems: 1
items:
- const: xtr
- const: fdma
+ - const: ana

resets:
items:
--
2.33.0


2021-12-09 09:46:29

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 3/6] net: lan966x: add support for interrupts from analyzer

This patch adds support for handling the interrupts generated by the
analyzer. Currently, only the MAC table generates these interrupts.
The MAC table will generate an interrupt whenever it learns or forgets
an entry in the table. It is the SW responsibility figure out which
entries were added/removed.

Signed-off-by: Horatiu Vultur <[email protected]>
---
.../ethernet/microchip/lan966x/lan966x_mac.c | 244 ++++++++++++++++++
.../ethernet/microchip/lan966x/lan966x_main.c | 23 ++
.../ethernet/microchip/lan966x/lan966x_main.h | 6 +
3 files changed, 273 insertions(+)

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
index f6878b9f57ef..c01ab01bffbf 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0+

+#include <net/switchdev.h>
#include "lan966x_main.h"

#define LAN966X_MAC_COLUMNS 4
@@ -13,6 +14,23 @@
#define MACACCESS_CMD_WRITE 7
#define MACACCESS_CMD_SYNC_GET_NEXT 8

+#define LAN966X_MAC_INVALID_ROW -1
+
+struct lan966x_mac_entry {
+ struct list_head list;
+ unsigned char mac[ETH_ALEN] __aligned(2);
+ u16 vid;
+ u16 port_index;
+ int row;
+};
+
+struct lan966x_mac_raw_entry {
+ u32 mach;
+ u32 macl;
+ u32 maca;
+ bool process;
+};
+
static int lan966x_mac_get_status(struct lan966x *lan966x)
{
return lan_rd(lan966x, ANA_MACACCESS);
@@ -98,4 +116,230 @@ void lan966x_mac_init(struct lan966x *lan966x)
/* Clear the MAC table */
lan_wr(MACACCESS_CMD_INIT, lan966x, ANA_MACACCESS);
lan966x_mac_wait_for_completion(lan966x);
+
+ spin_lock_init(&lan966x->mac_lock);
+ INIT_LIST_HEAD(&lan966x->mac_entries);
+}
+
+static struct lan966x_mac_entry *lan966x_mac_alloc_entry(const unsigned char *mac,
+ u16 vid, u16 port_index)
+{
+ struct lan966x_mac_entry *mac_entry;
+
+ mac_entry = kzalloc(sizeof(*mac_entry), GFP_KERNEL);
+ if (!mac_entry)
+ return NULL;
+
+ memcpy(mac_entry->mac, mac, ETH_ALEN);
+ mac_entry->vid = vid;
+ mac_entry->port_index = port_index;
+ mac_entry->row = LAN966X_MAC_INVALID_ROW;
+ return mac_entry;
+}
+
+static void lan966x_fdb_call_notifiers(enum switchdev_notifier_type type,
+ const char *mac, u16 vid,
+ struct net_device *dev)
+{
+ struct switchdev_notifier_fdb_info info = { 0 };
+
+ info.addr = mac;
+ info.vid = vid;
+ info.offloaded = true;
+ call_switchdev_notifiers(type, dev, &info.info, NULL);
+}
+
+void lan966x_mac_purge_entries(struct lan966x *lan966x)
+{
+ struct lan966x_mac_entry *mac_entry, *tmp;
+ unsigned long flags;
+
+ spin_lock_irqsave(&lan966x->mac_lock, flags);
+ list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries,
+ list) {
+ lan966x_mac_forget(lan966x, mac_entry->mac, mac_entry->vid,
+ ENTRYTYPE_LOCKED);
+
+ list_del(&mac_entry->list);
+ kfree(mac_entry);
+ }
+ spin_unlock_irqrestore(&lan966x->mac_lock, flags);
+}
+
+static void lan966x_mac_notifiers(struct lan966x *lan966x,
+ enum switchdev_notifier_type type,
+ unsigned char *mac, u32 vid,
+ struct net_device *dev)
+{
+ rtnl_lock();
+ lan966x_fdb_call_notifiers(type, mac, vid, dev);
+ rtnl_unlock();
+}
+
+static void lan966x_mac_process_raw_entry(struct lan966x_mac_raw_entry *raw_entry,
+ u8 *mac, u16 *vid, u32 *dest_idx)
+{
+ mac[0] = (raw_entry->mach >> 8) & 0xff;
+ mac[1] = (raw_entry->mach >> 0) & 0xff;
+ mac[2] = (raw_entry->macl >> 24) & 0xff;
+ mac[3] = (raw_entry->macl >> 16) & 0xff;
+ mac[4] = (raw_entry->macl >> 8) & 0xff;
+ mac[5] = (raw_entry->macl >> 0) & 0xff;
+
+ *vid = (raw_entry->mach >> 16) & 0xfff;
+ *dest_idx = ANA_MACACCESS_DEST_IDX_GET(raw_entry->maca);
+}
+
+static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
+ struct lan966x_mac_raw_entry *raw_entries)
+{
+ struct lan966x_mac_entry *mac_entry, *tmp;
+ char mac[ETH_ALEN] __aligned(2);
+ unsigned long flags;
+ u32 dest_idx;
+ u32 column;
+ u16 vid;
+
+ spin_lock_irqsave(&lan966x->mac_lock, flags);
+ list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries, list) {
+ bool found = false;
+
+ if (mac_entry->row != row)
+ continue;
+
+ for (column = 0; column < LAN966X_MAC_COLUMNS; ++column) {
+ /* All the valid entries are at the start of the row,
+ * so when get one invalid entry it can just skip the
+ * rest of the columns
+ */
+ if (!ANA_MACACCESS_VALID_GET(raw_entries[column].maca))
+ break;
+
+ lan966x_mac_process_raw_entry(&raw_entries[column],
+ mac, &vid, &dest_idx);
+ WARN_ON(dest_idx > lan966x->num_phys_ports);
+
+ /* If the entry in SW is found, then there is nothing
+ * to do
+ */
+ if (mac_entry->vid == vid &&
+ ether_addr_equal(mac_entry->mac, mac) &&
+ mac_entry->port_index == dest_idx) {
+ raw_entries[column].process = true;
+ found = true;
+ break;
+ }
+ }
+
+ if (!found) {
+ /* Notify the bridge that the entry doesn't exist
+ * anymore in the HW and remmove the entry from the SW
+ * list
+ */
+ lan966x_mac_notifiers(lan966x, SWITCHDEV_FDB_DEL_TO_BRIDGE,
+ mac_entry->mac, mac_entry->vid,
+ lan966x->ports[mac_entry->port_index]->dev);
+
+ list_del(&mac_entry->list);
+ kfree(mac_entry);
+ }
+ }
+ spin_unlock_irqrestore(&lan966x->mac_lock, flags);
+
+ /* Now go to the list of columns and see if any entry was not in the SW
+ * list, then that means that the entry is new so it needs to notify the
+ * bridge.
+ */
+ for (column = 0; column < LAN966X_MAC_COLUMNS; ++column) {
+ /* All the valid entries are at the start of the row, so when
+ * get one invalid entry it can just skip the rest of the columns
+ */
+ if (!ANA_MACACCESS_VALID_GET(raw_entries[column].maca))
+ break;
+
+ /* If the entry already exists then don't do anything */
+ if (raw_entries[column].process)
+ continue;
+
+ lan966x_mac_process_raw_entry(&raw_entries[column],
+ mac, &vid, &dest_idx);
+ WARN_ON(dest_idx > lan966x->num_phys_ports);
+
+ mac_entry = lan966x_mac_alloc_entry(mac, vid, dest_idx);
+ if (!mac_entry)
+ return;
+
+ mac_entry->row = row;
+
+ spin_lock_irqsave(&lan966x->mac_lock, flags);
+ list_add_tail(&mac_entry->list, &lan966x->mac_entries);
+ spin_unlock_irqrestore(&lan966x->mac_lock, flags);
+
+ lan966x_mac_notifiers(lan966x, SWITCHDEV_FDB_ADD_TO_BRIDGE,
+ mac, vid, lan966x->ports[dest_idx]->dev);
+ }
+}
+
+irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x)
+{
+ struct lan966x_mac_raw_entry entry[LAN966X_MAC_COLUMNS] = { 0 };
+ u32 index, column;
+ bool stop = true;
+ u32 val;
+
+ /* Check if the mac table triggered this, if not just bail out */
+ if (!(ANA_ANAINTR_INTR_GET(lan_rd(lan966x, ANA_ANAINTR))))
+ return IRQ_NONE;
+
+ /* Start the scan from 0, 0 */
+ lan_wr(ANA_MACTINDX_M_INDEX_SET(0) |
+ ANA_MACTINDX_BUCKET_SET(0),
+ lan966x, ANA_MACTINDX);
+
+ while (1) {
+ lan_rmw(ANA_MACACCESS_MAC_TABLE_CMD_SET(MACACCESS_CMD_SYNC_GET_NEXT),
+ ANA_MACACCESS_MAC_TABLE_CMD,
+ lan966x, ANA_MACACCESS);
+ lan966x_mac_wait_for_completion(lan966x);
+
+ val = lan_rd(lan966x, ANA_MACTINDX);
+ index = ANA_MACTINDX_M_INDEX_GET(val);
+ column = ANA_MACTINDX_BUCKET_GET(val);
+
+ /* The SYNC-GET-NEXT returns all the entries(4) in a row in
+ * which is suffered a change. By change it means that new entry
+ * was added or an entry was removed because of ageing.
+ * It would return all the columns for that row. And after that
+ * it would return the next row The stop conditions of the
+ * SYNC-GET-NEXT is when it reaches 'directly' to row 0
+ * column 3. So if SYNC-GET-NEXT returns row 0 and column 0
+ * then it is required to continue to read more even if it
+ * reaches row 0 and column 3.
+ */
+ if (index == 0 && column == 0)
+ stop = false;
+
+ if (column == LAN966X_MAC_COLUMNS - 1 &&
+ index == 0 && stop)
+ break;
+
+ entry[column].mach = lan_rd(lan966x, ANA_MACHDATA);
+ entry[column].macl = lan_rd(lan966x, ANA_MACLDATA);
+ entry[column].maca = lan_rd(lan966x, ANA_MACACCESS);
+
+ /* Once all the columns are read process them */
+ if (column == LAN966X_MAC_COLUMNS - 1) {
+ lan966x_mac_irq_process(lan966x, index, entry);
+ /* A row was processed so it is safe to assume that the
+ * next row/column can be the stop condition
+ */
+ stop = true;
+ }
+ }
+
+ lan_rmw(ANA_ANAINTR_INTR_SET(0),
+ ANA_ANAINTR_INTR,
+ lan966x, ANA_ANAINTR);
+
+ return IRQ_HANDLED;
}
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
index 101c1f005baf..7c6d6293611a 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
@@ -527,6 +527,13 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args)
return IRQ_HANDLED;
}

+static irqreturn_t lan966x_ana_irq_handler(int irq, void *args)
+{
+ struct lan966x *lan966x = args;
+
+ return lan966x_mac_irq_handler(lan966x);
+}
+
static void lan966x_cleanup_ports(struct lan966x *lan966x)
{
struct lan966x_port *port;
@@ -554,6 +561,11 @@ static void lan966x_cleanup_ports(struct lan966x *lan966x)

disable_irq(lan966x->xtr_irq);
lan966x->xtr_irq = -ENXIO;
+
+ if (lan966x->ana_irq) {
+ disable_irq(lan966x->ana_irq);
+ lan966x->ana_irq = -ENXIO;
+ }
}

static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
@@ -870,6 +882,15 @@ static int lan966x_probe(struct platform_device *pdev)
return -ENODEV;
}

+ lan966x->ana_irq = platform_get_irq_byname(pdev, "ana");
+ if (lan966x->ana_irq) {
+ err = devm_request_threaded_irq(&pdev->dev, lan966x->ana_irq, NULL,
+ lan966x_ana_irq_handler, IRQF_ONESHOT,
+ "ana irq", lan966x);
+ if (err)
+ return dev_err_probe(&pdev->dev, err, "Unable to use ana irq");
+ }
+
/* init switch */
lan966x_init(lan966x);
lan966x_stats_init(lan966x);
@@ -923,6 +944,8 @@ static int lan966x_remove(struct platform_device *pdev)
destroy_workqueue(lan966x->stats_queue);
mutex_destroy(&lan966x->stats_lock);

+ lan966x_mac_purge_entries(lan966x);
+
return 0;
}

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
index 7e5a3b6f168d..ba548d65b58a 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
@@ -75,6 +75,9 @@ struct lan966x {

u8 base_mac[ETH_ALEN];

+ struct list_head mac_entries;
+ spinlock_t mac_lock; /* lock for mac_entries list */
+
/* stats */
const struct lan966x_stat_layout *stats_layout;
u32 num_stats;
@@ -87,6 +90,7 @@ struct lan966x {

/* interrupts */
int xtr_irq;
+ int ana_irq;
};

struct lan966x_port_config {
@@ -141,6 +145,8 @@ int lan966x_mac_forget(struct lan966x *lan966x,
int lan966x_mac_cpu_learn(struct lan966x *lan966x, const char *addr, u16 vid);
int lan966x_mac_cpu_forget(struct lan966x *lan966x, const char *addr, u16 vid);
void lan966x_mac_init(struct lan966x *lan966x);
+void lan966x_mac_purge_entries(struct lan966x *lan966x);
+irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x);

static inline void __iomem *lan_addr(void __iomem *base[],
int id, int tinst, int tcnt,
--
2.33.0


2021-12-09 09:46:31

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 4/6] net: lan966x: More MAC table functionality

This patch adds support for adding/removing mac entries in the SW list
of entries and in the HW table. This is used by the bridge
functionality.

Signed-off-by: Horatiu Vultur <[email protected]>
---
.../ethernet/microchip/lan966x/lan966x_mac.c | 108 ++++++++++++++++++
.../ethernet/microchip/lan966x/lan966x_main.h | 9 ++
2 files changed, 117 insertions(+)

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
index c01ab01bffbf..60c0d97c3a98 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
@@ -111,6 +111,14 @@ int lan966x_mac_cpu_forget(struct lan966x *lan966x, const char *addr, u16 vid)
return lan966x_mac_forget(lan966x, addr, vid, ENTRYTYPE_LOCKED);
}

+void lan966x_mac_set_ageing(struct lan966x *lan966x,
+ u32 ageing)
+{
+ lan_rmw(ANA_AUTOAGE_AGE_PERIOD_SET(ageing / 2),
+ ANA_AUTOAGE_AGE_PERIOD,
+ lan966x, ANA_AUTOAGE);
+}
+
void lan966x_mac_init(struct lan966x *lan966x)
{
/* Clear the MAC table */
@@ -137,6 +145,49 @@ static struct lan966x_mac_entry *lan966x_mac_alloc_entry(const unsigned char *ma
return mac_entry;
}

+static struct lan966x_mac_entry *lan966x_mac_find_entry(struct lan966x *lan966x,
+ const unsigned char *mac,
+ u16 vid, u16 port_index)
+{
+ struct lan966x_mac_entry *res = NULL;
+ struct lan966x_mac_entry *mac_entry;
+ unsigned long flags;
+
+ spin_lock_irqsave(&lan966x->mac_lock, flags);
+ list_for_each_entry(mac_entry, &lan966x->mac_entries, list) {
+ if (mac_entry->vid == vid &&
+ ether_addr_equal(mac, mac_entry->mac) &&
+ mac_entry->port_index == port_index) {
+ res = mac_entry;
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&lan966x->mac_lock, flags);
+
+ return res;
+}
+
+static int lan966x_mac_lookup(struct lan966x *lan966x,
+ const unsigned char mac[ETH_ALEN],
+ unsigned int vid, enum macaccess_entry_type type)
+{
+ int ret;
+
+ lan966x_mac_select(lan966x, mac, vid);
+
+ /* Issue a read command */
+ lan_wr(ANA_MACACCESS_ENTRYTYPE_SET(type) |
+ ANA_MACACCESS_VALID_SET(1) |
+ ANA_MACACCESS_MAC_TABLE_CMD_SET(MACACCESS_CMD_READ),
+ lan966x, ANA_MACACCESS);
+
+ ret = lan966x_mac_wait_for_completion(lan966x);
+ if (ret)
+ return ret;
+
+ return ANA_MACACCESS_VALID_GET(lan_rd(lan966x, ANA_MACACCESS));
+}
+
static void lan966x_fdb_call_notifiers(enum switchdev_notifier_type type,
const char *mac, u16 vid,
struct net_device *dev)
@@ -149,6 +200,63 @@ static void lan966x_fdb_call_notifiers(enum switchdev_notifier_type type,
call_switchdev_notifiers(type, dev, &info.info, NULL);
}

+int lan966x_mac_add_entry(struct lan966x *lan966x, struct lan966x_port *port,
+ const unsigned char *addr, u16 vid)
+{
+ struct lan966x_mac_entry *mac_entry;
+ unsigned long flags;
+
+ if (lan966x_mac_lookup(lan966x, addr, vid, ENTRYTYPE_NORMAL))
+ return 0;
+
+ /* In case the entry already exists, don't add it again to SW,
+ * just update HW, but we need to look in the actual HW because
+ * it is possible for an entry to be learn by HW and before we
+ * get the interrupt the frame will reach CPU and the CPU will
+ * add the entry but without the extern_learn flag.
+ */
+ mac_entry = lan966x_mac_find_entry(lan966x, addr, vid, port->chip_port);
+ if (mac_entry)
+ return lan966x_mac_learn(lan966x, port->chip_port,
+ addr, vid, ENTRYTYPE_LOCKED);
+
+ mac_entry = lan966x_mac_alloc_entry(addr, vid, port->chip_port);
+ if (!mac_entry)
+ return -ENOMEM;
+
+ spin_lock_irqsave(&lan966x->mac_lock, flags);
+ list_add_tail(&mac_entry->list, &lan966x->mac_entries);
+ spin_unlock_irqrestore(&lan966x->mac_lock, flags);
+
+ lan966x_mac_learn(lan966x, port->chip_port, addr, vid, ENTRYTYPE_LOCKED);
+ lan966x_fdb_call_notifiers(SWITCHDEV_FDB_OFFLOADED, addr, vid, port->dev);
+
+ return 0;
+}
+
+int lan966x_mac_del_entry(struct lan966x *lan966x, const unsigned char *addr,
+ u16 vid)
+{
+ struct lan966x_mac_entry *mac_entry, *tmp;
+ unsigned long flags;
+
+ spin_lock_irqsave(&lan966x->mac_lock, flags);
+ list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries,
+ list) {
+ if ((vid == 0 || mac_entry->vid == vid) &&
+ ether_addr_equal(addr, mac_entry->mac)) {
+ lan966x_mac_forget(lan966x, mac_entry->mac, mac_entry->vid,
+ ENTRYTYPE_LOCKED);
+
+ list_del(&mac_entry->list);
+ kfree(mac_entry);
+ }
+ }
+ spin_unlock_irqrestore(&lan966x->mac_lock, flags);
+
+ return 0;
+}
+
void lan966x_mac_purge_entries(struct lan966x *lan966x)
{
struct lan966x_mac_entry *mac_entry, *tmp;
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
index ba548d65b58a..fcd5d09a070c 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
@@ -145,6 +145,15 @@ int lan966x_mac_forget(struct lan966x *lan966x,
int lan966x_mac_cpu_learn(struct lan966x *lan966x, const char *addr, u16 vid);
int lan966x_mac_cpu_forget(struct lan966x *lan966x, const char *addr, u16 vid);
void lan966x_mac_init(struct lan966x *lan966x);
+void lan966x_mac_set_ageing(struct lan966x *lan966x,
+ u32 ageing);
+int lan966x_mac_del_entry(struct lan966x *lan966x,
+ const unsigned char *addr,
+ u16 vid);
+int lan966x_mac_add_entry(struct lan966x *lan966x,
+ struct lan966x_port *port,
+ const unsigned char *addr,
+ u16 vid);
void lan966x_mac_purge_entries(struct lan966x *lan966x);
irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x);

--
2.33.0


2021-12-09 09:46:37

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 5/6] net: lan966x: Add vlan support

This adds support for vlan in lan966x.
This allows add/remove front ports from vlans and also allows the CPU
port to be added/remove from vlans. In this way it is possible to
filter frames towards the CPU based on the vlan.

Signed-off-by: Horatiu Vultur <[email protected]>
---
.../net/ethernet/microchip/lan966x/Makefile | 2 +-
.../ethernet/microchip/lan966x/lan966x_main.c | 35 +-
.../ethernet/microchip/lan966x/lan966x_main.h | 40 +-
.../ethernet/microchip/lan966x/lan966x_vlan.c | 436 ++++++++++++++++++
4 files changed, 508 insertions(+), 5 deletions(-)
create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c

diff --git a/drivers/net/ethernet/microchip/lan966x/Makefile b/drivers/net/ethernet/microchip/lan966x/Makefile
index 2989ba528236..f7e6068a91cb 100644
--- a/drivers/net/ethernet/microchip/lan966x/Makefile
+++ b/drivers/net/ethernet/microchip/lan966x/Makefile
@@ -6,4 +6,4 @@
obj-$(CONFIG_LAN966X_SWITCH) += lan966x-switch.o

lan966x-switch-objs := lan966x_main.o lan966x_phylink.o lan966x_port.o \
- lan966x_mac.o lan966x_ethtool.o
+ lan966x_mac.o lan966x_ethtool.o lan966x_vlan.o
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
index 7c6d6293611a..1b4c7e6b4f85 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
@@ -103,17 +103,18 @@ static int lan966x_create_targets(struct platform_device *pdev,
static int lan966x_port_set_mac_address(struct net_device *dev, void *p)
{
struct lan966x_port *port = netdev_priv(dev);
+ u16 pvid = lan966x_vlan_port_get_pvid(port);
struct lan966x *lan966x = port->lan966x;
const struct sockaddr *addr = p;
int ret;

/* Learn the new net device MAC address in the mac table. */
- ret = lan966x_mac_cpu_learn(lan966x, addr->sa_data, port->pvid);
+ ret = lan966x_mac_cpu_learn(lan966x, addr->sa_data, pvid);
if (ret)
return ret;

/* Then forget the previous one. */
- ret = lan966x_mac_cpu_forget(lan966x, dev->dev_addr, port->pvid);
+ ret = lan966x_mac_cpu_forget(lan966x, dev->dev_addr, pvid);
if (ret)
return ret;

@@ -283,6 +284,12 @@ static void lan966x_ifh_set_ipv(void *ifh, u64 bypass)
IFH_POS_IPV, IFH_LEN * 4, PACK, 0);
}

+static void lan966x_ifh_set_vid(void *ifh, u64 vid)
+{
+ packing(ifh, &vid, IFH_POS_TCI + IFH_WID_TCI - 1,
+ IFH_POS_TCI, IFH_LEN * 4, PACK, 0);
+}
+
static int lan966x_port_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct lan966x_port *port = netdev_priv(dev);
@@ -294,6 +301,7 @@ static int lan966x_port_xmit(struct sk_buff *skb, struct net_device *dev)
lan966x_ifh_set_port(ifh, BIT_ULL(port->chip_port));
lan966x_ifh_set_qos_class(ifh, skb->priority >= 7 ? 0x7 : skb->priority);
lan966x_ifh_set_ipv(ifh, skb->priority >= 7 ? 0x7 : skb->priority);
+ lan966x_ifh_set_vid(ifh, skb_vlan_tag_get(skb));

return lan966x_port_ifh_xmit(skb, ifh, dev);
}
@@ -365,6 +373,18 @@ static int lan966x_port_get_parent_id(struct net_device *dev,
return 0;
}

+static int lan966x_port_set_features(struct net_device *dev,
+ netdev_features_t features)
+{
+ struct lan966x_port *port = netdev_priv(dev);
+ netdev_features_t changed = dev->features ^ features;
+
+ if (changed & NETIF_F_HW_VLAN_CTAG_FILTER)
+ lan966x_vlan_mode(port, features);
+
+ return 0;
+}
+
static const struct net_device_ops lan966x_port_netdev_ops = {
.ndo_open = lan966x_port_open,
.ndo_stop = lan966x_port_stop,
@@ -376,6 +396,9 @@ static const struct net_device_ops lan966x_port_netdev_ops = {
.ndo_get_stats64 = lan966x_stats_get,
.ndo_set_mac_address = lan966x_port_set_mac_address,
.ndo_get_port_parent_id = lan966x_port_get_parent_id,
+ .ndo_set_features = lan966x_port_set_features,
+ .ndo_vlan_rx_add_vid = lan966x_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = lan966x_vlan_rx_kill_vid,
};

static int lan966x_port_xtr_status(struct lan966x *lan966x, u8 grp)
@@ -590,7 +613,6 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
port->dev = dev;
port->lan966x = lan966x;
port->chip_port = p;
- port->pvid = PORT_PVID;
lan966x->ports[p] = port;

dev->max_mtu = ETH_MAX_MTU;
@@ -643,6 +665,10 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
return err;
}

+ lan966x_vlan_port_set_vlan_aware(port, 0);
+ lan966x_vlan_port_set_vid(port, HOST_PVID, false, false);
+ lan966x_vlan_port_apply(port);
+
return 0;
}

@@ -653,6 +679,9 @@ static void lan966x_init(struct lan966x *lan966x)
/* MAC table initialization */
lan966x_mac_init(lan966x);

+ /* Vlan initialization */
+ lan966x_vlan_init(lan966x);
+
/* Flush queues */
lan_wr(lan_rd(lan966x, QS_XTR_FLUSH) |
GENMASK(1, 0),
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
index fcd5d09a070c..ec3eccf634b3 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
@@ -4,6 +4,7 @@
#define __LAN966X_MAIN_H__

#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
#include <linux/jiffies.h>
#include <linux/phy.h>
#include <linux/phylink.h>
@@ -22,7 +23,8 @@
#define PGID_SRC 80
#define PGID_ENTRIES 89

-#define PORT_PVID 0
+#define UNAWARE_PVID 0
+#define HOST_PVID 4095

/* Reserved amount for (SRC, PRIO) at index 8*SRC + PRIO */
#define QSYS_Q_RSRV 95
@@ -78,6 +80,9 @@ struct lan966x {
struct list_head mac_entries;
spinlock_t mac_lock; /* lock for mac_entries list */

+ u16 vlan_mask[VLAN_N_VID];
+ DECLARE_BITMAP(cpu_vlan_mask, VLAN_N_VID);
+
/* stats */
const struct lan966x_stat_layout *stats_layout;
u32 num_stats;
@@ -109,6 +114,8 @@ struct lan966x_port {

u8 chip_port;
u16 pvid;
+ u16 vid;
+ u8 vlan_aware;

struct phylink_config phylink_config;
struct phylink_pcs phylink_pcs;
@@ -157,6 +164,37 @@ int lan966x_mac_add_entry(struct lan966x *lan966x,
void lan966x_mac_purge_entries(struct lan966x *lan966x);
irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x);

+void lan966x_vlan_init(struct lan966x *lan966x);
+void lan966x_vlan_port_apply(struct lan966x_port *port);
+
+int lan966x_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid);
+int lan966x_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid);
+
+void lan966x_vlan_mode(struct lan966x_port *port, netdev_features_t features);
+u16 lan966x_vlan_port_get_pvid(struct lan966x_port *port);
+
+bool lan966x_vlan_port_member_vlan_mask(struct lan966x_port *port, u16 vid);
+bool lan966x_vlan_cpu_member_vlan_mask(struct lan966x *lan966x, u16 vid);
+bool lan966x_vlan_port_any_vlan_mask(struct lan966x *lan966x, u16 vid);
+
+void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port);
+void lan966x_vlan_port_set_vlan_aware(struct lan966x_port *port,
+ bool vlan_aware);
+int lan966x_vlan_port_set_vid(struct lan966x_port *port, u16 vid,
+ bool pvid, bool untagged);
+int lan966x_vlan_port_add_vlan(struct lan966x_port *port,
+ u16 vid,
+ bool pvid,
+ bool untagged);
+int lan966x_vlan_port_del_vlan(struct lan966x_port *port,
+ u16 vid);
+int lan966x_vlan_cpu_add_vlan(struct lan966x *lan966x,
+ struct net_device *dev,
+ u16 vid);
+int lan966x_vlan_cpu_del_vlan(struct lan966x *lan966x,
+ struct net_device *dev,
+ u16 vid);
+
static inline void __iomem *lan_addr(void __iomem *base[],
int id, int tinst, int tcnt,
int gbase, int ginst,
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
new file mode 100644
index 000000000000..e47552775d06
--- /dev/null
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
@@ -0,0 +1,436 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+#include "lan966x_main.h"
+
+#define VLANACCESS_CMD_IDLE 0
+#define VLANACCESS_CMD_READ 1
+#define VLANACCESS_CMD_WRITE 2
+#define VLANACCESS_CMD_INIT 3
+
+static int lan966x_vlan_get_status(struct lan966x *lan966x)
+{
+ return lan_rd(lan966x, ANA_VLANACCESS);
+}
+
+static int lan966x_vlan_wait_for_completion(struct lan966x *lan966x)
+{
+ u32 val;
+
+ return readx_poll_timeout(lan966x_vlan_get_status,
+ lan966x, val,
+ (val & ANA_VLANACCESS_VLAN_TBL_CMD) ==
+ VLANACCESS_CMD_IDLE,
+ TABLE_UPDATE_SLEEP_US, TABLE_UPDATE_TIMEOUT_US);
+}
+
+static int lan966x_vlan_set_mask(struct lan966x *lan966x, u16 vid)
+{
+ u16 mask = lan966x->vlan_mask[vid];
+ bool cpu_dis;
+
+ cpu_dis = !(mask & BIT(CPU_PORT));
+
+ /* Set flags and the VID to configure */
+ lan_rmw(ANA_VLANTIDX_VLAN_PGID_CPU_DIS_SET(cpu_dis) |
+ ANA_VLANTIDX_V_INDEX_SET(vid),
+ ANA_VLANTIDX_VLAN_PGID_CPU_DIS |
+ ANA_VLANTIDX_V_INDEX,
+ lan966x, ANA_VLANTIDX);
+
+ /* Set the vlan port members mask */
+ lan_rmw(ANA_VLAN_PORT_MASK_VLAN_PORT_MASK_SET(mask),
+ ANA_VLAN_PORT_MASK_VLAN_PORT_MASK,
+ lan966x, ANA_VLAN_PORT_MASK);
+
+ /* Issue a write command */
+ lan_rmw(ANA_VLANACCESS_VLAN_TBL_CMD_SET(VLANACCESS_CMD_WRITE),
+ ANA_VLANACCESS_VLAN_TBL_CMD,
+ lan966x, ANA_VLANACCESS);
+
+ return lan966x_vlan_wait_for_completion(lan966x);
+}
+
+void lan966x_vlan_init(struct lan966x *lan966x)
+{
+ u16 port, vid;
+
+ /* Clear VLAN table, by default all ports are members of all VLANS */
+ lan_rmw(ANA_VLANACCESS_VLAN_TBL_CMD_SET(VLANACCESS_CMD_INIT),
+ ANA_VLANACCESS_VLAN_TBL_CMD,
+ lan966x, ANA_VLANACCESS);
+ lan966x_vlan_wait_for_completion(lan966x);
+
+ for (vid = 1; vid < VLAN_N_VID; vid++) {
+ lan966x->vlan_mask[vid] = 0;
+ lan966x_vlan_set_mask(lan966x, vid);
+ }
+
+ /* Set all the ports + cpu to be part of HOST_PVID and UNAWARE_PVID */
+ lan966x->vlan_mask[HOST_PVID] =
+ GENMASK(lan966x->num_phys_ports - 1, 0) | BIT(CPU_PORT);
+ lan966x_vlan_set_mask(lan966x, HOST_PVID);
+
+ lan966x->vlan_mask[UNAWARE_PVID] =
+ GENMASK(lan966x->num_phys_ports - 1, 0) | BIT(CPU_PORT);
+ lan966x_vlan_set_mask(lan966x, UNAWARE_PVID);
+
+ /* Configure the CPU port to be vlan aware */
+ lan_wr(ANA_VLAN_CFG_VLAN_VID_SET(0) |
+ ANA_VLAN_CFG_VLAN_AWARE_ENA_SET(1) |
+ ANA_VLAN_CFG_VLAN_POP_CNT_SET(1),
+ lan966x, ANA_VLAN_CFG(CPU_PORT));
+
+ /* Set vlan ingress filter mask to all ports */
+ lan_wr(GENMASK(lan966x->num_phys_ports, 0),
+ lan966x, ANA_VLANMASK);
+
+ for (port = 0; port < lan966x->num_phys_ports; port++) {
+ lan_wr(0, lan966x, REW_PORT_VLAN_CFG(port));
+ lan_wr(0, lan966x, REW_TAG_CFG(port));
+ }
+}
+
+static int lan966x_vlan_port_add_vlan_mask(struct lan966x_port *port, u16 vid)
+{
+ struct lan966x *lan966x = port->lan966x;
+ u8 p = port->chip_port;
+
+ lan966x->vlan_mask[vid] |= BIT(p);
+ return lan966x_vlan_set_mask(lan966x, vid);
+}
+
+static int lan966x_vlan_port_del_vlan_mask(struct lan966x_port *port, u16 vid)
+{
+ struct lan966x *lan966x = port->lan966x;
+ u8 p = port->chip_port;
+
+ lan966x->vlan_mask[vid] &= ~BIT(p);
+ return lan966x_vlan_set_mask(lan966x, vid);
+}
+
+bool lan966x_vlan_port_member_vlan_mask(struct lan966x_port *port, u16 vid)
+{
+ struct lan966x *lan966x = port->lan966x;
+ u8 p = port->chip_port;
+
+ return lan966x->vlan_mask[vid] & BIT(p);
+}
+
+bool lan966x_vlan_port_any_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ return !!(lan966x->vlan_mask[vid] & ~BIT(CPU_PORT));
+}
+
+static int lan966x_vlan_cpu_add_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ lan966x->vlan_mask[vid] |= BIT(CPU_PORT);
+ return lan966x_vlan_set_mask(lan966x, vid);
+}
+
+static int lan966x_vlan_cpu_del_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ lan966x->vlan_mask[vid] &= ~BIT(CPU_PORT);
+ return lan966x_vlan_set_mask(lan966x, vid);
+}
+
+bool lan966x_vlan_cpu_member_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ return lan966x->vlan_mask[vid] & BIT(CPU_PORT);
+}
+
+static void lan966x_vlan_cpu_add_cpu_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ set_bit(vid, lan966x->cpu_vlan_mask);
+}
+
+static void lan966x_vlan_cpu_del_cpu_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ clear_bit(vid, lan966x->cpu_vlan_mask);
+}
+
+static bool lan966x_vlan_cpu_member_cpu_vlan_mask(struct lan966x *lan966x, u16 vid)
+{
+ return test_bit(vid, lan966x->cpu_vlan_mask);
+}
+
+u16 lan966x_vlan_port_get_pvid(struct lan966x_port *port)
+{
+ return port->vlan_aware ? port->pvid : UNAWARE_PVID;
+}
+
+int lan966x_vlan_port_set_vid(struct lan966x_port *port, u16 vid,
+ bool pvid, bool untagged)
+{
+ struct lan966x *lan966x = port->lan966x;
+
+ /* Egress vlan classification */
+ if (untagged && port->vid != vid) {
+ if (port->vid) {
+ dev_err(lan966x->dev,
+ "Port already has a native VLAN: %d\n",
+ port->vid);
+ return -EBUSY;
+ }
+ port->vid = vid;
+ }
+
+ /* Default ingress vlan classification */
+ if (pvid)
+ port->pvid = vid;
+
+ return 0;
+}
+
+static int lan966x_vlan_port_remove_vid(struct lan966x_port *port, u16 vid)
+{
+ if (port->pvid == vid)
+ port->pvid = 0;
+
+ if (port->vid == vid)
+ port->vid = 0;
+
+ return 0;
+}
+
+void lan966x_vlan_port_set_vlan_aware(struct lan966x_port *port,
+ bool vlan_aware)
+{
+ port->vlan_aware = vlan_aware;
+}
+
+void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
+{
+ struct lan966x *lan966x = port->lan966x;
+
+ if (!port->vlan_aware) {
+ /* In case of vlan unaware, all the ports will be set in
+ * UNAWARE_PVID and have their PVID set to this PVID
+ * The CPU doesn't need to be added because it is always part of
+ * that vlan, it is required just to add entries in the MAC
+ * table for the front port and the CPU
+ */
+ lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, UNAWARE_PVID);
+
+ lan966x_vlan_port_add_vlan_mask(port, UNAWARE_PVID);
+ lan966x_vlan_port_apply(port);
+ } else {
+ /* In case of vlan aware, just clear what happened when changed
+ * to vlan unaware
+ */
+ lan966x_mac_cpu_forget(lan966x, port->dev->dev_addr, UNAWARE_PVID);
+
+ lan966x_vlan_port_del_vlan_mask(port, UNAWARE_PVID);
+ lan966x_vlan_port_apply(port);
+ }
+}
+
+void lan966x_vlan_port_apply(struct lan966x_port *port)
+{
+ struct lan966x *lan966x = port->lan966x;
+ u16 pvid;
+ u32 val;
+
+ pvid = lan966x_vlan_port_get_pvid(port);
+
+ /* Ingress clasification (ANA_PORT_VLAN_CFG) */
+ /* Default vlan to casify for untagged frames (may be zero) */
+ val = ANA_VLAN_CFG_VLAN_VID_SET(pvid);
+ if (port->vlan_aware)
+ val |= ANA_VLAN_CFG_VLAN_AWARE_ENA_SET(1) |
+ ANA_VLAN_CFG_VLAN_POP_CNT_SET(1);
+
+ lan_rmw(val,
+ ANA_VLAN_CFG_VLAN_VID | ANA_VLAN_CFG_VLAN_AWARE_ENA |
+ ANA_VLAN_CFG_VLAN_POP_CNT,
+ lan966x, ANA_VLAN_CFG(port->chip_port));
+
+ /* Drop frames with multicast source address */
+ val = ANA_DROP_CFG_DROP_MC_SMAC_ENA_SET(1);
+ if (port->vlan_aware && !pvid)
+ /* If port is vlan-aware and tagged, drop untagged and priority
+ * tagged frames.
+ */
+ val |= ANA_DROP_CFG_DROP_UNTAGGED_ENA_SET(1) |
+ ANA_DROP_CFG_DROP_PRIO_S_TAGGED_ENA_SET(1) |
+ ANA_DROP_CFG_DROP_PRIO_C_TAGGED_ENA_SET(1);
+
+ lan_wr(val, lan966x, ANA_DROP_CFG(port->chip_port));
+
+ /* Egress configuration (REW_TAG_CFG): VLAN tag type to 8021Q */
+ val = REW_TAG_CFG_TAG_TPID_CFG_SET(0);
+ if (port->vlan_aware) {
+ if (port->vid)
+ /* Tag all frames except when VID == DEFAULT_VLAN */
+ val |= REW_TAG_CFG_TAG_CFG_SET(1);
+ else
+ val |= REW_TAG_CFG_TAG_CFG_SET(3);
+ }
+
+ /* Update only some bits in the register */
+ lan_rmw(val,
+ REW_TAG_CFG_TAG_TPID_CFG | REW_TAG_CFG_TAG_CFG,
+ lan966x, REW_TAG_CFG(port->chip_port));
+
+ /* Set default VLAN and tag type to 8021Q */
+ lan_rmw(REW_PORT_VLAN_CFG_PORT_TPID_SET(ETH_P_8021Q) |
+ REW_PORT_VLAN_CFG_PORT_VID_SET(port->vid),
+ REW_PORT_VLAN_CFG_PORT_TPID |
+ REW_PORT_VLAN_CFG_PORT_VID,
+ lan966x, REW_PORT_VLAN_CFG(port->chip_port));
+}
+
+int lan966x_vlan_port_add_vlan(struct lan966x_port *port,
+ u16 vid,
+ bool pvid,
+ bool untagged)
+{
+ struct lan966x *lan966x = port->lan966x;
+
+ /* If the CPU(br) is already part of the vlan then add the MAC
+ * address of the device in MAC table to copy the frames to the
+ * CPU(br). If the CPU(br) is not part of the vlan then it would
+ * just drop the frames.
+ */
+ if (lan966x_vlan_cpu_member_cpu_vlan_mask(lan966x, vid)) {
+ lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, vid);
+ lan966x_vlan_cpu_add_vlan_mask(lan966x, vid);
+ }
+
+ lan966x_vlan_port_set_vid(port, vid, pvid, untagged);
+ lan966x_vlan_port_add_vlan_mask(port, vid);
+ lan966x_vlan_port_apply(port);
+
+ return 0;
+}
+
+int lan966x_vlan_port_del_vlan(struct lan966x_port *port,
+ u16 vid)
+{
+ struct lan966x *lan966x = port->lan966x;
+
+ /* In case the CPU(br) is part of the vlan then remove the MAC entry
+ * because frame doesn't need to reach to CPU
+ */
+ if (lan966x_vlan_cpu_member_cpu_vlan_mask(lan966x, vid))
+ lan966x_mac_cpu_forget(lan966x, port->dev->dev_addr, vid);
+
+ lan966x_vlan_port_remove_vid(port, vid);
+ lan966x_vlan_port_del_vlan_mask(port, vid);
+ lan966x_vlan_port_apply(port);
+
+ /* In case there are no other ports in vlan then remove the CPU from
+ * that vlan but still keep it in the mask because it may be needed
+ * again then another port gets added in tha vlan
+ */
+ if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid))
+ lan966x_vlan_cpu_del_vlan_mask(lan966x, vid);
+
+ return 0;
+}
+
+int lan966x_vlan_cpu_add_vlan(struct lan966x *lan966x,
+ struct net_device *dev,
+ u16 vid)
+{
+ int p;
+
+ /* Iterate over the ports and see which ones are part of the
+ * vlan and for those ports add entry in the MAC table to
+ * copy the frames to the CPU
+ */
+ for (p = 0; p < lan966x->num_phys_ports; p++) {
+ struct lan966x_port *port = lan966x->ports[p];
+
+ if (!port ||
+ !lan966x_vlan_port_member_vlan_mask(port, vid))
+ continue;
+
+ lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, vid);
+ }
+
+ /* Add an entry in the MAC table for the CPU */
+ if (lan966x_vlan_port_any_vlan_mask(lan966x, vid))
+ lan966x_mac_cpu_learn(lan966x, dev->dev_addr, vid);
+
+ /* Add the CPU part of the vlan only if there is another port in that
+ * vlan otherwise all the broadcast frames in that vlan will go to CPU
+ * even if none of the ports are in the vlan and then the CPU will just
+ * need to discard these frames. It is required to store this
+ * information so when a front port is added then it would add also the
+ * CPU port.
+ */
+ if (lan966x_vlan_port_any_vlan_mask(lan966x, vid))
+ lan966x_vlan_cpu_add_vlan_mask(lan966x, vid);
+
+ lan966x_vlan_cpu_add_cpu_vlan_mask(lan966x, vid);
+
+ return 0;
+}
+
+int lan966x_vlan_cpu_del_vlan(struct lan966x *lan966x,
+ struct net_device *dev,
+ u16 vid)
+{
+ int p;
+
+ /* Iterate over the ports and see which ones are part of the
+ * vlan and for those ports remove entry in the MAC table to
+ * copy the frames to the CPU
+ */
+ for (p = 0; p < lan966x->num_phys_ports; p++) {
+ struct lan966x_port *port = lan966x->ports[p];
+
+ if (!port ||
+ !lan966x_vlan_port_member_vlan_mask(port, vid))
+ continue;
+
+ lan966x_mac_cpu_forget(lan966x, port->dev->dev_addr, vid);
+ }
+
+ /* Remove an entry in the MAC table for the CPU */
+ lan966x_mac_cpu_forget(lan966x, dev->dev_addr, vid);
+
+ /* Remove the CPU part of the vlan */
+ lan966x_vlan_cpu_del_cpu_vlan_mask(lan966x, vid);
+ lan966x_vlan_cpu_del_vlan_mask(lan966x, vid);
+
+ return 0;
+}
+
+int lan966x_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid)
+{
+ struct lan966x_port *port = netdev_priv(dev);
+
+ lan966x_vlan_port_set_vid(port, vid, false, false);
+ lan966x_vlan_port_add_vlan_mask(port, vid);
+ lan966x_vlan_port_apply(port);
+
+ return 0;
+}
+
+int lan966x_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
+ u16 vid)
+{
+ struct lan966x_port *port = netdev_priv(dev);
+
+ lan966x_vlan_port_remove_vid(port, vid);
+ lan966x_vlan_port_del_vlan_mask(port, vid);
+ lan966x_vlan_port_apply(port);
+
+ return 0;
+}
+
+void lan966x_vlan_mode(struct lan966x_port *port,
+ netdev_features_t features)
+{
+ struct lan966x *lan966x = port->lan966x;
+ u32 val;
+
+ /* Filtering */
+ val = lan_rd(lan966x, ANA_VLANMASK);
+ if (features & NETIF_F_HW_VLAN_CTAG_FILTER)
+ val |= BIT(port->chip_port);
+ else
+ val &= ~BIT(port->chip_port);
+ lan_wr(val, lan966x, ANA_VLANMASK);
+}
--
2.33.0


2021-12-09 09:46:46

by Horatiu Vultur

[permalink] [raw]
Subject: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

This adds support for switchdev in lan966x.
It offloads to the HW basic forwarding and vlan filtering. To be able to
offload this to the HW, it is required to disable promisc mode for ports
that are part of the bridge.

Signed-off-by: Horatiu Vultur <[email protected]>
---
.../net/ethernet/microchip/lan966x/Makefile | 3 +-
.../ethernet/microchip/lan966x/lan966x_main.c | 41 +-
.../ethernet/microchip/lan966x/lan966x_main.h | 18 +
.../microchip/lan966x/lan966x_switchdev.c | 548 ++++++++++++++++++
.../ethernet/microchip/lan966x/lan966x_vlan.c | 12 +-
5 files changed, 610 insertions(+), 12 deletions(-)
create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c

diff --git a/drivers/net/ethernet/microchip/lan966x/Makefile b/drivers/net/ethernet/microchip/lan966x/Makefile
index f7e6068a91cb..d82e896c2e53 100644
--- a/drivers/net/ethernet/microchip/lan966x/Makefile
+++ b/drivers/net/ethernet/microchip/lan966x/Makefile
@@ -6,4 +6,5 @@
obj-$(CONFIG_LAN966X_SWITCH) += lan966x-switch.o

lan966x-switch-objs := lan966x_main.o lan966x_phylink.o lan966x_port.o \
- lan966x_mac.o lan966x_ethtool.o lan966x_vlan.o
+ lan966x_mac.o lan966x_ethtool.o lan966x_switchdev.o \
+ lan966x_vlan.o
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
index 1b4c7e6b4f85..aee36c1cfa17 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
@@ -306,7 +306,7 @@ static int lan966x_port_xmit(struct sk_buff *skb, struct net_device *dev)
return lan966x_port_ifh_xmit(skb, ifh, dev);
}

-static void lan966x_set_promisc(struct lan966x_port *port, bool enable)
+void lan966x_set_promisc(struct lan966x_port *port, bool enable)
{
struct lan966x *lan966x = port->lan966x;

@@ -318,14 +318,18 @@ static void lan966x_set_promisc(struct lan966x_port *port, bool enable)
static void lan966x_port_change_rx_flags(struct net_device *dev, int flags)
{
struct lan966x_port *port = netdev_priv(dev);
+ bool enable;

if (!(flags & IFF_PROMISC))
return;

- if (dev->flags & IFF_PROMISC)
- lan966x_set_promisc(port, true);
- else
- lan966x_set_promisc(port, false);
+ enable = dev->flags & IFF_PROMISC ? true : false;
+ port->promisc = enable;
+
+ if (port->bridge)
+ return;
+
+ lan966x_set_promisc(port, enable);
}

static int lan966x_port_change_mtu(struct net_device *dev, int new_mtu)
@@ -340,7 +344,7 @@ static int lan966x_port_change_mtu(struct net_device *dev, int new_mtu)
return 0;
}

-static int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
+int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
{
struct lan966x_port *port = netdev_priv(dev);
struct lan966x *lan966x = port->lan966x;
@@ -348,7 +352,7 @@ static int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
return lan966x_mac_forget(lan966x, addr, port->pvid, ENTRYTYPE_LOCKED);
}

-static int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr)
+int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr)
{
struct lan966x_port *port = netdev_priv(dev);
struct lan966x *lan966x = port->lan966x;
@@ -401,6 +405,11 @@ static const struct net_device_ops lan966x_port_netdev_ops = {
.ndo_vlan_rx_kill_vid = lan966x_vlan_rx_kill_vid,
};

+bool lan966x_netdevice_check(const struct net_device *dev)
+{
+ return dev && (dev->netdev_ops == &lan966x_port_netdev_ops);
+}
+
static int lan966x_port_xtr_status(struct lan966x *lan966x, u8 grp)
{
return lan_rd(lan966x, QS_XTR_RD(grp));
@@ -537,6 +546,11 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args)

skb->protocol = eth_type_trans(skb, dev);

+#ifdef CONFIG_NET_SWITCHDEV
+ if (lan966x->ports[src_port]->bridge)
+ skb->offload_fwd_mark = 1;
+#endif
+
netif_rx_ni(skb);
dev->stats.rx_bytes += len;
dev->stats.rx_packets++;
@@ -619,13 +633,16 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p,

dev->netdev_ops = &lan966x_port_netdev_ops;
dev->ethtool_ops = &lan966x_ethtool_ops;
+ dev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER |
+ NETIF_F_RXFCS;
+ dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER |
+ NETIF_F_HW_VLAN_CTAG_TX |
+ NETIF_F_HW_VLAN_STAG_TX;
+ dev->priv_flags |= IFF_UNICAST_FLT;
dev->needed_headroom = IFH_LEN * sizeof(u32);

eth_hw_addr_gen(dev, lan966x->base_mac, p + 1);

- lan966x_mac_learn(lan966x, PGID_CPU, dev->dev_addr, port->pvid,
- ENTRYTYPE_LOCKED);
-
port->phylink_config.dev = &port->dev->dev;
port->phylink_config.type = PHYLINK_NETDEV;
port->phylink_pcs.poll = true;
@@ -949,6 +966,8 @@ static int lan966x_probe(struct platform_device *pdev)
lan966x_port_init(lan966x->ports[p]);
}

+ lan966x_register_notifier_blocks(lan966x);
+
return 0;

cleanup_ports:
@@ -967,6 +986,8 @@ static int lan966x_remove(struct platform_device *pdev)
{
struct lan966x *lan966x = platform_get_drvdata(pdev);

+ lan966x_unregister_notifier_blocks(lan966x);
+
lan966x_cleanup_ports(lan966x);

cancel_delayed_work_sync(&lan966x->stats_work);
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
index ec3eccf634b3..4a0988087167 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
@@ -80,6 +80,11 @@ struct lan966x {
struct list_head mac_entries;
spinlock_t mac_lock; /* lock for mac_entries list */

+ /* Notifiers */
+ struct notifier_block netdevice_nb;
+ struct notifier_block switchdev_nb;
+ struct notifier_block switchdev_blocking_nb;
+
u16 vlan_mask[VLAN_N_VID];
DECLARE_BITMAP(cpu_vlan_mask, VLAN_N_VID);

@@ -112,6 +117,10 @@ struct lan966x_port {
struct net_device *dev;
struct lan966x *lan966x;

+ struct net_device *bridge;
+ u8 stp_state;
+ u8 promisc;
+
u8 chip_port;
u16 pvid;
u16 vid;
@@ -129,6 +138,14 @@ extern const struct phylink_mac_ops lan966x_phylink_mac_ops;
extern const struct phylink_pcs_ops lan966x_phylink_pcs_ops;
extern const struct ethtool_ops lan966x_ethtool_ops;

+int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr);
+int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr);
+
+bool lan966x_netdevice_check(const struct net_device *dev);
+
+int lan966x_register_notifier_blocks(struct lan966x *lan966x);
+void lan966x_unregister_notifier_blocks(struct lan966x *lan966x);
+
void lan966x_stats_get(struct net_device *dev,
struct rtnl_link_stats64 *stats);
int lan966x_stats_init(struct lan966x *lan966x);
@@ -139,6 +156,7 @@ void lan966x_port_status_get(struct lan966x_port *port,
struct phylink_link_state *state);
int lan966x_port_pcs_set(struct lan966x_port *port,
struct lan966x_port_config *config);
+void lan966x_set_promisc(struct lan966x_port *port, bool enable);
void lan966x_port_init(struct lan966x_port *port);

int lan966x_mac_learn(struct lan966x *lan966x, int port,
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
new file mode 100644
index 000000000000..ed6ec78d2d9a
--- /dev/null
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
@@ -0,0 +1,548 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+#include <linux/if_bridge.h>
+#include <net/switchdev.h>
+
+#include "lan966x_main.h"
+
+static struct workqueue_struct *lan966x_owq;
+
+struct lan966x_fdb_event_work {
+ struct work_struct work;
+ struct switchdev_notifier_fdb_info fdb_info;
+ struct net_device *dev;
+ struct lan966x *lan966x;
+ unsigned long event;
+};
+
+static void lan966x_port_attr_bridge_flags(struct lan966x_port *port,
+ struct switchdev_brport_flags flags)
+{
+ u32 val = lan_rd(port->lan966x, ANA_PGID(PGID_MC));
+
+ val = ANA_PGID_PGID_GET(val);
+
+ if (flags.mask & BR_MCAST_FLOOD) {
+ if (flags.val & BR_MCAST_FLOOD)
+ val |= BIT(port->chip_port);
+ else
+ val &= ~BIT(port->chip_port);
+ }
+
+ lan_rmw(ANA_PGID_PGID_SET(val),
+ ANA_PGID_PGID,
+ port->lan966x, ANA_PGID(PGID_MC));
+}
+
+static u32 lan966x_get_fwd_mask(struct lan966x_port *port)
+{
+ struct net_device *bridge = port->bridge;
+ struct lan966x *lan966x = port->lan966x;
+ u8 ingress_src = port->chip_port;
+ u32 mask = 0;
+ int p;
+
+ if (port->stp_state != BR_STATE_FORWARDING)
+ goto skip_forwarding;
+
+ for (p = 0; p < lan966x->num_phys_ports; p++) {
+ port = lan966x->ports[p];
+
+ if (!port)
+ continue;
+
+ if (port->stp_state == BR_STATE_FORWARDING &&
+ port->bridge == bridge)
+ mask |= BIT(p);
+ }
+
+skip_forwarding:
+ mask &= ~BIT(ingress_src);
+
+ return mask;
+}
+
+static void lan966x_update_fwd_mask(struct lan966x *lan966x)
+{
+ int p;
+
+ for (p = 0; p < lan966x->num_phys_ports; p++) {
+ struct lan966x_port *port = lan966x->ports[p];
+ unsigned long mask = 0;
+
+ if (port->bridge)
+ mask = lan966x_get_fwd_mask(port);
+
+ mask |= BIT(CPU_PORT);
+
+ lan_wr(ANA_PGID_PGID_SET(mask),
+ lan966x, ANA_PGID(PGID_SRC + p));
+ }
+}
+
+static void lan966x_attr_stp_state_set(struct lan966x_port *port,
+ u8 state)
+{
+ struct lan966x *lan966x = port->lan966x;
+ bool learn_ena = 0;
+
+ port->stp_state = state;
+
+ if (state == BR_STATE_FORWARDING || state == BR_STATE_LEARNING)
+ learn_ena = 1;
+
+ lan_rmw(ANA_PORT_CFG_LEARN_ENA_SET(learn_ena),
+ ANA_PORT_CFG_LEARN_ENA,
+ lan966x, ANA_PORT_CFG(port->chip_port));
+
+ lan966x_update_fwd_mask(lan966x);
+}
+
+static void lan966x_port_attr_ageing_set(struct lan966x_port *port,
+ unsigned long ageing_clock_t)
+{
+ unsigned long ageing_jiffies = clock_t_to_jiffies(ageing_clock_t);
+ u32 ageing_time = jiffies_to_msecs(ageing_jiffies) / 1000;
+
+ lan966x_mac_set_ageing(port->lan966x, ageing_time);
+}
+
+static int lan966x_port_attr_set(struct net_device *dev, const void *ctx,
+ const struct switchdev_attr *attr,
+ struct netlink_ext_ack *extack)
+{
+ struct lan966x_port *port = netdev_priv(dev);
+
+ switch (attr->id) {
+ case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
+ lan966x_port_attr_bridge_flags(port, attr->u.brport_flags);
+ break;
+ case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
+ lan966x_attr_stp_state_set(port, attr->u.stp_state);
+ break;
+ case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
+ lan966x_port_attr_ageing_set(port, attr->u.ageing_time);
+ break;
+ case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
+ lan966x_vlan_port_set_vlan_aware(port, attr->u.vlan_filtering);
+ lan966x_vlan_port_apply(port);
+ lan966x_vlan_cpu_set_vlan_aware(port);
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
+static int lan966x_port_bridge_join(struct lan966x_port *port,
+ struct net_device *bridge,
+ struct netlink_ext_ack *extack)
+{
+ struct net_device *dev = port->dev;
+ int err;
+
+ err = switchdev_bridge_port_offload(dev, dev, NULL, NULL, NULL,
+ false, extack);
+ if (err)
+ return err;
+
+ port->bridge = bridge;
+
+ /* Port enters in bridge mode therefor don't need to copy to CPU
+ * frames for multicast in case the bridge is not requesting them
+ */
+ __dev_mc_unsync(dev, lan966x_mc_unsync);
+
+ /* make sure that the promisc is disabled when entering under the bridge
+ * because we don't want all the frames to come to CPU
+ */
+ lan966x_set_promisc(port, false);
+
+ return 0;
+}
+
+static void lan966x_port_bridge_leave(struct lan966x_port *port,
+ struct net_device *bridge)
+{
+ struct lan966x *lan966x = port->lan966x;
+
+ switchdev_bridge_port_unoffload(port->dev, NULL, NULL, NULL);
+ port->bridge = NULL;
+
+ /* Set the port back to host mode */
+ lan966x_vlan_port_set_vlan_aware(port, 0);
+ lan966x_vlan_port_set_vid(port, HOST_PVID, false, false);
+ lan966x_vlan_port_apply(port);
+
+ lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, HOST_PVID);
+
+ /* Port enters in host more therefore restore mc list */
+ __dev_mc_sync(port->dev, lan966x_mc_sync, lan966x_mc_unsync);
+
+ /* Restore back the promisc as it was before the interfaces was added to
+ * the bridge
+ */
+ lan966x_set_promisc(port, port->promisc);
+}
+
+static int lan966x_port_changeupper(struct net_device *dev,
+ struct netdev_notifier_changeupper_info *info)
+{
+ struct lan966x_port *port = netdev_priv(dev);
+ struct netlink_ext_ack *extack;
+ int err = 0;
+
+ extack = netdev_notifier_info_to_extack(&info->info);
+
+ if (netif_is_bridge_master(info->upper_dev)) {
+ if (info->linking)
+ err = lan966x_port_bridge_join(port, info->upper_dev,
+ extack);
+ else
+ lan966x_port_bridge_leave(port, info->upper_dev);
+ }
+
+ return err;
+}
+
+static int lan966x_port_add_addr(struct net_device *dev, bool up)
+{
+ struct lan966x_port *port = netdev_priv(dev);
+ struct lan966x *lan966x = port->lan966x;
+ u16 vid;
+
+ vid = lan966x_vlan_port_get_pvid(port);
+
+ if (up)
+ lan966x_mac_cpu_learn(lan966x, dev->dev_addr, vid);
+ else
+ lan966x_mac_cpu_forget(lan966x, dev->dev_addr, vid);
+
+ return 0;
+}
+
+static int lan966x_netdevice_port_event(struct net_device *dev,
+ struct notifier_block *nb,
+ unsigned long event, void *ptr)
+{
+ int err = 0;
+
+ if (!lan966x_netdevice_check(dev))
+ return 0;
+
+ switch (event) {
+ case NETDEV_CHANGEUPPER:
+ err = lan966x_port_changeupper(dev, ptr);
+ break;
+ case NETDEV_PRE_UP:
+ err = lan966x_port_add_addr(dev, true);
+ break;
+ case NETDEV_DOWN:
+ err = lan966x_port_add_addr(dev, false);
+ break;
+ }
+
+ return err;
+}
+
+static int lan966x_netdevice_event(struct notifier_block *nb,
+ unsigned long event, void *ptr)
+{
+ struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+ int ret;
+
+ ret = lan966x_netdevice_port_event(dev, nb, event, ptr);
+
+ return notifier_from_errno(ret);
+}
+
+static void lan966x_fdb_event_work(struct work_struct *work)
+{
+ struct lan966x_fdb_event_work *fdb_work =
+ container_of(work, struct lan966x_fdb_event_work, work);
+ struct switchdev_notifier_fdb_info *fdb_info;
+ struct net_device *dev = fdb_work->dev;
+ struct lan966x_port *port;
+ struct lan966x *lan966x;
+
+ rtnl_lock();
+
+ fdb_info = &fdb_work->fdb_info;
+ lan966x = fdb_work->lan966x;
+
+ if (lan966x_netdevice_check(dev)) {
+ port = netdev_priv(dev);
+
+ switch (fdb_work->event) {
+ case SWITCHDEV_FDB_ADD_TO_DEVICE:
+ if (!fdb_info->added_by_user)
+ break;
+ lan966x_mac_add_entry(lan966x, port, fdb_info->addr,
+ fdb_info->vid);
+ break;
+ case SWITCHDEV_FDB_DEL_TO_DEVICE:
+ if (!fdb_info->added_by_user)
+ break;
+ lan966x_mac_del_entry(lan966x, fdb_info->addr, fdb_info->vid);
+ break;
+ }
+ } else {
+ if (!netif_is_bridge_master(dev))
+ goto out;
+
+ /* If the CPU is not part of the vlan then there is no point
+ * to copy the frames to the CPU because they will be dropped
+ */
+ if (!lan966x_vlan_cpu_member_vlan_mask(lan966x, fdb_info->vid))
+ goto out;
+
+ /* In case the bridge is called */
+ switch (fdb_work->event) {
+ case SWITCHDEV_FDB_ADD_TO_DEVICE:
+ /* If there is no front port in this vlan, there is no
+ * point to copy the frame to CPU because it would be
+ * just dropped at later point. So add it only if
+ * there is a port
+ */
+ if (!lan966x_vlan_port_any_vlan_mask(lan966x, fdb_info->vid))
+ break;
+
+ lan966x_mac_cpu_learn(lan966x, fdb_info->addr, fdb_info->vid);
+ break;
+ case SWITCHDEV_FDB_DEL_TO_DEVICE:
+ /* It is OK to always forget the entry it */
+ lan966x_mac_cpu_forget(lan966x, fdb_info->addr, fdb_info->vid);
+ break;
+ }
+ }
+
+out:
+ rtnl_unlock();
+ kfree(fdb_work->fdb_info.addr);
+ kfree(fdb_work);
+ dev_put(dev);
+}
+
+static int lan966x_switchdev_event(struct notifier_block *nb,
+ unsigned long event, void *ptr)
+{
+ struct lan966x *lan966x = container_of(nb, struct lan966x, switchdev_nb);
+ struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+ struct switchdev_notifier_fdb_info *fdb_info;
+ struct switchdev_notifier_info *info = ptr;
+ struct lan966x_fdb_event_work *fdb_work;
+ int err;
+
+ switch (event) {
+ case SWITCHDEV_PORT_ATTR_SET:
+ err = switchdev_handle_port_attr_set(dev, ptr,
+ lan966x_netdevice_check,
+ lan966x_port_attr_set);
+ return notifier_from_errno(err);
+ case SWITCHDEV_FDB_ADD_TO_DEVICE:
+ fallthrough;
+ case SWITCHDEV_FDB_DEL_TO_DEVICE:
+ fdb_work = kzalloc(sizeof(*fdb_work), GFP_ATOMIC);
+ if (!fdb_work)
+ return NOTIFY_BAD;
+
+ fdb_info = container_of(info,
+ struct switchdev_notifier_fdb_info,
+ info);
+
+ fdb_work->dev = dev;
+ fdb_work->lan966x = lan966x;
+ fdb_work->event = event;
+ INIT_WORK(&fdb_work->work, lan966x_fdb_event_work);
+ memcpy(&fdb_work->fdb_info, ptr, sizeof(fdb_work->fdb_info));
+ fdb_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
+ if (!fdb_work->fdb_info.addr)
+ goto err_addr_alloc;
+
+ ether_addr_copy((u8 *)fdb_work->fdb_info.addr, fdb_info->addr);
+ dev_hold(dev);
+
+ queue_work(lan966x_owq, &fdb_work->work);
+ break;
+ }
+
+ return NOTIFY_DONE;
+err_addr_alloc:
+ kfree(fdb_work);
+ return NOTIFY_BAD;
+}
+
+static int lan966x_handle_port_vlan_add(struct net_device *dev,
+ struct notifier_block *nb,
+ const struct switchdev_obj_port_vlan *v)
+{
+ struct lan966x_port *port;
+ struct lan966x *lan966x;
+
+ /* When adding a port to a vlan, we get a callback for the port but
+ * also for the bridge. When get the callback for the bridge just bail
+ * out. Then when the bridge is added to the vlan, then we get a
+ * callback here but in this case the flags has set:
+ * BRIDGE_VLAN_INFO_BRENTRY. In this case it means that the CPU
+ * port is added to the vlan, so the broadcast frames and unicast frames
+ * with dmac of the bridge should be foward to CPU.
+ */
+ if (netif_is_bridge_master(dev) &&
+ !(v->flags & BRIDGE_VLAN_INFO_BRENTRY))
+ return 0;
+
+ lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
+
+ /* In case the port gets called */
+ if (!(netif_is_bridge_master(dev))) {
+ if (!lan966x_netdevice_check(dev))
+ return -EOPNOTSUPP;
+
+ port = netdev_priv(dev);
+ return lan966x_vlan_port_add_vlan(port, v->vid,
+ v->flags & BRIDGE_VLAN_INFO_PVID,
+ v->flags & BRIDGE_VLAN_INFO_UNTAGGED);
+ }
+
+ /* In case the bridge gets called */
+ if (netif_is_bridge_master(dev))
+ return lan966x_vlan_cpu_add_vlan(lan966x, dev, v->vid);
+
+ return 0;
+}
+
+static int lan966x_handle_port_obj_add(struct net_device *dev,
+ struct notifier_block *nb,
+ struct switchdev_notifier_port_obj_info *info)
+{
+ const struct switchdev_obj *obj = info->obj;
+ int err;
+
+ switch (obj->id) {
+ case SWITCHDEV_OBJ_ID_PORT_VLAN:
+ err = lan966x_handle_port_vlan_add(dev, nb,
+ SWITCHDEV_OBJ_PORT_VLAN(obj));
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ info->handled = true;
+ return err;
+}
+
+static int lan966x_handle_port_vlan_del(struct net_device *dev,
+ struct notifier_block *nb,
+ const struct switchdev_obj_port_vlan *v)
+{
+ struct lan966x_port *port;
+ struct lan966x *lan966x;
+
+ lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
+
+ /* In case the physical port gets called */
+ if (!netif_is_bridge_master(dev)) {
+ if (!lan966x_netdevice_check(dev))
+ return -EOPNOTSUPP;
+
+ port = netdev_priv(dev);
+ return lan966x_vlan_port_del_vlan(port, v->vid);
+ }
+
+ /* In case the bridge gets called */
+ if (netif_is_bridge_master(dev))
+ return lan966x_vlan_cpu_del_vlan(lan966x, dev, v->vid);
+
+ return 0;
+}
+
+static int lan966x_handle_port_obj_del(struct net_device *dev,
+ struct notifier_block *nb,
+ struct switchdev_notifier_port_obj_info *info)
+{
+ const struct switchdev_obj *obj = info->obj;
+ int err;
+
+ switch (obj->id) {
+ case SWITCHDEV_OBJ_ID_PORT_VLAN:
+ err = lan966x_handle_port_vlan_del(dev, nb,
+ SWITCHDEV_OBJ_PORT_VLAN(obj));
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ info->handled = true;
+ return err;
+}
+
+static int lan966x_switchdev_blocking_event(struct notifier_block *nb,
+ unsigned long event,
+ void *ptr)
+{
+ struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+ int err;
+
+ switch (event) {
+ case SWITCHDEV_PORT_OBJ_ADD:
+ err = lan966x_handle_port_obj_add(dev, nb, ptr);
+ return notifier_from_errno(err);
+ case SWITCHDEV_PORT_OBJ_DEL:
+ err = lan966x_handle_port_obj_del(dev, nb, ptr);
+ return notifier_from_errno(err);
+ case SWITCHDEV_PORT_ATTR_SET:
+ err = switchdev_handle_port_attr_set(dev, ptr,
+ lan966x_netdevice_check,
+ lan966x_port_attr_set);
+ return notifier_from_errno(err);
+ }
+
+ return NOTIFY_DONE;
+}
+
+int lan966x_register_notifier_blocks(struct lan966x *lan966x)
+{
+ int err;
+
+ lan966x->netdevice_nb.notifier_call = lan966x_netdevice_event;
+ err = register_netdevice_notifier(&lan966x->netdevice_nb);
+ if (err)
+ return err;
+
+ lan966x->switchdev_nb.notifier_call = lan966x_switchdev_event;
+ err = register_switchdev_notifier(&lan966x->switchdev_nb);
+ if (err)
+ goto err_switchdev_nb;
+
+ lan966x->switchdev_blocking_nb.notifier_call = lan966x_switchdev_blocking_event;
+ err = register_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
+ if (err)
+ goto err_switchdev_blocking_nb;
+
+ lan966x_owq = alloc_ordered_workqueue("lan966x_order", 0);
+ if (!lan966x_owq) {
+ err = -ENOMEM;
+ goto err_switchdev_blocking_nb;
+ }
+
+ return 0;
+
+err_switchdev_blocking_nb:
+ unregister_switchdev_notifier(&lan966x->switchdev_nb);
+err_switchdev_nb:
+ unregister_netdevice_notifier(&lan966x->netdevice_nb);
+
+ return err;
+}
+
+void lan966x_unregister_notifier_blocks(struct lan966x *lan966x)
+{
+ destroy_workqueue(lan966x_owq);
+
+ unregister_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
+ unregister_switchdev_notifier(&lan966x->switchdev_nb);
+ unregister_netdevice_notifier(&lan966x->netdevice_nb);
+}
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
index e47552775d06..26644503b4e6 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
@@ -155,6 +155,9 @@ static bool lan966x_vlan_cpu_member_cpu_vlan_mask(struct lan966x *lan966x, u16 v

u16 lan966x_vlan_port_get_pvid(struct lan966x_port *port)
{
+ if (!port->bridge)
+ return HOST_PVID;
+
return port->vlan_aware ? port->pvid : UNAWARE_PVID;
}

@@ -210,6 +213,8 @@ void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
* table for the front port and the CPU
*/
lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, UNAWARE_PVID);
+ lan966x_mac_cpu_learn(lan966x, port->bridge->dev_addr,
+ UNAWARE_PVID);

lan966x_vlan_port_add_vlan_mask(port, UNAWARE_PVID);
lan966x_vlan_port_apply(port);
@@ -218,6 +223,8 @@ void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
* to vlan unaware
*/
lan966x_mac_cpu_forget(lan966x, port->dev->dev_addr, UNAWARE_PVID);
+ lan966x_mac_cpu_forget(lan966x, port->bridge->dev_addr,
+ UNAWARE_PVID);

lan966x_vlan_port_del_vlan_mask(port, UNAWARE_PVID);
lan966x_vlan_port_apply(port);
@@ -293,6 +300,7 @@ int lan966x_vlan_port_add_vlan(struct lan966x_port *port,
*/
if (lan966x_vlan_cpu_member_cpu_vlan_mask(lan966x, vid)) {
lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, vid);
+ lan966x_mac_cpu_learn(lan966x, port->bridge->dev_addr, vid);
lan966x_vlan_cpu_add_vlan_mask(lan966x, vid);
}

@@ -322,8 +330,10 @@ int lan966x_vlan_port_del_vlan(struct lan966x_port *port,
* that vlan but still keep it in the mask because it may be needed
* again then another port gets added in tha vlan
*/
- if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid))
+ if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid)) {
+ lan966x_mac_cpu_forget(lan966x, port->bridge->dev_addr, vid);
lan966x_vlan_cpu_del_vlan_mask(lan966x, vid);
+ }

return 0;
}
--
2.33.0


2021-12-09 10:59:14

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/6] dt-bindings: net: lan966x: Extend with the analyzer interrupt

On Thu, Dec 09, 2021 at 10:46:11AM +0100, Horatiu Vultur wrote:
> Extend dt-bindings for lan966x with analyzer interrupt.
> This interrupt can be generated for example when the HW learn/forgets
> an entry in the MAC table.
>
> Signed-off-by: Horatiu Vultur <[email protected]>
> ---

Reviewed-by: Vladimir Oltean <[email protected]>

Why don't you describe your hardware in the device tree all at once?
Doing it piece by piece means that every time when you add a new
functionality you need to be compatible with the absence of a certain
reg, interrupt etc.

> .../devicetree/bindings/net/microchip,lan966x-switch.yaml | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
> index 5bee665d5fcf..e79e4e166ad8 100644
> --- a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
> +++ b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
> @@ -37,12 +37,14 @@ properties:
> items:
> - description: register based extraction
> - description: frame dma based extraction
> + - description: analyzer interrupt
>
> interrupt-names:
> minItems: 1
> items:
> - const: xtr
> - const: fdma
> + - const: ana
>
> resets:
> items:
> --
> 2.33.0
>

2021-12-09 11:47:53

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 3/6] net: lan966x: add support for interrupts from analyzer

On Thu, Dec 09, 2021 at 10:46:12AM +0100, Horatiu Vultur wrote:
> This patch adds support for handling the interrupts generated by the
> analyzer. Currently, only the MAC table generates these interrupts.
> The MAC table will generate an interrupt whenever it learns or forgets
> an entry in the table. It is the SW responsibility figure out which
> entries were added/removed.
>
> Signed-off-by: Horatiu Vultur <[email protected]>
> ---
> .../ethernet/microchip/lan966x/lan966x_mac.c | 244 ++++++++++++++++++
> .../ethernet/microchip/lan966x/lan966x_main.c | 23 ++
> .../ethernet/microchip/lan966x/lan966x_main.h | 6 +
> 3 files changed, 273 insertions(+)
>
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
> index f6878b9f57ef..c01ab01bffbf 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
> @@ -1,5 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0+
>
> +#include <net/switchdev.h>
> #include "lan966x_main.h"
>
> #define LAN966X_MAC_COLUMNS 4
> @@ -13,6 +14,23 @@
> #define MACACCESS_CMD_WRITE 7
> #define MACACCESS_CMD_SYNC_GET_NEXT 8
>
> +#define LAN966X_MAC_INVALID_ROW -1
> +
> +struct lan966x_mac_entry {
> + struct list_head list;
> + unsigned char mac[ETH_ALEN] __aligned(2);
> + u16 vid;
> + u16 port_index;
> + int row;
> +};
> +
> +struct lan966x_mac_raw_entry {
> + u32 mach;
> + u32 macl;
> + u32 maca;
> + bool process;
> +};
> +
> static int lan966x_mac_get_status(struct lan966x *lan966x)
> {
> return lan_rd(lan966x, ANA_MACACCESS);
> @@ -98,4 +116,230 @@ void lan966x_mac_init(struct lan966x *lan966x)
> /* Clear the MAC table */
> lan_wr(MACACCESS_CMD_INIT, lan966x, ANA_MACACCESS);
> lan966x_mac_wait_for_completion(lan966x);
> +
> + spin_lock_init(&lan966x->mac_lock);
> + INIT_LIST_HEAD(&lan966x->mac_entries);
> +}
> +
> +static struct lan966x_mac_entry *lan966x_mac_alloc_entry(const unsigned char *mac,
> + u16 vid, u16 port_index)
> +{
> + struct lan966x_mac_entry *mac_entry;
> +
> + mac_entry = kzalloc(sizeof(*mac_entry), GFP_KERNEL);
> + if (!mac_entry)
> + return NULL;
> +
> + memcpy(mac_entry->mac, mac, ETH_ALEN);
> + mac_entry->vid = vid;
> + mac_entry->port_index = port_index;
> + mac_entry->row = LAN966X_MAC_INVALID_ROW;
> + return mac_entry;
> +}
> +
> +static void lan966x_fdb_call_notifiers(enum switchdev_notifier_type type,
> + const char *mac, u16 vid,
> + struct net_device *dev)
> +{
> + struct switchdev_notifier_fdb_info info = { 0 };
> +
> + info.addr = mac;
> + info.vid = vid;
> + info.offloaded = true;
> + call_switchdev_notifiers(type, dev, &info.info, NULL);
> +}
> +
> +void lan966x_mac_purge_entries(struct lan966x *lan966x)
> +{
> + struct lan966x_mac_entry *mac_entry, *tmp;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&lan966x->mac_lock, flags);

I hope I'm not wrong, but you are only using this spinlock to serialize
access to the list, which isn't accessed from hardirq context anywhere
(the irq is threaded). So spin_lock_irqsave could simply be spin_lock.
Unless...

> + list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries,
> + list) {
> + lan966x_mac_forget(lan966x, mac_entry->mac, mac_entry->vid,
> + ENTRYTYPE_LOCKED);

Does this generate a MAC table interrupt?

> +
> + list_del(&mac_entry->list);
> + kfree(mac_entry);
> + }
> + spin_unlock_irqrestore(&lan966x->mac_lock, flags);
> +}
> +
> +static void lan966x_mac_notifiers(struct lan966x *lan966x,
> + enum switchdev_notifier_type type,
> + unsigned char *mac, u32 vid,
> + struct net_device *dev)
> +{
> + rtnl_lock();
> + lan966x_fdb_call_notifiers(type, mac, vid, dev);
> + rtnl_unlock();
> +}
> +
> +static void lan966x_mac_process_raw_entry(struct lan966x_mac_raw_entry *raw_entry,
> + u8 *mac, u16 *vid, u32 *dest_idx)
> +{
> + mac[0] = (raw_entry->mach >> 8) & 0xff;
> + mac[1] = (raw_entry->mach >> 0) & 0xff;
> + mac[2] = (raw_entry->macl >> 24) & 0xff;
> + mac[3] = (raw_entry->macl >> 16) & 0xff;
> + mac[4] = (raw_entry->macl >> 8) & 0xff;
> + mac[5] = (raw_entry->macl >> 0) & 0xff;
> +
> + *vid = (raw_entry->mach >> 16) & 0xfff;
> + *dest_idx = ANA_MACACCESS_DEST_IDX_GET(raw_entry->maca);
> +}
> +
> +static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
> + struct lan966x_mac_raw_entry *raw_entries)
> +{
> + struct lan966x_mac_entry *mac_entry, *tmp;
> + char mac[ETH_ALEN] __aligned(2);

unsigned char

> + unsigned long flags;
> + u32 dest_idx;
> + u32 column;
> + u16 vid;
> +
> + spin_lock_irqsave(&lan966x->mac_lock, flags);
> + list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries, list) {
> + bool found = false;
> +
> + if (mac_entry->row != row)
> + continue;

When the MAC table gets large, you could consider keeping separate lists
per row. This way you can avoid traversing a list of elements you're
sure you don't care about.

> +
> + for (column = 0; column < LAN966X_MAC_COLUMNS; ++column) {
> + /* All the valid entries are at the start of the row,
> + * so when get one invalid entry it can just skip the
> + * rest of the columns
> + */
> + if (!ANA_MACACCESS_VALID_GET(raw_entries[column].maca))
> + break;
> +
> + lan966x_mac_process_raw_entry(&raw_entries[column],
> + mac, &vid, &dest_idx);
> + WARN_ON(dest_idx > lan966x->num_phys_ports);
> +
> + /* If the entry in SW is found, then there is nothing
> + * to do
> + */
> + if (mac_entry->vid == vid &&
> + ether_addr_equal(mac_entry->mac, mac) &&
> + mac_entry->port_index == dest_idx) {
> + raw_entries[column].process = true;
> + found = true;
> + break;
> + }
> + }
> +
> + if (!found) {
> + /* Notify the bridge that the entry doesn't exist
> + * anymore in the HW and remmove the entry from the SW

s/remmove/remove/

> + * list
> + */
> + lan966x_mac_notifiers(lan966x, SWITCHDEV_FDB_DEL_TO_BRIDGE,
> + mac_entry->mac, mac_entry->vid,
> + lan966x->ports[mac_entry->port_index]->dev);
> +
> + list_del(&mac_entry->list);
> + kfree(mac_entry);
> + }
> + }
> + spin_unlock_irqrestore(&lan966x->mac_lock, flags);
> +
> + /* Now go to the list of columns and see if any entry was not in the SW
> + * list, then that means that the entry is new so it needs to notify the
> + * bridge.
> + */
> + for (column = 0; column < LAN966X_MAC_COLUMNS; ++column) {
> + /* All the valid entries are at the start of the row, so when
> + * get one invalid entry it can just skip the rest of the columns
> + */
> + if (!ANA_MACACCESS_VALID_GET(raw_entries[column].maca))
> + break;
> +
> + /* If the entry already exists then don't do anything */
> + if (raw_entries[column].process)

s/process/processed/

> + continue;
> +
> + lan966x_mac_process_raw_entry(&raw_entries[column],
> + mac, &vid, &dest_idx);
> + WARN_ON(dest_idx > lan966x->num_phys_ports);
> +
> + mac_entry = lan966x_mac_alloc_entry(mac, vid, dest_idx);
> + if (!mac_entry)
> + return;
> +
> + mac_entry->row = row;
> +
> + spin_lock_irqsave(&lan966x->mac_lock, flags);
> + list_add_tail(&mac_entry->list, &lan966x->mac_entries);
> + spin_unlock_irqrestore(&lan966x->mac_lock, flags);

spin_lock_irqsave shouldn't be necessary from an irq handler.

> +
> + lan966x_mac_notifiers(lan966x, SWITCHDEV_FDB_ADD_TO_BRIDGE,
> + mac, vid, lan966x->ports[dest_idx]->dev);
> + }
> +}
> +
> +irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x)
> +{
> + struct lan966x_mac_raw_entry entry[LAN966X_MAC_COLUMNS] = { 0 };
> + u32 index, column;
> + bool stop = true;
> + u32 val;
> +
> + /* Check if the mac table triggered this, if not just bail out */
> + if (!(ANA_ANAINTR_INTR_GET(lan_rd(lan966x, ANA_ANAINTR))))
> + return IRQ_NONE;

The interrupt isn't shared, so if we enter this condition, it means the
analyzer block generated it, just not the MAC table portion of it.
If we return IRQ_NONE there will be an IRQ storm because that condition
will never go away. Could we ack the interrupt and return IRQ_HANDLED?

> +
> + /* Start the scan from 0, 0 */
> + lan_wr(ANA_MACTINDX_M_INDEX_SET(0) |
> + ANA_MACTINDX_BUCKET_SET(0),
> + lan966x, ANA_MACTINDX);
> +
> + while (1) {
> + lan_rmw(ANA_MACACCESS_MAC_TABLE_CMD_SET(MACACCESS_CMD_SYNC_GET_NEXT),
> + ANA_MACACCESS_MAC_TABLE_CMD,
> + lan966x, ANA_MACACCESS);
> + lan966x_mac_wait_for_completion(lan966x);
> +
> + val = lan_rd(lan966x, ANA_MACTINDX);
> + index = ANA_MACTINDX_M_INDEX_GET(val);
> + column = ANA_MACTINDX_BUCKET_GET(val);
> +
> + /* The SYNC-GET-NEXT returns all the entries(4) in a row in
> + * which is suffered a change. By change it means that new entry
> + * was added or an entry was removed because of ageing.
> + * It would return all the columns for that row. And after that
> + * it would return the next row The stop conditions of the
> + * SYNC-GET-NEXT is when it reaches 'directly' to row 0
> + * column 3. So if SYNC-GET-NEXT returns row 0 and column 0
> + * then it is required to continue to read more even if it
> + * reaches row 0 and column 3.
> + */
> + if (index == 0 && column == 0)
> + stop = false;
> +
> + if (column == LAN966X_MAC_COLUMNS - 1 &&
> + index == 0 && stop)
> + break;
> +
> + entry[column].mach = lan_rd(lan966x, ANA_MACHDATA);
> + entry[column].macl = lan_rd(lan966x, ANA_MACLDATA);
> + entry[column].maca = lan_rd(lan966x, ANA_MACACCESS);
> +
> + /* Once all the columns are read process them */
> + if (column == LAN966X_MAC_COLUMNS - 1) {
> + lan966x_mac_irq_process(lan966x, index, entry);
> + /* A row was processed so it is safe to assume that the
> + * next row/column can be the stop condition
> + */
> + stop = true;
> + }
> + }
> +
> + lan_rmw(ANA_ANAINTR_INTR_SET(0),
> + ANA_ANAINTR_INTR,
> + lan966x, ANA_ANAINTR);
> +
> + return IRQ_HANDLED;
> }
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> index 101c1f005baf..7c6d6293611a 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> @@ -527,6 +527,13 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args)
> return IRQ_HANDLED;
> }
>
> +static irqreturn_t lan966x_ana_irq_handler(int irq, void *args)
> +{
> + struct lan966x *lan966x = args;
> +
> + return lan966x_mac_irq_handler(lan966x);
> +}
> +
> static void lan966x_cleanup_ports(struct lan966x *lan966x)
> {
> struct lan966x_port *port;
> @@ -554,6 +561,11 @@ static void lan966x_cleanup_ports(struct lan966x *lan966x)
>
> disable_irq(lan966x->xtr_irq);
> lan966x->xtr_irq = -ENXIO;
> +
> + if (lan966x->ana_irq) {
> + disable_irq(lan966x->ana_irq);
> + lan966x->ana_irq = -ENXIO;
> + }
> }
>
> static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
> @@ -870,6 +882,15 @@ static int lan966x_probe(struct platform_device *pdev)
> return -ENODEV;
> }
>
> + lan966x->ana_irq = platform_get_irq_byname(pdev, "ana");
> + if (lan966x->ana_irq) {
> + err = devm_request_threaded_irq(&pdev->dev, lan966x->ana_irq, NULL,
> + lan966x_ana_irq_handler, IRQF_ONESHOT,
> + "ana irq", lan966x);
> + if (err)
> + return dev_err_probe(&pdev->dev, err, "Unable to use ana irq");
> + }
> +
> /* init switch */
> lan966x_init(lan966x);
> lan966x_stats_init(lan966x);
> @@ -923,6 +944,8 @@ static int lan966x_remove(struct platform_device *pdev)
> destroy_workqueue(lan966x->stats_queue);
> mutex_destroy(&lan966x->stats_lock);
>
> + lan966x_mac_purge_entries(lan966x);
> +
> return 0;
> }
>
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> index 7e5a3b6f168d..ba548d65b58a 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> @@ -75,6 +75,9 @@ struct lan966x {
>
> u8 base_mac[ETH_ALEN];
>
> + struct list_head mac_entries;
> + spinlock_t mac_lock; /* lock for mac_entries list */
> +
> /* stats */
> const struct lan966x_stat_layout *stats_layout;
> u32 num_stats;
> @@ -87,6 +90,7 @@ struct lan966x {
>
> /* interrupts */
> int xtr_irq;
> + int ana_irq;
> };
>
> struct lan966x_port_config {
> @@ -141,6 +145,8 @@ int lan966x_mac_forget(struct lan966x *lan966x,
> int lan966x_mac_cpu_learn(struct lan966x *lan966x, const char *addr, u16 vid);
> int lan966x_mac_cpu_forget(struct lan966x *lan966x, const char *addr, u16 vid);
> void lan966x_mac_init(struct lan966x *lan966x);
> +void lan966x_mac_purge_entries(struct lan966x *lan966x);
> +irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x);
>
> static inline void __iomem *lan_addr(void __iomem *base[],
> int id, int tinst, int tcnt,
> --
> 2.33.0
>

2021-12-09 13:36:24

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Thu, Dec 09, 2021 at 10:46:15AM +0100, Horatiu Vultur wrote:
> This adds support for switchdev in lan966x.
> It offloads to the HW basic forwarding and vlan filtering. To be able to
> offload this to the HW, it is required to disable promisc mode for ports
> that are part of the bridge.
>
> Signed-off-by: Horatiu Vultur <[email protected]>
> ---
> .../net/ethernet/microchip/lan966x/Makefile | 3 +-
> .../ethernet/microchip/lan966x/lan966x_main.c | 41 +-
> .../ethernet/microchip/lan966x/lan966x_main.h | 18 +
> .../microchip/lan966x/lan966x_switchdev.c | 548 ++++++++++++++++++
> .../ethernet/microchip/lan966x/lan966x_vlan.c | 12 +-
> 5 files changed, 610 insertions(+), 12 deletions(-)
> create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
>
> diff --git a/drivers/net/ethernet/microchip/lan966x/Makefile b/drivers/net/ethernet/microchip/lan966x/Makefile
> index f7e6068a91cb..d82e896c2e53 100644
> --- a/drivers/net/ethernet/microchip/lan966x/Makefile
> +++ b/drivers/net/ethernet/microchip/lan966x/Makefile
> @@ -6,4 +6,5 @@
> obj-$(CONFIG_LAN966X_SWITCH) += lan966x-switch.o
>
> lan966x-switch-objs := lan966x_main.o lan966x_phylink.o lan966x_port.o \
> - lan966x_mac.o lan966x_ethtool.o lan966x_vlan.o
> + lan966x_mac.o lan966x_ethtool.o lan966x_switchdev.o \
> + lan966x_vlan.o
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> index 1b4c7e6b4f85..aee36c1cfa17 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> @@ -306,7 +306,7 @@ static int lan966x_port_xmit(struct sk_buff *skb, struct net_device *dev)
> return lan966x_port_ifh_xmit(skb, ifh, dev);
> }
>
> -static void lan966x_set_promisc(struct lan966x_port *port, bool enable)
> +void lan966x_set_promisc(struct lan966x_port *port, bool enable)
> {
> struct lan966x *lan966x = port->lan966x;
>

My documentation of CPU_SRC_COPY_ENA says:

If set, all frames received on this port are
copied to the CPU extraction queue given by
CPUQ_CFG.CPUQ_SRC_COPY.

I think it was established a while ago that this isn't what promiscuous
mode is about? Instead it is about accepting packets on a port
regardless of whether the MAC DA is in their RX filter or not.

Hence the oddity of your change. I understand what it intends to do:
if this is a standalone port you support IFF_UNICAST_FLT, so you drop
frames with unknown MAC DA. But if IFF_PROMISC is set, then why do you
copy all frames to the CPU? Why don't you just put the CPU in the
unknown flooding mask? There's a difference between "force a packet to
get copied to the CPU" and "copy it to the CPU only if it may have
business there". Then this change would be compatible with bridge mode.
You want the bridge to receive unknown traffic too, it may need to
forward it in software.

> @@ -318,14 +318,18 @@ static void lan966x_set_promisc(struct lan966x_port *port, bool enable)
> static void lan966x_port_change_rx_flags(struct net_device *dev, int flags)
> {
> struct lan966x_port *port = netdev_priv(dev);
> + bool enable;
>
> if (!(flags & IFF_PROMISC))
> return;
>
> - if (dev->flags & IFF_PROMISC)
> - lan966x_set_promisc(port, true);
> - else
> - lan966x_set_promisc(port, false);
> + enable = dev->flags & IFF_PROMISC ? true : false;
> + port->promisc = enable;
> +
> + if (port->bridge)
> + return;
> +
> + lan966x_set_promisc(port, enable);
> }
>
> static int lan966x_port_change_mtu(struct net_device *dev, int new_mtu)
> @@ -340,7 +344,7 @@ static int lan966x_port_change_mtu(struct net_device *dev, int new_mtu)
> return 0;
> }
>
> -static int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
> +int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
> {
> struct lan966x_port *port = netdev_priv(dev);
> struct lan966x *lan966x = port->lan966x;
> @@ -348,7 +352,7 @@ static int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
> return lan966x_mac_forget(lan966x, addr, port->pvid, ENTRYTYPE_LOCKED);
> }
>
> -static int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr)
> +int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr)
> {
> struct lan966x_port *port = netdev_priv(dev);
> struct lan966x *lan966x = port->lan966x;
> @@ -401,6 +405,11 @@ static const struct net_device_ops lan966x_port_netdev_ops = {
> .ndo_vlan_rx_kill_vid = lan966x_vlan_rx_kill_vid,
> };
>
> +bool lan966x_netdevice_check(const struct net_device *dev)
> +{
> + return dev && (dev->netdev_ops == &lan966x_port_netdev_ops);

Can "dev" ever be NULL?

> +}
> +
> static int lan966x_port_xtr_status(struct lan966x *lan966x, u8 grp)
> {
> return lan_rd(lan966x, QS_XTR_RD(grp));
> @@ -537,6 +546,11 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args)
>
> skb->protocol = eth_type_trans(skb, dev);
>
> +#ifdef CONFIG_NET_SWITCHDEV
> + if (lan966x->ports[src_port]->bridge)
> + skb->offload_fwd_mark = 1;
> +#endif
> +
> netif_rx_ni(skb);
> dev->stats.rx_bytes += len;
> dev->stats.rx_packets++;
> @@ -619,13 +633,16 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
>
> dev->netdev_ops = &lan966x_port_netdev_ops;
> dev->ethtool_ops = &lan966x_ethtool_ops;
> + dev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER |
> + NETIF_F_RXFCS;
> + dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER |
> + NETIF_F_HW_VLAN_CTAG_TX |
> + NETIF_F_HW_VLAN_STAG_TX;
> + dev->priv_flags |= IFF_UNICAST_FLT;

Too many changes in one patch. IFF_UNICAST_FLT and the handling of
promiscuous mode have nothing to do with switchdev. Neither VLAN
filtering via dev->features (should have been part of the previous
patch), or RXFCS. It seems that each one of these changes was previously
missed and is now snuck in without the explanation it deserves.

> dev->needed_headroom = IFH_LEN * sizeof(u32);
>
> eth_hw_addr_gen(dev, lan966x->base_mac, p + 1);
>
> - lan966x_mac_learn(lan966x, PGID_CPU, dev->dev_addr, port->pvid,
> - ENTRYTYPE_LOCKED);
> -

Why is this deleted? If the port uses IFF_UNICAST_FLT and isn't
promiscuous, how does it get the unicast traffic it needs?
I may be not realizing something because the changes aren't properly
split.

> port->phylink_config.dev = &port->dev->dev;
> port->phylink_config.type = PHYLINK_NETDEV;
> port->phylink_pcs.poll = true;
> @@ -949,6 +966,8 @@ static int lan966x_probe(struct platform_device *pdev)
> lan966x_port_init(lan966x->ports[p]);
> }
>
> + lan966x_register_notifier_blocks(lan966x);
> +
> return 0;
>
> cleanup_ports:
> @@ -967,6 +986,8 @@ static int lan966x_remove(struct platform_device *pdev)
> {
> struct lan966x *lan966x = platform_get_drvdata(pdev);
>
> + lan966x_unregister_notifier_blocks(lan966x);
> +
> lan966x_cleanup_ports(lan966x);
>
> cancel_delayed_work_sync(&lan966x->stats_work);
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> index ec3eccf634b3..4a0988087167 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> @@ -80,6 +80,11 @@ struct lan966x {
> struct list_head mac_entries;
> spinlock_t mac_lock; /* lock for mac_entries list */
>
> + /* Notifiers */
> + struct notifier_block netdevice_nb;
> + struct notifier_block switchdev_nb;
> + struct notifier_block switchdev_blocking_nb;
> +
> u16 vlan_mask[VLAN_N_VID];
> DECLARE_BITMAP(cpu_vlan_mask, VLAN_N_VID);
>
> @@ -112,6 +117,10 @@ struct lan966x_port {
> struct net_device *dev;
> struct lan966x *lan966x;
>
> + struct net_device *bridge;
> + u8 stp_state;
> + u8 promisc;
> +
> u8 chip_port;
> u16 pvid;
> u16 vid;
> @@ -129,6 +138,14 @@ extern const struct phylink_mac_ops lan966x_phylink_mac_ops;
> extern const struct phylink_pcs_ops lan966x_phylink_pcs_ops;
> extern const struct ethtool_ops lan966x_ethtool_ops;
>
> +int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr);
> +int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr);
> +
> +bool lan966x_netdevice_check(const struct net_device *dev);
> +
> +int lan966x_register_notifier_blocks(struct lan966x *lan966x);
> +void lan966x_unregister_notifier_blocks(struct lan966x *lan966x);
> +
> void lan966x_stats_get(struct net_device *dev,
> struct rtnl_link_stats64 *stats);
> int lan966x_stats_init(struct lan966x *lan966x);
> @@ -139,6 +156,7 @@ void lan966x_port_status_get(struct lan966x_port *port,
> struct phylink_link_state *state);
> int lan966x_port_pcs_set(struct lan966x_port *port,
> struct lan966x_port_config *config);
> +void lan966x_set_promisc(struct lan966x_port *port, bool enable);
> void lan966x_port_init(struct lan966x_port *port);
>
> int lan966x_mac_learn(struct lan966x *lan966x, int port,
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> new file mode 100644
> index 000000000000..ed6ec78d2d9a
> --- /dev/null
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> @@ -0,0 +1,548 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +
> +#include <linux/if_bridge.h>
> +#include <net/switchdev.h>
> +
> +#include "lan966x_main.h"
> +
> +static struct workqueue_struct *lan966x_owq;
> +
> +struct lan966x_fdb_event_work {
> + struct work_struct work;
> + struct switchdev_notifier_fdb_info fdb_info;
> + struct net_device *dev;
> + struct lan966x *lan966x;
> + unsigned long event;
> +};
> +
> +static void lan966x_port_attr_bridge_flags(struct lan966x_port *port,
> + struct switchdev_brport_flags flags)
> +{
> + u32 val = lan_rd(port->lan966x, ANA_PGID(PGID_MC));
> +
> + val = ANA_PGID_PGID_GET(val);
> +
> + if (flags.mask & BR_MCAST_FLOOD) {
> + if (flags.val & BR_MCAST_FLOOD)
> + val |= BIT(port->chip_port);
> + else
> + val &= ~BIT(port->chip_port);
> + }
> +
> + lan_rmw(ANA_PGID_PGID_SET(val),
> + ANA_PGID_PGID,
> + port->lan966x, ANA_PGID(PGID_MC));
> +}
> +
> +static u32 lan966x_get_fwd_mask(struct lan966x_port *port)
> +{
> + struct net_device *bridge = port->bridge;
> + struct lan966x *lan966x = port->lan966x;
> + u8 ingress_src = port->chip_port;
> + u32 mask = 0;
> + int p;
> +
> + if (port->stp_state != BR_STATE_FORWARDING)
> + goto skip_forwarding;
> +
> + for (p = 0; p < lan966x->num_phys_ports; p++) {
> + port = lan966x->ports[p];
> +
> + if (!port)
> + continue;
> +
> + if (port->stp_state == BR_STATE_FORWARDING &&
> + port->bridge == bridge)
> + mask |= BIT(p);
> + }
> +
> +skip_forwarding:
> + mask &= ~BIT(ingress_src);
> +
> + return mask;
> +}
> +
> +static void lan966x_update_fwd_mask(struct lan966x *lan966x)
> +{
> + int p;
> +
> + for (p = 0; p < lan966x->num_phys_ports; p++) {
> + struct lan966x_port *port = lan966x->ports[p];
> + unsigned long mask = 0;
> +
> + if (port->bridge)
> + mask = lan966x_get_fwd_mask(port);
> +
> + mask |= BIT(CPU_PORT);
> +
> + lan_wr(ANA_PGID_PGID_SET(mask),
> + lan966x, ANA_PGID(PGID_SRC + p));
> + }
> +}
> +
> +static void lan966x_attr_stp_state_set(struct lan966x_port *port,
> + u8 state)
> +{
> + struct lan966x *lan966x = port->lan966x;
> + bool learn_ena = 0;
> +
> + port->stp_state = state;
> +
> + if (state == BR_STATE_FORWARDING || state == BR_STATE_LEARNING)
> + learn_ena = 1;

Please use true/false for bool types.

> +
> + lan_rmw(ANA_PORT_CFG_LEARN_ENA_SET(learn_ena),
> + ANA_PORT_CFG_LEARN_ENA,
> + lan966x, ANA_PORT_CFG(port->chip_port));
> +
> + lan966x_update_fwd_mask(lan966x);
> +}
> +
> +static void lan966x_port_attr_ageing_set(struct lan966x_port *port,
> + unsigned long ageing_clock_t)
> +{
> + unsigned long ageing_jiffies = clock_t_to_jiffies(ageing_clock_t);
> + u32 ageing_time = jiffies_to_msecs(ageing_jiffies) / 1000;
> +
> + lan966x_mac_set_ageing(port->lan966x, ageing_time);
> +}
> +
> +static int lan966x_port_attr_set(struct net_device *dev, const void *ctx,
> + const struct switchdev_attr *attr,
> + struct netlink_ext_ack *extack)
> +{
> + struct lan966x_port *port = netdev_priv(dev);
> +
> + switch (attr->id) {
> + case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
> + lan966x_port_attr_bridge_flags(port, attr->u.brport_flags);
> + break;

no SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS? I think it doesn't even work
if you don't handle that?

br_switchdev_set_port_flag():

attr.id = SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS;
attr.u.brport_flags.val = flags;
attr.u.brport_flags.mask = mask;

/* We run from atomic context here */
err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev,
&info.info, extack);
err = notifier_to_errno(err);
if (err == -EOPNOTSUPP)
return 0;

Anyway, a big blob of "switchdev support" is hard to follow and review.
If you add bridge port flags you could as well add more comprehensive
support for them, but in a separate change please. Forwarding domain is
one thing, FDB/MDB is another, VLAN is another.

> + case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
> + lan966x_attr_stp_state_set(port, attr->u.stp_state);
> + break;
> + case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
> + lan966x_port_attr_ageing_set(port, attr->u.ageing_time);
> + break;
> + case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
> + lan966x_vlan_port_set_vlan_aware(port, attr->u.vlan_filtering);
> + lan966x_vlan_port_apply(port);
> + lan966x_vlan_cpu_set_vlan_aware(port);
> + break;
> + default:
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
> +
> +static int lan966x_port_bridge_join(struct lan966x_port *port,
> + struct net_device *bridge,
> + struct netlink_ext_ack *extack)
> +{
> + struct net_device *dev = port->dev;
> + int err;
> +
> + err = switchdev_bridge_port_offload(dev, dev, NULL, NULL, NULL,
> + false, extack);
> + if (err)
> + return err;
> +
> + port->bridge = bridge;
> +
> + /* Port enters in bridge mode therefor don't need to copy to CPU
> + * frames for multicast in case the bridge is not requesting them
> + */
> + __dev_mc_unsync(dev, lan966x_mc_unsync);

Why is there a need to unsync the addresses, though? A driver that
supports proper MAC table isolation between standalone ports and
VLAN-unaware bridge ports should use separate pvids for the two modes of
operation (if it doesn't support other isolation mechanisms apart from VLAN,
which it looks like lan966x does not). I see that your driver even does
this in lan966x_vlan_port_get_pvid(). So the non-bridge multicast
addresses could sit there even when the port is in a bridge.

> +
> + /* make sure that the promisc is disabled when entering under the bridge
> + * because we don't want all the frames to come to CPU
> + */
> + lan966x_set_promisc(port, false);

What's the story here? Why don't other switchdev drivers handle promisc
in this way (copy all frames to the CPU)?

> +
> + return 0;
> +}
> +
> +static void lan966x_port_bridge_leave(struct lan966x_port *port,
> + struct net_device *bridge)
> +{
> + struct lan966x *lan966x = port->lan966x;
> +
> + switchdev_bridge_port_unoffload(port->dev, NULL, NULL, NULL);

The bridge offers a facility to sync and unsync the host addresses on
joins and leaves. For this to work, the switchdev_bridge_port_unoffload
call should be during the NETDEV_PRECHANGEUPPER notifier event.

> + port->bridge = NULL;
> +
> + /* Set the port back to host mode */
> + lan966x_vlan_port_set_vlan_aware(port, 0);

s/0/false/

> + lan966x_vlan_port_set_vid(port, HOST_PVID, false, false);
> + lan966x_vlan_port_apply(port);
> +
> + lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, HOST_PVID);
> +
> + /* Port enters in host more therefore restore mc list */
> + __dev_mc_sync(port->dev, lan966x_mc_sync, lan966x_mc_unsync);
> +
> + /* Restore back the promisc as it was before the interfaces was added to
> + * the bridge
> + */
> + lan966x_set_promisc(port, port->promisc);
> +}
> +
> +static int lan966x_port_changeupper(struct net_device *dev,
> + struct netdev_notifier_changeupper_info *info)
> +{
> + struct lan966x_port *port = netdev_priv(dev);
> + struct netlink_ext_ack *extack;
> + int err = 0;
> +
> + extack = netdev_notifier_info_to_extack(&info->info);
> +
> + if (netif_is_bridge_master(info->upper_dev)) {
> + if (info->linking)
> + err = lan966x_port_bridge_join(port, info->upper_dev,
> + extack);
> + else
> + lan966x_port_bridge_leave(port, info->upper_dev);
> + }
> +
> + return err;
> +}
> +
> +static int lan966x_port_add_addr(struct net_device *dev, bool up)
> +{
> + struct lan966x_port *port = netdev_priv(dev);
> + struct lan966x *lan966x = port->lan966x;
> + u16 vid;
> +
> + vid = lan966x_vlan_port_get_pvid(port);
> +
> + if (up)
> + lan966x_mac_cpu_learn(lan966x, dev->dev_addr, vid);
> + else
> + lan966x_mac_cpu_forget(lan966x, dev->dev_addr, vid);

For which uppers is this intended? The bridge? Because the bridge
notifies you of all the entries it needs, if you also consider the
replayed events and provide non-NULL pointers to
switchdev_bridge_port_offload.

> +
> + return 0;
> +}
> +
> +static int lan966x_netdevice_port_event(struct net_device *dev,
> + struct notifier_block *nb,
> + unsigned long event, void *ptr)
> +{
> + int err = 0;
> +
> + if (!lan966x_netdevice_check(dev))
> + return 0;
> +
> + switch (event) {
> + case NETDEV_CHANGEUPPER:
> + err = lan966x_port_changeupper(dev, ptr);
> + break;
> + case NETDEV_PRE_UP:
> + err = lan966x_port_add_addr(dev, true);
> + break;
> + case NETDEV_DOWN:
> + err = lan966x_port_add_addr(dev, false);
> + break;
> + }
> +
> + return err;
> +}
> +
> +static int lan966x_netdevice_event(struct notifier_block *nb,
> + unsigned long event, void *ptr)
> +{
> + struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> + int ret;
> +
> + ret = lan966x_netdevice_port_event(dev, nb, event, ptr);
> +
> + return notifier_from_errno(ret);
> +}
> +
> +static void lan966x_fdb_event_work(struct work_struct *work)
> +{
> + struct lan966x_fdb_event_work *fdb_work =
> + container_of(work, struct lan966x_fdb_event_work, work);
> + struct switchdev_notifier_fdb_info *fdb_info;
> + struct net_device *dev = fdb_work->dev;
> + struct lan966x_port *port;
> + struct lan966x *lan966x;
> +
> + rtnl_lock();

rtnl_lock() shouldn't be needed.

> +
> + fdb_info = &fdb_work->fdb_info;
> + lan966x = fdb_work->lan966x;
> +
> + if (lan966x_netdevice_check(dev)) {
> + port = netdev_priv(dev);
> +
> + switch (fdb_work->event) {
> + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> + if (!fdb_info->added_by_user)
> + break;

If you get notified of a MAC address dynamically learned by the software
bridge on a lan966x port, you will have allocated memory for the work
item, and scheduled it, for nothing. Please try to exit unnecessary work
early.

> + lan966x_mac_add_entry(lan966x, port, fdb_info->addr,
> + fdb_info->vid);
> + break;
> + case SWITCHDEV_FDB_DEL_TO_DEVICE:
> + if (!fdb_info->added_by_user)
> + break;
> + lan966x_mac_del_entry(lan966x, fdb_info->addr, fdb_info->vid);
> + break;
> + }
> + } else {
> + if (!netif_is_bridge_master(dev))
> + goto out;
> +
> + /* If the CPU is not part of the vlan then there is no point
> + * to copy the frames to the CPU because they will be dropped
> + */
> + if (!lan966x_vlan_cpu_member_vlan_mask(lan966x, fdb_info->vid))
> + goto out;

It isn't part of the VLAN now, but what about later? I don't see that
you keep these FDB entries anywhere and restore them when a port joins
that VLAN.

> +
> + /* In case the bridge is called */
> + switch (fdb_work->event) {
> + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> + /* If there is no front port in this vlan, there is no
> + * point to copy the frame to CPU because it would be
> + * just dropped at later point. So add it only if
> + * there is a port
> + */
> + if (!lan966x_vlan_port_any_vlan_mask(lan966x, fdb_info->vid))
> + break;
> +
> + lan966x_mac_cpu_learn(lan966x, fdb_info->addr, fdb_info->vid);

Does the lan966x_mac_cpu_learn() operation trigger interrupts, or only
the dynamic learning process? How do you handle migration of an FDB
entry pointing towards the CPU, towards one pointing towards a port?

> + break;
> + case SWITCHDEV_FDB_DEL_TO_DEVICE:
> + /* It is OK to always forget the entry it */

Forget the entry it?

> + lan966x_mac_cpu_forget(lan966x, fdb_info->addr, fdb_info->vid);
> + break;
> + }
> + }
> +
> +out:
> + rtnl_unlock();
> + kfree(fdb_work->fdb_info.addr);
> + kfree(fdb_work);
> + dev_put(dev);
> +}
> +
> +static int lan966x_switchdev_event(struct notifier_block *nb,
> + unsigned long event, void *ptr)
> +{
> + struct lan966x *lan966x = container_of(nb, struct lan966x, switchdev_nb);
> + struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
> + struct switchdev_notifier_fdb_info *fdb_info;
> + struct switchdev_notifier_info *info = ptr;
> + struct lan966x_fdb_event_work *fdb_work;
> + int err;
> +
> + switch (event) {
> + case SWITCHDEV_PORT_ATTR_SET:
> + err = switchdev_handle_port_attr_set(dev, ptr,
> + lan966x_netdevice_check,
> + lan966x_port_attr_set);
> + return notifier_from_errno(err);
> + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> + fallthrough;

"fallthrough;" not needed here.

> + case SWITCHDEV_FDB_DEL_TO_DEVICE:
> + fdb_work = kzalloc(sizeof(*fdb_work), GFP_ATOMIC);
> + if (!fdb_work)
> + return NOTIFY_BAD;
> +
> + fdb_info = container_of(info,
> + struct switchdev_notifier_fdb_info,
> + info);
> +
> + fdb_work->dev = dev;
> + fdb_work->lan966x = lan966x;
> + fdb_work->event = event;
> + INIT_WORK(&fdb_work->work, lan966x_fdb_event_work);
> + memcpy(&fdb_work->fdb_info, ptr, sizeof(fdb_work->fdb_info));
> + fdb_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
> + if (!fdb_work->fdb_info.addr)
> + goto err_addr_alloc;
> +
> + ether_addr_copy((u8 *)fdb_work->fdb_info.addr, fdb_info->addr);
> + dev_hold(dev);
> +
> + queue_work(lan966x_owq, &fdb_work->work);
> + break;
> + }
> +
> + return NOTIFY_DONE;
> +err_addr_alloc:
> + kfree(fdb_work);
> + return NOTIFY_BAD;
> +}
> +
> +static int lan966x_handle_port_vlan_add(struct net_device *dev,
> + struct notifier_block *nb,
> + const struct switchdev_obj_port_vlan *v)
> +{
> + struct lan966x_port *port;
> + struct lan966x *lan966x;
> +
> + /* When adding a port to a vlan, we get a callback for the port but
> + * also for the bridge. When get the callback for the bridge just bail
> + * out. Then when the bridge is added to the vlan, then we get a
> + * callback here but in this case the flags has set:
> + * BRIDGE_VLAN_INFO_BRENTRY. In this case it means that the CPU
> + * port is added to the vlan, so the broadcast frames and unicast frames
> + * with dmac of the bridge should be foward to CPU.
> + */
> + if (netif_is_bridge_master(dev) &&
> + !(v->flags & BRIDGE_VLAN_INFO_BRENTRY))
> + return 0;
> +
> + lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
> +
> + /* In case the port gets called */
> + if (!(netif_is_bridge_master(dev))) {
> + if (!lan966x_netdevice_check(dev))
> + return -EOPNOTSUPP;
> +
> + port = netdev_priv(dev);
> + return lan966x_vlan_port_add_vlan(port, v->vid,
> + v->flags & BRIDGE_VLAN_INFO_PVID,
> + v->flags & BRIDGE_VLAN_INFO_UNTAGGED);
> + }
> +
> + /* In case the bridge gets called */
> + if (netif_is_bridge_master(dev))
> + return lan966x_vlan_cpu_add_vlan(lan966x, dev, v->vid);
> +
> + return 0;
> +}
> +
> +static int lan966x_handle_port_obj_add(struct net_device *dev,
> + struct notifier_block *nb,
> + struct switchdev_notifier_port_obj_info *info)
> +{
> + const struct switchdev_obj *obj = info->obj;
> + int err;
> +
> + switch (obj->id) {
> + case SWITCHDEV_OBJ_ID_PORT_VLAN:
> + err = lan966x_handle_port_vlan_add(dev, nb,
> + SWITCHDEV_OBJ_PORT_VLAN(obj));
> + break;
> + default:
> + err = -EOPNOTSUPP;
> + break;
> + }
> +
> + info->handled = true;
> + return err;
> +}
> +
> +static int lan966x_handle_port_vlan_del(struct net_device *dev,
> + struct notifier_block *nb,
> + const struct switchdev_obj_port_vlan *v)
> +{
> + struct lan966x_port *port;
> + struct lan966x *lan966x;
> +
> + lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
> +
> + /* In case the physical port gets called */
> + if (!netif_is_bridge_master(dev)) {
> + if (!lan966x_netdevice_check(dev))
> + return -EOPNOTSUPP;
> +
> + port = netdev_priv(dev);
> + return lan966x_vlan_port_del_vlan(port, v->vid);
> + }
> +
> + /* In case the bridge gets called */
> + if (netif_is_bridge_master(dev))
> + return lan966x_vlan_cpu_del_vlan(lan966x, dev, v->vid);
> +
> + return 0;
> +}
> +
> +static int lan966x_handle_port_obj_del(struct net_device *dev,
> + struct notifier_block *nb,
> + struct switchdev_notifier_port_obj_info *info)
> +{
> + const struct switchdev_obj *obj = info->obj;
> + int err;
> +
> + switch (obj->id) {
> + case SWITCHDEV_OBJ_ID_PORT_VLAN:
> + err = lan966x_handle_port_vlan_del(dev, nb,
> + SWITCHDEV_OBJ_PORT_VLAN(obj));
> + break;
> + default:
> + err = -EOPNOTSUPP;
> + break;
> + }
> +
> + info->handled = true;
> + return err;
> +}
> +
> +static int lan966x_switchdev_blocking_event(struct notifier_block *nb,
> + unsigned long event,
> + void *ptr)
> +{
> + struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
> + int err;
> +
> + switch (event) {
> + case SWITCHDEV_PORT_OBJ_ADD:
> + err = lan966x_handle_port_obj_add(dev, nb, ptr);
> + return notifier_from_errno(err);
> + case SWITCHDEV_PORT_OBJ_DEL:
> + err = lan966x_handle_port_obj_del(dev, nb, ptr);
> + return notifier_from_errno(err);
> + case SWITCHDEV_PORT_ATTR_SET:
> + err = switchdev_handle_port_attr_set(dev, ptr,
> + lan966x_netdevice_check,
> + lan966x_port_attr_set);
> + return notifier_from_errno(err);
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +int lan966x_register_notifier_blocks(struct lan966x *lan966x)
> +{
> + int err;
> +
> + lan966x->netdevice_nb.notifier_call = lan966x_netdevice_event;
> + err = register_netdevice_notifier(&lan966x->netdevice_nb);
> + if (err)
> + return err;
> +
> + lan966x->switchdev_nb.notifier_call = lan966x_switchdev_event;
> + err = register_switchdev_notifier(&lan966x->switchdev_nb);
> + if (err)
> + goto err_switchdev_nb;
> +
> + lan966x->switchdev_blocking_nb.notifier_call = lan966x_switchdev_blocking_event;
> + err = register_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> + if (err)
> + goto err_switchdev_blocking_nb;
> +
> + lan966x_owq = alloc_ordered_workqueue("lan966x_order", 0);
> + if (!lan966x_owq) {
> + err = -ENOMEM;
> + goto err_switchdev_blocking_nb;
> + }

These should be singleton objects, otherwise things get problematic if
you have more than one switch device instantiated in the system.

> +
> + return 0;
> +
> +err_switchdev_blocking_nb:
> + unregister_switchdev_notifier(&lan966x->switchdev_nb);
> +err_switchdev_nb:
> + unregister_netdevice_notifier(&lan966x->netdevice_nb);
> +
> + return err;
> +}
> +
> +void lan966x_unregister_notifier_blocks(struct lan966x *lan966x)
> +{
> + destroy_workqueue(lan966x_owq);
> +
> + unregister_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> + unregister_switchdev_notifier(&lan966x->switchdev_nb);
> + unregister_netdevice_notifier(&lan966x->netdevice_nb);
> +}
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
> index e47552775d06..26644503b4e6 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
> @@ -155,6 +155,9 @@ static bool lan966x_vlan_cpu_member_cpu_vlan_mask(struct lan966x *lan966x, u16 v
>
> u16 lan966x_vlan_port_get_pvid(struct lan966x_port *port)
> {
> + if (!port->bridge)
> + return HOST_PVID;
> +
> return port->vlan_aware ? port->pvid : UNAWARE_PVID;
> }
>
> @@ -210,6 +213,8 @@ void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
> * table for the front port and the CPU
> */
> lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, UNAWARE_PVID);
> + lan966x_mac_cpu_learn(lan966x, port->bridge->dev_addr,
> + UNAWARE_PVID);
>
> lan966x_vlan_port_add_vlan_mask(port, UNAWARE_PVID);
> lan966x_vlan_port_apply(port);
> @@ -218,6 +223,8 @@ void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
> * to vlan unaware
> */
> lan966x_mac_cpu_forget(lan966x, port->dev->dev_addr, UNAWARE_PVID);
> + lan966x_mac_cpu_forget(lan966x, port->bridge->dev_addr,
> + UNAWARE_PVID);
>
> lan966x_vlan_port_del_vlan_mask(port, UNAWARE_PVID);
> lan966x_vlan_port_apply(port);
> @@ -293,6 +300,7 @@ int lan966x_vlan_port_add_vlan(struct lan966x_port *port,
> */
> if (lan966x_vlan_cpu_member_cpu_vlan_mask(lan966x, vid)) {
> lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, vid);
> + lan966x_mac_cpu_learn(lan966x, port->bridge->dev_addr, vid);
> lan966x_vlan_cpu_add_vlan_mask(lan966x, vid);
> }
>
> @@ -322,8 +330,10 @@ int lan966x_vlan_port_del_vlan(struct lan966x_port *port,
> * that vlan but still keep it in the mask because it may be needed
> * again then another port gets added in tha vlan
> */
> - if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid))
> + if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid)) {
> + lan966x_mac_cpu_forget(lan966x, port->bridge->dev_addr, vid);
> lan966x_vlan_cpu_del_vlan_mask(lan966x, vid);
> + }
>
> return 0;
> }
> --
> 2.33.0
>

2021-12-09 13:59:34

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 5/6] net: lan966x: Add vlan support

On Thu, Dec 09, 2021 at 10:46:14AM +0100, Horatiu Vultur wrote:
> +int lan966x_vlan_port_set_vid(struct lan966x_port *port, u16 vid,
> + bool pvid, bool untagged)
> +{
> + struct lan966x *lan966x = port->lan966x;
> +
> + /* Egress vlan classification */
> + if (untagged && port->vid != vid) {
> + if (port->vid) {
> + dev_err(lan966x->dev,
> + "Port already has a native VLAN: %d\n",
> + port->vid);
> + return -EBUSY;

Are you interested in supporting the use case from 0da1a1c48911 ("net:
mscc: ocelot: allow a config where all bridge VLANs are egress-untagged")?
Because it would be good if the driver was structured that way from the
get-go instead of patching it later.

> + }
> + port->vid = vid;
> + }
> +
> + /* Default ingress vlan classification */
> + if (pvid)
> + port->pvid = vid;
> +
> + return 0;
> +}

2021-12-09 15:36:43

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 3/6] net: lan966x: add support for interrupts from analyzer

The 12/09/2021 11:47, Vladimir Oltean wrote:
>
> On Thu, Dec 09, 2021 at 10:46:12AM +0100, Horatiu Vultur wrote:
> > This patch adds support for handling the interrupts generated by the
> > analyzer. Currently, only the MAC table generates these interrupts.
> > The MAC table will generate an interrupt whenever it learns or forgets
> > an entry in the table. It is the SW responsibility figure out which
> > entries were added/removed.
> >
> > Signed-off-by: Horatiu Vultur <[email protected]>
> > ---
> > .../ethernet/microchip/lan966x/lan966x_mac.c | 244 ++++++++++++++++++
> > .../ethernet/microchip/lan966x/lan966x_main.c | 23 ++
> > .../ethernet/microchip/lan966x/lan966x_main.h | 6 +
> > 3 files changed, 273 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
> > index f6878b9f57ef..c01ab01bffbf 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
> > @@ -1,5 +1,6 @@
> > // SPDX-License-Identifier: GPL-2.0+
> >
> > +#include <net/switchdev.h>
> > #include "lan966x_main.h"
> >
> > #define LAN966X_MAC_COLUMNS 4
> > @@ -13,6 +14,23 @@
> > #define MACACCESS_CMD_WRITE 7
> > #define MACACCESS_CMD_SYNC_GET_NEXT 8
> >
> > +#define LAN966X_MAC_INVALID_ROW -1
> > +
> > +struct lan966x_mac_entry {
> > + struct list_head list;
> > + unsigned char mac[ETH_ALEN] __aligned(2);
> > + u16 vid;
> > + u16 port_index;
> > + int row;
> > +};
> > +
> > +struct lan966x_mac_raw_entry {
> > + u32 mach;
> > + u32 macl;
> > + u32 maca;
> > + bool process;
> > +};
> > +
> > static int lan966x_mac_get_status(struct lan966x *lan966x)
> > {
> > return lan_rd(lan966x, ANA_MACACCESS);
> > @@ -98,4 +116,230 @@ void lan966x_mac_init(struct lan966x *lan966x)
> > /* Clear the MAC table */
> > lan_wr(MACACCESS_CMD_INIT, lan966x, ANA_MACACCESS);
> > lan966x_mac_wait_for_completion(lan966x);
> > +
> > + spin_lock_init(&lan966x->mac_lock);
> > + INIT_LIST_HEAD(&lan966x->mac_entries);
> > +}
> > +
> > +static struct lan966x_mac_entry *lan966x_mac_alloc_entry(const unsigned char *mac,
> > + u16 vid, u16 port_index)
> > +{
> > + struct lan966x_mac_entry *mac_entry;
> > +
> > + mac_entry = kzalloc(sizeof(*mac_entry), GFP_KERNEL);
> > + if (!mac_entry)
> > + return NULL;
> > +
> > + memcpy(mac_entry->mac, mac, ETH_ALEN);
> > + mac_entry->vid = vid;
> > + mac_entry->port_index = port_index;
> > + mac_entry->row = LAN966X_MAC_INVALID_ROW;
> > + return mac_entry;
> > +}
> > +
> > +static void lan966x_fdb_call_notifiers(enum switchdev_notifier_type type,
> > + const char *mac, u16 vid,
> > + struct net_device *dev)
> > +{
> > + struct switchdev_notifier_fdb_info info = { 0 };
> > +
> > + info.addr = mac;
> > + info.vid = vid;
> > + info.offloaded = true;
> > + call_switchdev_notifiers(type, dev, &info.info, NULL);
> > +}
> > +
> > +void lan966x_mac_purge_entries(struct lan966x *lan966x)
> > +{
> > + struct lan966x_mac_entry *mac_entry, *tmp;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&lan966x->mac_lock, flags);
>
> I hope I'm not wrong, but you are only using this spinlock to serialize
> access to the list, which isn't accessed from hardirq context anywhere
> (the irq is threaded). So spin_lock_irqsave could simply be spin_lock.
> Unless...

It should be OK to use spin_lock. I have chosen spin_lock_irq because
that is the safest way in case it would be extended.

>
> > + list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries,
> > + list) {
> > + lan966x_mac_forget(lan966x, mac_entry->mac, mac_entry->vid,
> > + ENTRYTYPE_LOCKED);
>
> Does this generate a MAC table interrupt?

It doesn't.

>
> > +
> > + list_del(&mac_entry->list);
> > + kfree(mac_entry);
> > + }
> > + spin_unlock_irqrestore(&lan966x->mac_lock, flags);
> > +}
> > +
> > +static void lan966x_mac_notifiers(struct lan966x *lan966x,
> > + enum switchdev_notifier_type type,
> > + unsigned char *mac, u32 vid,
> > + struct net_device *dev)
> > +{
> > + rtnl_lock();
> > + lan966x_fdb_call_notifiers(type, mac, vid, dev);
> > + rtnl_unlock();
> > +}
> > +
> > +static void lan966x_mac_process_raw_entry(struct lan966x_mac_raw_entry *raw_entry,
> > + u8 *mac, u16 *vid, u32 *dest_idx)
> > +{
> > + mac[0] = (raw_entry->mach >> 8) & 0xff;
> > + mac[1] = (raw_entry->mach >> 0) & 0xff;
> > + mac[2] = (raw_entry->macl >> 24) & 0xff;
> > + mac[3] = (raw_entry->macl >> 16) & 0xff;
> > + mac[4] = (raw_entry->macl >> 8) & 0xff;
> > + mac[5] = (raw_entry->macl >> 0) & 0xff;
> > +
> > + *vid = (raw_entry->mach >> 16) & 0xfff;
> > + *dest_idx = ANA_MACACCESS_DEST_IDX_GET(raw_entry->maca);
> > +}
> > +
> > +static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
> > + struct lan966x_mac_raw_entry *raw_entries)
> > +{
> > + struct lan966x_mac_entry *mac_entry, *tmp;
> > + char mac[ETH_ALEN] __aligned(2);
>
> unsigned char
>
> > + unsigned long flags;
> > + u32 dest_idx;
> > + u32 column;
> > + u16 vid;
> > +
> > + spin_lock_irqsave(&lan966x->mac_lock, flags);
> > + list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries, list) {
> > + bool found = false;
> > +
> > + if (mac_entry->row != row)
> > + continue;
>
> When the MAC table gets large, you could consider keeping separate lists
> per row. This way you can avoid traversing a list of elements you're
> sure you don't care about.

Before I will change anything, I should run some measurements and see
some results.

>
> > +
> > + for (column = 0; column < LAN966X_MAC_COLUMNS; ++column) {
> > + /* All the valid entries are at the start of the row,
> > + * so when get one invalid entry it can just skip the
> > + * rest of the columns
> > + */
> > + if (!ANA_MACACCESS_VALID_GET(raw_entries[column].maca))
> > + break;
> > +
> > + lan966x_mac_process_raw_entry(&raw_entries[column],
> > + mac, &vid, &dest_idx);
> > + WARN_ON(dest_idx > lan966x->num_phys_ports);
> > +
> > + /* If the entry in SW is found, then there is nothing
> > + * to do
> > + */
> > + if (mac_entry->vid == vid &&
> > + ether_addr_equal(mac_entry->mac, mac) &&
> > + mac_entry->port_index == dest_idx) {
> > + raw_entries[column].process = true;
> > + found = true;
> > + break;
> > + }
> > + }
> > +
> > + if (!found) {
> > + /* Notify the bridge that the entry doesn't exist
> > + * anymore in the HW and remmove the entry from the SW
>
> s/remmove/remove/
>
> > + * list
> > + */
> > + lan966x_mac_notifiers(lan966x, SWITCHDEV_FDB_DEL_TO_BRIDGE,
> > + mac_entry->mac, mac_entry->vid,
> > + lan966x->ports[mac_entry->port_index]->dev);
> > +
> > + list_del(&mac_entry->list);
> > + kfree(mac_entry);
> > + }
> > + }
> > + spin_unlock_irqrestore(&lan966x->mac_lock, flags);
> > +
> > + /* Now go to the list of columns and see if any entry was not in the SW
> > + * list, then that means that the entry is new so it needs to notify the
> > + * bridge.
> > + */
> > + for (column = 0; column < LAN966X_MAC_COLUMNS; ++column) {
> > + /* All the valid entries are at the start of the row, so when
> > + * get one invalid entry it can just skip the rest of the columns
> > + */
> > + if (!ANA_MACACCESS_VALID_GET(raw_entries[column].maca))
> > + break;
> > +
> > + /* If the entry already exists then don't do anything */
> > + if (raw_entries[column].process)
>
> s/process/processed/
>
> > + continue;
> > +
> > + lan966x_mac_process_raw_entry(&raw_entries[column],
> > + mac, &vid, &dest_idx);
> > + WARN_ON(dest_idx > lan966x->num_phys_ports);
> > +
> > + mac_entry = lan966x_mac_alloc_entry(mac, vid, dest_idx);
> > + if (!mac_entry)
> > + return;
> > +
> > + mac_entry->row = row;
> > +
> > + spin_lock_irqsave(&lan966x->mac_lock, flags);
> > + list_add_tail(&mac_entry->list, &lan966x->mac_entries);
> > + spin_unlock_irqrestore(&lan966x->mac_lock, flags);
>
> spin_lock_irqsave shouldn't be necessary from an irq handler.
>
> > +
> > + lan966x_mac_notifiers(lan966x, SWITCHDEV_FDB_ADD_TO_BRIDGE,
> > + mac, vid, lan966x->ports[dest_idx]->dev);
> > + }
> > +}
> > +
> > +irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x)
> > +{
> > + struct lan966x_mac_raw_entry entry[LAN966X_MAC_COLUMNS] = { 0 };
> > + u32 index, column;
> > + bool stop = true;
> > + u32 val;
> > +
> > + /* Check if the mac table triggered this, if not just bail out */
> > + if (!(ANA_ANAINTR_INTR_GET(lan_rd(lan966x, ANA_ANAINTR))))
> > + return IRQ_NONE;
>
> The interrupt isn't shared, so if we enter this condition, it means the
> analyzer block generated it, just not the MAC table portion of it.
> If we return IRQ_NONE there will be an IRQ storm because that condition
> will never go away. Could we ack the interrupt and return IRQ_HANDLED?

Actually, there is a wrong comment, it doesn't check if the MAC table
triggered it, but it checks if the analyzer generated it. So it can be
removed and then always return IRQ_HANDLER.

>
> > +
> > + /* Start the scan from 0, 0 */
> > + lan_wr(ANA_MACTINDX_M_INDEX_SET(0) |
> > + ANA_MACTINDX_BUCKET_SET(0),
> > + lan966x, ANA_MACTINDX);
> > +
> > + while (1) {
> > + lan_rmw(ANA_MACACCESS_MAC_TABLE_CMD_SET(MACACCESS_CMD_SYNC_GET_NEXT),
> > + ANA_MACACCESS_MAC_TABLE_CMD,
> > + lan966x, ANA_MACACCESS);
> > + lan966x_mac_wait_for_completion(lan966x);
> > +
> > + val = lan_rd(lan966x, ANA_MACTINDX);
> > + index = ANA_MACTINDX_M_INDEX_GET(val);
> > + column = ANA_MACTINDX_BUCKET_GET(val);
> > +
> > + /* The SYNC-GET-NEXT returns all the entries(4) in a row in
> > + * which is suffered a change. By change it means that new entry
> > + * was added or an entry was removed because of ageing.
> > + * It would return all the columns for that row. And after that
> > + * it would return the next row The stop conditions of the
> > + * SYNC-GET-NEXT is when it reaches 'directly' to row 0
> > + * column 3. So if SYNC-GET-NEXT returns row 0 and column 0
> > + * then it is required to continue to read more even if it
> > + * reaches row 0 and column 3.
> > + */
> > + if (index == 0 && column == 0)
> > + stop = false;
> > +
> > + if (column == LAN966X_MAC_COLUMNS - 1 &&
> > + index == 0 && stop)
> > + break;
> > +
> > + entry[column].mach = lan_rd(lan966x, ANA_MACHDATA);
> > + entry[column].macl = lan_rd(lan966x, ANA_MACLDATA);
> > + entry[column].maca = lan_rd(lan966x, ANA_MACACCESS);
> > +
> > + /* Once all the columns are read process them */
> > + if (column == LAN966X_MAC_COLUMNS - 1) {
> > + lan966x_mac_irq_process(lan966x, index, entry);
> > + /* A row was processed so it is safe to assume that the
> > + * next row/column can be the stop condition
> > + */
> > + stop = true;
> > + }
> > + }
> > +
> > + lan_rmw(ANA_ANAINTR_INTR_SET(0),
> > + ANA_ANAINTR_INTR,
> > + lan966x, ANA_ANAINTR);
> > +
> > + return IRQ_HANDLED;
> > }
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> > index 101c1f005baf..7c6d6293611a 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> > @@ -527,6 +527,13 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args)
> > return IRQ_HANDLED;
> > }
> >
> > +static irqreturn_t lan966x_ana_irq_handler(int irq, void *args)
> > +{
> > + struct lan966x *lan966x = args;
> > +
> > + return lan966x_mac_irq_handler(lan966x);
> > +}
> > +
> > static void lan966x_cleanup_ports(struct lan966x *lan966x)
> > {
> > struct lan966x_port *port;
> > @@ -554,6 +561,11 @@ static void lan966x_cleanup_ports(struct lan966x *lan966x)
> >
> > disable_irq(lan966x->xtr_irq);
> > lan966x->xtr_irq = -ENXIO;
> > +
> > + if (lan966x->ana_irq) {
> > + disable_irq(lan966x->ana_irq);
> > + lan966x->ana_irq = -ENXIO;
> > + }
> > }
> >
> > static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
> > @@ -870,6 +882,15 @@ static int lan966x_probe(struct platform_device *pdev)
> > return -ENODEV;
> > }
> >
> > + lan966x->ana_irq = platform_get_irq_byname(pdev, "ana");
> > + if (lan966x->ana_irq) {
> > + err = devm_request_threaded_irq(&pdev->dev, lan966x->ana_irq, NULL,
> > + lan966x_ana_irq_handler, IRQF_ONESHOT,
> > + "ana irq", lan966x);
> > + if (err)
> > + return dev_err_probe(&pdev->dev, err, "Unable to use ana irq");
> > + }
> > +
> > /* init switch */
> > lan966x_init(lan966x);
> > lan966x_stats_init(lan966x);
> > @@ -923,6 +944,8 @@ static int lan966x_remove(struct platform_device *pdev)
> > destroy_workqueue(lan966x->stats_queue);
> > mutex_destroy(&lan966x->stats_lock);
> >
> > + lan966x_mac_purge_entries(lan966x);
> > +
> > return 0;
> > }
> >
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> > index 7e5a3b6f168d..ba548d65b58a 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> > @@ -75,6 +75,9 @@ struct lan966x {
> >
> > u8 base_mac[ETH_ALEN];
> >
> > + struct list_head mac_entries;
> > + spinlock_t mac_lock; /* lock for mac_entries list */
> > +
> > /* stats */
> > const struct lan966x_stat_layout *stats_layout;
> > u32 num_stats;
> > @@ -87,6 +90,7 @@ struct lan966x {
> >
> > /* interrupts */
> > int xtr_irq;
> > + int ana_irq;
> > };
> >
> > struct lan966x_port_config {
> > @@ -141,6 +145,8 @@ int lan966x_mac_forget(struct lan966x *lan966x,
> > int lan966x_mac_cpu_learn(struct lan966x *lan966x, const char *addr, u16 vid);
> > int lan966x_mac_cpu_forget(struct lan966x *lan966x, const char *addr, u16 vid);
> > void lan966x_mac_init(struct lan966x *lan966x);
> > +void lan966x_mac_purge_entries(struct lan966x *lan966x);
> > +irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x);
> >
> > static inline void __iomem *lan_addr(void __iomem *base[],
> > int id, int tinst, int tcnt,
> > --
> > 2.33.0
> >

--
/Horatiu

2021-12-09 15:40:52

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/6] dt-bindings: net: lan966x: Extend with the analyzer interrupt

The 12/09/2021 10:58, Vladimir Oltean wrote:
>
> On Thu, Dec 09, 2021 at 10:46:11AM +0100, Horatiu Vultur wrote:
> > Extend dt-bindings for lan966x with analyzer interrupt.
> > This interrupt can be generated for example when the HW learn/forgets
> > an entry in the MAC table.
> >
> > Signed-off-by: Horatiu Vultur <[email protected]>
> > ---
>
> Reviewed-by: Vladimir Oltean <[email protected]>
>
> Why don't you describe your hardware in the device tree all at once?
> Doing it piece by piece means that every time when you add a new
> functionality you need to be compatible with the absence of a certain
> reg, interrupt etc.

I though it is more clear what is added in the patch series.
But then, if for example add more interrupts in DT than what the
driver support, that would not be an issue?

>
> > .../devicetree/bindings/net/microchip,lan966x-switch.yaml | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
> > index 5bee665d5fcf..e79e4e166ad8 100644
> > --- a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
> > +++ b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
> > @@ -37,12 +37,14 @@ properties:
> > items:
> > - description: register based extraction
> > - description: frame dma based extraction
> > + - description: analyzer interrupt
> >
> > interrupt-names:
> > minItems: 1
> > items:
> > - const: xtr
> > - const: fdma
> > + - const: ana
> >
> > resets:
> > items:
> > --
> > 2.33.0
> >

--
/Horatiu

2021-12-09 15:45:16

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 5/6] net: lan966x: Add vlan support

The 12/09/2021 13:59, Vladimir Oltean wrote:
>
> On Thu, Dec 09, 2021 at 10:46:14AM +0100, Horatiu Vultur wrote:
> > +int lan966x_vlan_port_set_vid(struct lan966x_port *port, u16 vid,
> > + bool pvid, bool untagged)
> > +{
> > + struct lan966x *lan966x = port->lan966x;
> > +
> > + /* Egress vlan classification */
> > + if (untagged && port->vid != vid) {
> > + if (port->vid) {
> > + dev_err(lan966x->dev,
> > + "Port already has a native VLAN: %d\n",
> > + port->vid);
> > + return -EBUSY;
>
> Are you interested in supporting the use case from 0da1a1c48911 ("net:
> mscc: ocelot: allow a config where all bridge VLANs are egress-untagged")?
> Because it would be good if the driver was structured that way from the
> get-go instead of patching it later.

Currently not, but I don't know what will happen in 1 month or 6
months.

>
> > + }
> > + port->vid = vid;
> > + }
> > +
> > + /* Default ingress vlan classification */
> > + if (pvid)
> > + port->pvid = vid;
> > +
> > + return 0;
> > +}

--
/Horatiu

2021-12-09 16:41:17

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/09/2021 13:36, Vladimir Oltean wrote:
>
> On Thu, Dec 09, 2021 at 10:46:15AM +0100, Horatiu Vultur wrote:
> > This adds support for switchdev in lan966x.
> > It offloads to the HW basic forwarding and vlan filtering. To be able to
> > offload this to the HW, it is required to disable promisc mode for ports
> > that are part of the bridge.
> >
> > Signed-off-by: Horatiu Vultur <[email protected]>
> > ---
> > .../net/ethernet/microchip/lan966x/Makefile | 3 +-
> > .../ethernet/microchip/lan966x/lan966x_main.c | 41 +-
> > .../ethernet/microchip/lan966x/lan966x_main.h | 18 +
> > .../microchip/lan966x/lan966x_switchdev.c | 548 ++++++++++++++++++
> > .../ethernet/microchip/lan966x/lan966x_vlan.c | 12 +-
> > 5 files changed, 610 insertions(+), 12 deletions(-)
> > create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> >
> > diff --git a/drivers/net/ethernet/microchip/lan966x/Makefile b/drivers/net/ethernet/microchip/lan966x/Makefile
> > index f7e6068a91cb..d82e896c2e53 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/Makefile
> > +++ b/drivers/net/ethernet/microchip/lan966x/Makefile
> > @@ -6,4 +6,5 @@
> > obj-$(CONFIG_LAN966X_SWITCH) += lan966x-switch.o
> >
> > lan966x-switch-objs := lan966x_main.o lan966x_phylink.o lan966x_port.o \
> > - lan966x_mac.o lan966x_ethtool.o lan966x_vlan.o
> > + lan966x_mac.o lan966x_ethtool.o lan966x_switchdev.o \
> > + lan966x_vlan.o
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> > index 1b4c7e6b4f85..aee36c1cfa17 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
> > @@ -306,7 +306,7 @@ static int lan966x_port_xmit(struct sk_buff *skb, struct net_device *dev)
> > return lan966x_port_ifh_xmit(skb, ifh, dev);
> > }
> >
> > -static void lan966x_set_promisc(struct lan966x_port *port, bool enable)
> > +void lan966x_set_promisc(struct lan966x_port *port, bool enable)
> > {
> > struct lan966x *lan966x = port->lan966x;
> >
>
> My documentation of CPU_SRC_COPY_ENA says:
>
> If set, all frames received on this port are
> copied to the CPU extraction queue given by
> CPUQ_CFG.CPUQ_SRC_COPY.
>
> I think it was established a while ago that this isn't what promiscuous
> mode is about? Instead it is about accepting packets on a port
> regardless of whether the MAC DA is in their RX filter or not.

Yes, I am aware that this change interprets the things differently and I
am totally OK to drop this promisc if it is needed.

>
> Hence the oddity of your change. I understand what it intends to do:
> if this is a standalone port you support IFF_UNICAST_FLT, so you drop
> frames with unknown MAC DA. But if IFF_PROMISC is set, then why do you
> copy all frames to the CPU? Why don't you just put the CPU in the
> unknown flooding mask?

Because I don't want the CPU to be in the unknown flooding mask. I want
to send frames to the CPU only if it is required.

> There's a difference between "force a packet to
> get copied to the CPU" and "copy it to the CPU only if it may have
> business there". Then this change would be compatible with bridge mode.
> You want the bridge to receive unknown traffic too, it may need to
> forward it in software.

I don't want to bridge to receive unknown traffic if I know it would
just drop it.

>
> > @@ -318,14 +318,18 @@ static void lan966x_set_promisc(struct lan966x_port *port, bool enable)
> > static void lan966x_port_change_rx_flags(struct net_device *dev, int flags)
> > {
> > struct lan966x_port *port = netdev_priv(dev);
> > + bool enable;
> >
> > if (!(flags & IFF_PROMISC))
> > return;
> >
> > - if (dev->flags & IFF_PROMISC)
> > - lan966x_set_promisc(port, true);
> > - else
> > - lan966x_set_promisc(port, false);
> > + enable = dev->flags & IFF_PROMISC ? true : false;
> > + port->promisc = enable;
> > +
> > + if (port->bridge)
> > + return;
> > +
> > + lan966x_set_promisc(port, enable);
> > }
> >
> > static int lan966x_port_change_mtu(struct net_device *dev, int new_mtu)
> > @@ -340,7 +344,7 @@ static int lan966x_port_change_mtu(struct net_device *dev, int new_mtu)
> > return 0;
> > }
> >
> > -static int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
> > +int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
> > {
> > struct lan966x_port *port = netdev_priv(dev);
> > struct lan966x *lan966x = port->lan966x;
> > @@ -348,7 +352,7 @@ static int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr)
> > return lan966x_mac_forget(lan966x, addr, port->pvid, ENTRYTYPE_LOCKED);
> > }
> >
> > -static int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr)
> > +int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr)
> > {
> > struct lan966x_port *port = netdev_priv(dev);
> > struct lan966x *lan966x = port->lan966x;
> > @@ -401,6 +405,11 @@ static const struct net_device_ops lan966x_port_netdev_ops = {
> > .ndo_vlan_rx_kill_vid = lan966x_vlan_rx_kill_vid,
> > };
> >
> > +bool lan966x_netdevice_check(const struct net_device *dev)
> > +{
> > + return dev && (dev->netdev_ops == &lan966x_port_netdev_ops);
>
> Can "dev" ever be NULL?

It doesn't look like that. I will remove it.

>
> > +}
> > +
> > static int lan966x_port_xtr_status(struct lan966x *lan966x, u8 grp)
> > {
> > return lan_rd(lan966x, QS_XTR_RD(grp));
> > @@ -537,6 +546,11 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args)
> >
> > skb->protocol = eth_type_trans(skb, dev);
> >
> > +#ifdef CONFIG_NET_SWITCHDEV
> > + if (lan966x->ports[src_port]->bridge)
> > + skb->offload_fwd_mark = 1;
> > +#endif
> > +
> > netif_rx_ni(skb);
> > dev->stats.rx_bytes += len;
> > dev->stats.rx_packets++;
> > @@ -619,13 +633,16 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p,
> >
> > dev->netdev_ops = &lan966x_port_netdev_ops;
> > dev->ethtool_ops = &lan966x_ethtool_ops;
> > + dev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER |
> > + NETIF_F_RXFCS;
> > + dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER |
> > + NETIF_F_HW_VLAN_CTAG_TX |
> > + NETIF_F_HW_VLAN_STAG_TX;
> > + dev->priv_flags |= IFF_UNICAST_FLT;
>
> Too many changes in one patch. IFF_UNICAST_FLT and the handling of
> promiscuous mode have nothing to do with switchdev.

Well, in this case (because it is a bug in the promisc) it has. Because
the bridge will increase the promiscuity of the interfaces, so then all
the ports will copy the frames to the CPU. Which breaks the entire
purpose of the bridge.

> Neither VLAN filtering via dev->features (should have been part of the previous
> patch), or RXFCS.

I agree, the VLAN filtering should be part of the previous patch and
RXFCS should be totally separate change.

> It seems that each one of these changes was previously
> missed and is now snuck in without the explanation it deserves.
>
> > dev->needed_headroom = IFH_LEN * sizeof(u32);
> >
> > eth_hw_addr_gen(dev, lan966x->base_mac, p + 1);
> >
> > - lan966x_mac_learn(lan966x, PGID_CPU, dev->dev_addr, port->pvid,
> > - ENTRYTYPE_LOCKED);
> > -
>
> Why is this deleted? If the port uses IFF_UNICAST_FLT and isn't
> promiscuous, how does it get the unicast traffic it needs?
> I may be not realizing something because the changes aren't properly
> split.

The functionality is moved inside function 'lan966x_port_add_addr'.

>
> > port->phylink_config.dev = &port->dev->dev;
> > port->phylink_config.type = PHYLINK_NETDEV;
> > port->phylink_pcs.poll = true;
> > @@ -949,6 +966,8 @@ static int lan966x_probe(struct platform_device *pdev)
> > lan966x_port_init(lan966x->ports[p]);
> > }
> >
> > + lan966x_register_notifier_blocks(lan966x);
> > +
> > return 0;
> >
> > cleanup_ports:
> > @@ -967,6 +986,8 @@ static int lan966x_remove(struct platform_device *pdev)
> > {
> > struct lan966x *lan966x = platform_get_drvdata(pdev);
> >
> > + lan966x_unregister_notifier_blocks(lan966x);
> > +
> > lan966x_cleanup_ports(lan966x);
> >
> > cancel_delayed_work_sync(&lan966x->stats_work);
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> > index ec3eccf634b3..4a0988087167 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
> > @@ -80,6 +80,11 @@ struct lan966x {
> > struct list_head mac_entries;
> > spinlock_t mac_lock; /* lock for mac_entries list */
> >
> > + /* Notifiers */
> > + struct notifier_block netdevice_nb;
> > + struct notifier_block switchdev_nb;
> > + struct notifier_block switchdev_blocking_nb;
> > +
> > u16 vlan_mask[VLAN_N_VID];
> > DECLARE_BITMAP(cpu_vlan_mask, VLAN_N_VID);
> >
> > @@ -112,6 +117,10 @@ struct lan966x_port {
> > struct net_device *dev;
> > struct lan966x *lan966x;
> >
> > + struct net_device *bridge;
> > + u8 stp_state;
> > + u8 promisc;
> > +
> > u8 chip_port;
> > u16 pvid;
> > u16 vid;
> > @@ -129,6 +138,14 @@ extern const struct phylink_mac_ops lan966x_phylink_mac_ops;
> > extern const struct phylink_pcs_ops lan966x_phylink_pcs_ops;
> > extern const struct ethtool_ops lan966x_ethtool_ops;
> >
> > +int lan966x_mc_unsync(struct net_device *dev, const unsigned char *addr);
> > +int lan966x_mc_sync(struct net_device *dev, const unsigned char *addr);
> > +
> > +bool lan966x_netdevice_check(const struct net_device *dev);
> > +
> > +int lan966x_register_notifier_blocks(struct lan966x *lan966x);
> > +void lan966x_unregister_notifier_blocks(struct lan966x *lan966x);
> > +
> > void lan966x_stats_get(struct net_device *dev,
> > struct rtnl_link_stats64 *stats);
> > int lan966x_stats_init(struct lan966x *lan966x);
> > @@ -139,6 +156,7 @@ void lan966x_port_status_get(struct lan966x_port *port,
> > struct phylink_link_state *state);
> > int lan966x_port_pcs_set(struct lan966x_port *port,
> > struct lan966x_port_config *config);
> > +void lan966x_set_promisc(struct lan966x_port *port, bool enable);
> > void lan966x_port_init(struct lan966x_port *port);
> >
> > int lan966x_mac_learn(struct lan966x *lan966x, int port,
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> > new file mode 100644
> > index 000000000000..ed6ec78d2d9a
> > --- /dev/null
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> > @@ -0,0 +1,548 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +
> > +#include <linux/if_bridge.h>
> > +#include <net/switchdev.h>
> > +
> > +#include "lan966x_main.h"
> > +
> > +static struct workqueue_struct *lan966x_owq;
> > +
> > +struct lan966x_fdb_event_work {
> > + struct work_struct work;
> > + struct switchdev_notifier_fdb_info fdb_info;
> > + struct net_device *dev;
> > + struct lan966x *lan966x;
> > + unsigned long event;
> > +};
> > +
> > +static void lan966x_port_attr_bridge_flags(struct lan966x_port *port,
> > + struct switchdev_brport_flags flags)
> > +{
> > + u32 val = lan_rd(port->lan966x, ANA_PGID(PGID_MC));
> > +
> > + val = ANA_PGID_PGID_GET(val);
> > +
> > + if (flags.mask & BR_MCAST_FLOOD) {
> > + if (flags.val & BR_MCAST_FLOOD)
> > + val |= BIT(port->chip_port);
> > + else
> > + val &= ~BIT(port->chip_port);
> > + }
> > +
> > + lan_rmw(ANA_PGID_PGID_SET(val),
> > + ANA_PGID_PGID,
> > + port->lan966x, ANA_PGID(PGID_MC));
> > +}
> > +
> > +static u32 lan966x_get_fwd_mask(struct lan966x_port *port)
> > +{
> > + struct net_device *bridge = port->bridge;
> > + struct lan966x *lan966x = port->lan966x;
> > + u8 ingress_src = port->chip_port;
> > + u32 mask = 0;
> > + int p;
> > +
> > + if (port->stp_state != BR_STATE_FORWARDING)
> > + goto skip_forwarding;
> > +
> > + for (p = 0; p < lan966x->num_phys_ports; p++) {
> > + port = lan966x->ports[p];
> > +
> > + if (!port)
> > + continue;
> > +
> > + if (port->stp_state == BR_STATE_FORWARDING &&
> > + port->bridge == bridge)
> > + mask |= BIT(p);
> > + }
> > +
> > +skip_forwarding:
> > + mask &= ~BIT(ingress_src);
> > +
> > + return mask;
> > +}
> > +
> > +static void lan966x_update_fwd_mask(struct lan966x *lan966x)
> > +{
> > + int p;
> > +
> > + for (p = 0; p < lan966x->num_phys_ports; p++) {
> > + struct lan966x_port *port = lan966x->ports[p];
> > + unsigned long mask = 0;
> > +
> > + if (port->bridge)
> > + mask = lan966x_get_fwd_mask(port);
> > +
> > + mask |= BIT(CPU_PORT);
> > +
> > + lan_wr(ANA_PGID_PGID_SET(mask),
> > + lan966x, ANA_PGID(PGID_SRC + p));
> > + }
> > +}
> > +
> > +static void lan966x_attr_stp_state_set(struct lan966x_port *port,
> > + u8 state)
> > +{
> > + struct lan966x *lan966x = port->lan966x;
> > + bool learn_ena = 0;
> > +
> > + port->stp_state = state;
> > +
> > + if (state == BR_STATE_FORWARDING || state == BR_STATE_LEARNING)
> > + learn_ena = 1;
>
> Please use true/false for bool types.

Will do that.

>
> > +
> > + lan_rmw(ANA_PORT_CFG_LEARN_ENA_SET(learn_ena),
> > + ANA_PORT_CFG_LEARN_ENA,
> > + lan966x, ANA_PORT_CFG(port->chip_port));
> > +
> > + lan966x_update_fwd_mask(lan966x);
> > +}
> > +
> > +static void lan966x_port_attr_ageing_set(struct lan966x_port *port,
> > + unsigned long ageing_clock_t)
> > +{
> > + unsigned long ageing_jiffies = clock_t_to_jiffies(ageing_clock_t);
> > + u32 ageing_time = jiffies_to_msecs(ageing_jiffies) / 1000;
> > +
> > + lan966x_mac_set_ageing(port->lan966x, ageing_time);
> > +}
> > +
> > +static int lan966x_port_attr_set(struct net_device *dev, const void *ctx,
> > + const struct switchdev_attr *attr,
> > + struct netlink_ext_ack *extack)
> > +{
> > + struct lan966x_port *port = netdev_priv(dev);
> > +
> > + switch (attr->id) {
> > + case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
> > + lan966x_port_attr_bridge_flags(port, attr->u.brport_flags);
> > + break;
>
> no SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS? I think it doesn't even work
> if you don't handle that?
>
> br_switchdev_set_port_flag():
>
> attr.id = SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS;
> attr.u.brport_flags.val = flags;
> attr.u.brport_flags.mask = mask;
>
> /* We run from atomic context here */
> err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev,
> &info.info, extack);
> err = notifier_to_errno(err);
> if (err == -EOPNOTSUPP)
> return 0;
>
> Anyway, a big blob of "switchdev support" is hard to follow and review.
> If you add bridge port flags you could as well add more comprehensive
> support for them, but in a separate change please. Forwarding domain is
> one thing, FDB/MDB is another, VLAN is another.

Good catch. I will try to split it as you suggested.

>
> > + case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
> > + lan966x_attr_stp_state_set(port, attr->u.stp_state);
> > + break;
> > + case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
> > + lan966x_port_attr_ageing_set(port, attr->u.ageing_time);
> > + break;
> > + case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
> > + lan966x_vlan_port_set_vlan_aware(port, attr->u.vlan_filtering);
> > + lan966x_vlan_port_apply(port);
> > + lan966x_vlan_cpu_set_vlan_aware(port);
> > + break;
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int lan966x_port_bridge_join(struct lan966x_port *port,
> > + struct net_device *bridge,
> > + struct netlink_ext_ack *extack)
> > +{
> > + struct net_device *dev = port->dev;
> > + int err;
> > +
> > + err = switchdev_bridge_port_offload(dev, dev, NULL, NULL, NULL,
> > + false, extack);
> > + if (err)
> > + return err;
> > +
> > + port->bridge = bridge;
> > +
> > + /* Port enters in bridge mode therefor don't need to copy to CPU
> > + * frames for multicast in case the bridge is not requesting them
> > + */
> > + __dev_mc_unsync(dev, lan966x_mc_unsync);
>
> Why is there a need to unsync the addresses, though? A driver that
> supports proper MAC table isolation between standalone ports and
> VLAN-unaware bridge ports should use separate pvids for the two modes of
> operation (if it doesn't support other isolation mechanisms apart from VLAN,
> which it looks like lan966x does not). I see that your driver even does
> this in lan966x_vlan_port_get_pvid(). So the non-bridge multicast
> addresses could sit there even when the port is in a bridge.

Really good observation.
Initially I had the same PVID for standalone and vlan-unaware bridge
ports. So then I have added all these but now that they use different
PVID I can remove this.

>
> > +
> > + /* make sure that the promisc is disabled when entering under the bridge
> > + * because we don't want all the frames to come to CPU
> > + */
> > + lan966x_set_promisc(port, false);
>
> What's the story here? Why don't other switchdev drivers handle promisc
> in this way (copy all frames to the CPU)?

I can't say about other switchdev drivers, but in our case we really
don't want all the frames to the CPU. The function lan966x_set_promisc
was setting the port to copy all the frames to the CPU, which we don't
want when the port is part of the bridge.

>
> > +
> > + return 0;
> > +}
> > +
> > +static void lan966x_port_bridge_leave(struct lan966x_port *port,
> > + struct net_device *bridge)
> > +{
> > + struct lan966x *lan966x = port->lan966x;
> > +
> > + switchdev_bridge_port_unoffload(port->dev, NULL, NULL, NULL);
>
> The bridge offers a facility to sync and unsync the host addresses on
> joins and leaves. For this to work, the switchdev_bridge_port_unoffload
> call should be during the NETDEV_PRECHANGEUPPER notifier event.
>
> > + port->bridge = NULL;
> > +
> > + /* Set the port back to host mode */
> > + lan966x_vlan_port_set_vlan_aware(port, 0);
>
> s/0/false/
>
> > + lan966x_vlan_port_set_vid(port, HOST_PVID, false, false);
> > + lan966x_vlan_port_apply(port);
> > +
> > + lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, HOST_PVID);
> > +
> > + /* Port enters in host more therefore restore mc list */
> > + __dev_mc_sync(port->dev, lan966x_mc_sync, lan966x_mc_unsync);
> > +
> > + /* Restore back the promisc as it was before the interfaces was added to
> > + * the bridge
> > + */
> > + lan966x_set_promisc(port, port->promisc);
> > +}
> > +
> > +static int lan966x_port_changeupper(struct net_device *dev,
> > + struct netdev_notifier_changeupper_info *info)
> > +{
> > + struct lan966x_port *port = netdev_priv(dev);
> > + struct netlink_ext_ack *extack;
> > + int err = 0;
> > +
> > + extack = netdev_notifier_info_to_extack(&info->info);
> > +
> > + if (netif_is_bridge_master(info->upper_dev)) {
> > + if (info->linking)
> > + err = lan966x_port_bridge_join(port, info->upper_dev,
> > + extack);
> > + else
> > + lan966x_port_bridge_leave(port, info->upper_dev);
> > + }
> > +
> > + return err;
> > +}
> > +
> > +static int lan966x_port_add_addr(struct net_device *dev, bool up)
> > +{
> > + struct lan966x_port *port = netdev_priv(dev);
> > + struct lan966x *lan966x = port->lan966x;
> > + u16 vid;
> > +
> > + vid = lan966x_vlan_port_get_pvid(port);
> > +
> > + if (up)
> > + lan966x_mac_cpu_learn(lan966x, dev->dev_addr, vid);
> > + else
> > + lan966x_mac_cpu_forget(lan966x, dev->dev_addr, vid);
>
> For which uppers is this intended? The bridge? Because the bridge
> notifies you of all the entries it needs, if you also consider the
> replayed events and provide non-NULL pointers to
> switchdev_bridge_port_offload.

No, this up just means if the port is up or down. Doesn't have anything
to do with the bridge. This code replace the code from
lan966x_probe_port.

>
> > +
> > + return 0;
> > +}
> > +
> > +static int lan966x_netdevice_port_event(struct net_device *dev,
> > + struct notifier_block *nb,
> > + unsigned long event, void *ptr)
> > +{
> > + int err = 0;
> > +
> > + if (!lan966x_netdevice_check(dev))
> > + return 0;
> > +
> > + switch (event) {
> > + case NETDEV_CHANGEUPPER:
> > + err = lan966x_port_changeupper(dev, ptr);
> > + break;
> > + case NETDEV_PRE_UP:
> > + err = lan966x_port_add_addr(dev, true);
> > + break;
> > + case NETDEV_DOWN:
> > + err = lan966x_port_add_addr(dev, false);
> > + break;
> > + }
> > +
> > + return err;
> > +}
> > +
> > +static int lan966x_netdevice_event(struct notifier_block *nb,
> > + unsigned long event, void *ptr)
> > +{
> > + struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> > + int ret;
> > +
> > + ret = lan966x_netdevice_port_event(dev, nb, event, ptr);
> > +
> > + return notifier_from_errno(ret);
> > +}
> > +
> > +static void lan966x_fdb_event_work(struct work_struct *work)
> > +{
> > + struct lan966x_fdb_event_work *fdb_work =
> > + container_of(work, struct lan966x_fdb_event_work, work);
> > + struct switchdev_notifier_fdb_info *fdb_info;
> > + struct net_device *dev = fdb_work->dev;
> > + struct lan966x_port *port;
> > + struct lan966x *lan966x;
> > +
> > + rtnl_lock();
>
> rtnl_lock() shouldn't be needed.
>
> > +
> > + fdb_info = &fdb_work->fdb_info;
> > + lan966x = fdb_work->lan966x;
> > +
> > + if (lan966x_netdevice_check(dev)) {
> > + port = netdev_priv(dev);
> > +
> > + switch (fdb_work->event) {
> > + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> > + if (!fdb_info->added_by_user)
> > + break;
>
> If you get notified of a MAC address dynamically learned by the software
> bridge on a lan966x port, you will have allocated memory for the work
> item, and scheduled it, for nothing. Please try to exit unnecessary work
> early.

Yes, I will add an early check.

>
> > + lan966x_mac_add_entry(lan966x, port, fdb_info->addr,
> > + fdb_info->vid);
> > + break;
> > + case SWITCHDEV_FDB_DEL_TO_DEVICE:
> > + if (!fdb_info->added_by_user)
> > + break;
> > + lan966x_mac_del_entry(lan966x, fdb_info->addr, fdb_info->vid);
> > + break;
> > + }
> > + } else {
> > + if (!netif_is_bridge_master(dev))
> > + goto out;
> > +
> > + /* If the CPU is not part of the vlan then there is no point
> > + * to copy the frames to the CPU because they will be dropped
> > + */
> > + if (!lan966x_vlan_cpu_member_vlan_mask(lan966x, fdb_info->vid))
> > + goto out;
>
> It isn't part of the VLAN now, but what about later? I don't see that
> you keep these FDB entries anywhere and restore them when a port joins
> that VLAN.

Actually I do that. It is inside lan966x_vlan_port_add_vlan. There I
check if the CPU is part of the VLAN then add that entry.

>
> > +
> > + /* In case the bridge is called */
> > + switch (fdb_work->event) {
> > + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> > + /* If there is no front port in this vlan, there is no
> > + * point to copy the frame to CPU because it would be
> > + * just dropped at later point. So add it only if
> > + * there is a port
> > + */
> > + if (!lan966x_vlan_port_any_vlan_mask(lan966x, fdb_info->vid))
> > + break;
> > +
> > + lan966x_mac_cpu_learn(lan966x, fdb_info->addr, fdb_info->vid);
>
> Does the lan966x_mac_cpu_learn() operation trigger interrupts, or only
> the dynamic learning process?

Only dynamic learning process.

> How do you handle migration of an FDB entry pointing towards the CPU,
> towards one pointing towards a port?

Shouldn't I get 2 calls that the entry is removed from CPU and then
added to a port?

>
> > + break;
> > + case SWITCHDEV_FDB_DEL_TO_DEVICE:
> > + /* It is OK to always forget the entry it */
>
> Forget the entry it?

It was a typo. I will remove the 'it'.

>
> > + lan966x_mac_cpu_forget(lan966x, fdb_info->addr, fdb_info->vid);
> > + break;
> > + }
> > + }
> > +
> > +out:
> > + rtnl_unlock();
> > + kfree(fdb_work->fdb_info.addr);
> > + kfree(fdb_work);
> > + dev_put(dev);
> > +}
> > +
> > +static int lan966x_switchdev_event(struct notifier_block *nb,
> > + unsigned long event, void *ptr)
> > +{
> > + struct lan966x *lan966x = container_of(nb, struct lan966x, switchdev_nb);
> > + struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
> > + struct switchdev_notifier_fdb_info *fdb_info;
> > + struct switchdev_notifier_info *info = ptr;
> > + struct lan966x_fdb_event_work *fdb_work;
> > + int err;
> > +
> > + switch (event) {
> > + case SWITCHDEV_PORT_ATTR_SET:
> > + err = switchdev_handle_port_attr_set(dev, ptr,
> > + lan966x_netdevice_check,
> > + lan966x_port_attr_set);
> > + return notifier_from_errno(err);
> > + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> > + fallthrough;
>
> "fallthrough;" not needed here.
>
> > + case SWITCHDEV_FDB_DEL_TO_DEVICE:
> > + fdb_work = kzalloc(sizeof(*fdb_work), GFP_ATOMIC);
> > + if (!fdb_work)
> > + return NOTIFY_BAD;
> > +
> > + fdb_info = container_of(info,
> > + struct switchdev_notifier_fdb_info,
> > + info);
> > +
> > + fdb_work->dev = dev;
> > + fdb_work->lan966x = lan966x;
> > + fdb_work->event = event;
> > + INIT_WORK(&fdb_work->work, lan966x_fdb_event_work);
> > + memcpy(&fdb_work->fdb_info, ptr, sizeof(fdb_work->fdb_info));
> > + fdb_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
> > + if (!fdb_work->fdb_info.addr)
> > + goto err_addr_alloc;
> > +
> > + ether_addr_copy((u8 *)fdb_work->fdb_info.addr, fdb_info->addr);
> > + dev_hold(dev);
> > +
> > + queue_work(lan966x_owq, &fdb_work->work);
> > + break;
> > + }
> > +
> > + return NOTIFY_DONE;
> > +err_addr_alloc:
> > + kfree(fdb_work);
> > + return NOTIFY_BAD;
> > +}
> > +
> > +static int lan966x_handle_port_vlan_add(struct net_device *dev,
> > + struct notifier_block *nb,
> > + const struct switchdev_obj_port_vlan *v)
> > +{
> > + struct lan966x_port *port;
> > + struct lan966x *lan966x;
> > +
> > + /* When adding a port to a vlan, we get a callback for the port but
> > + * also for the bridge. When get the callback for the bridge just bail
> > + * out. Then when the bridge is added to the vlan, then we get a
> > + * callback here but in this case the flags has set:
> > + * BRIDGE_VLAN_INFO_BRENTRY. In this case it means that the CPU
> > + * port is added to the vlan, so the broadcast frames and unicast frames
> > + * with dmac of the bridge should be foward to CPU.
> > + */
> > + if (netif_is_bridge_master(dev) &&
> > + !(v->flags & BRIDGE_VLAN_INFO_BRENTRY))
> > + return 0;
> > +
> > + lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
> > +
> > + /* In case the port gets called */
> > + if (!(netif_is_bridge_master(dev))) {
> > + if (!lan966x_netdevice_check(dev))
> > + return -EOPNOTSUPP;
> > +
> > + port = netdev_priv(dev);
> > + return lan966x_vlan_port_add_vlan(port, v->vid,
> > + v->flags & BRIDGE_VLAN_INFO_PVID,
> > + v->flags & BRIDGE_VLAN_INFO_UNTAGGED);
> > + }
> > +
> > + /* In case the bridge gets called */
> > + if (netif_is_bridge_master(dev))
> > + return lan966x_vlan_cpu_add_vlan(lan966x, dev, v->vid);
> > +
> > + return 0;
> > +}
> > +
> > +static int lan966x_handle_port_obj_add(struct net_device *dev,
> > + struct notifier_block *nb,
> > + struct switchdev_notifier_port_obj_info *info)
> > +{
> > + const struct switchdev_obj *obj = info->obj;
> > + int err;
> > +
> > + switch (obj->id) {
> > + case SWITCHDEV_OBJ_ID_PORT_VLAN:
> > + err = lan966x_handle_port_vlan_add(dev, nb,
> > + SWITCHDEV_OBJ_PORT_VLAN(obj));
> > + break;
> > + default:
> > + err = -EOPNOTSUPP;
> > + break;
> > + }
> > +
> > + info->handled = true;
> > + return err;
> > +}
> > +
> > +static int lan966x_handle_port_vlan_del(struct net_device *dev,
> > + struct notifier_block *nb,
> > + const struct switchdev_obj_port_vlan *v)
> > +{
> > + struct lan966x_port *port;
> > + struct lan966x *lan966x;
> > +
> > + lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
> > +
> > + /* In case the physical port gets called */
> > + if (!netif_is_bridge_master(dev)) {
> > + if (!lan966x_netdevice_check(dev))
> > + return -EOPNOTSUPP;
> > +
> > + port = netdev_priv(dev);
> > + return lan966x_vlan_port_del_vlan(port, v->vid);
> > + }
> > +
> > + /* In case the bridge gets called */
> > + if (netif_is_bridge_master(dev))
> > + return lan966x_vlan_cpu_del_vlan(lan966x, dev, v->vid);
> > +
> > + return 0;
> > +}
> > +
> > +static int lan966x_handle_port_obj_del(struct net_device *dev,
> > + struct notifier_block *nb,
> > + struct switchdev_notifier_port_obj_info *info)
> > +{
> > + const struct switchdev_obj *obj = info->obj;
> > + int err;
> > +
> > + switch (obj->id) {
> > + case SWITCHDEV_OBJ_ID_PORT_VLAN:
> > + err = lan966x_handle_port_vlan_del(dev, nb,
> > + SWITCHDEV_OBJ_PORT_VLAN(obj));
> > + break;
> > + default:
> > + err = -EOPNOTSUPP;
> > + break;
> > + }
> > +
> > + info->handled = true;
> > + return err;
> > +}
> > +
> > +static int lan966x_switchdev_blocking_event(struct notifier_block *nb,
> > + unsigned long event,
> > + void *ptr)
> > +{
> > + struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
> > + int err;
> > +
> > + switch (event) {
> > + case SWITCHDEV_PORT_OBJ_ADD:
> > + err = lan966x_handle_port_obj_add(dev, nb, ptr);
> > + return notifier_from_errno(err);
> > + case SWITCHDEV_PORT_OBJ_DEL:
> > + err = lan966x_handle_port_obj_del(dev, nb, ptr);
> > + return notifier_from_errno(err);
> > + case SWITCHDEV_PORT_ATTR_SET:
> > + err = switchdev_handle_port_attr_set(dev, ptr,
> > + lan966x_netdevice_check,
> > + lan966x_port_attr_set);
> > + return notifier_from_errno(err);
> > + }
> > +
> > + return NOTIFY_DONE;
> > +}
> > +
> > +int lan966x_register_notifier_blocks(struct lan966x *lan966x)
> > +{
> > + int err;
> > +
> > + lan966x->netdevice_nb.notifier_call = lan966x_netdevice_event;
> > + err = register_netdevice_notifier(&lan966x->netdevice_nb);
> > + if (err)
> > + return err;
> > +
> > + lan966x->switchdev_nb.notifier_call = lan966x_switchdev_event;
> > + err = register_switchdev_notifier(&lan966x->switchdev_nb);
> > + if (err)
> > + goto err_switchdev_nb;
> > +
> > + lan966x->switchdev_blocking_nb.notifier_call = lan966x_switchdev_blocking_event;
> > + err = register_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> > + if (err)
> > + goto err_switchdev_blocking_nb;
> > +
> > + lan966x_owq = alloc_ordered_workqueue("lan966x_order", 0);
> > + if (!lan966x_owq) {
> > + err = -ENOMEM;
> > + goto err_switchdev_blocking_nb;
> > + }
>
> These should be singleton objects, otherwise things get problematic if
> you have more than one switch device instantiated in the system.

Yes, I will update this.

>
> > +
> > + return 0;
> > +
> > +err_switchdev_blocking_nb:
> > + unregister_switchdev_notifier(&lan966x->switchdev_nb);
> > +err_switchdev_nb:
> > + unregister_netdevice_notifier(&lan966x->netdevice_nb);
> > +
> > + return err;
> > +}
> > +
> > +void lan966x_unregister_notifier_blocks(struct lan966x *lan966x)
> > +{
> > + destroy_workqueue(lan966x_owq);
> > +
> > + unregister_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> > + unregister_switchdev_notifier(&lan966x->switchdev_nb);
> > + unregister_netdevice_notifier(&lan966x->netdevice_nb);
> > +}
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
> > index e47552775d06..26644503b4e6 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vlan.c
> > @@ -155,6 +155,9 @@ static bool lan966x_vlan_cpu_member_cpu_vlan_mask(struct lan966x *lan966x, u16 v
> >
> > u16 lan966x_vlan_port_get_pvid(struct lan966x_port *port)
> > {
> > + if (!port->bridge)
> > + return HOST_PVID;
> > +
> > return port->vlan_aware ? port->pvid : UNAWARE_PVID;
> > }
> >
> > @@ -210,6 +213,8 @@ void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
> > * table for the front port and the CPU
> > */
> > lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, UNAWARE_PVID);
> > + lan966x_mac_cpu_learn(lan966x, port->bridge->dev_addr,
> > + UNAWARE_PVID);
> >
> > lan966x_vlan_port_add_vlan_mask(port, UNAWARE_PVID);
> > lan966x_vlan_port_apply(port);
> > @@ -218,6 +223,8 @@ void lan966x_vlan_cpu_set_vlan_aware(struct lan966x_port *port)
> > * to vlan unaware
> > */
> > lan966x_mac_cpu_forget(lan966x, port->dev->dev_addr, UNAWARE_PVID);
> > + lan966x_mac_cpu_forget(lan966x, port->bridge->dev_addr,
> > + UNAWARE_PVID);
> >
> > lan966x_vlan_port_del_vlan_mask(port, UNAWARE_PVID);
> > lan966x_vlan_port_apply(port);
> > @@ -293,6 +300,7 @@ int lan966x_vlan_port_add_vlan(struct lan966x_port *port,
> > */
> > if (lan966x_vlan_cpu_member_cpu_vlan_mask(lan966x, vid)) {
> > lan966x_mac_cpu_learn(lan966x, port->dev->dev_addr, vid);
> > + lan966x_mac_cpu_learn(lan966x, port->bridge->dev_addr, vid);
> > lan966x_vlan_cpu_add_vlan_mask(lan966x, vid);
> > }
> >
> > @@ -322,8 +330,10 @@ int lan966x_vlan_port_del_vlan(struct lan966x_port *port,
> > * that vlan but still keep it in the mask because it may be needed
> > * again then another port gets added in tha vlan
> > */
> > - if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid))
> > + if (!lan966x_vlan_port_any_vlan_mask(lan966x, vid)) {
> > + lan966x_mac_cpu_forget(lan966x, port->bridge->dev_addr, vid);
> > lan966x_vlan_cpu_del_vlan_mask(lan966x, vid);
> > + }
> >
> > return 0;
> > }
> > --
> > 2.33.0
> >

--
/Horatiu

2021-12-09 20:23:36

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 2/6] dt-bindings: net: lan966x: Extend with the analyzer interrupt

On Thu, Dec 09, 2021 at 04:42:47PM +0100, Horatiu Vultur wrote:
> The 12/09/2021 10:58, Vladimir Oltean wrote:
> >
> > On Thu, Dec 09, 2021 at 10:46:11AM +0100, Horatiu Vultur wrote:
> > > Extend dt-bindings for lan966x with analyzer interrupt.
> > > This interrupt can be generated for example when the HW learn/forgets
> > > an entry in the MAC table.
> > >
> > > Signed-off-by: Horatiu Vultur <[email protected]>
> > > ---
> >
> > Reviewed-by: Vladimir Oltean <[email protected]>
> >
> > Why don't you describe your hardware in the device tree all at once?
> > Doing it piece by piece means that every time when you add a new
> > functionality you need to be compatible with the absence of a certain
> > reg, interrupt etc.
>
> I though it is more clear what is added in the patch series.
> But then, if for example add more interrupts in DT than what the
> driver support, that would not be an issue?

I haven't kept track of the lan966x driver development. It looks like it
is pretty new, so I think it's ok in this case. But I've also seen
features introduced years after the driver was initially published (see
ocelot fdma) where device tree updates were still necessary, due to
minor things like these: an interrupt isn't there, the registers for
FDMA aren't there, etc. After that kind of time you'd expect the DT
to no longer require updates unless there is some unforeseen event
(something is broken, a driver is radically rethought). Sure there's a
fine line between how much you can add to the device tree and the
how many consumers there are in the kernel, but on the other hand the
kernel doesn't have to use everything that's in the device tree.
For example, at Mark Brown's suggestion, the DSPI nodes in ls1028a.dtsi
declare their DMA channels even though the driver does not use them
(it could, though, but it would be slower). Similarly, the DSPI driver
for LS1021A has had a while in which it ignored the interrupt line from
the device tree, because poll mode was simply faster. To me, this kind
of approach where the device tree is provisioned even for configurations
that aren't supported today makes sense, precisely because the DT blob
and the kernel have different lifetimes. It's better to have the
interrupt and not use it than to need it and not have it.

2021-12-13 10:25:57

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/09/2021 17:43, Horatiu Vultur wrote:
> > > +int lan966x_register_notifier_blocks(struct lan966x *lan966x)
> > > +{
> > > + int err;
> > > +
> > > + lan966x->netdevice_nb.notifier_call = lan966x_netdevice_event;
> > > + err = register_netdevice_notifier(&lan966x->netdevice_nb);
> > > + if (err)
> > > + return err;
> > > +
> > > + lan966x->switchdev_nb.notifier_call = lan966x_switchdev_event;
> > > + err = register_switchdev_notifier(&lan966x->switchdev_nb);
> > > + if (err)
> > > + goto err_switchdev_nb;
> > > +
> > > + lan966x->switchdev_blocking_nb.notifier_call = lan966x_switchdev_blocking_event;
> > > + err = register_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> > > + if (err)
> > > + goto err_switchdev_blocking_nb;
> > > +
> > > + lan966x_owq = alloc_ordered_workqueue("lan966x_order", 0);
> > > + if (!lan966x_owq) {
> > > + err = -ENOMEM;
> > > + goto err_switchdev_blocking_nb;
> > > + }
> >
> > These should be singleton objects, otherwise things get problematic if
> > you have more than one switch device instantiated in the system.
>
> Yes, I will update this.

Actually I think they need to be part of lan966x.
Because we want each lan966x instance to be independent of each other.
This is not seen in this version but is more clear in the next version
(v4).

>
> >
>
> --
> /Horatiu

--
/Horatiu

2021-12-13 11:46:11

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Thu, Dec 09, 2021 at 05:43:11PM +0100, Horatiu Vultur wrote:
> > My documentation of CPU_SRC_COPY_ENA says:
> >
> > If set, all frames received on this port are
> > copied to the CPU extraction queue given by
> > CPUQ_CFG.CPUQ_SRC_COPY.
> >
> > I think it was established a while ago that this isn't what promiscuous
> > mode is about? Instead it is about accepting packets on a port
> > regardless of whether the MAC DA is in their RX filter or not.
>
> Yes, I am aware that this change interprets the things differently and I
> am totally OK to drop this promisc if it is needed.

I think we just need to agree on the observable behavior. Promiscuous
means for an interface to receive packets with unknown destination, and
while in standalone mode you do support that, in bridge mode you're a
bit on the edge: the port accepts them but will deliver them anywhere
except to the CPU. I suppose you could try to make an argument that you
know better than the bridge, and as long as the use cases for that are
restricted enough, maybe it could work for most scenarios. I don't know.

> > Hence the oddity of your change. I understand what it intends to do:
> > if this is a standalone port you support IFF_UNICAST_FLT, so you drop
> > frames with unknown MAC DA. But if IFF_PROMISC is set, then why do you
> > copy all frames to the CPU? Why don't you just put the CPU in the
> > unknown flooding mask?
>
> Because I don't want the CPU to be in the unknown flooding mask. I want
> to send frames to the CPU only if it is required.

What is the strategy through which this driver accepts things like
pinging the bridge device over IPv6, with the Neighbor Discovery
protocol having the ICMP6 neighbor solicitation messages delivered to
(according to my knowledge) an unregistered IPv6 multicast address?
Whose responsibility is it to notify the driver of that address, and
does the driver copy those packets to the CPU in the right VLAN?

> > How do you handle migration of an FDB entry pointing towards the CPU,
> > towards one pointing towards a port?
>
> Shouldn't I get 2 calls that the entry is removed from CPU and then
> added to a port?

Ok.

2021-12-13 13:35:41

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/13/2021 11:46, Vladimir Oltean wrote:
>
> On Thu, Dec 09, 2021 at 05:43:11PM +0100, Horatiu Vultur wrote:
> > > My documentation of CPU_SRC_COPY_ENA says:
> > >
> > > If set, all frames received on this port are
> > > copied to the CPU extraction queue given by
> > > CPUQ_CFG.CPUQ_SRC_COPY.
> > >
> > > I think it was established a while ago that this isn't what promiscuous
> > > mode is about? Instead it is about accepting packets on a port
> > > regardless of whether the MAC DA is in their RX filter or not.
> >
> > Yes, I am aware that this change interprets the things differently and I
> > am totally OK to drop this promisc if it is needed.
>
> I think we just need to agree on the observable behavior. Promiscuous
> means for an interface to receive packets with unknown destination, and
> while in standalone mode you do support that, in bridge mode you're a
> bit on the edge: the port accepts them but will deliver them anywhere
> except to the CPU. I suppose you could try to make an argument that you
> know better than the bridge, and as long as the use cases for that are
> restricted enough, maybe it could work for most scenarios. I don't know.

I think this requires some proper explanations of the intended
behaviour for both the standalone and bridge mode. I will drop this
promisc for now, as other drivers are doing it and at a later point
send some patch series with all the explanations.

>
> > > Hence the oddity of your change. I understand what it intends to do:
> > > if this is a standalone port you support IFF_UNICAST_FLT, so you drop
> > > frames with unknown MAC DA. But if IFF_PROMISC is set, then why do you
> > > copy all frames to the CPU? Why don't you just put the CPU in the
> > > unknown flooding mask?
> >
> > Because I don't want the CPU to be in the unknown flooding mask. I want
> > to send frames to the CPU only if it is required.
>
> What is the strategy through which this driver accepts things like
> pinging the bridge device over IPv6, with the Neighbor Discovery
> protocol having the ICMP6 neighbor solicitation messages delivered to
> (according to my knowledge) an unregistered IPv6 multicast address?
> Whose responsibility is it to notify the driver of that address, and
> does the driver copy those packets to the CPU in the right VLAN?

I think in that case the CPU should be part of the multicast flooding
mask. I will need to look more on this because I don't know much about
the IPv6.

>
> > > How do you handle migration of an FDB entry pointing towards the CPU,
> > > towards one pointing towards a port?
> >
> > Shouldn't I get 2 calls that the entry is removed from CPU and then
> > added to a port?
>
> Ok.

--
/Horatiu

2021-12-13 13:43:30

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Mon, Dec 13, 2021 at 11:25:29AM +0100, Horatiu Vultur wrote:
> The 12/09/2021 17:43, Horatiu Vultur wrote:
> > > > +int lan966x_register_notifier_blocks(struct lan966x *lan966x)
> > > > +{
> > > > + int err;
> > > > +
> > > > + lan966x->netdevice_nb.notifier_call = lan966x_netdevice_event;
> > > > + err = register_netdevice_notifier(&lan966x->netdevice_nb);
> > > > + if (err)
> > > > + return err;
> > > > +
> > > > + lan966x->switchdev_nb.notifier_call = lan966x_switchdev_event;
> > > > + err = register_switchdev_notifier(&lan966x->switchdev_nb);
> > > > + if (err)
> > > > + goto err_switchdev_nb;
> > > > +
> > > > + lan966x->switchdev_blocking_nb.notifier_call = lan966x_switchdev_blocking_event;
> > > > + err = register_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> > > > + if (err)
> > > > + goto err_switchdev_blocking_nb;
> > > > +
> > > > + lan966x_owq = alloc_ordered_workqueue("lan966x_order", 0);
> > > > + if (!lan966x_owq) {
> > > > + err = -ENOMEM;
> > > > + goto err_switchdev_blocking_nb;
> > > > + }
> > >
> > > These should be singleton objects, otherwise things get problematic if
> > > you have more than one switch device instantiated in the system.
> >
> > Yes, I will update this.
>
> Actually I think they need to be part of lan966x.
> Because we want each lan966x instance to be independent of each other.
> This is not seen in this version but is more clear in the next version
> (v4).

They are independent of each other. You deduce the interface on which
the notifier was emitted using switchdev_notifier_info_to_dev() and act
upon it, if lan966x_netdevice_check() is true. The notifier handling
code itself is stateless, all the state is per port / per switch.
If you register one notifier handler per switch, lan966x_netdevice_check()
would return true for each notifier handler instance, and you would
handle each event twice, would you not? This is why I'm saying that the
notifier handlers should be registered as singletons, like other drivers
do.

2021-12-13 14:24:58

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/13/2021 13:43, Vladimir Oltean wrote:
>
> On Mon, Dec 13, 2021 at 11:25:29AM +0100, Horatiu Vultur wrote:
> > The 12/09/2021 17:43, Horatiu Vultur wrote:
> > > > > +int lan966x_register_notifier_blocks(struct lan966x *lan966x)
> > > > > +{
> > > > > + int err;
> > > > > +
> > > > > + lan966x->netdevice_nb.notifier_call = lan966x_netdevice_event;
> > > > > + err = register_netdevice_notifier(&lan966x->netdevice_nb);
> > > > > + if (err)
> > > > > + return err;
> > > > > +
> > > > > + lan966x->switchdev_nb.notifier_call = lan966x_switchdev_event;
> > > > > + err = register_switchdev_notifier(&lan966x->switchdev_nb);
> > > > > + if (err)
> > > > > + goto err_switchdev_nb;
> > > > > +
> > > > > + lan966x->switchdev_blocking_nb.notifier_call = lan966x_switchdev_blocking_event;
> > > > > + err = register_switchdev_blocking_notifier(&lan966x->switchdev_blocking_nb);
> > > > > + if (err)
> > > > > + goto err_switchdev_blocking_nb;
> > > > > +
> > > > > + lan966x_owq = alloc_ordered_workqueue("lan966x_order", 0);
> > > > > + if (!lan966x_owq) {
> > > > > + err = -ENOMEM;
> > > > > + goto err_switchdev_blocking_nb;
> > > > > + }
> > > >
> > > > These should be singleton objects, otherwise things get problematic if
> > > > you have more than one switch device instantiated in the system.
> > >
> > > Yes, I will update this.
> >
> > Actually I think they need to be part of lan966x.
> > Because we want each lan966x instance to be independent of each other.
> > This is not seen in this version but is more clear in the next version
> > (v4).
>
> They are independent of each other. You deduce the interface on which
> the notifier was emitted using switchdev_notifier_info_to_dev() and act
> upon it, if lan966x_netdevice_check() is true. The notifier handling
> code itself is stateless, all the state is per port / per switch.
> If you register one notifier handler per switch, lan966x_netdevice_check()
> would return true for each notifier handler instance, and you would
> handle each event twice, would you not?

That is correct, I will get the event twice which is a problem in the
lan966x. The function lan966x_netdevice_check should be per instance, in
this way each instance can filter the events.
The reason why I am putting the notifier_block inside lan966x is to be
able to get to the instance of lan966x even if I get a event that is not
for lan966x port.

> notifier handlers should be registered as singletons, like other drivers
> do.

It looks like not all the other driver register them as singletone. For
example: prestera, mlx5, sparx5. (I just have done a git grep for
register_switchdev_notifier, I have not looked in details at the
implementation).


--
/Horatiu

2021-12-13 14:29:14

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > They are independent of each other. You deduce the interface on which
> > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > code itself is stateless, all the state is per port / per switch.
> > If you register one notifier handler per switch, lan966x_netdevice_check()
> > would return true for each notifier handler instance, and you would
> > handle each event twice, would you not?
>
> That is correct, I will get the event twice which is a problem in the
> lan966x. The function lan966x_netdevice_check should be per instance, in
> this way each instance can filter the events.
> The reason why I am putting the notifier_block inside lan966x is to be
> able to get to the instance of lan966x even if I get a event that is not
> for lan966x port.

That isn't a problem, every netdevice notifier still sees all events.
DSA intercepts a lot of events which aren't directly emitted for its own
interfaces. You don't gain much by having one more, if anything.

> > notifier handlers should be registered as singletons, like other drivers
> > do.
>
> It looks like not all the other driver register them as singletone. For
> example: prestera, mlx5, sparx5. (I just have done a git grep for
> register_switchdev_notifier, I have not looked in details at the
> implementation).

Not all driver writers may have realized that it is an issue that needs
to be thought of.

2021-12-13 15:26:26

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/13/2021 14:29, Vladimir Oltean wrote:
>
> On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > > They are independent of each other. You deduce the interface on which
> > > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > > code itself is stateless, all the state is per port / per switch.
> > > If you register one notifier handler per switch, lan966x_netdevice_check()
> > > would return true for each notifier handler instance, and you would
> > > handle each event twice, would you not?
> >
> > That is correct, I will get the event twice which is a problem in the
> > lan966x. The function lan966x_netdevice_check should be per instance, in
> > this way each instance can filter the events.
> > The reason why I am putting the notifier_block inside lan966x is to be
> > able to get to the instance of lan966x even if I get a event that is not
> > for lan966x port.
>
> That isn't a problem, every netdevice notifier still sees all events.

Yes, that is correct.
Sorry maybe I am still confused, but some things are still not right.

So lets say there are two lan966x instances(A and B) and each one has 2
ports(ethA0, ethA1, ethB0, ethB1).
So with the current behaviour, if for example ethA0 is added in vlan
100, then we get two callbacks for each lan966x instance(A and B) because
each of them registered. And because of lan966x_netdevice_check() is true
for ethA0 will do twice the work.
And you propose to have a singleton notifier so we get only 1 callback
and will be fine for this case. But if you add for example the bridge in
vlan 200 then I will never be able to get to the lan966x instance which
is needed in this case.

That is why if the lan966x_netdevice_check would be per instance, then
we can filter like before, we still get call twice but then we filter for
each instance. We get the lan966x instance from notifier_block and then
we can check if the port netdev_ops is the same as the lan966x
netdev_ops.

And in the other case we will still be able to get to the lan966x instance
in case the bridge is added in a vlan.

> DSA intercepts a lot of events which aren't directly emitted for its own
> interfaces. You don't gain much by having one more, if anything.
>
> > > notifier handlers should be registered as singletons, like other drivers
> > > do.
> >
> > It looks like not all the other driver register them as singletone. For
> > example: prestera, mlx5, sparx5. (I just have done a git grep for
> > register_switchdev_notifier, I have not looked in details at the
> > implementation).
>
> Not all driver writers may have realized that it is an issue that needs
> to be thought of.

--
/Horatiu

2021-12-13 16:25:12

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Mon, Dec 13, 2021 at 04:28:24PM +0100, Horatiu Vultur wrote:
> The 12/13/2021 14:29, Vladimir Oltean wrote:
> >
> > On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > > > They are independent of each other. You deduce the interface on which
> > > > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > > > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > > > code itself is stateless, all the state is per port / per switch.
> > > > If you register one notifier handler per switch, lan966x_netdevice_check()
> > > > would return true for each notifier handler instance, and you would
> > > > handle each event twice, would you not?
> > >
> > > That is correct, I will get the event twice which is a problem in the
> > > lan966x. The function lan966x_netdevice_check should be per instance, in
> > > this way each instance can filter the events.
> > > The reason why I am putting the notifier_block inside lan966x is to be
> > > able to get to the instance of lan966x even if I get a event that is not
> > > for lan966x port.
> >
> > That isn't a problem, every netdevice notifier still sees all events.
>
> Yes, that is correct.
> Sorry maybe I am still confused, but some things are still not right.
>
> So lets say there are two lan966x instances(A and B) and each one has 2
> ports(ethA0, ethA1, ethB0, ethB1).
> So with the current behaviour, if for example ethA0 is added in vlan
> 100, then we get two callbacks for each lan966x instance(A and B) because
> each of them registered. And because of lan966x_netdevice_check() is true
> for ethA0 will do twice the work.
> And you propose to have a singleton notifier so we get only 1 callback
> and will be fine for this case. But if you add for example the bridge in
> vlan 200 then I will never be able to get to the lan966x instance which
> is needed in this case.

I'm not sure what you mean by "you add the bridge in vlan 200" with
respect to netdevice notifiers. Create an 8021q upper with VID 200 on
top of a bridge (as that would generate a NETDEV_CHANGEUPPER)?
If there's a netdevice event on a bridge, the singleton netdevice event
handler can see if it is a bridge (netif_is_bridge_master), and if it
is, it can crawl through the bridge's lower interfaces using
netdev_for_each_lower_dev to see if there is any lan966x interface
beneath it. If there isn't, nothing to do. Otherwise, you get the
opportunity to do something for each port under that bridge. Maybe I'm
not understanding what you're trying to accomplish that's different?

> That is why if the lan966x_netdevice_check would be per instance, then
> we can filter like before, we still get call twice but then we filter for
> each instance. We get the lan966x instance from notifier_block and then
> we can check if the port netdev_ops is the same as the lan966x
> netdev_ops.
>
> And in the other case we will still be able to get to the lan966x instance
> in case the bridge is added in a vlan.
>
> > DSA intercepts a lot of events which aren't directly emitted for its own
> > interfaces. You don't gain much by having one more, if anything.
> >
> > > > notifier handlers should be registered as singletons, like other drivers
> > > > do.
> > >
> > > It looks like not all the other driver register them as singletone. For
> > > example: prestera, mlx5, sparx5. (I just have done a git grep for
> > > register_switchdev_notifier, I have not looked in details at the
> > > implementation).
> >
> > Not all driver writers may have realized that it is an issue that needs
> > to be thought of.
>
> --
> /Horatiu

2021-12-13 21:22:52

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/13/2021 16:25, Vladimir Oltean wrote:
>
> On Mon, Dec 13, 2021 at 04:28:24PM +0100, Horatiu Vultur wrote:
> > The 12/13/2021 14:29, Vladimir Oltean wrote:
> > >
> > > On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > > > > They are independent of each other. You deduce the interface on which
> > > > > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > > > > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > > > > code itself is stateless, all the state is per port / per switch.
> > > > > If you register one notifier handler per switch, lan966x_netdevice_check()
> > > > > would return true for each notifier handler instance, and you would
> > > > > handle each event twice, would you not?
> > > >
> > > > That is correct, I will get the event twice which is a problem in the
> > > > lan966x. The function lan966x_netdevice_check should be per instance, in
> > > > this way each instance can filter the events.
> > > > The reason why I am putting the notifier_block inside lan966x is to be
> > > > able to get to the instance of lan966x even if I get a event that is not
> > > > for lan966x port.
> > >
> > > That isn't a problem, every netdevice notifier still sees all events.
> >
> > Yes, that is correct.
> > Sorry maybe I am still confused, but some things are still not right.
> >
> > So lets say there are two lan966x instances(A and B) and each one has 2
> > ports(ethA0, ethA1, ethB0, ethB1).
> > So with the current behaviour, if for example ethA0 is added in vlan
> > 100, then we get two callbacks for each lan966x instance(A and B) because
> > each of them registered. And because of lan966x_netdevice_check() is true
> > for ethA0 will do twice the work.
> > And you propose to have a singleton notifier so we get only 1 callback
> > and will be fine for this case. But if you add for example the bridge in
> > vlan 200 then I will never be able to get to the lan966x instance which
> > is needed in this case.
>
> I'm not sure what you mean by "you add the bridge in vlan 200" with
> respect to netdevice notifiers. Create an 8021q upper with VID 200 on
> top of a bridge (as that would generate a NETDEV_CHANGEUPPER)?

I meant the following:

ip link add name brA type bridge
ip link add name brB type bridge
ip link set dev ethA0 master brA
ip link set dev ethA1 master brA
ip link set dev ethB0 master brB
ip link set dev ethB1 master brB
bridge vlan add dev brA vid 200 self

After the last command both lan966x instances will get a switchdev blocking
event where event is SWITCHDEV_PORT_OBJ_ADD and obj->id is
SWITCHDEV_OBJ_ID_PORT_VLAN. And in this case the
switchdev_notifier_info_to_dev will return brA.

> If there's a netdevice event on a bridge, the singleton netdevice event
> handler can see if it is a bridge (netif_is_bridge_master), and if it
> is, it can crawl through the bridge's lower interfaces using
> netdev_for_each_lower_dev to see if there is any lan966x interface
> beneath it. If there isn't, nothing to do. Otherwise, you get the
> opportunity to do something for each port under that bridge.

If I start to use switchdev_handle_port_obj_add, then as you mention
will get a callback for each interface under the port and then I need to
look in obj->orig_dev to see if it was a bridge or was a port that was
part of the bridge.

If I don't use switchdev_handle_port_obj_add and implement own function
then there is no way to get to the lan966x instance. I will get two
callbacks but then they can be filtered based on the bridge. If I use
switchdev_handle_port_obj_add then if I have 2 ports under the bridge,
both ports will be called which will do the same work anyway.

So I am not sure how much I will benefit of using
switchdev_handle_port_obj_add in this case.

One important observation in the driver is that I am separating these 2
cases:

bridge vlan add dev brA vid 300 self
bridge vlan add dev ethA0 vid 400

> Maybe I'm not understanding what you're trying to accomplish that's different?

The reason is that I want to use brA to control the CPU port. To decide
which frames to be copy to the CPU. Also to copy as few as possible
frames to CPU.

If we still want to go with the approach of using a singleton notifier
block, then we will still have a problem for netdevice notifier block.
We will get the same issue, can't get to lan966x instance in case the
lan966x callback is called for a different device. And we need this for
the following case:

If for example eth0, eth1 are part of a different IP and eth2, eth3 are
part of lan966x. We would like not to be able to put under the same
bridge interfaces that are part of different IPs (more precisely,
lan966x interfaces can be only under a bridge where lan966x interfaces
are part).

For example the following command should fail:
ip link add name br0 type bridge
ip link set dev eth0 master br0
ip link set dev eth2 master br0

Also the this command should fail:
ip link add name br0 type bridge
ip link set dev eth2 master br0
ip link set dev eth0 master br0

But the following should be accepted:
ip link add name br0 type bridge
ip link set dev eth0 master br0
ip link set dev eth1 master br0
ip link add name br1 type bridge
ip link set dev eth2 master br1
ip link set dev eth3 master br1

Maybe I should also make it explictly that is not allow to have more
than one instance of lan966x for now. And once is needed then add
support for it.

>
> > That is why if the lan966x_netdevice_check would be per instance, then
> > we can filter like before, we still get call twice but then we filter for
> > each instance. We get the lan966x instance from notifier_block and then
> > we can check if the port netdev_ops is the same as the lan966x
> > netdev_ops.
> >
> > And in the other case we will still be able to get to the lan966x instance
> > in case the bridge is added in a vlan.
> >
> > > DSA intercepts a lot of events which aren't directly emitted for its own
> > > interfaces. You don't gain much by having one more, if anything.
> > >
> > > > > notifier handlers should be registered as singletons, like other drivers
> > > > > do.
> > > >
> > > > It looks like not all the other driver register them as singletone. For
> > > > example: prestera, mlx5, sparx5. (I just have done a git grep for
> > > > register_switchdev_notifier, I have not looked in details at the
> > > > implementation).
> > >
> > > Not all driver writers may have realized that it is an issue that needs
> > > to be thought of.
> >
> > --
> > /Horatiu

--
/Horatiu

2021-12-14 00:02:53

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Mon, Dec 13, 2021 at 10:24:50PM +0100, Horatiu Vultur wrote:
> The 12/13/2021 16:25, Vladimir Oltean wrote:
> >
> > On Mon, Dec 13, 2021 at 04:28:24PM +0100, Horatiu Vultur wrote:
> > > The 12/13/2021 14:29, Vladimir Oltean wrote:
> > > >
> > > > On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > > > > > They are independent of each other. You deduce the interface on which
> > > > > > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > > > > > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > > > > > code itself is stateless, all the state is per port / per switch.
> > > > > > If you register one notifier handler per switch, lan966x_netdevice_check()
> > > > > > would return true for each notifier handler instance, and you would
> > > > > > handle each event twice, would you not?
> > > > >
> > > > > That is correct, I will get the event twice which is a problem in the
> > > > > lan966x. The function lan966x_netdevice_check should be per instance, in
> > > > > this way each instance can filter the events.
> > > > > The reason why I am putting the notifier_block inside lan966x is to be
> > > > > able to get to the instance of lan966x even if I get a event that is not
> > > > > for lan966x port.
> > > >
> > > > That isn't a problem, every netdevice notifier still sees all events.
> > >
> > > Yes, that is correct.
> > > Sorry maybe I am still confused, but some things are still not right.
> > >
> > > So lets say there are two lan966x instances(A and B) and each one has 2
> > > ports(ethA0, ethA1, ethB0, ethB1).
> > > So with the current behaviour, if for example ethA0 is added in vlan
> > > 100, then we get two callbacks for each lan966x instance(A and B) because
> > > each of them registered. And because of lan966x_netdevice_check() is true
> > > for ethA0 will do twice the work.
> > > And you propose to have a singleton notifier so we get only 1 callback
> > > and will be fine for this case. But if you add for example the bridge in
> > > vlan 200 then I will never be able to get to the lan966x instance which
> > > is needed in this case.
> >
> > I'm not sure what you mean by "you add the bridge in vlan 200" with
> > respect to netdevice notifiers. Create an 8021q upper with VID 200 on
> > top of a bridge (as that would generate a NETDEV_CHANGEUPPER)?
>
> I meant the following:
>
> ip link add name brA type bridge
> ip link add name brB type bridge
> ip link set dev ethA0 master brA
> ip link set dev ethA1 master brA
> ip link set dev ethB0 master brB
> ip link set dev ethB1 master brB
> bridge vlan add dev brA vid 200 self

Ok, so the same as this use case and patch posted by Florian for DSA:
https://lkml.org/lkml/2018/6/24/300
we should be getting back to it some day.

> After the last command both lan966x instances will get a switchdev blocking
> event where event is SWITCHDEV_PORT_OBJ_ADD and obj->id is
> SWITCHDEV_OBJ_ID_PORT_VLAN. And in this case the
> switchdev_notifier_info_to_dev will return brA.

It returns brA anyway. But the point being, your current code submission
is something like this (of course, I had to fish these two functions
from two different patches, because they still aren't properly split):

static int lan966x_vlan_cpu_add_vlan_mask(struct lan966x *lan966x, u16 vid)
{
lan966x->vlan_mask[vid] |= BIT(CPU_PORT);
return lan966x_vlan_set_mask(lan966x, vid);
}

static int lan966x_handle_port_vlan_add(struct net_device *dev,
struct notifier_block *nb,
const struct switchdev_obj_port_vlan *v)
{
struct lan966x_port *port;
struct lan966x *lan966x;

/* When adding a port to a vlan, we get a callback for the port but
* also for the bridge. When get the callback for the bridge just bail
* out. Then when the bridge is added to the vlan, then we get a
* callback here but in this case the flags has set:
* BRIDGE_VLAN_INFO_BRENTRY. In this case it means that the CPU
* port is added to the vlan, so the broadcast frames and unicast frames
* with dmac of the bridge should be foward to CPU.
*/
if (netif_is_bridge_master(dev) &&
!(v->flags & BRIDGE_VLAN_INFO_BRENTRY))
return 0;

lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);

/* In case the port gets called */
if (!(netif_is_bridge_master(dev))) {
if (!lan966x_netdevice_check(dev))
return -EOPNOTSUPP;

port = netdev_priv(dev);
return lan966x_vlan_port_add_vlan(port, v->vid,
v->flags & BRIDGE_VLAN_INFO_PVID,
v->flags & BRIDGE_VLAN_INFO_UNTAGGED);
}

/* In case the bridge gets called */
if (netif_is_bridge_master(dev))
return lan966x_vlan_cpu_add_vlan(lan966x, dev, v->vid);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In which way does this function call, exactly, check
your lan966x's relationship to that bridge?

return 0;
}

My point being, if you have two veth interfaces in your system just
minding their own business and being put in an unrelated bridge, and
that bridge would be put in VLAN 100 too, my understanding is that your
lan966x driver would sniff that event and add its CPU port to VLAN 100
too. The reverse is true as well: any removal of a bridge from a VLAN
would also cause your CPU port to stop being in that VLAN, no matter
what interfaces may be in that VLAN. How could I say this... "spooky
action at a distance".

> > If there's a netdevice event on a bridge, the singleton netdevice event
> > handler can see if it is a bridge (netif_is_bridge_master), and if it
> > is, it can crawl through the bridge's lower interfaces using
> > netdev_for_each_lower_dev to see if there is any lan966x interface
> > beneath it. If there isn't, nothing to do. Otherwise, you get the
> > opportunity to do something for each port under that bridge.
>
> If I start to use switchdev_handle_port_obj_add, then as you mention
> will get a callback for each interface under the port and then I need to
> look in obj->orig_dev to see if it was a bridge or was a port that was
> part of the bridge.

Oh yes of course. And right now you don't need that because? You think
you get notifications only of switchdev events emitted by bridges that
you have a port in?

> If I don't use switchdev_handle_port_obj_add and implement own function
> then there is no way to get to the lan966x instance.

The switchdev_handle_port_obj_add() function isn't magic, and it has an
actual public implementation, too. Sure you can get to the lan966x
instance even if you don't use switchdev_handle_port_obj_add() -
although, it is there for people to use it.

> I will get two callbacks but then they can be filtered based on the
> bridge. If I use switchdev_handle_port_obj_add then if I have 2 ports
> under the bridge, both ports will be called which will do the same
> work anyway.

And that's a good thing, if you actually think about how you design
things to actually work. Please consider that you have two distinct
events: you can join a bridge that is in a VLAN, or the bridge can join
that VLAN while you have some ports under it. The invariant is that your
CPU port needs to be in that VLAN only for as long as there is any port
under that bridge. So it is actually beneficial to use the
switchdev_handle_* helper. It tells you how many users of the CPU VLAN
rule there still are. It would be broken to delete it right away, when a
port leaves the bridge. It would also be broken to not delete it after
all ports leave: the bridge may have a longer lifetime than the lan966x
ports beneath them, so there may not be any deletion event that you
should expect.

> So I am not sure how much I will benefit of using
> switchdev_handle_port_obj_add in this case.
>
> One important observation in the driver is that I am separating these 2
> cases:
>
> bridge vlan add dev brA vid 300 self
> bridge vlan add dev ethA0 vid 400

Understood, and that's ok. But I'm not convinced it works, though.

> > Maybe I'm not understanding what you're trying to accomplish that's different?
>
> The reason is that I want to use brA to control the CPU port. To decide
> which frames to be copy to the CPU. Also to copy as few as possible
> frames to CPU.
>
> If we still want to go with the approach of using a singleton notifier
> block, then we will still have a problem for netdevice notifier block.
> We will get the same issue, can't get to lan966x instance in case the
> lan966x callback is called for a different device. And we need this for
> the following case:
>
> If for example eth0, eth1 are part of a different IP and eth2, eth3 are
> part of lan966x. We would like not to be able to put under the same
> bridge interfaces that are part of different IPs (more precisely,
> lan966x interfaces can be only under a bridge where lan966x interfaces
> are part).
>
> For example the following command should fail:
> ip link add name br0 type bridge
> ip link set dev eth0 master br0
> ip link set dev eth2 master br0
>
> Also the this command should fail:
> ip link add name br0 type bridge
> ip link set dev eth2 master br0
> ip link set dev eth0 master br0
>
> But the following should be accepted:
> ip link add name br0 type bridge
> ip link set dev eth0 master br0
> ip link set dev eth1 master br0
> ip link add name br1 type bridge
> ip link set dev eth2 master br1
> ip link set dev eth3 master br1

You can track NETDEV_PRECHANGEUPPER and deny that, and also provide an
extack with a reason. That should work, it's been tested.

> Maybe I should also make it explictly that is not allow to have more
> than one instance of lan966x for now. And once is needed then add
> support for it.

I don't necessarily see the reason for this, but ok. I don't think you
should view things as "support for parallel instances of the driver is
what's complicating the implementation", but rather "catching the events
in all the permutations that they can happen in is what this driver
needs to provide a good user experience".

2021-12-14 14:29:32

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

The 12/14/2021 00:01, Vladimir Oltean wrote:

Sorry for late reply, but I spend some time trying out your suggestions.

>
> On Mon, Dec 13, 2021 at 10:24:50PM +0100, Horatiu Vultur wrote:
> > The 12/13/2021 16:25, Vladimir Oltean wrote:
> > >
> > > On Mon, Dec 13, 2021 at 04:28:24PM +0100, Horatiu Vultur wrote:
> > > > The 12/13/2021 14:29, Vladimir Oltean wrote:
> > > > >
> > > > > On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > > > > > > They are independent of each other. You deduce the interface on which
> > > > > > > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > > > > > > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > > > > > > code itself is stateless, all the state is per port / per switch.
> > > > > > > If you register one notifier handler per switch, lan966x_netdevice_check()
> > > > > > > would return true for each notifier handler instance, and you would
> > > > > > > handle each event twice, would you not?
> > > > > >
> > > > > > That is correct, I will get the event twice which is a problem in the
> > > > > > lan966x. The function lan966x_netdevice_check should be per instance, in
> > > > > > this way each instance can filter the events.
> > > > > > The reason why I am putting the notifier_block inside lan966x is to be
> > > > > > able to get to the instance of lan966x even if I get a event that is not
> > > > > > for lan966x port.
> > > > >
> > > > > That isn't a problem, every netdevice notifier still sees all events.
> > > >
> > > > Yes, that is correct.
> > > > Sorry maybe I am still confused, but some things are still not right.
> > > >
> > > > So lets say there are two lan966x instances(A and B) and each one has 2
> > > > ports(ethA0, ethA1, ethB0, ethB1).
> > > > So with the current behaviour, if for example ethA0 is added in vlan
> > > > 100, then we get two callbacks for each lan966x instance(A and B) because
> > > > each of them registered. And because of lan966x_netdevice_check() is true
> > > > for ethA0 will do twice the work.
> > > > And you propose to have a singleton notifier so we get only 1 callback
> > > > and will be fine for this case. But if you add for example the bridge in
> > > > vlan 200 then I will never be able to get to the lan966x instance which
> > > > is needed in this case.
> > >
> > > I'm not sure what you mean by "you add the bridge in vlan 200" with
> > > respect to netdevice notifiers. Create an 8021q upper with VID 200 on
> > > top of a bridge (as that would generate a NETDEV_CHANGEUPPER)?
> >
> > I meant the following:
> >
> > ip link add name brA type bridge
> > ip link add name brB type bridge
> > ip link set dev ethA0 master brA
> > ip link set dev ethA1 master brA
> > ip link set dev ethB0 master brB
> > ip link set dev ethB1 master brB
> > bridge vlan add dev brA vid 200 self
>
> Ok, so the same as this use case and patch posted by Florian for DSA:
> https://lkml.org/lkml/2018/6/24/300
> we should be getting back to it some day.
>
> > After the last command both lan966x instances will get a switchdev blocking
> > event where event is SWITCHDEV_PORT_OBJ_ADD and obj->id is
> > SWITCHDEV_OBJ_ID_PORT_VLAN. And in this case the
> > switchdev_notifier_info_to_dev will return brA.
>
> It returns brA anyway. But the point being, your current code submission
> is something like this (of course, I had to fish these two functions
> from two different patches, because they still aren't properly split):
>
> static int lan966x_vlan_cpu_add_vlan_mask(struct lan966x *lan966x, u16 vid)
> {
> lan966x->vlan_mask[vid] |= BIT(CPU_PORT);
> return lan966x_vlan_set_mask(lan966x, vid);
> }
>
> static int lan966x_handle_port_vlan_add(struct net_device *dev,
> struct notifier_block *nb,
> const struct switchdev_obj_port_vlan *v)
> {
> struct lan966x_port *port;
> struct lan966x *lan966x;
>
> /* When adding a port to a vlan, we get a callback for the port but
> * also for the bridge. When get the callback for the bridge just bail
> * out. Then when the bridge is added to the vlan, then we get a
> * callback here but in this case the flags has set:
> * BRIDGE_VLAN_INFO_BRENTRY. In this case it means that the CPU
> * port is added to the vlan, so the broadcast frames and unicast frames
> * with dmac of the bridge should be foward to CPU.
> */
> if (netif_is_bridge_master(dev) &&
> !(v->flags & BRIDGE_VLAN_INFO_BRENTRY))
> return 0;
>
> lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);
>
> /* In case the port gets called */
> if (!(netif_is_bridge_master(dev))) {
> if (!lan966x_netdevice_check(dev))
> return -EOPNOTSUPP;
>
> port = netdev_priv(dev);
> return lan966x_vlan_port_add_vlan(port, v->vid,
> v->flags & BRIDGE_VLAN_INFO_PVID,
> v->flags & BRIDGE_VLAN_INFO_UNTAGGED);
> }
>
> /* In case the bridge gets called */
> if (netif_is_bridge_master(dev))
> return lan966x_vlan_cpu_add_vlan(lan966x, dev, v->vid);
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In which way does this function call, exactly, check
> your lan966x's relationship to that bridge?
>
> return 0;
> }
>
> My point being, if you have two veth interfaces in your system just
> minding their own business and being put in an unrelated bridge, and
> that bridge would be put in VLAN 100 too, my understanding is that your
> lan966x driver would sniff that event and add its CPU port to VLAN 100
> too.

That is correct and my initial thought was to add something like this:

if (netif_is_bridge_master(dev) && lan966x->bridge != dev)
return 0;

> The reverse is true as well: any removal of a bridge from a VLAN
> would also cause your CPU port to stop being in that VLAN, no matter
> what interfaces may be in that VLAN. How could I say this... "spooky
> action at a distance".

But I don't need to do this if I use 'switchdev_handle_port_obj_add'

>
> > > If there's a netdevice event on a bridge, the singleton netdevice event
> > > handler can see if it is a bridge (netif_is_bridge_master), and if it
> > > is, it can crawl through the bridge's lower interfaces using
> > > netdev_for_each_lower_dev to see if there is any lan966x interface
> > > beneath it. If there isn't, nothing to do. Otherwise, you get the
> > > opportunity to do something for each port under that bridge.
> >
> > If I start to use switchdev_handle_port_obj_add, then as you mention
> > will get a callback for each interface under the port and then I need to
> > look in obj->orig_dev to see if it was a bridge or was a port that was
> > part of the bridge.
>
> Oh yes of course. And right now you don't need that because? You think
> you get notifications only of switchdev events emitted by bridges that
> you have a port in?

Actually I need it, it was just a comment.

>
> > If I don't use switchdev_handle_port_obj_add and implement own function
> > then there is no way to get to the lan966x instance.
>
> The switchdev_handle_port_obj_add() function isn't magic, and it has an
> actual public implementation, too. Sure you can get to the lan966x
> instance even if you don't use switchdev_handle_port_obj_add() -
> although, it is there for people to use it.

Yes, I can use switchdev_handle_port_obj_add and still get access to
bridge device and to lan966x instance.

>
> > I will get two callbacks but then they can be filtered based on the
> > bridge. If I use switchdev_handle_port_obj_add then if I have 2 ports
> > under the bridge, both ports will be called which will do the same
> > work anyway.
>
> And that's a good thing, if you actually think about how you design
> things to actually work. Please consider that you have two distinct
> events: you can join a bridge that is in a VLAN, or the bridge can join
> that VLAN while you have some ports under it. The invariant is that your
> CPU port needs to be in that VLAN only for as long as there is any port
> under that bridge. So it is actually beneficial to use the
> switchdev_handle_* helper. It tells you how many users of the CPU VLAN
> rule there still are. It would be broken to delete it right away, when a
> port leaves the bridge. It would also be broken to not delete it after
> all ports leave: the bridge may have a longer lifetime than the lan966x
> ports beneath them, so there may not be any deletion event that you
> should expect.

I have already some logic in the driver regarding this. To see how many
ports are part of a vlan and also in which vlan is the CPU.

>
> > So I am not sure how much I will benefit of using
> > switchdev_handle_port_obj_add in this case.
> >
> > One important observation in the driver is that I am separating these 2
> > cases:
> >
> > bridge vlan add dev brA vid 300 self
> > bridge vlan add dev ethA0 vid 400
>
> Understood, and that's ok. But I'm not convinced it works, though.

I think it would work :)

>
> > > Maybe I'm not understanding what you're trying to accomplish that's different?
> >
> > The reason is that I want to use brA to control the CPU port. To decide
> > which frames to be copy to the CPU. Also to copy as few as possible
> > frames to CPU.
> >
> > If we still want to go with the approach of using a singleton notifier
> > block, then we will still have a problem for netdevice notifier block.
> > We will get the same issue, can't get to lan966x instance in case the
> > lan966x callback is called for a different device. And we need this for
> > the following case:
> >
> > If for example eth0, eth1 are part of a different IP and eth2, eth3 are
> > part of lan966x. We would like not to be able to put under the same
> > bridge interfaces that are part of different IPs (more precisely,
> > lan966x interfaces can be only under a bridge where lan966x interfaces
> > are part).
> >
> > For example the following command should fail:
> > ip link add name br0 type bridge
> > ip link set dev eth0 master br0
> > ip link set dev eth2 master br0
> >
> > Also the this command should fail:
> > ip link add name br0 type bridge
> > ip link set dev eth2 master br0
> > ip link set dev eth0 master br0
> >
> > But the following should be accepted:
> > ip link add name br0 type bridge
> > ip link set dev eth0 master br0
> > ip link set dev eth1 master br0
> > ip link add name br1 type bridge
> > ip link set dev eth2 master br1
> > ip link set dev eth3 master br1
>
> You can track NETDEV_PRECHANGEUPPER and deny that, and also provide an
> extack with a reason. That should work, it's been tested.

Yes and that is what I was doing. But I had to keep a list of bridges to
see any of the bridges has any foreign interfaces. The problem was that
I kept the list on the lan966x instance, but that can be fixed easily
by making it static.

>
> > Maybe I should also make it explictly that is not allow to have more
> > than one instance of lan966x for now. And once is needed then add
> > support for it.
>
> I don't necessarily see the reason for this, but ok. I don't think you
> should view things as "support for parallel instances of the driver is
> what's complicating the implementation", but rather "catching the events
> in all the permutations that they can happen in is what this driver
> needs to provide a good user experience".

So long story short, I agree with your comments, I can make the
notifier_block as singleton objects and start to use
switchdev_handle_port_obj_add and the other variants.
In this way it should also work "parallel instances of the driver".

--
/Horatiu

2021-12-15 09:27:37

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

On Tue, Dec 14, 2021 at 03:31:29PM +0100, Horatiu Vultur wrote:
> So long story short, I agree with your comments, I can make the
> notifier_block as singleton objects and start to use
> switchdev_handle_port_obj_add and the other variants.
> In this way it should also work "parallel instances of the driver".

To be clear, what I'd like to see for v5 is a set of patterns that is
simple, clean and functional enough that it could be copied by other
drivers - I'd be interested in limiting the VLANs on the CPU ports in
DSA too.