2020-11-26 00:04:19

by Lukasz Majewski

[permalink] [raw]
Subject: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

This is the first attempt to add support for L2 switch available on some NXP
devices - i.e. iMX287 or VF610. This patch set uses common FEC and DSA code.

This code provides _very_ basic switch functionality (packets are passed
between lan1 and lan2 ports and it is possible to send packets via eth0),
at its main purpose is to establish the way of reusing the FEC driver. When
this is done, one can add more advanced features to the switch (like vlan or
port separation).

I also do have a request for testing on e.g. VF610 if this driver works on
it too.
The L2 switch documentation is very scant on NXP's User Manual [0] and most
understanding of how it really works comes from old (2.6.35) NXP driver [1].
The aforementioned old driver [1] was monolitic and now this patch set tries
to mix FEC and DSA.

Open issues:
- I do have a hard time on understanding how to "disable" ENET-MAC{01} ports
in DSA (via port_disable callback in dsa_switch_ops).
When I disable L2 switch port1,2 or the ENET-MAC{01} in control register, I
cannot simply re-enable it with enabling this bit again. The old driver reset
(and setup again) the whole switch.

- The L2 switch is part of the SoC silicon, so we cannot follow the "normal" DSA
pattern with "attaching" it via mdio device. The switch reuses already well
defined ENET-MAC{01}. For that reason the MoreThanIP switch driver is
registered as platform device

- The question regarding power management - at least for my use case there
is no need for runtime power management. The L2 switch shall work always at
it connects other devices.

- The FEC clock is also used for L2 switch management and configuration (as
the L2 switch is just in the same, large IP block). For now I just keep it
enabled so DSA code can use it. It looks a bit problematic to export
fec_enet_clk_enable() to be reused on DSA code.

Links:
[0] - "i.MX28 Applications Processor Reference Manual, Rev. 2, 08/2013"
[1] - https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5

Dependencies:
This patch set depends on one, which adds DTS for XEA board. However, it shall
be also possible to work on any board by adding L2 switch specific description.

https://marc.info/?l=devicetree&m=160632122703785&w=2
https://marc.info/?l=devicetree&m=160632122303783&w=2
https://marc.info/?l=devicetree&m=160632123203787&w=2

Those patches has been tested (applied) on 4.9.130-cip and v5.9 (vanila
mainline kernel)


Lukasz Majewski (4):
net: fec: Move some defines to ./drivers/net/ethernet/freescale/fec.h
header
net: dsa: Provide DSA driver for NXP's More Than IP L2 switch
net: imx: l2switch: Adjust fec_main.c to provide support for L2 switch
ARM: dts: imx28: Add description for L2 switch on XEA board

arch/arm/boot/dts/imx28-xea.dts | 55 +++
drivers/net/dsa/Kconfig | 11 +
drivers/net/dsa/Makefile | 1 +
drivers/net/dsa/mtip-l2switch.c | 399 ++++++++++++++++++++++
drivers/net/dsa/mtip-l2switch.h | 239 +++++++++++++
drivers/net/ethernet/freescale/fec.h | 42 +++
drivers/net/ethernet/freescale/fec_main.c | 148 ++++++--
7 files changed, 874 insertions(+), 21 deletions(-)
create mode 100644 drivers/net/dsa/mtip-l2switch.c
create mode 100644 drivers/net/dsa/mtip-l2switch.h

--
2.20.1


2020-11-26 00:05:08

by Lukasz Majewski

[permalink] [raw]
Subject: [RFC 3/4] net: imx: l2switch: Adjust fec_main.c to provide support for L2 switch

This patch provides the code for re-using generic FEC i.MX code for L2
switch.

The trick here is to set eth0 controller as a "fixed-link" and then use
DSA subsystem to create lan{12} network devices for ports. The internal
connection diagram can be found here [0].

This code has been developed on i.MX287 device, but L2 switch (from More
ThanIP) can be also found on e.g. NXP's Vybrid VF610 SoC.

To reuse as much code as possible it was also necessary to introduce
fec_hwp() wrapper, which either returns pointer to MAC or L2 switch
interrupt and buffer descriptors.
Following registers shall be used with it:
FEC_{IEVENT|IMASK}, FEC_R_DES_ACTIVE_0, FEC_X_DES_ACTIVE_0,
FEC_R_DES_START_0, FEC_X_DES_START_0

The biggest issue was to divide the legacy driver's code (from 2.6.35 [1])
between contemporary kernel FEC driver and DSA subsystem [2].

Only the code for setting up interrupts is protected by
#ifdef CONFIG_NET_DSA_MTIP_L2SW

In other cases the fep->l2switch flag is checked (and the code is compiled
into the driver) as there are only a few alterations (i.e. readl/writel)
to the generic FEC code.

The most intrusive changes when L2 switch is used:
- FEC clock enabled
It is necessary to allow DSA normal operation (it shall be enabled as
long as L2 switch is used)

- Disable RACC (Receive Data Accelerator)
When this feature is enabled the data is shifted by 16 bits and
hence received packets are malformed when we try to pass data from one
switch port to another.

- Enable PROMISC mode for ENET-MAC{01} ports

- L2 switch shall pass the whole pkt_len packet. Current FEC code
subtract 4 to remove FCS

Links:
[0] - Chapter 29.3.1 in
"i.MX28 Applications Processor Reference Manual, Rev. 2, 08/2013"
[1]
From 2.6 i.MX kernel:
Repo: git://git.freescale.com/imx/linux-2.6-imx.git
Branch: imx_2.6.35_maintain
SHA1: b3912bb8a4caf3ec50909135e88af959982c43ca

[2] - https://github.com/lmajewski/linux-imx28-l2switch/commits/master

Signed-off-by: Lukasz Majewski <[email protected]>
---
drivers/net/ethernet/freescale/fec.h | 39 ++++++
drivers/net/ethernet/freescale/fec_main.c | 145 +++++++++++++++++++---
2 files changed, 166 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index c555a421f647..acca7cbf9e6b 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -29,8 +29,13 @@
*/
#define FEC_IEVENT 0x004 /* Interrupt event reg */
#define FEC_IMASK 0x008 /* Interrupt mask reg */
+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+#define FEC_R_DES_ACTIVE_0 0x018 /* L2 switch Receive descriptor reg */
+#define FEC_X_DES_ACTIVE_0 0x01C /* L2 switch Transmit descriptor reg */
+#else
#define FEC_R_DES_ACTIVE_0 0x010 /* Receive descriptor reg */
#define FEC_X_DES_ACTIVE_0 0x014 /* Transmit descriptor reg */
+#endif
#define FEC_ECNTRL 0x024 /* Ethernet control reg */
#define FEC_MII_DATA 0x040 /* MII manage frame reg */
#define FEC_MII_SPEED 0x044 /* MII speed control reg */
@@ -59,8 +64,13 @@
#define FEC_R_DES_START_2 0x16c /* Receive descriptor ring 2 */
#define FEC_X_DES_START_2 0x170 /* Transmit descriptor ring 2 */
#define FEC_R_BUFF_SIZE_2 0x174 /* Maximum receive buff ring2 size */
+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+#define FEC_R_DES_START_0 0x0C /* L2 switch Receive descriptor ring */
+#define FEC_X_DES_START_0 0x10 /* L2 switch Transmit descriptor ring */
+#else
#define FEC_R_DES_START_0 0x180 /* Receive descriptor ring */
#define FEC_X_DES_START_0 0x184 /* Transmit descriptor ring */
+#endif
#define FEC_R_BUFF_SIZE_0 0x188 /* Maximum receive buff size */
#define FEC_R_FIFO_RSFL 0x190 /* Receive FIFO section full threshold */
#define FEC_R_FIFO_RSEM 0x194 /* Receive FIFO section empty threshold */
@@ -363,13 +373,25 @@ struct bufdesc_ex {
#define FEC_ENET_BABR ((uint)0x40000000) /* Babbling receiver */
#define FEC_ENET_BABT ((uint)0x20000000) /* Babbling transmitter */
#define FEC_ENET_GRA ((uint)0x10000000) /* Graceful stop complete */
+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+#define FEC_ENET_TXF_0 BIT(4)
+#define FEC_ENET_TXF_1 0
+#define FEC_ENET_TXF_2 0
+#else
#define FEC_ENET_TXF_0 ((uint)0x08000000) /* Full frame transmitted */
#define FEC_ENET_TXF_1 ((uint)0x00000008) /* Full frame transmitted */
#define FEC_ENET_TXF_2 ((uint)0x00000080) /* Full frame transmitted */
+#endif
#define FEC_ENET_TXB ((uint)0x04000000) /* A buffer was transmitted */
+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+#define FEC_ENET_RXF_0 BIT(2)
+#define FEC_ENET_RXF_1 0
+#define FEC_ENET_RXF_2 0
+#else
#define FEC_ENET_RXF_0 ((uint)0x02000000) /* Full frame received */
#define FEC_ENET_RXF_1 ((uint)0x00000002) /* Full frame received */
#define FEC_ENET_RXF_2 ((uint)0x00000020) /* Full frame received */
+#endif
#define FEC_ENET_RXB ((uint)0x01000000) /* A buffer was received */
#define FEC_ENET_MII ((uint)0x00800000) /* MII interrupt */
#define FEC_ENET_EBERR ((uint)0x00400000) /* SDMA bus error */
@@ -380,6 +402,7 @@ struct bufdesc_ex {
#define FEC_ENET_TS_TIMER ((uint)0x00008000)

#define FEC_DEFAULT_IMASK (FEC_ENET_TXF | FEC_ENET_RXF)
+#define FEC_NAPI_IMASK FEC_ENET_MII
#define FEC_RX_DISABLED_IMASK (FEC_DEFAULT_IMASK & (~FEC_ENET_RXF))

/* ENET interrupt coalescing macro define */
@@ -587,9 +610,25 @@ struct fec_enet_private {
int pps_enable;
unsigned int next_counter;

+ /* More Than IP L2 switch */
+ void __iomem *hwpsw;
+ bool l2switch;
+
u64 ethtool_stats[];
};

+/* L2 switch */
+#define L2SW_CTRL_REG_OFFSET (0x4)
+/* Get proper base address for either L2 switch or MAC ENET */
+static inline void __iomem *fec_hwp(struct fec_enet_private *fep)
+{
+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+ if (fep->l2switch)
+ return fep->hwpsw;
+#endif
+ return fep->hwp;
+}
+
void fec_ptp_init(struct platform_device *pdev, int irq_idx);
void fec_ptp_stop(struct platform_device *pdev);
void fec_ptp_start_cyclecounter(struct net_device *ndev);
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index bd29c84fd89a..32dbb1d18785 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -65,10 +65,12 @@
#include <linux/mfd/syscon.h>
#include <linux/regmap.h>
#include <soc/imx/cpuidle.h>
+#include <net/dsa.h>

#include <asm/cacheflush.h>

#include "fec.h"
+#include "../../dsa/mtip-l2switch.h"

static void set_multicast_list(struct net_device *ndev);
static void fec_enet_itr_coal_init(struct net_device *ndev);
@@ -103,6 +105,11 @@ static const struct fec_devinfo fec_imx28_info = {
FEC_QUIRK_HAS_FRREG,
};

+static const struct fec_devinfo fec_imx28l2sw_info = {
+ .quirks = FEC_QUIRK_ENET_MAC | FEC_QUIRK_SWAP_FRAME |
+ FEC_QUIRK_SINGLE_MDIO | FEC_QUIRK_HAS_FRREG,
+};
+
static const struct fec_devinfo fec_imx6q_info = {
.quirks = FEC_QUIRK_ENET_MAC | FEC_QUIRK_HAS_GBIT |
FEC_QUIRK_HAS_BUFDESC_EX | FEC_QUIRK_HAS_CSUM |
@@ -144,6 +151,9 @@ static struct platform_device_id fec_devtype[] = {
}, {
.name = "imx28-fec",
.driver_data = (kernel_ulong_t)&fec_imx28_info,
+ }, {
+ .name = "imx28-l2switch",
+ .driver_data = (kernel_ulong_t)&fec_imx28l2sw_info,
}, {
.name = "imx6q-fec",
.driver_data = (kernel_ulong_t)&fec_imx6q_info,
@@ -166,6 +176,7 @@ enum imx_fec_type {
IMX25_FEC = 1, /* runs on i.mx25/50/53 */
IMX27_FEC, /* runs on i.mx27/35/51 */
IMX28_FEC,
+ IMX28_L2SWITCH,
IMX6Q_FEC,
MVF600_FEC,
IMX6SX_FEC,
@@ -176,6 +187,8 @@ static const struct of_device_id fec_dt_ids[] = {
{ .compatible = "fsl,imx25-fec", .data = &fec_devtype[IMX25_FEC], },
{ .compatible = "fsl,imx27-fec", .data = &fec_devtype[IMX27_FEC], },
{ .compatible = "fsl,imx28-fec", .data = &fec_devtype[IMX28_FEC], },
+ { .compatible = "fsl,imx28-l2switch",
+ .data = &fec_devtype[IMX28_L2SWITCH], },
{ .compatible = "fsl,imx6q-fec", .data = &fec_devtype[IMX6Q_FEC], },
{ .compatible = "fsl,mvf600-fec", .data = &fec_devtype[MVF600_FEC], },
{ .compatible = "fsl,imx6sx-fec", .data = &fec_devtype[IMX6SX_FEC], },
@@ -893,7 +906,7 @@ static void fec_enet_enable_ring(struct net_device *ndev)

for (i = 0; i < fep->num_rx_queues; i++) {
rxq = fep->rx_queue[i];
- writel(rxq->bd.dma, fep->hwp + FEC_R_DES_START(i));
+ writel(rxq->bd.dma, fec_hwp(fep) + FEC_R_DES_START(i));
writel(PKT_MAXBUF_SIZE, fep->hwp + FEC_R_BUFF_SIZE(i));

/* enable DMA1/2 */
@@ -904,7 +917,7 @@ static void fec_enet_enable_ring(struct net_device *ndev)

for (i = 0; i < fep->num_tx_queues; i++) {
txq = fep->tx_queue[i];
- writel(txq->bd.dma, fep->hwp + FEC_X_DES_START(i));
+ writel(txq->bd.dma, fec_hwp(fep) + FEC_X_DES_START(i));

/* enable DMA1/2 */
if (i)
@@ -1078,6 +1091,29 @@ fec_restart(struct net_device *ndev)
}
#endif /* !defined(CONFIG_M5272) */

+ if (fep->l2switch) {
+ /*
+ * Set PROMISC mode for MAC0/1 - it is necessary for L2 switch
+ * correct operation.
+ */
+ rcntl |= MCF_FEC_RCR_PROM;
+
+ /*
+ * In fec_restart, the FEC_R_CNTRL register is only configured
+ * for MAC0 controller (when 'mac1' DTS node is NOT enabled).
+ *
+ * For L2 switch we configure only one DTS node - 'eth_switch'
+ * so the same value shall be written to MAC1 corresponding
+ * register.
+ *
+ * The port configuration is finished in the DSA's 'link_adjust'
+ * callback - which also modifies FEC_R_CNTRL register to set
+ * proper link speed.
+ */
+ writel(rcntl,
+ fep->hwp + ENET_L2_SW_PORT1_OFFSET + FEC_R_CNTRL);
+ }
+
writel(rcntl, fep->hwp + FEC_R_CNTRL);

/* Setup multicast filter. */
@@ -1109,8 +1145,12 @@ fec_restart(struct net_device *ndev)
if (fep->bufdesc_ex)
fec_ptp_start_cyclecounter(ndev);

- /* Enable interrupts we wish to service */
- if (fep->link)
+ /*
+ * Enable interrupts we wish to service - for L2 switch only
+ * MII interrupts are required for MAC{01} to allow proper
+ * PHY configuration.
+ */
+ if (!fep->l2switch && fep->link)
writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
else
writel(0, fep->hwp + FEC_IMASK);
@@ -1407,7 +1447,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
unsigned short status;
struct sk_buff *skb_new = NULL;
struct sk_buff *skb;
- ushort pkt_len;
+ ushort pkt_len, l2pl;
__u8 *data;
int pkt_received = 0;
struct bufdesc_ex *ebdp = NULL;
@@ -1433,7 +1473,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
break;
pkt_received++;

- writel(FEC_ENET_RXF, fep->hwp + FEC_IEVENT);
+ writel(FEC_ENET_RXF, fec_hwp(fep) + FEC_IEVENT);

/* Check for errors. */
status ^= BD_ENET_RX_LAST;
@@ -1473,7 +1513,16 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
* include that when passing upstream as it messes up
* bridging applications.
*/
- is_copybreak = fec_enet_copybreak(ndev, &skb, bdp, pkt_len - 4,
+ /*
+ * When the L2 switch support is enabled the FCS is removed
+ * by it and hence the pkt_len shall be passed without
+ * substracting 4 bytes.
+ */
+ l2pl = pkt_len - 4;
+ if (fep->l2switch)
+ l2pl = pkt_len;
+
+ is_copybreak = fec_enet_copybreak(ndev, &skb, bdp, l2pl,
need_swap);
if (!is_copybreak) {
skb_new = netdev_alloc_skb(ndev, FEC_ENET_RX_FRSIZE);
@@ -1488,7 +1537,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
}

prefetch(skb->data - NET_IP_ALIGN);
- skb_put(skb, pkt_len - 4);
+ skb_put(skb, l2pl);
data = skb->data;

if (!is_copybreak && need_swap)
@@ -1605,12 +1654,12 @@ static bool fec_enet_collect_events(struct fec_enet_private *fep)
{
uint int_events;

- int_events = readl(fep->hwp + FEC_IEVENT);
+ int_events = readl(fec_hwp(fep) + FEC_IEVENT);

/* Don't clear MDIO events, we poll for those */
int_events &= ~FEC_ENET_MII;

- writel(int_events, fep->hwp + FEC_IEVENT);
+ writel(int_events, fec_hwp(fep) + FEC_IEVENT);

return int_events != 0;
}
@@ -1635,6 +1684,27 @@ fec_enet_interrupt(int irq, void *dev_id)
return ret;
}

+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+static irqreturn_t
+l2switch_interrupt_handler(int irq, void *dev_id)
+{
+ struct net_device *ndev = dev_id;
+ struct fec_enet_private *fep = netdev_priv(ndev);
+ irqreturn_t ret = IRQ_NONE;
+
+ if (fec_enet_collect_events(fep) && fep->link) {
+ ret = IRQ_HANDLED;
+
+ if (napi_schedule_prep(&fep->napi)) {
+ /* Disable NAPI interrupts */
+ writel(0, fep->hwpsw + FEC_IMASK);
+ __napi_schedule(&fep->napi);
+ }
+ }
+ return ret;
+}
+#endif
+
static int fec_enet_rx_napi(struct napi_struct *napi, int budget)
{
struct net_device *ndev = napi->dev;
@@ -1648,7 +1718,7 @@ static int fec_enet_rx_napi(struct napi_struct *napi, int budget)

if (done < budget) {
napi_complete_done(napi, done);
- writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
+ writel(FEC_DEFAULT_IMASK, fec_hwp(fep) + FEC_IMASK);
}

return done;
@@ -3065,6 +3135,10 @@ static void set_multicast_list(struct net_device *ndev)
unsigned char hash;
unsigned int hash_high = 0, hash_low = 0;

+ /* DSA subsystem will handle multicast/broadcast setup for L2 switch */
+ if (fep->l2switch)
+ return;
+
if (ndev->flags & IFF_PROMISC) {
tmp = readl(fep->hwp + FEC_R_CNTRL);
tmp |= 0x8;
@@ -3279,7 +3353,8 @@ static int fec_enet_init(struct net_device *ndev)
rxq->bd.dma = bd_dma;
rxq->bd.dsize = dsize;
rxq->bd.dsize_log2 = dsize_log2;
- rxq->bd.reg_desc_active = fep->hwp + offset_des_active_rxq[i];
+ rxq->bd.reg_desc_active =
+ fec_hwp(fep) + offset_des_active_rxq[i];
bd_dma += size;
cbd_base = (struct bufdesc *)(((void *)cbd_base) + size);
rxq->bd.last = (struct bufdesc *)(((void *)cbd_base) - dsize);
@@ -3295,7 +3370,8 @@ static int fec_enet_init(struct net_device *ndev)
txq->bd.dma = bd_dma;
txq->bd.dsize = dsize;
txq->bd.dsize_log2 = dsize_log2;
- txq->bd.reg_desc_active = fep->hwp + offset_des_active_txq[i];
+ txq->bd.reg_desc_active =
+ fec_hwp(fep) + offset_des_active_txq[i];
bd_dma += size;
cbd_base = (struct bufdesc *)(((void *)cbd_base) + size);
txq->bd.last = (struct bufdesc *)(((void *)cbd_base) - dsize);
@@ -3306,8 +3382,7 @@ static int fec_enet_init(struct net_device *ndev)
ndev->watchdog_timeo = TX_TIMEOUT;
ndev->netdev_ops = &fec_netdev_ops;
ndev->ethtool_ops = &fec_enet_ethtool_ops;
-
- writel(FEC_RX_DISABLED_IMASK, fep->hwp + FEC_IMASK);
+ writel(FEC_RX_DISABLED_IMASK, fec_hwp(fep) + FEC_IMASK);
netif_napi_add(ndev, &fep->napi, fec_enet_rx_napi, NAPI_POLL_WEIGHT);

if (fep->quirks & FEC_QUIRK_HAS_VLAN)
@@ -3544,6 +3619,23 @@ fec_probe(struct platform_device *pdev)
fep->pdev = pdev;
fep->dev_id = dev_id++;

+ if (of_device_is_compatible(np, "fsl,imx28-l2switch")) {
+ fep->l2switch = true;
+ fep->hwpsw = devm_platform_ioremap_resource(pdev, 1);
+ /*
+ * MAC{01} interrupt and descriptors registers have 4 bytes
+ * offset when compared to L2 switch IP block.
+ *
+ * When L2 switch is added "between" ENET (eth0) and MAC{01}
+ * the code for interrupts and setting up descriptors is
+ * reused.
+ *
+ * As for example FEC_IMASK are used also on MAC{01} to
+ * perform MII transmission it is better to subtract the
+ * offset from the outset and reuse defines.
+ */
+ fep->hwpsw -= L2SW_CTRL_REG_OFFSET;
+ }
platform_set_drvdata(pdev, ndev);

if ((of_machine_is_compatible("fsl,imx6q") ||
@@ -3669,8 +3761,16 @@ fec_probe(struct platform_device *pdev)
ret = irq;
goto failed_irq;
}
- ret = devm_request_irq(&pdev->dev, irq, fec_enet_interrupt,
- 0, pdev->name, ndev);
+#ifdef CONFIG_NET_DSA_MTIP_L2SW
+ if (fep->l2switch && i == 0)
+ ret = devm_request_irq(&pdev->dev, irq,
+ l2switch_interrupt_handler,
+ 0, "l2switch", ndev);
+ else
+#endif
+ ret = devm_request_irq(&pdev->dev, irq,
+ fec_enet_interrupt,
+ 0, pdev->name, ndev);
if (ret)
goto failed_irq;

@@ -3683,7 +3783,16 @@ fec_probe(struct platform_device *pdev)

/* Carrier starts down, phylib will bring it up */
netif_carrier_off(ndev);
- fec_enet_clk_enable(ndev, false);
+ /*
+ * For L2 switch proper initialization this clk cannot be disabled,
+ * as it uses is to access shared MAC registers for proper MDIO
+ * operation.
+ *
+ * DSA switch will also use it afterwards to setup its ports.
+ */
+ if (!fep->l2switch)
+ fec_enet_clk_enable(ndev, false);
+
pinctrl_pm_select_sleep_state(&pdev->dev);

ndev->max_mtu = PKT_MAXBUF_SIZE - ETH_HLEN - ETH_FCS_LEN;
--
2.20.1

2020-11-26 00:06:46

by Lukasz Majewski

[permalink] [raw]
Subject: [RFC 2/4] net: dsa: Provide DSA driver for NXP's More Than IP L2 switch

This change provides driver for DSA (Distributed Switch Architecture)
subsystem for i.MX28 SoC (imx287 to be precise).

This code just is responsible for configuring this device as L2 bridge
(no vlan, port separation supported).

This driver shall be regarded as a complementary one for NXP's FEC
(fec_main.c).

Signed-off-by: Lukasz Majewski <[email protected]>
---
drivers/net/dsa/Kconfig | 11 +
drivers/net/dsa/Makefile | 1 +
drivers/net/dsa/mtip-l2switch.c | 399 ++++++++++++++++++++++++++++++++
drivers/net/dsa/mtip-l2switch.h | 239 +++++++++++++++++++
4 files changed, 650 insertions(+)
create mode 100644 drivers/net/dsa/mtip-l2switch.c
create mode 100644 drivers/net/dsa/mtip-l2switch.h

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 468b3c4273c5..52013eb5afb3 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -132,4 +132,15 @@ config NET_DSA_VITESSE_VSC73XX_PLATFORM
This enables support for the Vitesse VSC7385, VSC7388, VSC7395
and VSC7398 SparX integrated ethernet switches, connected over
a CPU-attached address bus and work in memory-mapped I/O mode.
+
+config NET_DSA_MTIP_L2SW
+ tristate "MoreThanIP L2 switch support (IMX)"
+ depends on OF
+ depends on NET_DSA
+ depends on ARCH_MXS
+ select FIXED_PHY
+ help
+ This enables support for the MoreThan IP on i.MX SoCs (e.g. iMX28)
+ L2 switch.
+
endmenu
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index 4a943ccc2ca4..b947dc54db09 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_NET_DSA_SMSC_LAN9303_MDIO) += lan9303_mdio.o
obj-$(CONFIG_NET_DSA_VITESSE_VSC73XX) += vitesse-vsc73xx-core.o
obj-$(CONFIG_NET_DSA_VITESSE_VSC73XX_PLATFORM) += vitesse-vsc73xx-platform.o
obj-$(CONFIG_NET_DSA_VITESSE_VSC73XX_SPI) += vitesse-vsc73xx-spi.o
+obj-$(CONFIG_NET_DSA_MTIP_L2SW) += mtip-l2switch.o
obj-y += b53/
obj-y += microchip/
obj-y += mv88e6xxx/
diff --git a/drivers/net/dsa/mtip-l2switch.c b/drivers/net/dsa/mtip-l2switch.c
new file mode 100644
index 000000000000..2ecdf82ccd6c
--- /dev/null
+++ b/drivers/net/dsa/mtip-l2switch.c
@@ -0,0 +1,399 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020 Lukasz Majewski <[email protected]>
+ */
+
+#include <linux/module.h>
+#include <linux/phy.h>
+#include <linux/netdevice.h>
+#include <net/dsa.h>
+#include <linux/of_net.h>
+#include <linux/of_platform.h>
+#include <linux/if_bridge.h>
+#include <linux/mdio.h>
+#include <linux/of_mdio.h>
+#include <linux/etherdevice.h>
+
+#include "mtip-l2switch.h"
+#include "../ethernet/freescale/fec.h"
+
+static int mtipl2_en_rx(struct net_device *ndev)
+{
+ struct fec_enet_private *fep = netdev_priv(ndev);
+
+ if (!fep->hwpsw)
+ return -ENODEV;
+
+ writel(MCF_ESW_RDAR_R_DES_ACTIVE, fep->hwpsw + MCF_ESW_R_RDAR);
+
+ return 0;
+}
+
+static int mtipl2_set_rx_buf_size(struct net_device *ndev)
+{
+ struct fec_enet_private *fep = netdev_priv(ndev);
+
+ if (!fep->hwpsw)
+ return -ENODEV;
+
+ /* Set the max rx buffer size */
+ writel(L2SW_PKT_MAXBLR_SIZE, fep->hwpsw + MCF_ESW_R_BUFF_SIZE);
+
+ return 0;
+}
+
+static int mtipl2_set_ints(struct net_device *ndev)
+{
+ struct fec_enet_private *fep = netdev_priv(ndev);
+
+ if (!fep->hwpsw)
+ return -ENODEV;
+
+ /* Clear all pending interrupts */
+ writel(0xffffffff, fep->hwpsw + FEC_IEVENT);
+
+ /* Enable interrupts we wish to service */
+ writel(MCF_ESW_IMR_TXF | MCF_ESW_IMR_RXB | MCF_ESW_IMR_TXB,
+ fep->hwpsw + FEC_IMASK);
+
+ return 0;
+}
+
+static void read_atable(struct mtipl2sw_priv *priv, int index,
+ unsigned long *read_lo, unsigned long *read_hi)
+{
+ unsigned long atable_base = (unsigned long)priv->hwentry;
+
+ *read_lo = readl((const volatile void*)atable_base + (index << 3));
+ *read_hi = readl((const volatile void*)atable_base + (index << 3) + 4);
+}
+
+static void write_atable(struct mtipl2sw_priv *priv, int index,
+ unsigned long write_lo, unsigned long write_hi)
+{
+ unsigned long atable_base = (unsigned long)priv->hwentry;
+
+ writel(write_lo, (volatile void *)atable_base + (index << 3));
+ writel(write_hi, (volatile void *)atable_base + (index << 3) + 4);
+}
+
+/*
+ * Clear complete MAC Look Up Table
+ */
+static void esw_clear_atable(struct mtipl2sw_priv *priv)
+{
+ int index;
+ for (index = 0; index < 2048; index++)
+ write_atable(priv, index, 0, 0);
+}
+
+static void l2_enet_set_mac(struct net_device *ndev, void __iomem *hwp)
+{
+ writel(ndev->dev_addr[3] |
+ ndev->dev_addr[2] << 8 |
+ ndev->dev_addr[1] << 16 |
+ ndev->dev_addr[0] << 24,
+ hwp + FEC_ADDR_LOW);
+
+ writel(ndev->dev_addr[5] << 16 |
+ (ndev->dev_addr[4] + (unsigned char)(0)) << 24,
+ hwp + FEC_ADDR_HIGH);
+}
+
+static void l2_enet_reset(struct net_device *ndev, int port,
+ struct phy_device *phy)
+{
+ struct fec_enet_private *fep = netdev_priv(ndev);
+ void __iomem *hwp = fep->hwp;
+
+ if (port == 1)
+ hwp += ENET_L2_SW_PORT1_OFFSET;
+
+ /* ECR */
+ writel(FEC_ECR_MAGICEN, hwp + FEC_ECNTRL);
+ /* EMRBR */
+ writel(L2SW_PKT_MAXBLR_SIZE, hwp + MCF_FEC_EMRBR);
+
+ /*
+ * PHY speed for both MACs (via mac0 adjustement is setup
+ * in fec_restart()
+ */
+
+ /* EIR */
+ writel(0, hwp + FEC_IEVENT);
+ /* IAUR */
+ writel(0, hwp + FEC_HASH_TABLE_HIGH);
+ /* IALR */
+ writel(0, hwp + FEC_HASH_TABLE_LOW);
+ /* GAUR */
+ writel(0, hwp + FEC_GRP_HASH_TABLE_HIGH);
+ /* GALR */
+ writel(0, hwp + FEC_GRP_HASH_TABLE_LOW);
+ /* EMRBR */
+ writel(L2SW_PKT_MAXBLR_SIZE, hwp + MCF_FEC_EMRBR);
+
+ /* Set the MAC address for the LAN{12} net device */
+ l2_enet_set_mac(ndev, hwp);
+
+ /* RCR */
+ if (phy && phy->speed == SPEED_100) {
+ if (phy->duplex == DUPLEX_FULL) {
+ /* full duplex mode */
+ writel((readl(hwp + FEC_R_CNTRL)
+ | MCF_FEC_RCR_FCE | MCF_FEC_RCR_PROM)
+ & ~(MCF_FEC_RCR_RMII_10BASET |
+ MCF_FEC_RCR_DRT),
+ hwp + FEC_R_CNTRL);
+ } else {
+ /* half duplex mode */
+ writel((readl(hwp + FEC_R_CNTRL)
+ | MCF_FEC_RCR_FCE | MCF_FEC_RCR_PROM
+ | MCF_FEC_RCR_DRT)
+ & ~(MCF_FEC_RCR_RMII_10BASET),
+ hwp + FEC_R_CNTRL);
+ }
+ } else {
+ if (phy->duplex == DUPLEX_FULL) {
+ /* full duplex mode */
+ writel((readl(hwp + FEC_R_CNTRL)
+ | MCF_FEC_RCR_FCE | MCF_FEC_RCR_PROM
+ | MCF_FEC_RCR_RMII_10BASET) &
+ ~(MCF_FEC_RCR_DRT),
+ hwp + FEC_R_CNTRL);
+ } else {
+ /* half duplex mode */
+ writel(readl(hwp + FEC_R_CNTRL)
+ | MCF_FEC_RCR_FCE | MCF_FEC_RCR_PROM
+ | MCF_FEC_RCR_RMII_10BASET | MCF_FEC_RCR_DRT,
+ hwp + FEC_R_CNTRL);
+ }
+ }
+
+ /*
+ * MII Gasket configuration is performed for MAC0 (which shall be
+ * correct for both MACs) in fec_restart().
+ *
+ * Configuration: RMII, 50 MHz, no loopback, no echo
+ */
+
+ /* TCR */
+ if (phy->duplex == DUPLEX_FULL)
+ writel(0x1c, hwp + FEC_X_CNTRL);
+ else
+ writel(0x18, hwp + FEC_X_CNTRL);
+
+ /* ECR */
+ writel(readl(hwp + FEC_ECNTRL) | FEC_ECNTRL_ETHER_EN,
+ hwp + FEC_ECNTRL);
+}
+
+static enum dsa_tag_protocol
+mtipl2_get_tag_protocol(struct dsa_switch *ds, int port,
+ enum dsa_tag_protocol mp)
+{
+ return DSA_TAG_PROTO_NONE;
+}
+
+static int mtip_sw_config(struct dsa_switch *ds)
+{
+ struct mtipl2sw_priv *priv = (struct mtipl2sw_priv *)ds->priv;
+ const struct dsa_port *pm = dsa_to_port(ds, priv->cpu_port);
+
+ /*
+ * L2 switch - reset
+ */
+ writel(MCF_ESW_MODE_SW_RST, &priv->fecp->ESW_MODE);
+ udelay(10);
+
+ /* Configure switch*/
+ writel(MCF_ESW_MODE_STATRST, &priv->fecp->ESW_MODE);
+ writel(MCF_ESW_MODE_SW_EN, &priv->fecp->ESW_MODE);
+
+ /*
+ * Transmit/Receive on ports 1,2 (including port 0)
+ * will be enabled when DSA port is enabled
+ */
+
+ /* Management port configuration, make port 0 as management port */
+ writel(0, &priv->fecp->ESW_BMPC);
+
+ /*
+ * Set backpressure threshold to minimize discarded frames
+ * during due to congestion.
+ */
+ writel(P0BC_THRESHOLD, &priv->fecp->ESW_P0BCT);
+
+ mtipl2_set_rx_buf_size(pm->master);
+
+ /* Enable broadcast on all ports */
+ writel(0x7, &priv->fecp->ESW_DBCR);
+
+ /* Enable multicast on all ports */
+ writel(0x7, &priv->fecp->ESW_DMCR);
+
+ esw_clear_atable(priv);
+ mtipl2_set_ints(pm->master);
+
+ return 0;
+}
+
+static int mtipl2_setup(struct dsa_switch *ds)
+{
+ struct mtipl2sw_priv *priv = (struct mtipl2sw_priv *)ds->priv;
+
+ /* Make sure that port 2 is the cpu port */
+ if (dsa_is_cpu_port(ds, 2))
+ priv->cpu_port = 2;
+ else
+ return -EINVAL;
+
+ mtip_sw_config(ds);
+ return 0;
+}
+
+void mtipl2_adjust_link (struct dsa_switch *ds, int port,
+ struct phy_device *phy)
+{
+ struct mtipl2sw_priv *priv = (struct mtipl2sw_priv *)ds->priv;
+ const struct dsa_port *pm = dsa_to_port(ds, priv->cpu_port);
+
+ /* Set port speed */
+ switch (phy->speed) {
+ case 10:
+ pr_err("fec: %s: PHY speed 10 [%d]\n", __func__, port);
+ break;
+ case 100:
+ pr_err("fec: %s: PHY speed 100 [%d]\n", __func__, port);
+ break;
+ default:
+ pr_err("port%d link speed %dMbps not supported.\n", port,
+ phy->speed);
+ return;
+ }
+
+ /* Set duplex mode */
+ if (phy->duplex == DUPLEX_FULL)
+ pr_err("fec: %s: LINK UPDATE DUPLEX_FULL [%d]\n", __func__,
+ port);
+
+ /* Reset and configure L2 switch port */
+ l2_enet_reset(pm->master, port, phy);
+
+ /*
+ * The L2 switch R_RDAR register needs to be updated
+ * after the port is enabled.
+ */
+ mtipl2_en_rx(pm->master);
+}
+
+int mtipl2_port_enable (struct dsa_switch *ds, int port,
+ struct phy_device *phydev)
+{
+ struct mtipl2sw_priv *priv = (struct mtipl2sw_priv *)ds->priv;
+ struct dsa_port *pm = dsa_to_port(ds, priv->cpu_port);
+ u32 l2_ports_en;
+
+ pr_err("fec: %s: PORT ENABLE %d\n", __func__, port);
+
+ /* The CPU port shall be always enabled - as it is a "fixed-link" one */
+ if (port == priv->cpu_port || priv->cpu_port == -1)
+ return 0;
+
+ /* Enable tx/rx on L2 switch ports */
+ l2_ports_en = readl(&priv->fecp->ESW_PER);
+ if (!(l2_ports_en & MCF_ESW_ENA_PORT_0))
+ l2_ports_en = MCF_ESW_ENA_PORT_0;
+
+ if (port == 0 && !(l2_ports_en & MCF_ESW_ENA_PORT_1))
+ l2_ports_en |= MCF_ESW_ENA_PORT_1;
+
+ if (port == 1 && !(l2_ports_en & MCF_ESW_ENA_PORT_2))
+ l2_ports_en |= MCF_ESW_ENA_PORT_2;
+
+ writel(l2_ports_en, &priv->fecp->ESW_PER);
+
+ /*
+ * Check if MAC IP block is already enabled (after switch initializtion)
+ * or if we need to enable it after mtipl2_port_disable was called.
+ */
+
+ return 0;
+}
+
+static void mtipl2_port_disable (struct dsa_switch *ds, int port)
+{
+ struct mtipl2sw_priv *priv = (struct mtipl2sw_priv *)ds->priv;
+ struct dsa_port *pm = dsa_to_port(ds, priv->cpu_port);
+ u32 l2_ports_en;
+
+ pr_err("fec: %s: PORT DISABLE %d\n", __func__, port);
+
+ l2_ports_en = readl(&priv->fecp->ESW_PER);
+ if (port == 0)
+ l2_ports_en &= ~MCF_ESW_ENA_PORT_1;
+
+ if (port == 1)
+ l2_ports_en &= ~MCF_ESW_ENA_PORT_2;
+
+ /* Finally disable tx/rx on port 0 */
+ if (!(l2_ports_en & MCF_ESW_ENA_PORT_1) &&
+ !(l2_ports_en & MCF_ESW_ENA_PORT_2))
+ l2_ports_en &= ~MCF_ESW_ENA_PORT_0;
+
+ writel(l2_ports_en, &priv->fecp->ESW_PER);
+}
+
+static const struct dsa_switch_ops mtipl2_switch_ops = {
+ .get_tag_protocol = mtipl2_get_tag_protocol,
+ .setup = mtipl2_setup,
+ .adjust_link = mtipl2_adjust_link,
+ .port_enable = mtipl2_port_enable,
+ .port_disable = mtipl2_port_disable,
+};
+
+static int mtipl2_sw_probe(struct platform_device *pdev)
+{
+ struct mtipl2sw_priv *priv;
+ struct resource *r;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ priv->fecp = devm_ioremap_resource(&pdev->dev, r);
+
+ r = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+ priv->hwentry = devm_ioremap_resource(&pdev->dev, r);
+
+ priv->ds = devm_kzalloc(&pdev->dev, sizeof(*priv->ds), GFP_KERNEL);
+ if (!priv->ds)
+ return -ENOMEM;
+
+ priv->ds->dev = &pdev->dev;
+ priv->ds->num_ports = MTIPL2SW_NUM_PORTS;
+ priv->ds->priv = priv;
+ priv->ds->ops = &mtipl2_switch_ops;
+
+ return dsa_register_switch(priv->ds);
+}
+
+static const struct of_device_id mtipl2_of_match[] = {
+ { .compatible = "imx,mtip-l2switch" },
+ { /* sentinel */ },
+};
+
+static struct platform_driver mtipl2plat_driver = {
+ .probe = mtipl2_sw_probe,
+ .driver = {
+ .name = "mtipl2sw",
+ .of_match_table = mtipl2_of_match,
+ },
+};
+
+module_platform_driver(mtipl2plat_driver);
+
+MODULE_AUTHOR("Lukasz Majewski <[email protected]>");
+MODULE_DESCRIPTION("Driver for MTIP L2 on SOC switch");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:mtipl2sw");
diff --git a/drivers/net/dsa/mtip-l2switch.h b/drivers/net/dsa/mtip-l2switch.h
new file mode 100644
index 000000000000..ea354ae99ecd
--- /dev/null
+++ b/drivers/net/dsa/mtip-l2switch.h
@@ -0,0 +1,239 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020 DENX Software Engineering GmbH
+ * Lukasz Majewski <[email protected]>
+ */
+
+#ifndef __MTIP_L2SWITCH_H_
+#define __MTIP_L2SWITCH_H_
+
+#define MTIPL2SW_NUM_PORTS 3
+
+#define ENET_SWI_PHYS_ADDR_OFFSET 0x8000
+#define ENET_L2_SW_PORT1_OFFSET 0x4000
+/* Bit definitions and macros for MCF_ESW_MODE */
+#define MCF_ESW_MODE_SW_RST BIT(0)
+#define MCF_ESW_MODE_SW_EN BIT(1)
+#define MCF_ESW_MODE_STATRST BIT(31)
+
+/* Port 0 backpressure congestion threshold */
+#define P0BC_THRESHOLD 0x40
+
+struct esw_output_queue_status {
+ unsigned long ESW_MMSR;
+ unsigned long ESW_LMT;
+ unsigned long ESW_LFC;
+ unsigned long ESW_PCSR;
+ unsigned long ESW_IOSR;
+ unsigned long ESW_QWT;
+ unsigned long esw_reserved;
+ unsigned long ESW_P0BCT;
+};
+struct esw_statistics_status {
+ /*
+ * Total number of incoming frames processed
+ * but discarded in switch
+ */
+ unsigned long ESW_DISCN;
+ /*Sum of bytes of frames counted in ESW_DISCN*/
+ unsigned long ESW_DISCB;
+ /*
+ * Total number of incoming frames processed
+ * but not discarded in switch
+ */
+ unsigned long ESW_NDISCN;
+ /*Sum of bytes of frames counted in ESW_NDISCN*/
+ unsigned long ESW_NDISCB;
+};
+
+struct esw_port_statistics_status {
+ /*outgoing frames discarded due to transmit queue congestion*/
+ unsigned long MCF_ESW_POQC;
+ /*incoming frames discarded due to VLAN domain mismatch*/
+ unsigned long MCF_ESW_PMVID;
+ /*incoming frames discarded due to untagged discard*/
+ unsigned long MCF_ESW_PMVTAG;
+ /*incoming frames discarded due port is in blocking state*/
+ unsigned long MCF_ESW_PBL;
+};
+
+struct l2switch_t {
+ unsigned long ESW_REVISION;
+ unsigned long ESW_SCRATCH;
+ unsigned long ESW_PER;
+ unsigned long reserved0[1];
+ unsigned long ESW_VLANV;
+ unsigned long ESW_DBCR;
+ unsigned long ESW_DMCR;
+ unsigned long ESW_BKLR;
+ unsigned long ESW_BMPC;
+ unsigned long ESW_MODE;
+ unsigned long ESW_VIMSEL;
+ unsigned long ESW_VOMSEL;
+ unsigned long ESW_VIMEN;
+ unsigned long ESW_VID;/*0x34*/
+ /*from 0x38 0x3C*/
+ unsigned long esw_reserved0[2];
+ unsigned long ESW_MCR;/*0x40*/
+ unsigned long ESW_EGMAP;
+ unsigned long ESW_INGMAP;
+ unsigned long ESW_INGSAL;
+ unsigned long ESW_INGSAH;
+ unsigned long ESW_INGDAL;
+ unsigned long ESW_INGDAH;
+ unsigned long ESW_ENGSAL;
+ unsigned long ESW_ENGSAH;
+ unsigned long ESW_ENGDAL;
+ unsigned long ESW_ENGDAH;
+ unsigned long ESW_MCVAL;/*0x6C*/
+ /*from 0x70--0x7C*/
+ unsigned long esw_reserved1[4];
+ unsigned long ESW_MMSR;/*0x80*/
+ unsigned long ESW_LMT;
+ unsigned long ESW_LFC;
+ unsigned long ESW_PCSR;
+ unsigned long ESW_IOSR;
+ unsigned long ESW_QWT;/*0x94*/
+ unsigned long esw_reserved2[1];/*0x98*/
+ unsigned long ESW_P0BCT;/*0x9C*/
+ /*from 0xA0-0xB8*/
+ unsigned long esw_reserved3[7];
+ unsigned long ESW_P0FFEN;/*0xBC*/
+ unsigned long ESW_PSNP[8];
+ unsigned long ESW_IPSNP[8];
+ unsigned long ESW_PVRES[3];
+ /*from 0x10C-0x13C*/
+ unsigned long esw_reserved4[13];
+ unsigned long ESW_IPRES;/*0x140*/
+ /*from 0x144-0x17C*/
+ unsigned long esw_reserved5[15];
+ unsigned long ESW_PRES[3];
+ /*from 0x18C-0x1FC*/
+ unsigned long esw_reserved6[29];
+ unsigned long ESW_PID[3];
+ /*from 0x20C-0x27C*/
+ unsigned long esw_reserved7[29];
+ unsigned long ESW_VRES[32];
+ unsigned long ESW_DISCN;/*0x300*/
+ unsigned long ESW_DISCB;
+ unsigned long ESW_NDISCN;
+ unsigned long ESW_NDISCB;/*0xFC0DC30C*/
+ struct esw_port_statistics_status port_statistics_status[3];
+ /*from 0x340-0x400*/
+ unsigned long esw_reserved8[48];
+
+ /*0xFC0DC400---0xFC0DC418*/
+ /*unsigned long MCF_ESW_ISR;*/
+ unsigned long switch_ievent; /* Interrupt event reg */
+ /*unsigned long MCF_ESW_IMR;*/
+ unsigned long switch_imask; /* Interrupt mask reg */
+ /*unsigned long MCF_ESW_RDSR;*/
+ unsigned long fec_r_des_start; /* Receive descriptor ring */
+ /*unsigned long MCF_ESW_TDSR;*/
+ unsigned long fec_x_des_start; /* Transmit descriptor ring */
+ /*unsigned long MCF_ESW_MRBR;*/
+ unsigned long fec_r_buff_size; /* Maximum receive buff size */
+ /*unsigned long MCF_ESW_RDAR;*/
+ unsigned long fec_r_des_active; /* Receive descriptor reg */
+ /*unsigned long MCF_ESW_TDAR;*/
+ unsigned long fec_x_des_active; /* Transmit descriptor reg */
+ /*from 0x420-0x4FC*/
+ unsigned long esw_reserved9[57];
+
+ /*0xFC0DC500---0xFC0DC508*/
+ unsigned long ESW_LREC0;
+ unsigned long ESW_LREC1;
+ unsigned long ESW_LSR;
+};
+
+struct AddrTable64bEntry {
+ unsigned int lo; /* lower 32 bits */
+ unsigned int hi; /* upper 32 bits */
+};
+
+struct eswAddrTable_t {
+ struct AddrTable64bEntry eswTable64bEntry[2048];
+};
+
+struct mtipl2sw_priv {
+ struct dsa_switch *ds;
+ struct device *dev;
+
+ /* CPU port number */
+ int cpu_port;
+ /* Registers to configure/setup L2 switch IP block */
+ struct l2switch_t *fecp;
+
+ /* Look-up MAC address table start from 0x800FC000 */
+ struct eswAddrTable_t *hwentry;
+};
+
+
+#define MCF_ESW_RDAR_R_DES_ACTIVE BIT(24)
+#define FEC_MIIGSK_EN BIT(2)
+
+/* Bit definitions and macros for MCF_ESW_IMR */
+#define MCF_ESW_IMR_EBERR (0x00000001)
+#define MCF_ESW_IMR_RXB (0x00000002)
+#define MCF_ESW_IMR_RXF (0x00000004)
+#define MCF_ESW_IMR_TXB (0x00000008)
+#define MCF_ESW_IMR_TXF (0x00000010)
+#define MCF_ESW_IMR_QM (0x00000020)
+#define MCF_ESW_IMR_OD0 (0x00000040)
+#define MCF_ESW_IMR_OD1 (0x00000080)
+#define MCF_ESW_IMR_OD2 (0x00000100)
+#define MCF_ESW_IMR_LRN (0x00000200)
+
+/* HW_ENET_SWI_PORT_ENA */
+#define MCF_ESW_ENA_TRANSMIT_0 BIT(0)
+#define MCF_ESW_ENA_TRANSMIT_1 BIT(1)
+#define MCF_ESW_ENA_TRANSMIT_2 BIT(2)
+
+#define MCF_ESW_ENA_RECEIVE_0 BIT(16)
+#define MCF_ESW_ENA_RECEIVE_1 BIT(17)
+#define MCF_ESW_ENA_RECEIVE_2 BIT(18)
+
+#define MCF_ESW_ENA_PORT_0 (MCF_ESW_ENA_TRANSMIT_0 | MCF_ESW_ENA_RECEIVE_0)
+#define MCF_ESW_ENA_PORT_1 (MCF_ESW_ENA_TRANSMIT_1 | MCF_ESW_ENA_RECEIVE_1)
+#define MCF_ESW_ENA_PORT_2 (MCF_ESW_ENA_TRANSMIT_2 | MCF_ESW_ENA_RECEIVE_2)
+
+#define MCF_FEC_RCR_DRT (0x00000002)
+#define MCF_FEC_RCR_PROM (0x00000008)
+#define MCF_FEC_RCR_FCE (0x00000020)
+#define MCF_FEC_RCR_RMII_MODE (0x00000100)
+#define MCF_FEC_RCR_RMII_10BASET (0x00000200)
+#define MCF_FEC_RCR_MAX_FL(x) (((x)&0x00003FFF)<<16)
+#define MCF_FEC_RCR_CRC_FWD (0x00004000)
+#define MCF_FEC_RCR_NO_LGTH_CHECK (0x40000000)
+#define MCF_FEC_TCR_FDEN (0x00000004)
+
+#define MCF_FEC_ECR_RESET (0x00000001)
+#define MCF_FEC_ECR_ETHER_EN (0x00000002)
+#define MCF_FEC_ECR_MAGIC_ENA (0x00000004)
+#define MCF_FEC_ECR_ENA_1588 (0x00000010)
+
+/* L2 switch Maximum receive buff size */
+#define MCF_ESW_R_BUFF_SIZE 0x14
+/* L2 switch Receive Descriptor Active Register */
+#define MCF_ESW_R_RDAR 0x18
+
+/*
+ * MCF_FEC_EMRBR corresponds to FEC_R_BUFF_SIZE_0 in fec main driver.
+ * It is duplicated as for L2 switch FEC_R_BUFF_SIZE_0 is aliased
+ * to different offset in switch IC. Hence the need to introduce new
+ * #define.
+ */
+#define MCF_FEC_EMRBR (0x188)
+
+#define MCF_FEC_ERDSR(x) ((x) << 2)
+
+#define MCF_ECR_ETHER_EN 1
+#define FEC_ECNTRL_ETHER_EN BIT(1)
+/*
+ * The Switch stores dest/src/type, data, and checksum for receive packets.
+ */
+#define L2SW_PKT_MAXBUF_SIZE 1518
+#define L2SW_PKT_MINBUF_SIZE 64
+#define L2SW_PKT_MAXBLR_SIZE 1520
+
+#endif /* __MTIP_L2SWITCH_H_ */
--
2.20.1

2020-11-26 00:07:13

by Lukasz Majewski

[permalink] [raw]
Subject: [RFC 4/4] ARM: dts: imx28: Add description for L2 switch on XEA board

The 'eth_switch' node is now used to enable support for L2 switch.
Moreover, a separate 'switch' node was introduced to keep the code more
clean.

Signed-off-by: Lukasz Majewski <[email protected]>
---
arch/arm/boot/dts/imx28-xea.dts | 55 +++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)

diff --git a/arch/arm/boot/dts/imx28-xea.dts b/arch/arm/boot/dts/imx28-xea.dts
index 672080485b78..b8a896df02c0 100644
--- a/arch/arm/boot/dts/imx28-xea.dts
+++ b/arch/arm/boot/dts/imx28-xea.dts
@@ -9,6 +9,30 @@

/ {
model = "XEA";
+
+ switch {
+ compatible = "imx,mtip-l2switch";
+ reg = <0x800f8000 0x400>, <0x800fC000 0x4000>;
+ ports {
+ port0@0 {
+ reg = <0>;
+ label = "lan1";
+ phy-handle = <&ethsw0>;
+ };
+
+ port1@1 {
+ reg = <1>;
+ label = "lan2";
+ phy-handle = <&ethsw1>;
+ };
+
+ port2@2 {
+ reg = <2>;
+ label = "cpu";
+ ethernet = <&eth_switch>;
+ };
+ };
+ };
};

&can0 {
@@ -23,6 +47,37 @@
status = "okay";
};

+&eth_switch {
+ compatible = "fsl,imx28-l2switch";
+ reg = <0x800f0000 0x8000>, <0x800f8400 0x400>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&mac0_pins_a>, <&mac1_pins_a>;
+ phy-mode = "rmii";
+ phy-supply = <&reg_fec_3v3>;
+ phy-reset-gpios = <&gpio2 13 0>;
+ phy-reset-duration = <100>;
+ interrupts = <100>, <101>, <102>;
+ clocks = <&clks 57>, <&clks 57>, <&clks 64>, <&clks 35>;
+ clock-names = "ipg", "ahb", "enet_out", "ptp";
+ local-mac-address = [ 00 11 22 AA BB CC ];
+ status = "okay";
+
+ fixed-link {
+ speed = <100>;
+ full-duplex;
+ };
+
+ mdio {
+ ethsw0: ethernet-phy@0 {
+ reg = <0>;
+ };
+
+ ethsw1: ethernet-phy@1 {
+ reg = <1>;
+ };
+ };
+};
+
&pinctrl {
pinctrl-names = "default";
pinctrl-0 = <&hog_pins_a &hog_pins_tiva>;
--
2.20.1

2020-11-26 00:12:16

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

On Thu, Nov 26, 2020 at 12:24:55AM +0100, Lukasz Majewski wrote:
> This is the first attempt to add support for L2 switch available on some NXP
> devices - i.e. iMX287 or VF610. This patch set uses common FEC and DSA code.

Interesting. I need to take another look at the Vybrid manual. Last
time i looked, i was more thinking of a pure switchdev driver, not a
DSA driver. So i'm not sure this is the correct architecture. But has
been a while since i looked at the datasheet.

Andrew

2020-11-26 01:35:51

by Florian Fainelli

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC



On 11/25/2020 4:00 PM, Andrew Lunn wrote:
> On Thu, Nov 26, 2020 at 12:24:55AM +0100, Lukasz Majewski wrote:
>> This is the first attempt to add support for L2 switch available on some NXP
>> devices - i.e. iMX287 or VF610. This patch set uses common FEC and DSA code.
>
> Interesting. I need to take another look at the Vybrid manual. Last
> time i looked, i was more thinking of a pure switchdev driver, not a
> DSA driver. So i'm not sure this is the correct architecture. But has
> been a while since i looked at the datasheet.

Agreed the block diagram shows one DMA for each "switch port" which
definitively fits more within the switchdev model than the DSA model
that re-purposes an existing Ethernet MAC controller as-is and bolts on
an integrated or external switch IC.
--
Florian

2020-11-26 08:26:26

by Lukasz Majewski

[permalink] [raw]
Subject: [RFC 1/4] net: fec: Move some defines to ./drivers/net/ethernet/freescale/fec.h header

After this change ECR (control register) defines are moved to fec.h, so
they can be reused by L2 switch code.

Signed-off-by: Lukasz Majewski <[email protected]>
---
drivers/net/ethernet/freescale/fec.h | 3 +++
drivers/net/ethernet/freescale/fec_main.c | 3 ---
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 832a2175636d..c555a421f647 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -189,6 +189,9 @@
#define FEC_RXIC2 0xfff
#endif /* CONFIG_M5272 */

+/* FEC ECR bits definition */
+#define FEC_ECR_MAGICEN (1 << 2)
+#define FEC_ECR_SLEEP (1 << 3)

/*
* Define the buffer descriptor structure.
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index fb37816a74db..bd29c84fd89a 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -249,9 +249,6 @@ MODULE_PARM_DESC(macaddr, "FEC Ethernet MAC address");
#define FEC_MMFR_RA(v) ((v & 0x1f) << 18)
#define FEC_MMFR_TA (2 << 16)
#define FEC_MMFR_DATA(v) (v & 0xffff)
-/* FEC ECR bits definition */
-#define FEC_ECR_MAGICEN (1 << 2)
-#define FEC_ECR_SLEEP (1 << 3)

#define FEC_MII_TIMEOUT 30000 /* us */

--
2.20.1

2020-11-26 09:39:19

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

On Wed, Nov 25, 2020 at 05:30:04PM -0800, Florian Fainelli wrote:
>
>
> On 11/25/2020 4:00 PM, Andrew Lunn wrote:
> > On Thu, Nov 26, 2020 at 12:24:55AM +0100, Lukasz Majewski wrote:
> >> This is the first attempt to add support for L2 switch available on some NXP
> >> devices - i.e. iMX287 or VF610. This patch set uses common FEC and DSA code.
> >
> > Interesting. I need to take another look at the Vybrid manual. Last
> > time i looked, i was more thinking of a pure switchdev driver, not a
> > DSA driver. So i'm not sure this is the correct architecture. But has
> > been a while since i looked at the datasheet.
>
> Agreed the block diagram shows one DMA for each "switch port" which
> definitively fits more within the switchdev model than the DSA model
> that re-purposes an existing Ethernet MAC controller as-is and bolts on
> an integrated or external switch IC.

Hi Florian

I'm not sure it is that simple. I'm looking at the Vybrid
datasheet. There are two major configurations.

1) The switch is pass through, and does nothing. Then two DMA channels
are used, one per external port. You basically just have two Ethernet
interfaces

2) The switch is active. You then have a 3 port switch, 2 ports for
the external interfaces, and one port connected to a single DMA
channel.

So when in an active mode, it does look more like a DSA switch.

What is not yet clear to me is how you direct frames out specific
interfaces. This is where i think we hit problems. I don't see a
generic mechanism, which is probably why Lukasz put tagger as None. It
does appear you can control the output of BPDUs, but it is not very
friendly. You write a register with the port you would like the next
BPDU to go out, queue the frame up on the DMA, and then poll a bit in
the register which flips when the frame is actually processed in the
switch. I don't see how you determine what port a BPDU came in on!
Maybe you have to use the learning interface?

Ah, the ESW_FFEN register can be used to send a frame out a specific
port. Write this register with the destination port, DMA a frame, and
it goes out the specific port. You then need to write the register
again for the next frame.

I get the feeling this is going to need a very close relationship
between the 'tagger' and the DMA engine. I don't see how this can be
done using the DSA architecture, the coupling is too loose.

It seems like the HW design assumes frames from the CPU will be
switched using the switch internal FDB, making Linux integration
"interesting"

It does not look like this is a classic DSA switch with a tagging
protocol. It might be possible to do VLAN per port, in order to direct
frames out a specific port?

Andrew

2020-11-26 11:43:55

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Andrew, Florian,

> On Wed, Nov 25, 2020 at 05:30:04PM -0800, Florian Fainelli wrote:
> >
> >
> > On 11/25/2020 4:00 PM, Andrew Lunn wrote:
> > > On Thu, Nov 26, 2020 at 12:24:55AM +0100, Lukasz Majewski wrote:
> > >> This is the first attempt to add support for L2 switch available
> > >> on some NXP devices - i.e. iMX287 or VF610. This patch set uses
> > >> common FEC and DSA code.
> > >
> > > Interesting. I need to take another look at the Vybrid manual.
> > > Last time i looked, i was more thinking of a pure switchdev
> > > driver, not a DSA driver. So i'm not sure this is the correct
> > > architecture. But has been a while since i looked at the
> > > datasheet.
> >
> > Agreed the block diagram shows one DMA for each "switch port" which
> > definitively fits more within the switchdev model than the DSA model
> > that re-purposes an existing Ethernet MAC controller as-is and
> > bolts on an integrated or external switch IC.
>
> Hi Florian
>
> I'm not sure it is that simple. I'm looking at the Vybrid
> datasheet. There are two major configurations.
>
> 1) The switch is pass through, and does nothing. Then two DMA channels
> are used, one per external port.

This is the "default" state (at least for imx28) - Chapter 29.3.2 on
User Manual [0].

Then you use DMA0 and DMA1 to read/write data to ENET-MAC{01}.

> You basically just have two Ethernet
> interfaces

If I may add important note - on the imx28 the ENET-MAC1 is configured
via ENET-MAC0 (it has FEC_QUIRK_SINGLE_MDIO set in fec_main.c). On this
device it is clearly stated that ENET-MAC1 block functionality is
reduced and its main purpose is to be used with L2 switch.

On the contrary, this flag is not set for vf610 in fec_main.c

>
> 2) The switch is active. You then have a 3 port switch, 2 ports for
> the external interfaces, and one port connected to a single DMA
> channel.

+1 (Only DMA0 is used)

>
> So when in an active mode, it does look more like a DSA switch.

It also looked like this for me.

>
> What is not yet clear to me is how you direct frames out specific
> interfaces. This is where i think we hit problems. I don't see a
> generic mechanism, which is probably why Lukasz put tagger as None.

I've put the "None" tag just to share the "testable" RFC code.

It is possible to "tag" frames - at least from the manual [0]:
Chapter: "29.4.9.2 Forced Forwarding".

With using register HW_ENET_SWI_FORCE_FWD_P0
29.9.34 ENET SWI Enable forced forwarding for a frame processed
from port 0 (HW_ENET_SWI_FORCE_FWD_P0)

One can "tag" the packet going from port0 (internal one from SoC) to be
forwarded to port1 (ENET-MAC0) or port2 (ENET-MAC1).

According to the legacy driver [1]:
"* It only replace the MAC lookup function,
* all other filtering(eg.VLAN verification) act as normal"

> It
> does appear you can control the output of BPDUs, but it is not very
> friendly. You write a register with the port you would like the next
> BPDU to go out, queue the frame up on the DMA, and then poll a bit in
> the register which flips when the frame is actually processed in the
> switch. I don't see how you determine what port a BPDU came in on!
> Maybe you have to use the learning interface?

The learning interface works with the legacy NXP driver (2.6.35) which
copy can be found here[1].

>
> Ah, the ESW_FFEN register can be used to send a frame out a specific
> port. Write this register with the destination port, DMA a frame, and
> it goes out the specific port. You then need to write the register
> again for the next frame.

It seems like the description of HW_ENET_SWI_FORCE_FWD_P0 described
above for imx28.

>
> I get the feeling this is going to need a very close relationship
> between the 'tagger' and the DMA engine. I don't see how this can be
> done using the DSA architecture, the coupling is too loose.

My first impression was that it could be possible to set this
register in the DSA callback (which normally append the tag to the
frame).

This would work if we can assure that after calling this callback this
frame is transmitted (wait and poll?). Have I correctly understood
your above concern?

I do know that L2 switch has some kind of buffer from DMA0 (port0 - its
input port). However, I don't have access so such detailed manual.

>
> It seems like the HW design assumes frames from the CPU will be
> switched using the switch internal FDB, making Linux integration
> "interesting"

The MoreThanIP L2 switch (on imx28) has 2K entries for setting FDB. It
also uses some hash function to speed up access/presence assessment.

>
> It does not look like this is a classic DSA switch with a tagging
> protocol.

This whole L2 switch implementation available on NXP's SoCs is a bit
odd. It is very highly coupled with FEC, ENET and DMA.

The original driver (2.6.35) was just a copy of FEC driver with some
switch adjustments.

Do you have any idea how to proceed?

The vf610 and imx28 are still produced and widely used, so this is
still "actual" topic.

> It might be possible to do VLAN per port, in order to direct
> frames out a specific port?

The old driver had code to support VLANs.

From the manual[0] (29.1):
" Programmable Ingress and Egress VLAN tag addition, removal and
manipulation supporting single and double-tagged VLAN frames"

Also the "29.4.10 Frame Forwarding Tasks" from [0] sheds some more
light on it.

From the manual (29.4.10.2) it looks like the VLAN tag can be appended.
However, I don't know if it will work with VLAN built with L2 switch output ports.

>
> Andrew

Links:

[0] - "i.MX28 Applications Processor Reference Manual, Rev. 2, 08/2013"
[1] -
https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-11-26 14:49:14

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

> > What is not yet clear to me is how you direct frames out specific
> > interfaces. This is where i think we hit problems. I don't see a
> > generic mechanism, which is probably why Lukasz put tagger as None.
>
> I've put the "None" tag just to share the "testable" RFC code.

Tagging is a core feature of DSA. Without being able to direct a
packet out a specific port, it is not really a DSA driver. It is also
core requirement of integrating a switch into Linux. A DSA driver, or
a pure switchdev driver expects to be able to forward frames out
specific ports.

> It is possible to "tag" frames - at least from the manual [0]:
> Chapter: "29.4.9.2 Forced Forwarding".
>
> With using register HW_ENET_SWI_FORCE_FWD_P0
> 29.9.34 ENET SWI Enable forced forwarding for a frame processed
> from port 0 (HW_ENET_SWI_FORCE_FWD_P0)
>
> One can "tag" the packet going from port0 (internal one from SoC) to be
> forwarded to port1 (ENET-MAC0) or port2 (ENET-MAC1).
>
> According to the legacy driver [1]:
> "* It only replace the MAC lookup function,
> * all other filtering(eg.VLAN verification) act as normal"

This might solve your outgoing frame problems. But you need to dive
deep into how the FEC driver works, especially in a DSA like
setup. The normal path would be, the slave interface passes a frame to
the tagger driver, living in net/dsa/tag_*.c. Normally, it adds a
header/trailer which the switch looks at. It then hands to packet over
to the master Ethernet driver, which at some point will send the
frame. Because the frame is self contained, we don't care what that
ethernet driver actually does. It can add it to a queue and send it
later. It can look at the QoS tags and send it with low priority after
other frames, or could put it to the head of the queue and send it
before other frames etc.

Since you don't have self contained frames, this is a problem. After
writing to this register, you need to ensure what is transmitted next
is the specific frame you intend. It cannot be added to an existing
queue etc. You need to know when the frame has been sent, so you can
re-write this register for the next frame.

This is why i said i don't know if the DSA architecture will work. You
need a close coupling between the tagger setting the force bits, and
the DMA engine sending the frame.

The other option is you totally ignore most of this and statically
assign VLANs. Frames sent with VLAN 1 are forwarded out port 1. Frames
sent with VLAN 2 are sent out port 2. You need the port to
append/strip these VLAN tags for ingress/egress. tag_8021q.c gives you
some code to help with this. But can you still use the hardware to
switch frames between ports 1 and 2 without them going via the CPU?

Andrew.

2020-11-27 08:24:01

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Lukasz,

On Thu, Nov 26, 2020 at 12:24:55AM +0100, Lukasz Majewski wrote:
> This is the first attempt to add support for L2 switch available on some NXP
> devices - i.e. iMX287 or VF610. This patch set uses common FEC and DSA code.
>
> This code provides _very_ basic switch functionality (packets are passed
> between lan1 and lan2 ports and it is possible to send packets via eth0),
> at its main purpose is to establish the way of reusing the FEC driver. When
> this is done, one can add more advanced features to the switch (like vlan or
> port separation).
>
> I also do have a request for testing on e.g. VF610 if this driver works on
> it too.
> The L2 switch documentation is very scant on NXP's User Manual [0] and most
> understanding of how it really works comes from old (2.6.35) NXP driver [1].
> The aforementioned old driver [1] was monolitic and now this patch set tries
> to mix FEC and DSA.
>
> Open issues:
> - I do have a hard time on understanding how to "disable" ENET-MAC{01} ports
> in DSA (via port_disable callback in dsa_switch_ops).
> When I disable L2 switch port1,2 or the ENET-MAC{01} in control register, I
> cannot simply re-enable it with enabling this bit again. The old driver reset
> (and setup again) the whole switch.

You don't have to disable the ports if that does more harm than good, of course.

> - The L2 switch is part of the SoC silicon, so we cannot follow the "normal" DSA
> pattern with "attaching" it via mdio device. The switch reuses already well
> defined ENET-MAC{01}. For that reason the MoreThanIP switch driver is
> registered as platform device

That is not a problem. Also, I would not say that the "normal" DSA
pattern is to have an MDIO-attached switch. Maybe that was true 10 years
ago. But now, we have DSA switches registered as platform devices, I2C
devices, SPI devices, PCI devices. That is not an issue under any
circumstance.

> - The question regarding power management - at least for my use case there
> is no need for runtime power management. The L2 switch shall work always at
> it connects other devices.
>
> - The FEC clock is also used for L2 switch management and configuration (as
> the L2 switch is just in the same, large IP block). For now I just keep it
> enabled so DSA code can use it. It looks a bit problematic to export
> fec_enet_clk_enable() to be reused on DSA code.
>
> Links:
> [0] - "i.MX28 Applications Processor Reference Manual, Rev. 2, 08/2013"
> [1] - https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5

Disclaimer: I don't know the details of imx28, it's just now that I
downloaded the reference manual to see what it's about.

I would push back and say that the switch offers bridge acceleration for
the FEC. The fact that the bridge acceleration is provided by a different
vendor and requires access to an extra set of register blocks is immaterial.
To qualify as a DSA switch, you need to have indirect networking I/O
through a different network interface. You do not have that. What I
would do is I would expand the fec driver into something that, on
capable SoCs, detects bridging of the ENET_MAC0 and ENETC_MAC1 ports and
configures the switch accordingly to offload that in a seamless manner
for the user. This would also solve your power management issues, since
the entire Ethernet block would be handled by a single driver.
DSA is a complication you do not need. Convince me otherwise.

Also, side note.
Please, please, please, stop calling it "l2 switch" and use something
more specific. If everybody writing a driver for the Linux kernel called
their L2 switch "L2 switch", we would go crazy. The kernel is not a deep
silo like the hardware team that integrated this MTIP switching IP into
imx28, and for whom this L2 switch is the only switch that exists, and
therefore for whom no further qualification was necessary. Andy, Peng or
Fabio might be able to give you a reference to an internal code name
that you can use, or something. Otherwise, I don't care if you need to
invent a name yourself - be as creative as you feel like. mtip-fec-switch,
charlie, matilda, brunhild, all fine by me.

2020-11-27 08:49:06

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Vladimir,

> Hi Lukasz,
>
> On Thu, Nov 26, 2020 at 12:24:55AM +0100, Lukasz Majewski wrote:
> > This is the first attempt to add support for L2 switch available on
> > some NXP devices - i.e. iMX287 or VF610. This patch set uses common
> > FEC and DSA code.
> >
> > This code provides _very_ basic switch functionality (packets are
> > passed between lan1 and lan2 ports and it is possible to send
> > packets via eth0), at its main purpose is to establish the way of
> > reusing the FEC driver. When this is done, one can add more
> > advanced features to the switch (like vlan or port separation).
> >
> > I also do have a request for testing on e.g. VF610 if this driver
> > works on it too.
> > The L2 switch documentation is very scant on NXP's User Manual [0]
> > and most understanding of how it really works comes from old
> > (2.6.35) NXP driver [1]. The aforementioned old driver [1] was
> > monolitic and now this patch set tries to mix FEC and DSA.
> >
> > Open issues:
> > - I do have a hard time on understanding how to "disable"
> > ENET-MAC{01} ports in DSA (via port_disable callback in
> > dsa_switch_ops). When I disable L2 switch port1,2 or the
> > ENET-MAC{01} in control register, I cannot simply re-enable it with
> > enabling this bit again. The old driver reset (and setup again) the
> > whole switch.
>
> You don't have to disable the ports if that does more harm than good,
> of course.

Ok.

>
> > - The L2 switch is part of the SoC silicon, so we cannot follow the
> > "normal" DSA pattern with "attaching" it via mdio device. The
> > switch reuses already well defined ENET-MAC{01}. For that reason
> > the MoreThanIP switch driver is registered as platform device
>
> That is not a problem. Also, I would not say that the "normal" DSA
> pattern is to have an MDIO-attached switch. Maybe that was true 10
> years ago. But now, we have DSA switches registered as platform
> devices, I2C devices, SPI devices, PCI devices. That is not an issue
> under any circumstance.

Ok. The MTIP switch now registers as platform device.

>
> > - The question regarding power management - at least for my use
> > case there is no need for runtime power management. The L2 switch
> > shall work always at it connects other devices.
> >
> > - The FEC clock is also used for L2 switch management and
> > configuration (as the L2 switch is just in the same, large IP
> > block). For now I just keep it enabled so DSA code can use it. It
> > looks a bit problematic to export fec_enet_clk_enable() to be
> > reused on DSA code.
> >
> > Links:
> > [0] - "i.MX28 Applications Processor Reference Manual, Rev. 2,
> > 08/2013" [1] -
> > https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5
> >
>
> Disclaimer: I don't know the details of imx28, it's just now that I
> downloaded the reference manual to see what it's about.
>
> I would push back and say that the switch offers bridge acceleration
> for the FEC.

Am I correct, that the "bridge acceleration" means in-hardware support
for L2 packet bridging?

And without the switch IP block enabled one shall be able to have
software bridging in Linux in those two interfaces?

> The fact that the bridge acceleration is provided by a
> different vendor and requires access to an extra set of register
> blocks is immaterial.

Am I correct that you mean not important above (i.e. immaterial == not
important)?

> To qualify as a DSA switch, you need to have
> indirect networking I/O through a different network interface. You do
> not have that.

I do have eth0 (DMA0) -> MoreThanIP switch port 0 input
|
|----> switch port1 -> ENET-MAC0
|
|----> switch port2 -> ENET-MAC1

> What I would do is I would expand the fec driver into
> something that, on capable SoCs, detects bridging of the ENET_MAC0
> and ENETC_MAC1 ports and configures the switch accordingly to offload
> that in a seamless manner for the user.

Do you propose to catch some kind of notification when user calls:

ip link add name br0 type bridge; ip link set br0 up;
ip link set lan1 up; ip link set lan2 up;
ip link set lan1 master br0; ip link set lan2 master br0;
bridge link

And then configure the FEC driver to use this L2 switch driver?

> This would also solve your
> power management issues, since the entire Ethernet block would be
> handled by a single driver. DSA is a complication you do not need.
> Convince me otherwise.

From what I see the MoreThanIP IP block looks like a "typical" L2 switch
(like lan9xxx), with VLAN tagging support, static and dynamic tables,
forcing the packet to be passed to port [*], congestion management,
switch input buffer, priority of packets/queues, broadcast, multicast,
port snooping, and even IEEE1588 timestamps.

Seems like a lot of useful features.

The differences from "normal" DSA switches:

1. It uses mapped memory (for its register space) for
configuration/statistics gathering (instead of e.g. SPI, I2C)

2. The TAG is not appended to the frame outgoing from the "master" FEC
port - it can be setup when DMA transfers packet to MTIP switch internal
buffer.

Note:

[*] - The same situation is in the VF610 and IMX28:
The ESW_FFEN register - Bit 0 -> FEN

"When set, the next frame received from port 0 (the local DMA port) is
forwarded to the ports defined in FD. The bit resets to zero
automatically when one frame from port 0 has been processed by the
switch (i.e. has been read from the port 0 input buffer; see Figure
32-1). Therefore, the bit must be set again as necessary. See also
Section 32.5.8.2, "Forced Forwarding" for a description."

(Of course the "Section 32.5.8.2" is not available)


According to above the "tag" (engress port) is set when DMA transfers
the packet to input MTIP buffer. This shall allow force forwarding as
we can setup this bit when we normally append tag in the network stack.

I will investigate this issue - and check the port separation. If it
works then DSA (or switchdev) shall be used?

(A side question - DSA uses switchdev, so when one shall use switchdev
standalone?)


>
> Also, side note.
> Please, please, please, stop calling it "l2 switch" and use something
> more specific. If everybody writing a driver for the Linux kernel
> called their L2 switch "L2 switch", we would go crazy. The kernel is
> not a deep silo like the hardware team that integrated this MTIP
> switching IP into imx28, and for whom this L2 switch is the only
> switch that exists, and therefore for whom no further qualification
> was necessary.

The "l2 switch" name indeed was taken from the NXP's legacy driver :-)

> Andy, Peng or Fabio might be able to give you a
> reference to an internal code name that you can use, or something.

I would prefer to obtain some kind of manual/doc/spec for MTIP IP
block.

> Otherwise, I don't care if you need to invent a name yourself - be as
> creative as you feel like. mtip-fec-switch, charlie, matilda,
> brunhild, all fine by me.

I think that "mtip-fec-switch" is the most reasonable name.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-11-27 08:49:32

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

> > I would push back and say that the switch offers bridge acceleration
> > for the FEC.
>
> Am I correct, that the "bridge acceleration" means in-hardware support
> for L2 packet bridging?

You should think of the hardware as an accelerator, not a switch. The
hardware is there to accelerate what linux can already do. You setup a
software bridge in linux, and then offload L2 switching to the
accelerator. You setup vlans in linux, and then offload the filtering
of them to the accelerator. If there is something linux can do, but
the hardware cannot accelerate, you leave linux to do it in software.

> Do you propose to catch some kind of notification when user calls:
>
> ip link add name br0 type bridge; ip link set br0 up;
> ip link set lan1 up; ip link set lan2 up;
> ip link set lan1 master br0; ip link set lan2 master br0;
> bridge link
>
> And then configure the FEC driver to use this L2 switch driver?

That is what switchdev does. There are various hooks in the network
stack which call into switchdev to ask it to offload operations to the
accelerator.

> The differences from "normal" DSA switches:
>
> 1. It uses mapped memory (for its register space) for
> configuration/statistics gathering (instead of e.g. SPI, I2C)

That does not matter. And there are memory mapped DSA switches. The
DSA framework puts no restrictions on how the control plane works.

> (Of course the "Section 32.5.8.2" is not available)

It is in the Vybrid datasheet :-)

Andrew

2020-11-27 08:49:36

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Andrew,

> > > What is not yet clear to me is how you direct frames out specific
> > > interfaces. This is where i think we hit problems. I don't see a
> > > generic mechanism, which is probably why Lukasz put tagger as
> > > None.
> >
> > I've put the "None" tag just to share the "testable" RFC code.
>
> Tagging is a core feature of DSA. Without being able to direct a
> packet out a specific port, it is not really a DSA driver. It is also
> core requirement of integrating a switch into Linux. A DSA driver, or
> a pure switchdev driver expects to be able to forward frames out
> specific ports.

Please find my answer, which I gave to Vladimir in the other mail (you
were CC'ed).

As a backup plan - the vlan tagging may be worth to investigate.

>
> > It is possible to "tag" frames - at least from the manual [0]:
> > Chapter: "29.4.9.2 Forced Forwarding".
> >
> > With using register HW_ENET_SWI_FORCE_FWD_P0
> > 29.9.34 ENET SWI Enable forced forwarding for a frame processed
> > from port 0 (HW_ENET_SWI_FORCE_FWD_P0)
> >
> > One can "tag" the packet going from port0 (internal one from SoC)
> > to be forwarded to port1 (ENET-MAC0) or port2 (ENET-MAC1).
> >
> > According to the legacy driver [1]:
> > "* It only replace the MAC lookup function,
> > * all other filtering(eg.VLAN verification) act as normal"
>
> This might solve your outgoing frame problems. But you need to dive
> deep into how the FEC driver works, especially in a DSA like
> setup.

Agree.

> The normal path would be, the slave interface passes a frame to
> the tagger driver, living in net/dsa/tag_*.c. Normally, it adds a
> header/trailer which the switch looks at. It then hands to packet over
> to the master Ethernet driver, which at some point will send the
> frame. Because the frame is self contained, we don't care what that
> ethernet driver actually does. It can add it to a queue and send it
> later. It can look at the QoS tags and send it with low priority after
> other frames, or could put it to the head of the queue and send it
> before other frames etc.
>

Thanks for the explanation.

> Since you don't have self contained frames, this is a problem. After
> writing to this register, you need to ensure what is transmitted next
> is the specific frame you intend. It cannot be added to an existing
> queue etc. You need to know when the frame has been sent, so you can
> re-write this register for the next frame.

This needs to be assessed as the documentation is very vague. I'm
wondering how MTIP/NXP recommends usage of ESW_FFEN register.

>
> This is why i said i don't know if the DSA architecture will work. You
> need a close coupling between the tagger setting the force bits, and
> the DMA engine sending the frame.

Maybe it would be just enough to program the ESW_FFEN register when ENET
descriptor is programmed for DMA? Earlier we would append the
superfluous tag in the tag_*.c ?

>
> The other option is you totally ignore most of this and statically
> assign VLANs. Frames sent with VLAN 1 are forwarded out port 1. Frames
> sent with VLAN 2 are sent out port 2. You need the port to
> append/strip these VLAN tags for ingress/egress. tag_8021q.c gives you
> some code to help with this. But can you still use the hardware to
> switch frames between ports 1 and 2 without them going via the CPU?

Yes, it is possible to switch frames between ENET-MAC{01} ports without
any interaction from CPU.

>
> Andrew.




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-11-27 08:49:48

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

> (A side question - DSA uses switchdev, so when one shall use switchdev
> standalone?)

DSA gives you a framework for an Ethernet switch connected to a host
via Ethernet for the data plane. Generally, that Ethernet link to the
switch is a MAC to MAC connection. It can be PHY to PHY. But those are
just details. The important thing is you use an Ethernet driver on the
host.

If you look at pure switchdev devices, they generally DMA frames
directly into the switch. There is either one DMA queue per switch
port, or there is a way to multiplex frames over one DMA queue,
generally by additional fields in the buffer descriptor.

For this device, at the moment, it is hard to say which is the best
fit. A lot will depend on how the FEC driver works, if you can reuse
it, while still having the degree of control you need over the DMA
channel. If you can reuse the FEC driver, then a DSA driver might
work. If the coupling it too loose, and you have to take control of
the DMA, then a pure switchdev driver seems more appropriate.

Andrew

2020-11-27 09:20:32

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Andrew,

> > (A side question - DSA uses switchdev, so when one shall use
> > switchdev standalone?)
>
> DSA gives you a framework for an Ethernet switch connected to a host
> via Ethernet for the data plane. Generally, that Ethernet link to the
> switch is a MAC to MAC connection. It can be PHY to PHY. But those are
> just details. The important thing is you use an Ethernet driver on the
> host.
>
> If you look at pure switchdev devices, they generally DMA frames
> directly into the switch. There is either one DMA queue per switch
> port, or there is a way to multiplex frames over one DMA queue,
> generally by additional fields in the buffer descriptor.
>
> For this device, at the moment, it is hard to say which is the best
> fit. A lot will depend on how the FEC driver works, if you can reuse
> it, while still having the degree of control you need over the DMA
> channel. If you can reuse the FEC driver, then a DSA driver might
> work. If the coupling it too loose, and you have to take control of
> the DMA, then a pure switchdev driver seems more appropriate.
>
> Andrew
>

Thanks for the detailed explanation.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-11-27 09:28:01

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Andrew,

> > > I would push back and say that the switch offers bridge
> > > acceleration for the FEC.
> >
> > Am I correct, that the "bridge acceleration" means in-hardware
> > support for L2 packet bridging?
>
> You should think of the hardware as an accelerator, not a switch. The
> hardware is there to accelerate what linux can already do. You setup a
> software bridge in linux, and then offload L2 switching to the
> accelerator. You setup vlans in linux, and then offload the filtering
> of them to the accelerator. If there is something linux can do, but
> the hardware cannot accelerate, you leave linux to do it in software.

Ok.

>
> > Do you propose to catch some kind of notification when user calls:
> >
> > ip link add name br0 type bridge; ip link set br0 up;
> > ip link set lan1 up; ip link set lan2 up;
> > ip link set lan1 master br0; ip link set lan2 master br0;
> > bridge link
> >
> > And then configure the FEC driver to use this L2 switch driver?
>
> That is what switchdev does. There are various hooks in the network
> stack which call into switchdev to ask it to offload operations to the
> accelerator.

Ok.

>
> > The differences from "normal" DSA switches:
> >
> > 1. It uses mapped memory (for its register space) for
> > configuration/statistics gathering (instead of e.g. SPI, I2C)
>
> That does not matter. And there are memory mapped DSA switches. The
> DSA framework puts no restrictions on how the control plane works.
>
> > (Of course the "Section 32.5.8.2" is not available)
>
> It is in the Vybrid datasheet :-)

Hmm...

I cannot find such chapter in the official documentation from NXP:
"VFxxx Controller Reference Manual, Rev. 0, 10/2016"

Maybe you have more verbose version? Could you share how the document
is named?

>
> Andrew


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-11-27 15:12:59

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

On Fri, Nov 27, 2020 at 10:25:28AM +0100, Lukasz Majewski wrote:
> Hi Andrew,
>
> > > > I would push back and say that the switch offers bridge
> > > > acceleration for the FEC.
> > >
> > > Am I correct, that the "bridge acceleration" means in-hardware
> > > support for L2 packet bridging?
> >
> > You should think of the hardware as an accelerator, not a switch. The
> > hardware is there to accelerate what linux can already do. You setup a
> > software bridge in linux, and then offload L2 switching to the
> > accelerator. You setup vlans in linux, and then offload the filtering
> > of them to the accelerator. If there is something linux can do, but
> > the hardware cannot accelerate, you leave linux to do it in software.
>
> Ok.
>
> >
> > > Do you propose to catch some kind of notification when user calls:
> > >
> > > ip link add name br0 type bridge; ip link set br0 up;
> > > ip link set lan1 up; ip link set lan2 up;
> > > ip link set lan1 master br0; ip link set lan2 master br0;
> > > bridge link
> > >
> > > And then configure the FEC driver to use this L2 switch driver?
> >
> > That is what switchdev does. There are various hooks in the network
> > stack which call into switchdev to ask it to offload operations to the
> > accelerator.
>
> Ok.
>
> >
> > > The differences from "normal" DSA switches:
> > >
> > > 1. It uses mapped memory (for its register space) for
> > > configuration/statistics gathering (instead of e.g. SPI, I2C)
> Hmm...
>
> I cannot find such chapter in the official documentation from NXP:
> "VFxxx Controller Reference Manual, Rev. 0, 10/2016"

I have

Vybrid Reference Manual
F-Series

Document Number: VYBRIDRM
Rev 7, 06/2014

Andrew

2020-11-27 20:08:19

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

On Fri, Nov 27, 2020 at 12:35:49AM +0100, Lukasz Majewski wrote:
> > > - The question regarding power management - at least for my use
> > > case there is no need for runtime power management. The L2 switch
> > > shall work always at it connects other devices.
> > >
> > > - The FEC clock is also used for L2 switch management and
> > > configuration (as the L2 switch is just in the same, large IP
> > > block). For now I just keep it enabled so DSA code can use it. It
> > > looks a bit problematic to export fec_enet_clk_enable() to be
> > > reused on DSA code.
> > >
> > > Links:
> > > [0] - "i.MX28 Applications Processor Reference Manual, Rev. 2,
> > > 08/2013" [1] -
> > > https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5
> > >
> >
> > Disclaimer: I don't know the details of imx28, it's just now that I
> > downloaded the reference manual to see what it's about.
> >
> > I would push back and say that the switch offers bridge acceleration
> > for the FEC.
>
> Am I correct, that the "bridge acceleration" means in-hardware support
> for L2 packet bridging?
>
> And without the switch IP block enabled one shall be able to have
> software bridging in Linux in those two interfaces?

So if the switch is bypassed through pin strapping of sx_ena, then the
DMA0 would be connected to ENETC-MAC 0, DMA1 to ENET-MAC 1, and both
these ports could be driven by the regular fec driver, am I right? This
is how people use the imx28 with mainline linux right now, no?

When the sx_ena signal enables the switch, a hardware accelerator
appears between the DMA engine and the same MACs. But that DMA engine is
still compatible with the expectations of the fec driver. And the MACs
still belong to the FEC. So, at the end of the day, there are still 2
FEC interfaces.

Where I was going is that from a user's perspective, it would be natural
to have the exact same view of the system in both cases, aka still two
network interfaces for MAC 0 and MAC 1, they still function in the same
way (i.e. you can still ping through them) but with the additional
ability to do hardware-accelerating bridging, if the MAC 0 and MAC 1
network interfaces are put under the same bridge.

Currently, you do not offer that, at all.
You split the MTIP switch configuration between DSA and the FEC driver.
But you are still not exposing networking-capable net devices for MAC 0
and MAC 1. You still do I/O through the DMA engine.

So why use DSA at all? What benefit does it bring you? Why not do the
entire switch configuration from within FEC, or a separate driver very
closely related to it?

> > The fact that the bridge acceleration is provided by a
> > different vendor and requires access to an extra set of register
> > blocks is immaterial.
>
> Am I correct that you mean not important above (i.e. immaterial == not
> important)?

Yes, sorry if that was not clear.

> > To qualify as a DSA switch, you need to have
> > indirect networking I/O through a different network interface. You do
> > not have that.
>
> I do have eth0 (DMA0) -> MoreThanIP switch port 0 input
> |
> |----> switch port1 -> ENET-MAC0
> |
> |----> switch port2 -> ENET-MAC1

The whole point of DSA is to intercept traffic from a different and
completely unaware network interface, parse a tag, and masquerade the
packet as though it came on a different, virtual, network interface
corresponding to the switch. DSA offers this service so that:
- any switch works with any host Ethernet controller
- all vendors use the same mechanism and do not reinvent the same wheel
The last part is especially relevant. Just grep net/core/ for "dsa" and
be amazed at the number of hacks there. DSA can justify that only by scale.
Aka "we have hardware that works this way, and doesn't work any other way.
And lots of it".

That being said, in your case, the network interface is not unaware of
the switch at all. Come on, you need to put the (allegedly) switch's MAC
in promiscuous mode, from the "DSA master" driver! Hardware resource
ownership is very DSA-unlike in the imx28/vybrid model, it seems. This
is a very big deal.

And there is no tag to parse. You have a switch with a DMA engine which
is a priori known to be register-compatible with the DMA engine used for
plain FEC interfaces. It's not like you have a switch that can be
connected via MII to anything and everything. Force forwarding is done
via writing a register, again very much unlike DSA, and retrieving the
source port on RX is up in the air.

> > What I would do is I would expand the fec driver into
> > something that, on capable SoCs, detects bridging of the ENET_MAC0
> > and ENETC_MAC1 ports and configures the switch accordingly to offload
> > that in a seamless manner for the user.
>
> Do you propose to catch some kind of notification when user calls:
>
> ip link add name br0 type bridge; ip link set br0 up;
> ip link set lan1 up; ip link set lan2 up;
> ip link set lan1 master br0; ip link set lan2 master br0;
> bridge link
>
> And then configure the FEC driver to use this L2 switch driver?

Yes, that can be summarized as:
One option to get this upstream would be to transform fec into a
switchdev driver, or to create a mtip switchdev driver that shares a lot
of code with the fec driver. The correct tool for code sharing is
EXPORT_SYMBOL_GPL, not DSA.

> > This would also solve your
> > power management issues, since the entire Ethernet block would be
> > handled by a single driver. DSA is a complication you do not need.
> > Convince me otherwise.
>
> From what I see the MoreThanIP IP block looks like a "typical" L2 switch
> (like lan9xxx), with VLAN tagging support, static and dynamic tables,
> forcing the packet to be passed to port [*], congestion management,
> switch input buffer, priority of packets/queues, broadcast, multicast,
> port snooping, and even IEEE1588 timestamps.
>
> Seems like a lot of useful features.

I did not say that it is not a typical switch, or that it doesn't have
useful features. I said that DSA does not help you. Adding a
DSA_TAG_PROTO_NONE driver in 2020 is a no-go all around.

> The differences from "normal" DSA switches:
>
> 1. It uses mapped memory (for its register space) for
> configuration/statistics gathering (instead of e.g. SPI, I2C)

Nope, that's not a difference.

> 2. The TAG is not appended to the frame outgoing from the "master" FEC
> port - it can be setup when DMA transfers packet to MTIP switch internal
> buffer.

That is a difference indeed. Since DSA switches are isolated from their
host interface, and there has been a traditional separation at the
MAC-to-MAC layer (or MAC-to-PHY-to-PHY-to-MAC in weird cases), the
routing information is passed in-band via a header in the packet. The
host interface has no idea that this header exists, it is just traffic
as usual. The whole DSA infrastructure is built around intercepting and
decoding that.

> Note:
>
> [*] - The same situation is in the VF610 and IMX28:
> The ESW_FFEN register - Bit 0 -> FEN
>
> "When set, the next frame received from port 0 (the local DMA port) is
> forwarded to the ports defined in FD. The bit resets to zero
> automatically when one frame from port 0 has been processed by the
> switch (i.e. has been read from the port 0 input buffer; see Figure
> 32-1). Therefore, the bit must be set again as necessary. See also
> Section 32.5.8.2, "Forced Forwarding" for a description."
>
> (Of course the "Section 32.5.8.2" is not available)
>
>
> According to above the "tag" (engress port) is set when DMA transfers
> the packet to input MTIP buffer. This shall allow force forwarding as
> we can setup this bit when we normally append tag in the network stack.
>
> I will investigate this issue - and check the port separation. If it
> works then DSA (or switchdev) shall be used?

Source port identification and individual egress port addressing are
things that should behave absolutely the same regardless of whether you
have a DSA or a pure switchdev driver. I don't think that this is clear
enough to you. DSA with DSA_TAG_PROTO_NONE is not the shortcut you're
looking for. Please take a look at Documentation/networking/switchdev.rst,
it explains pretty clearly that a switchdev port must still expose the
same interface as any other net device.

> (A side question - DSA uses switchdev, so when one shall use switchdev
> standalone?)

Short answer: if the system-side packet I/O interface is Ethernet and
the hardware ownership is clearly delineated at the boundary of that
Ethernet port, then it should be DSA, otherwise it shouldn't.

But nonetheless, this raises a fundamental question, and I'll indulge in
attempting to answer it more comprehensively.

You seem to be tapping into an idea which has been circulating for a
while, and which can be also found in Documentation/networking/dsa/dsa.rst:

-----------------------------[cut here]-----------------------------

TODO
====

Making SWITCHDEV and DSA converge towards an unified codebase
-------------------------------------------------------------

SWITCHDEV properly takes care of abstracting the networking stack with offload
capable hardware, but does not enforce a strict switch device driver model. On
the other DSA enforces a fairly strict device driver model, and deals with most
of the switch specific. At some point we should envision a merger between these
two subsystems and get the best of both worlds.

-----------------------------[cut here]-----------------------------

IMO this is never going to happen, nor should it.
To unify DSA and switchdev would mean to force plain switchdev drivers
to have a stricter separation between the control path and the data
path, a la DSA. So, just like DSA has separate drivers for the switch,
the tagging protocol and for the DSA master, plain switchdev would need
a separate driver too for the DMA/rings/queues, or whatever, of the
switch (the piece of code that implements the NAPI instance). That
driver would need to:
(a) expose a net device for the DMA interface
(b) fake a DSA tag that gets added by the DMA net device, just to be
later removed by the switchdev net device.
This is just for compatibility with what DSA _needs_ to do to drive
hardware that was _designed_ to work that way, and where the _hardware_
adds that tag. But non-DSA drivers don't need any of that, nor does it
really help them in any way! So why should we model things this way?
I don't think the lines between plain switchdev and DSA are blurry at all.
And especially not in this case. You should not have a networking
interface registered to the system for DMA-0 at all, just for MAC-0 and
MAC-1.

2020-11-28 01:15:55

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Vladimir,

> On Fri, Nov 27, 2020 at 12:35:49AM +0100, Lukasz Majewski wrote:
> > > > - The question regarding power management - at least for my use
> > > > case there is no need for runtime power management. The L2
> > > > switch shall work always at it connects other devices.
> > > >
> > > > - The FEC clock is also used for L2 switch management and
> > > > configuration (as the L2 switch is just in the same, large IP
> > > > block). For now I just keep it enabled so DSA code can use it.
> > > > It looks a bit problematic to export fec_enet_clk_enable() to be
> > > > reused on DSA code.
> > > >
> > > > Links:
> > > > [0] - "i.MX28 Applications Processor Reference Manual, Rev. 2,
> > > > 08/2013" [1] -
> > > > https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5
> > > >
> > >
> > > Disclaimer: I don't know the details of imx28, it's just now that
> > > I downloaded the reference manual to see what it's about.
> > >
> > > I would push back and say that the switch offers bridge
> > > acceleration for the FEC.
> >
> > Am I correct, that the "bridge acceleration" means in-hardware
> > support for L2 packet bridging?
> >
> > And without the switch IP block enabled one shall be able to have
> > software bridging in Linux in those two interfaces?
>
> So if the switch is bypassed through pin strapping of sx_ena, then the
> DMA0 would be connected to ENETC-MAC 0, DMA1 to ENET-MAC 1, and both
> these ports could be driven by the regular fec driver, am I right?
> This is how people use the imx28 with mainline linux right now, no?

Yes. This is the current situation.

>
> When the sx_ena signal enables the switch, a hardware accelerator
> appears between the DMA engine and the same MACs.

Yes.

> But that DMA engine
> is still compatible with the expectations of the fec driver.

Yes. This is why I can reuse the fec_main.c driver.

> And the
> MACs still belong to the FEC.

Yes. I do use mdio communication between MAC and PHYs (by re-using FEC
driver code).

> So, at the end of the day, there are
> still 2 FEC interfaces.

I'm a bit confused with the above sentence. What do you mean "2 FEC
interfaces"?

There is the FEC driver (fec_main.c), which handles DMA0
management/setup and communication with PHYs via MDIO (the FEC driver
setups MDIO, link type, speed, provides info if link is up or down).

>
> Where I was going is that from a user's perspective, it would be
> natural to have the exact same view of the system in both cases, aka
> still two network interfaces for MAC 0 and MAC 1, they still function
> in the same way (i.e. you can still ping through them) but with the
> additional ability to do hardware-accelerating bridging, if the MAC 0
> and MAC 1 network interfaces are put under the same bridge.

At least on imx287 ENET-MAC1 needs MAC0 to be configured:
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/freescale/fec_main.c#L2065

Those two MACs are not fully independent - functionality of MAC1 is
reduced.

>
> Currently, you do not offer that, at all.
> You split the MTIP switch configuration between DSA and the FEC
> driver. But you are still not exposing networking-capable net devices
> for MAC 0 and MAC 1.

As fair as I can tell (and tested it) - it is possible to bring up
"lan{12}" interfaces with `ip l s lan{12}`

I would also like to set the static table with altering the FDB switch
memory/configuration. Access to it seems natural via lan{12} interfaces
(but of course could be possible from eth0).

Yes, I can send data via eth0 as I don't add tag to it. As pointed out
by Andrew - I could use VLAN tags (or play with FEC_FFEN register).

> You still do I/O through the DMA engine.

Yes, DMA0 is connected to MTIP port0 (input port from the imx28 SoC).
The MTIP port1 -> MAC0 and port2 -> MAC1

>
> So why use DSA at all? What benefit does it bring you? Why not do the
> entire switch configuration from within FEC, or a separate driver very
> closely related to it?

Mine rationale to use DSA and FEC:
- Make as little changes to FEC as possible

- Provide separate driver to allow programming FDB, MDB, VLAN setup.
This seems straightforward as MTIP has separate memory region (from
FEC) for switch configuration, statistics, learning, static table
programming. What is even more bizarre FEC and MTIP have the same 8
registers (with different base address and +4 offset :-) ) as
interface to handle DMA0 transfers.

- According to MTIP description from NXP documentation, there is a
separate register for frame forwarding, so it _shall_ also fit into
DSA.


For me it would be enough to have:

- lan{12} - so I could enable/disable it on demand (control when switch
ports are passing or not packets).

- Use standard net tools (like bridge) to setup FDB/MDB, vlan

- Read statistics from MTIP ports (all of them)

- I can use lan1 (bridged or not) to send data outside. It would be
also correct to use eth0.

I'm for the most pragmatic (and simple) solution, which fulfill above
requirements.

>
> > > The fact that the bridge acceleration is provided by a
> > > different vendor and requires access to an extra set of register
> > > blocks is immaterial.
> >
> > Am I correct that you mean not important above (i.e. immaterial ==
> > not important)?
>
> Yes, sorry if that was not clear.

Ok. I was just not sure.

>
> > > To qualify as a DSA switch, you need to have
> > > indirect networking I/O through a different network interface.
> > > You do not have that.
> >
> > I do have eth0 (DMA0) -> MoreThanIP switch port 0 input
> > |
> > |----> switch port1 -> ENET-MAC0
> > |
> > |----> switch port2 -> ENET-MAC1
>
> The whole point of DSA is to intercept traffic from a different and
> completely unaware network interface, parse a tag, and masquerade the
> packet as though it came on a different, virtual, network interface
> corresponding to the switch. DSA offers this service so that:
> - any switch works with any host Ethernet controller
> - all vendors use the same mechanism and do not reinvent the same
> wheel The last part is especially relevant. Just grep net/core/ for
> "dsa" and be amazed at the number of hacks there. DSA can justify
> that only by scale. Aka "we have hardware that works this way, and
> doesn't work any other way. And lots of it".

That is why I wanted to use FEC as "fixed-link" interface, so large
portion of it would be disabled (e.g. the ENET-MAC management for eth0).

>
> That being said, in your case, the network interface is not unaware of
> the switch at all.

Unfortunately, you are right here. FEC, MTIP, ENET-MAC are very closely
coupled.

> Come on, you need to put the (allegedly) switch's
> MAC in promiscuous mode, from the "DSA master" driver! Hardware
> resource ownership is very DSA-unlike in the imx28/vybrid model, it
> seems. This is a very big deal.

Ok.

>
> And there is no tag to parse. You have a switch with a DMA engine
> which is a priori known to be register-compatible with the DMA engine
> used for plain FEC interfaces. It's not like you have a switch that
> can be connected via MII to anything and everything. Force forwarding
> is done via writing a register, again very much unlike DSA, and
> retrieving the source port on RX is up in the air.

Yes. You are correct here.

>
> > > What I would do is I would expand the fec driver into
> > > something that, on capable SoCs, detects bridging of the ENET_MAC0
> > > and ENETC_MAC1 ports and configures the switch accordingly to
> > > offload that in a seamless manner for the user.
> >
> > Do you propose to catch some kind of notification when user calls:
> >
> > ip link add name br0 type bridge; ip link set br0 up;
> > ip link set lan1 up; ip link set lan2 up;
> > ip link set lan1 master br0; ip link set lan2 master br0;
> > bridge link
> >
> > And then configure the FEC driver to use this L2 switch driver?
>
> Yes, that can be summarized as:
> One option to get this upstream would be to transform fec into a
> switchdev driver, or to create a mtip switchdev driver that shares a
> lot of code with the fec driver. The correct tool for code sharing is
> EXPORT_SYMBOL_GPL, not DSA.

One more note - at least for imx28 you can have L2 switch or FEC, not
both.

Considering the above - you would recommend using the switchdev
framework for MTIP?

>
> > > This would also solve your
> > > power management issues, since the entire Ethernet block would be
> > > handled by a single driver. DSA is a complication you do not need.
> > > Convince me otherwise.
> >
> > From what I see the MoreThanIP IP block looks like a "typical" L2
> > switch (like lan9xxx), with VLAN tagging support, static and
> > dynamic tables, forcing the packet to be passed to port [*],
> > congestion management, switch input buffer, priority of
> > packets/queues, broadcast, multicast, port snooping, and even
> > IEEE1588 timestamps.
> >
> > Seems like a lot of useful features.
>
> I did not say that it is not a typical switch, or that it doesn't have
> useful features. I said that DSA does not help you. Adding a
> DSA_TAG_PROTO_NONE driver in 2020 is a no-go all around.

Ok. I see.

>
> > The differences from "normal" DSA switches:
> >
> > 1. It uses mapped memory (for its register space) for
> > configuration/statistics gathering (instead of e.g. SPI, I2C)
>
> Nope, that's not a difference.

Ok.

>
> > 2. The TAG is not appended to the frame outgoing from the "master"
> > FEC port - it can be setup when DMA transfers packet to MTIP switch
> > internal buffer.
>
> That is a difference indeed. Since DSA switches are isolated from
> their host interface, and there has been a traditional separation at
> the MAC-to-MAC layer (or MAC-to-PHY-to-PHY-to-MAC in weird cases), the
> routing information is passed in-band via a header in the packet. The
> host interface has no idea that this header exists, it is just traffic
> as usual. The whole DSA infrastructure is built around intercepting
> and decoding that.
>

Ok, so MTIP doesn't provide the core functionality for DSA out of the
box.

> > Note:
> >
> > [*] - The same situation is in the VF610 and IMX28:
> > The ESW_FFEN register - Bit 0 -> FEN
> >
> > "When set, the next frame received from port 0 (the local DMA port)
> > is forwarded to the ports defined in FD. The bit resets to zero
> > automatically when one frame from port 0 has been processed by the
> > switch (i.e. has been read from the port 0 input buffer; see Figure
> > 32-1). Therefore, the bit must be set again as necessary. See also
> > Section 32.5.8.2, "Forced Forwarding" for a description."
> >
> > (Of course the "Section 32.5.8.2" is not available)
> >
> >
> > According to above the "tag" (engress port) is set when DMA
> > transfers the packet to input MTIP buffer. This shall allow force
> > forwarding as we can setup this bit when we normally append tag in
> > the network stack.
> >
> > I will investigate this issue - and check the port separation. If it
> > works then DSA (or switchdev) shall be used?
>
> Source port identification and individual egress port addressing are
> things that should behave absolutely the same regardless of whether
> you have a DSA or a pure switchdev driver. I don't think that this is
> clear enough to you. DSA with DSA_TAG_PROTO_NONE is not the shortcut
> you're looking for. Please take a look at
> Documentation/networking/switchdev.rst, it explains pretty clearly
> that a switchdev port must still expose the same interface as any
> other net device.

Ok.

>
> > (A side question - DSA uses switchdev, so when one shall use
> > switchdev standalone?)
>
> Short answer: if the system-side packet I/O interface is Ethernet and
> the hardware ownership is clearly delineated at the boundary of that
> Ethernet port, then it should be DSA, otherwise it shouldn't.
>

Ok.

> But nonetheless, this raises a fundamental question, and I'll indulge
> in attempting to answer it more comprehensively.
>
> You seem to be tapping into an idea which has been circulating for a
> while,

There was only NXP's legacy driver (2.6.35), which added support for
MTIP switch. Up till then nobody tried to upstream it....

I do have this driver forward ported to v4.19.y:
https://github.com/lmajewski/linux-imx28-l2switch/commits/master

but its long term maintenance is at best awkward.

> and which can be also found in
> Documentation/networking/dsa/dsa.rst:
>
> -----------------------------[cut here]-----------------------------
>
> TODO
> ====
>
> Making SWITCHDEV and DSA converge towards an unified codebase
> -------------------------------------------------------------
>
> SWITCHDEV properly takes care of abstracting the networking stack
> with offload capable hardware, but does not enforce a strict switch
> device driver model. On the other DSA enforces a fairly strict device
> driver model, and deals with most of the switch specific. At some
> point we should envision a merger between these two subsystems and
> get the best of both worlds.
>
> -----------------------------[cut here]-----------------------------
>
> IMO this is never going to happen, nor should it.
> To unify DSA and switchdev would mean to force plain switchdev drivers
> to have a stricter separation between the control path and the data
> path, a la DSA. So, just like DSA has separate drivers for the switch,
> the tagging protocol and for the DSA master, plain switchdev would
> need a separate driver too for the DMA/rings/queues, or whatever, of
> the switch (the piece of code that implements the NAPI instance). That
> driver would need to:
> (a) expose a net device for the DMA interface
> (b) fake a DSA tag that gets added by the DMA net device, just to be
> later removed by the switchdev net device.

I also thought for a while to append "tag" byte ind tag_*.c file and
then parse it (+ write FEC_FFEN) and remove when the DMA transfer is
setup. But this seems to be not the "simple solution".

> This is just for compatibility with what DSA _needs_ to do to drive
> hardware that was _designed_ to work that way, and where the
> _hardware_ adds that tag. But non-DSA drivers don't need any of that,
> nor does it really help them in any way! So why should we model
> things this way? I don't think the lines between plain switchdev and
> DSA are blurry at all. And especially not in this case. You should
> not have a networking interface registered to the system for DMA-0 at
> all, just for MAC-0 and MAC-1.

Indeed, there should be lan1, lan2 (and no eth0). Unfortunately, now
FEC driver without switch enabled (on e.g. imx28) only exports eth0.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-11-28 04:52:25

by Florian Fainelli

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC



On 11/27/2020 4:33 PM, Lukasz Majewski wrote:
>> So why use DSA at all? What benefit does it bring you? Why not do the
>> entire switch configuration from within FEC, or a separate driver very
>> closely related to it?
>
> Mine rationale to use DSA and FEC:
> - Make as little changes to FEC as possible

Which is entirely possible if you stick to Vladimir suggestions of
exporting services for the MTIP switch driver.

>
> - Provide separate driver to allow programming FDB, MDB, VLAN setup.
> This seems straightforward as MTIP has separate memory region (from
> FEC) for switch configuration, statistics, learning, static table
> programming. What is even more bizarre FEC and MTIP have the same 8
> registers (with different base address and +4 offset :-) ) as
> interface to handle DMA0 transfers.

OK, not sure how that is relevant here? The register organization should
never ever dictate how to pick a particular subsystem.

>
> - According to MTIP description from NXP documentation, there is a
> separate register for frame forwarding, so it _shall_ also fit into
> DSA.

And yet it does not, Vladimir went into great length into explaining
what makes the MTIP + dual FEC different here and why it does not
qualify for DSA. Basically any time you have DMA + integrated switch
tightly coupled you have what we have coined a "pure switchdev" wrapper.

>
>
> For me it would be enough to have:
>
> - lan{12} - so I could enable/disable it on demand (control when switch
> ports are passing or not packets).
>
> - Use standard net tools (like bridge) to setup FDB/MDB, vlan
>
> - Read statistics from MTIP ports (all of them)
>
> - I can use lan1 (bridged or not) to send data outside. It would be
> also correct to use eth0.

You know you can do that without having DSA, right? Look at mlxsw, look
at rocker. You can call multiple times register_netdevice() with custom
network devices that behave differently whether HW bridging offload is
offered or not, whether the switch is declared in Device Tree or not.

>
> I'm for the most pragmatic (and simple) solution, which fulfill above
> requirements.

The most pragmatic solution is to implement switchdev operations to
offer HW bridging offload, VLAN programming, FDB/MDB programming.

It seems to me that you are trying to look for a framework to avoid
doing a bit of middle layer work between switchdev and the FEC driver
and that is not setting you for success.
--
Florian

2020-11-29 22:03:44

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Florian,

> On 11/27/2020 4:33 PM, Lukasz Majewski wrote:
> >> So why use DSA at all? What benefit does it bring you? Why not do
> >> the entire switch configuration from within FEC, or a separate
> >> driver very closely related to it?
> >
> > Mine rationale to use DSA and FEC:
> > - Make as little changes to FEC as possible
>
> Which is entirely possible if you stick to Vladimir suggestions of
> exporting services for the MTIP switch driver.

Ok.

>
> >
> > - Provide separate driver to allow programming FDB, MDB, VLAN setup.
> > This seems straightforward as MTIP has separate memory region
> > (from FEC) for switch configuration, statistics, learning, static
> > table programming. What is even more bizarre FEC and MTIP have the
> > same 8 registers (with different base address and +4 offset :-) ) as
> > interface to handle DMA0 transfers.
>
> OK, not sure how that is relevant here? The register organization
> should never ever dictate how to pick a particular subsystem.
>
> >
> > - According to MTIP description from NXP documentation, there is a
> > separate register for frame forwarding, so it _shall_ also fit
> > into DSA.
>
> And yet it does not, Vladimir went into great length into explaining
> what makes the MTIP + dual FEC different here and why it does not
> qualify for DSA.

I'm very grateful for this insight and explanation from Vladimir.

> Basically any time you have DMA + integrated switch
> tightly coupled you have what we have coined a "pure switchdev"
> wrapper.

Ok.

>
> >
> >
> > For me it would be enough to have:
> >
> > - lan{12} - so I could enable/disable it on demand (control when
> > switch ports are passing or not packets).
> >
> > - Use standard net tools (like bridge) to setup FDB/MDB, vlan
> >
> > - Read statistics from MTIP ports (all of them)
> >
> > - I can use lan1 (bridged or not) to send data outside. It would be
> > also correct to use eth0.
>
> You know you can do that without having DSA, right? Look at mlxsw,
> look at rocker. You can call multiple times register_netdevice() with
> custom network devices that behave differently whether HW bridging
> offload is offered or not, whether the switch is declared in Device
> Tree or not.

I will look into those examples and try to follow them for MTIP.

>
> >
> > I'm for the most pragmatic (and simple) solution, which fulfill
> > above requirements.
>
> The most pragmatic solution is to implement switchdev operations to
> offer HW bridging offload, VLAN programming, FDB/MDB programming.

Ok.

>
> It seems to me that you are trying to look for a framework to avoid
> doing a bit of middle layer work between switchdev and the FEC driver
> and that is not setting you for success.

I'm not afraid to rework FEC. I just thought that DSA would be the best
fit for the MTIP. However, after posting the RFC, the community gave me
arguments that I was wrong.

I'm happy for having so detailed feedback :-).


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2021-06-17 11:30:39

by Lukasz Majewski

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

Hi Andrew,

> > > I would push back and say that the switch offers bridge
> > > acceleration for the FEC.
> >
> > Am I correct, that the "bridge acceleration" means in-hardware
> > support for L2 packet bridging?
>
> You should think of the hardware as an accelerator, not a switch. The
> hardware is there to accelerate what linux can already do. You setup a
> software bridge in linux, and then offload L2 switching to the
> accelerator. You setup vlans in linux, and then offload the filtering
> of them to the accelerator. If there is something linux can do, but
> the hardware cannot accelerate, you leave linux to do it in software.
>
> > Do you propose to catch some kind of notification when user calls:
> >
> > ip link add name br0 type bridge; ip link set br0 up;
> > ip link set lan1 up; ip link set lan2 up;
> > ip link set lan1 master br0; ip link set lan2 master br0;
> > bridge link

^^^^^^^^^^^^^ [*]

> >
> > And then configure the FEC driver to use this L2 switch driver?
>
> That is what switchdev does. There are various hooks in the network
> stack which call into switchdev to ask it to offload operations to the
> accelerator.

I'm a bit confused about the interfaces that pop up when I do enable
bridging acceleration.

Without bridge I do have:
- eth0 (this is a 'primary' interface -> it also controls MII/PHY for
eth1)
- eth1 (it uses the MII/PHY control from eth0)

Both interfaces works correctly.

And after starting the bridge (and switchdev) with commands from [*] I
do have:

- br0 (created bridge - need to assign IP to it to communicate outside,
even when routing is set via eth0, and eth0 has the same IP address)
- eth0 (just is used to control PHY - ifconfig up/down)
- eth1 (just is used to control PHY - ifconfig up/down)

And now the question, how internally shall I tackle the transmission
(i.e. DMA setups)?

Now, I do use some hacks to re-use code for eth0 to perform the
transmission from/to imx28 L2 switch. The eth1 is stopped (it only
controls the PHY - responds to MII interrupts).

The above setup works, and the code adjustment for fec_main.c driver is
really small.

However, I do wonder how conceptually it "mix" with br0 interface? I
could leave br0 as is, but then why do I need to asign the IP address
to it to communicate?
As I need to do it - then conceptually (re-)using eth0 internal
structures (and the driver in fact) looks like some kind of abusement.
However, adding the transmission handling to br0 net device would bloat
and potentially duplicate the code.

I would prefer to re-use code from eth0 interface - it would be also
easier to cleanu up after disabling the L2 switch.

Any feedback and help is more than welcome.

>
> > The differences from "normal" DSA switches:
> >
> > 1. It uses mapped memory (for its register space) for
> > configuration/statistics gathering (instead of e.g. SPI, I2C)
>
> That does not matter. And there are memory mapped DSA switches. The
> DSA framework puts no restrictions on how the control plane works.
>
> > (Of course the "Section 32.5.8.2" is not available)
>
> It is in the Vybrid datasheet :-)
>
> Andrew



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: [email protected]


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2021-06-17 16:08:34

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC

On Thu, Jun 17, 2021 at 01:08:21PM +0200, Lukasz Majewski wrote:
> Hi Andrew,
>
> > > > I would push back and say that the switch offers bridge
> > > > acceleration for the FEC.
> > >
> > > Am I correct, that the "bridge acceleration" means in-hardware
> > > support for L2 packet bridging?
> >
> > You should think of the hardware as an accelerator, not a switch. The
> > hardware is there to accelerate what linux can already do. You setup a
> > software bridge in linux, and then offload L2 switching to the
> > accelerator. You setup vlans in linux, and then offload the filtering
> > of them to the accelerator. If there is something linux can do, but
> > the hardware cannot accelerate, you leave linux to do it in software.
> >
> > > Do you propose to catch some kind of notification when user calls:
> > >
> > > ip link add name br0 type bridge; ip link set br0 up;
> > > ip link set lan1 up; ip link set lan2 up;
> > > ip link set lan1 master br0; ip link set lan2 master br0;
> > > bridge link
>
> ^^^^^^^^^^^^^ [*]
>
> > >
> > > And then configure the FEC driver to use this L2 switch driver?
> >
> > That is what switchdev does. There are various hooks in the network
> > stack which call into switchdev to ask it to offload operations to the
> > accelerator.
>
> I'm a bit confused about the interfaces that pop up when I do enable
> bridging acceleration.
>
> Without bridge I do have:
> - eth0 (this is a 'primary' interface -> it also controls MII/PHY for
> eth1)
> - eth1 (it uses the MII/PHY control from eth0)

You need to be careful with wording here. eth0 might provide an MDIO
bus which eth1 PHY is connected to, but eth1 is controlling the PHY,
not eth0.

>
> Both interfaces works correctly.
>
> And after starting the bridge (and switchdev) with commands from [*] I
> do have:

You don't start switchdev. switchdev is just an API between the
network stack and the hardware accelerator.

> - br0 (created bridge - need to assign IP to it to communicate outside,
> even when routing is set via eth0, and eth0 has the same IP address)

The IP address should only be on the bridge.

> - eth0 (just is used to control PHY - ifconfig up/down)
> - eth1 (just is used to control PHY - ifconfig up/down)

It is more than that. You can still send on these interfaces. e.g. if
you are using ldap, or L2 PTP, the daemons will use these interfaces,
not the bridge, since they want to send frames out a specific
interface.

> And now the question, how internally shall I tackle the transmission
> (i.e. DMA setups)?
>
> Now, I do use some hacks to re-use code for eth0 to perform the
> transmission from/to imx28 L2 switch.

Avoid hacks. Always. Step back. Look at the hardware. How is the
hardware meant to be used when using the switch?

You have two choices.

1) A DSA style driver. The SoC has an Ethernet interface connect to
one port of the switch. In this case, there should be two additional
ports of the switch connected to the outside world, i.e. the switch
has 3 ports.

2) A pure switchdev driver. You DMA frames directly to the switch
ingree/egress queues for each port connected to the outside world. In
this case, you have a 2 port switch.

Once you get the architecture correct, then you can think about the
driver. Maybe you can reuse parts of the FEC? Maybe not.

Andrew