One of the reasons of this proposition is safety and performance -
host should not receive traffic which is not designated for it.
Some network devices can hold separate address tables for vlans and
real device, but for some reason there is no possibility to apply it
with generic net addressing scheme easily. At this moment the fastest
solution is to add mcast/ucast entries for every created vlan
including real device. But it not only consumes forwarding table but
also adds holes in the filtering and thus wastes cpus cycles.
This patchseries tries to correct core to assign mcast and ucast
addresses only for vlans that really require it and as result an end
driver can exclusively and simply set its rx filters. As an example
it's implemented on cpsw TI driver, but generic changes provided by
this series can be reused by other ethernet drivers having similar
rx filter address possibilities.
An address+vid is considered as separate address. The reserved device
address length is 32 Bytes, for ethernet devices it's additional
opportunity to pass auxiliary address info, like virtual ID
identifying a device the address belongs to. This series makes it
possible at least for ETH_P_8021Q.
Thus end real device can setup separate tables for virtual devices
just retrieving VID from the address. A device address space can
maintain addresses and references on them separately for each virtual
device if it needs so, or only addresses for real device (and all its
vlans) it holds usually.
A vlan device can be in any place of device chain upper real device,
say smth like rdevice/bonding/vlan or even rdevice/macvlan/vlan.
This series is verified on TI am572x EVM that can hold separate tables
for vlans. Potentially it can be easily extended to netcp driver for
keystone 2 boards (including k2g) and also new am6 chipsets. As a
simple test case, different combinations of vlan+macvlan, macvlan+vlan
were used and tested as with unicast as multicast addresses.
Based on net-next/master
It's continuation of RFC:
[RFC PATCH net-next 0/5] net: allow hw addresses for virtual device
https://lkml.org/lkml/2018/12/3/817
Ivan Khoronzhuk (6):
net: core: dev_addr_lists: add VID to device address
net: 8021q: vlan_dev: add vid tag to addresses of uc and mc lists
net: 8021q: vlan_dev: add vid tag for vlan device own address
ethernet: eth: add default vid len for all ehternet kind devices
net: ethernet: ti: cpsw: update mc filtering to use IVDF
net: ethernet: ti: cpsw: add macvlan and ucast/vlan filtering support
drivers/net/ethernet/ti/Kconfig | 1 +
drivers/net/ethernet/ti/cpsw.c | 139 ++++++++++++--------------------
include/linux/if_vlan.h | 2 +
include/linux/netdevice.h | 4 +
net/8021q/Kconfig | 12 +++
net/8021q/vlan.c | 3 +
net/8021q/vlan.h | 2 +
net/8021q/vlan_core.c | 25 ++++++
net/8021q/vlan_dev.c | 103 ++++++++++++++++++-----
net/core/dev_addr_lists.c | 124 ++++++++++++++++++++++------
net/ethernet/eth.c | 10 ++-
11 files changed, 292 insertions(+), 133 deletions(-)
--
2.17.1
The cpsw can filter multicast addresses only per vlan. Thus if mcast
address is set for one of them or only for real device it must be
added for every created vlan consuming ALE table w/o reason. In order to
simplify dispatching vlan filters, the IVDF recently added is resused.
In case IVDF is disabled - mc is updated only for real device as before.
The previous method is harder to reuse and vlan filtering is limited
only for vlans directly connected to real netdev, so drop it in flavor
of IVDF decision.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
drivers/net/ethernet/ti/Kconfig | 1 +
drivers/net/ethernet/ti/cpsw.c | 113 ++++----------------------------
2 files changed, 13 insertions(+), 101 deletions(-)
diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index bb126be1eb72..c99c08ece9a1 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -65,6 +65,7 @@ config TI_CPSW
select TI_DAVINCI_CPDMA
select TI_DAVINCI_MDIO
select TI_CPSW_PHY_SEL
+ select VLAN_8021Q_IVDF
select TI_CPSW_ALE
select MFD_SYSCON
select REGMAP
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index a591583d120e..fd76d1f12911 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -693,108 +693,21 @@ static int cpsw_set_mc(struct net_device *ndev, const u8 *addr,
return ret;
}
-static int cpsw_update_vlan_mc(struct net_device *vdev, int vid, void *ctx)
+static int cpsw_add_mc_addr(struct net_device *ndev, const u8 *addr)
{
- struct addr_sync_ctx *sync_ctx = ctx;
- struct netdev_hw_addr *ha;
- int found = 0, ret = 0;
-
- if (!vdev || !(vdev->flags & IFF_UP))
- return 0;
-
- /* vlan address is relevant if its sync_cnt != 0 */
- netdev_for_each_mc_addr(ha, vdev) {
- if (ether_addr_equal(ha->addr, sync_ctx->addr)) {
- found = ha->sync_cnt;
- break;
- }
- }
-
- if (found)
- sync_ctx->consumed++;
-
- if (sync_ctx->flush) {
- if (!found)
- cpsw_set_mc(sync_ctx->ndev, sync_ctx->addr, vid, 0);
- return 0;
- }
-
- if (found)
- ret = cpsw_set_mc(sync_ctx->ndev, sync_ctx->addr, vid, 1);
-
- return ret;
-}
-
-static int cpsw_add_mc_addr(struct net_device *ndev, const u8 *addr, int num)
-{
- struct addr_sync_ctx sync_ctx;
- int ret;
-
- sync_ctx.consumed = 0;
- sync_ctx.addr = addr;
- sync_ctx.ndev = ndev;
- sync_ctx.flush = 0;
-
- ret = vlan_for_each(ndev, cpsw_update_vlan_mc, &sync_ctx);
- if (sync_ctx.consumed < num && !ret)
- ret = cpsw_set_mc(ndev, addr, -1, 1);
-
- return ret;
-}
-
-static int cpsw_del_mc_addr(struct net_device *ndev, const u8 *addr, int num)
-{
- struct addr_sync_ctx sync_ctx;
-
- sync_ctx.consumed = 0;
- sync_ctx.addr = addr;
- sync_ctx.ndev = ndev;
- sync_ctx.flush = 1;
-
- vlan_for_each(ndev, cpsw_update_vlan_mc, &sync_ctx);
- if (sync_ctx.consumed == num)
- cpsw_set_mc(ndev, addr, -1, 0);
+ u16 vid;
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_mc(ndev, addr, vid ? vid : -1, 1);
return 0;
}
-static int cpsw_purge_vlan_mc(struct net_device *vdev, int vid, void *ctx)
+static int cpsw_del_mc_addr(struct net_device *ndev, const u8 *addr)
{
- struct addr_sync_ctx *sync_ctx = ctx;
- struct netdev_hw_addr *ha;
- int found = 0;
-
- if (!vdev || !(vdev->flags & IFF_UP))
- return 0;
-
- /* vlan address is relevant if its sync_cnt != 0 */
- netdev_for_each_mc_addr(ha, vdev) {
- if (ether_addr_equal(ha->addr, sync_ctx->addr)) {
- found = ha->sync_cnt;
- break;
- }
- }
-
- if (!found)
- return 0;
-
- sync_ctx->consumed++;
- cpsw_set_mc(sync_ctx->ndev, sync_ctx->addr, vid, 0);
- return 0;
-}
-
-static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num)
-{
- struct addr_sync_ctx sync_ctx;
-
- sync_ctx.addr = addr;
- sync_ctx.ndev = ndev;
- sync_ctx.consumed = 0;
-
- vlan_for_each(ndev, cpsw_purge_vlan_mc, &sync_ctx);
- if (sync_ctx.consumed < num)
- cpsw_set_mc(ndev, addr, -1, 0);
+ u16 vid;
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_mc(ndev, addr, vid ? vid : -1, 0);
return 0;
}
@@ -816,8 +729,7 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
cpsw_ale_set_allmulti(cpsw->ale, ndev->flags & IFF_ALLMULTI);
/* add/remove mcast address either for real netdev or for vlan */
- __hw_addr_ref_sync_dev(&ndev->mc, ndev, cpsw_add_mc_addr,
- cpsw_del_mc_addr);
+ __dev_mc_sync(ndev, cpsw_add_mc_addr, cpsw_del_mc_addr);
}
static void cpsw_intr_enable(struct cpsw_common *cpsw)
@@ -1970,9 +1882,6 @@ static int cpsw_restore_vlans(struct net_device *vdev, int vid, void *arg)
{
struct cpsw_priv *priv = arg;
- if (!vdev)
- return 0;
-
cpsw_ndo_vlan_rx_add_vid(priv->ndev, 0, vid);
return 0;
}
@@ -2099,7 +2008,7 @@ static int cpsw_ndo_stop(struct net_device *ndev)
struct cpsw_common *cpsw = priv->cpsw;
cpsw_info(priv, ifdown, "shutting down cpsw device\n");
- __hw_addr_ref_unsync_dev(&ndev->mc, ndev, cpsw_purge_all_mc);
+ __dev_mc_unsync(ndev, cpsw_del_mc_addr);
netif_tx_stop_all_queues(priv->ndev);
netif_carrier_off(priv->ndev);
@@ -3435,6 +3344,7 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv)
priv_sl2->emac_port = 1;
cpsw->slaves[1].ndev = ndev;
ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
+ vlan_dev_ivdf_set(ndev, 1);
ndev->netdev_ops = &cpsw_netdev_ops;
ndev->ethtool_ops = &cpsw_ethtool_ops;
@@ -3696,6 +3606,7 @@ static int cpsw_probe(struct platform_device *pdev)
}
ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
+ vlan_dev_ivdf_set(ndev, 1);
ndev->netdev_ops = &cpsw_netdev_ops;
ndev->ethtool_ops = &cpsw_ethtool_ops;
--
2.17.1
The cpsw supports unicast filtering as for real as for vlan devices
now, but has no flag set for that. As result, once macvlan or vlan
adds new ucast address the cpsw is silently toggled to promiscuous
mode. That's smth not expected, so patch fixes it.
A unicast address for vlan has to be presented by vlan/unicast entry
in ALE table. At this moment, while vlan address change, entry is not
created in any form, even just like real device unicast used for
macvlan, leaving only address inherited from real device created
while vlan addition.
Therefore, program unicast entries for vlans by using IVDF, it allows
to add only vlan/unicast entries for vlans, omitting real device
ucast entries unless they are added for macvans or so, as they are
redundant for vlans and just consume forwarding table and in case of
matching packet income - CPU time.
So, after this patch, cpsw has ability to handle macvlan and vlan
ucasts, synchronizing ucast tables for these devices with cpsw ALE
table exclusively.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
drivers/net/ethernet/ti/cpsw.c | 62 ++++++++++++++++++++++++++++++----
1 file changed, 56 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index fd76d1f12911..c6d5ddc05299 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -693,6 +693,31 @@ static int cpsw_set_mc(struct net_device *ndev, const u8 *addr,
return ret;
}
+static int cpsw_set_uc(struct net_device *ndev, const u8 *addr,
+ int vid, int add)
+{
+ struct cpsw_priv *priv = netdev_priv(ndev);
+ struct cpsw_common *cpsw = priv->cpsw;
+ int flags, port, ret;
+
+ if (vid < 0) {
+ if (cpsw->data.dual_emac)
+ vid = cpsw->slaves[priv->emac_port].port_vlan;
+ else
+ vid = 0;
+ }
+
+ port = HOST_PORT_NUM;
+ flags = vid ? ALE_VLAN : 0;
+
+ if (add)
+ ret = cpsw_ale_add_ucast(cpsw->ale, addr, port, flags, vid);
+ else
+ ret = cpsw_ale_del_ucast(cpsw->ale, addr, port, flags, vid);
+
+ return ret;
+}
+
static int cpsw_add_mc_addr(struct net_device *ndev, const u8 *addr)
{
u16 vid;
@@ -711,6 +736,24 @@ static int cpsw_del_mc_addr(struct net_device *ndev, const u8 *addr)
return 0;
}
+static int cpsw_add_uc_addr(struct net_device *ndev, const u8 *addr)
+{
+ u16 vid;
+
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_uc(ndev, addr, vid ? vid : -1, 1);
+ return 0;
+}
+
+static int cpsw_del_uc_addr(struct net_device *ndev, const u8 *addr)
+{
+ u16 vid;
+
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_uc(ndev, addr, vid ? vid : -1, 0);
+ return 0;
+}
+
static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
{
struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
@@ -730,6 +773,7 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
/* add/remove mcast address either for real netdev or for vlan */
__dev_mc_sync(ndev, cpsw_add_mc_addr, cpsw_del_mc_addr);
+ __dev_uc_sync(ndev, cpsw_add_uc_addr, cpsw_del_uc_addr);
}
static void cpsw_intr_enable(struct cpsw_common *cpsw)
@@ -2009,6 +2053,7 @@ static int cpsw_ndo_stop(struct net_device *ndev)
cpsw_info(priv, ifdown, "shutting down cpsw device\n");
__dev_mc_unsync(ndev, cpsw_del_mc_addr);
+ __dev_uc_unsync(ndev, cpsw_del_uc_addr);
netif_tx_stop_all_queues(priv->ndev);
netif_carrier_off(priv->ndev);
@@ -2369,10 +2414,12 @@ static inline int cpsw_add_vlan_ale_entry(struct cpsw_priv *priv,
if (ret != 0)
return ret;
- ret = cpsw_ale_add_ucast(cpsw->ale, priv->mac_addr,
- HOST_PORT_NUM, ALE_VLAN, vid);
- if (ret != 0)
- goto clean_vid;
+ if (!priv->ndev->vid_len) {
+ ret = cpsw_ale_add_ucast(cpsw->ale, priv->mac_addr,
+ HOST_PORT_NUM, ALE_VLAN, vid);
+ if (ret != 0)
+ goto clean_vid;
+ }
ret = cpsw_ale_add_mcast(cpsw->ale, priv->ndev->broadcast,
mcast_mask, ALE_VLAN, vid, 0);
@@ -2381,8 +2428,9 @@ static inline int cpsw_add_vlan_ale_entry(struct cpsw_priv *priv,
return 0;
clean_vlan_ucast:
- cpsw_ale_del_ucast(cpsw->ale, priv->mac_addr,
- HOST_PORT_NUM, ALE_VLAN, vid);
+ if (!priv->ndev->vid_len)
+ cpsw_ale_del_ucast(cpsw->ale, priv->mac_addr,
+ HOST_PORT_NUM, ALE_VLAN, vid);
clean_vid:
cpsw_ale_del_vlan(cpsw->ale, vid, 0);
return ret;
@@ -3344,6 +3392,7 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv)
priv_sl2->emac_port = 1;
cpsw->slaves[1].ndev = ndev;
ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
+ ndev->priv_flags |= IFF_UNICAST_FLT;
vlan_dev_ivdf_set(ndev, 1);
ndev->netdev_ops = &cpsw_netdev_ops;
@@ -3606,6 +3655,7 @@ static int cpsw_probe(struct platform_device *pdev)
}
ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
+ ndev->priv_flags |= IFF_UNICAST_FLT;
vlan_dev_ivdf_set(ndev, 1);
ndev->netdev_ops = &cpsw_netdev_ops;
--
2.17.1
IVDF - individual virtual device filtering. Allows to set per vlan
l2 address filters on end real network device (for unicast and for
multicast) and drop redundant not expected packet income.
If CONFIG_VLAN_8021Q_IVDF is enabled the following changes are
applied, and only for ethernet network devices.
By default every ethernet netdev needs vid len = 2 bytes to be able to
hold up to 4096 vids. So set it for every eth device to be correct,
except vlan devs.
In order to shrink all addresses of devices above vlan, the vid_len
for vlan dev = 0, as result all suckers sync their addresses to common
base not taking in to account vid part (vid_len of "to" devices is
important only). And only vlan device is the source of addresses with
actual its vid set, propagating it to parent devices while rx_mode().
Also, don't bother those ethernet devices that at this moment are not
moved to vlan addressing scheme, so while end ethernet device is
created - set vid_len to 0, thus, while syncing, its address space is
concatenated to one dimensional like usual, and who needs IVDF - set
it to NET_8021Q_VID_TSIZE.
There is another decision - is to inherit vid_len or some feature flag
from end root device in order to all upper devices have vlan extended
address space only if exact end real device have such capability. But
I didn't, because it requires more changes and probably I'm not
familiar with all places where it should be inherited, I would
appreciate if someone can guid where it's applicable, then it could
become a little bit more limited.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
include/linux/if_vlan.h | 1 +
net/8021q/Kconfig | 12 ++++++++++++
net/8021q/vlan_core.c | 12 ++++++++++++
net/8021q/vlan_dev.c | 1 +
net/ethernet/eth.c | 10 ++++++++--
5 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 94657f3c483a..9c914b31d208 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -137,6 +137,7 @@ extern int vlan_for_each(struct net_device *dev,
int (*action)(struct net_device *dev, int vid,
void *arg), void *arg);
extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr);
+extern void vlan_dev_ivdf_set(struct net_device *dev, int enable);
extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
extern u16 vlan_dev_vlan_id(const struct net_device *dev);
extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
diff --git a/net/8021q/Kconfig b/net/8021q/Kconfig
index 42320180967f..3e843045739c 100644
--- a/net/8021q/Kconfig
+++ b/net/8021q/Kconfig
@@ -38,3 +38,15 @@ config VLAN_8021Q_MVRP
supersedes GVRP and is not backwards-compatible.
If unsure, say N.
+
+config VLAN_8021Q_IVDF
+ bool "IVDF (Individual Virtual Device Filtering) support"
+ depends on VLAN_8021Q
+ help
+ Select this to enable IVDF addressing scheme support. IVDF is used
+ for automatic propagation of registered VLANs addresses to real end
+ devices. If no device supporting IVDF then disable this as it can
+ consume some memory in configuration with complex network device
+ structures to hold vlan addresses.
+
+ If unsure, say N.
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index fe2ac64c13f8..310b6cd39f22 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -454,6 +454,18 @@ bool vlan_uses_dev(const struct net_device *dev)
}
EXPORT_SYMBOL(vlan_uses_dev);
+void vlan_dev_ivdf_set(struct net_device *dev, int enable)
+{
+#ifdef CONFIG_VLAN_8021Q_IVDF
+ if (enable) {
+ dev->vid_len = NET_8021Q_VID_TSIZE;
+ return;
+ }
+#endif
+ dev->vid_len = 0;
+}
+EXPORT_SYMBOL(vlan_dev_ivdf_set);
+
u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
{
u16 vid = 0;
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 634436e780f1..e4120aca4b9b 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -896,5 +896,6 @@ void vlan_setup(struct net_device *dev)
dev->min_mtu = 0;
dev->max_mtu = ETH_MAX_MTU;
+ vlan_dev_ivdf_set(dev, 0);
eth_zero_addr(dev->broadcast);
}
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index f7a3d7a171c7..95497cac24eb 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -381,6 +381,7 @@ void ether_setup(struct net_device *dev)
dev->flags = IFF_BROADCAST|IFF_MULTICAST;
dev->priv_flags |= IFF_TX_SKB_SHARING;
+ vlan_dev_ivdf_set(dev, 1);
eth_broadcast_addr(dev->broadcast);
}
@@ -404,8 +405,13 @@ EXPORT_SYMBOL(ether_setup);
struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
unsigned int rxqs)
{
- return alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
- ether_setup, txqs, rxqs);
+ struct net_device *dev;
+
+ dev = alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
+ ether_setup, txqs, rxqs);
+
+ vlan_dev_ivdf_set(dev, 0);
+ return dev;
}
EXPORT_SYMBOL(alloc_etherdev_mqs);
--
2.17.1
Update vlan mc and uc addresses with VID tag while propagating
addresses to lower devices, do this only if address is not synced.
It allows at end driver level to distinguish addresses belonging
to vlan devices.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
include/linux/if_vlan.h | 1 +
net/8021q/vlan.h | 2 ++
net/8021q/vlan_core.c | 13 +++++++++++++
net/8021q/vlan_dev.c | 26 ++++++++++++++++++++++++++
4 files changed, 42 insertions(+)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 4cca4da7a6de..94657f3c483a 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -136,6 +136,7 @@ extern struct net_device *__vlan_find_dev_deep_rcu(struct net_device *real_dev,
extern int vlan_for_each(struct net_device *dev,
int (*action)(struct net_device *dev, int vid,
void *arg), void *arg);
+extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr);
extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
extern u16 vlan_dev_vlan_id(const struct net_device *dev);
extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
diff --git a/net/8021q/vlan.h b/net/8021q/vlan.h
index c46daf09a501..f083c43c508f 100644
--- a/net/8021q/vlan.h
+++ b/net/8021q/vlan.h
@@ -6,6 +6,8 @@
#include <linux/u64_stats_sync.h>
#include <linux/list.h>
+#define NET_8021Q_VID_TSIZE 2
+
/* if this changes, algorithm will have to be reworked because this
* depends on completely exhausting the VLAN identifier space. Thus
* it gives constant time look-up, but in many cases it wastes memory.
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index a313165e7a67..fe2ac64c13f8 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -454,6 +454,19 @@ bool vlan_uses_dev(const struct net_device *dev)
}
EXPORT_SYMBOL(vlan_uses_dev);
+u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
+{
+ u16 vid = 0;
+
+ if (dev->vid_len != NET_8021Q_VID_TSIZE)
+ return vid;
+
+ vid = addr[dev->addr_len];
+ vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
+ return vid;
+}
+EXPORT_SYMBOL(vlan_dev_get_addr_vid);
+
static struct sk_buff *vlan_gro_receive(struct list_head *head,
struct sk_buff *skb)
{
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 15293c2a5dd8..93d20b1f4916 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -249,6 +249,14 @@ void vlan_dev_get_realdev_name(const struct net_device *dev, char *result)
strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
}
+static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8 *addr)
+{
+ u16 vid = vlan_dev_vlan_id(vlan_dev);
+
+ addr[vlan_dev->addr_len] = vid & 0xff;
+ addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
+}
+
bool vlan_dev_inherit_address(struct net_device *dev,
struct net_device *real_dev)
{
@@ -480,8 +488,26 @@ static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
}
}
+static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
+{
+ struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
+ struct netdev_hw_addr *ha;
+
+ if (!real_dev->vid_len)
+ return;
+
+ netdev_for_each_mc_addr(ha, vlan_dev)
+ if (!ha->sync_cnt)
+ vlan_dev_set_addr_vid(vlan_dev, ha->addr);
+
+ netdev_for_each_uc_addr(ha, vlan_dev)
+ if (!ha->sync_cnt)
+ vlan_dev_set_addr_vid(vlan_dev, ha->addr);
+}
+
static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
{
+ vlan_dev_align_addr_vid(vlan_dev);
dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
}
--
2.17.1
The vlan device address is held separately from uc/mc lists and
handled differently. The vlan dev address is bound with real device
address only if it's inherited from init, in all other cases it's
separate address entry in uc list. With vid set, the address becomes
not inherited from real device after it's set manually as before, but
is part of uc list any way, with appropriate vid tag set. If vid_len
for real device is 0, the behaviour is the same as before this change,
so shouldn't be any impact on systems w/o individual virtual device
filtering (IVDF) enabled. This allows to control and sync vlan device
address and disable concrete vlan packet income when vlan interface is
down.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
net/8021q/vlan.c | 3 ++
net/8021q/vlan_dev.c | 76 +++++++++++++++++++++++++++++++++-----------
2 files changed, 60 insertions(+), 19 deletions(-)
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index dc4411165e43..9c72551a9a1e 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -295,6 +295,9 @@ static void vlan_sync_address(struct net_device *dev,
if (vlan_dev_inherit_address(vlandev, dev))
goto out;
+ if (dev->vid_len)
+ goto out;
+
/* vlan address was different from the old address and is equal to
* the new address */
if (!ether_addr_equal(vlandev->dev_addr, vlan->real_dev_addr) &&
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 93d20b1f4916..634436e780f1 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -257,12 +257,61 @@ static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8 *addr)
addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
}
+static int vlan_dev_add_addr(struct net_device *dev, u8 *addr)
+{
+ struct net_device *real_dev = vlan_dev_real_dev(dev);
+ unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
+
+ if (real_dev->vid_len) {
+ memcpy(naddr, addr, dev->addr_len);
+ vlan_dev_set_addr_vid(dev, naddr);
+ return dev_vid_uc_add(real_dev, naddr);
+ }
+
+ if (ether_addr_equal(addr, real_dev->dev_addr))
+ return 0;
+
+ return dev_uc_add(real_dev, addr);
+}
+
+static void vlan_dev_del_addr(struct net_device *dev, u8 *addr)
+{
+ struct net_device *real_dev = vlan_dev_real_dev(dev);
+ unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
+
+ if (real_dev->vid_len) {
+ memcpy(naddr, addr, dev->addr_len);
+ vlan_dev_set_addr_vid(dev, naddr);
+ dev_vid_uc_del(real_dev, naddr);
+ return;
+ }
+
+ if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
+ dev_uc_del(real_dev, addr);
+}
+
+static int vlan_dev_subs_addr(struct net_device *dev, u8 *addr)
+{
+ int err;
+
+ err = vlan_dev_add_addr(dev, addr);
+ if (err < 0)
+ return err;
+
+ vlan_dev_del_addr(dev, dev->dev_addr);
+ return err;
+}
+
bool vlan_dev_inherit_address(struct net_device *dev,
struct net_device *real_dev)
{
if (dev->addr_assign_type != NET_ADDR_STOLEN)
return false;
+ if (real_dev->vid_len)
+ if (vlan_dev_subs_addr(dev, real_dev->dev_addr))
+ return false;
+
ether_addr_copy(dev->dev_addr, real_dev->dev_addr);
call_netdevice_notifiers(NETDEV_CHANGEADDR, dev);
return true;
@@ -278,9 +327,10 @@ static int vlan_dev_open(struct net_device *dev)
!(vlan->flags & VLAN_FLAG_LOOSE_BINDING))
return -ENETDOWN;
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
- !vlan_dev_inherit_address(dev, real_dev)) {
- err = dev_uc_add(real_dev, dev->dev_addr);
+ if (ether_addr_equal(dev->dev_addr, real_dev->dev_addr) ||
+ (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
+ !vlan_dev_inherit_address(dev, real_dev))) {
+ err = vlan_dev_add_addr(dev, dev->dev_addr);
if (err < 0)
goto out;
}
@@ -312,8 +362,7 @@ static int vlan_dev_open(struct net_device *dev)
if (dev->flags & IFF_ALLMULTI)
dev_set_allmulti(real_dev, -1);
del_unicast:
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
- dev_uc_del(real_dev, dev->dev_addr);
+ vlan_dev_del_addr(dev, dev->dev_addr);
out:
netif_carrier_off(dev);
return err;
@@ -331,18 +380,14 @@ static int vlan_dev_stop(struct net_device *dev)
if (dev->flags & IFF_PROMISC)
dev_set_promiscuity(real_dev, -1);
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
- dev_uc_del(real_dev, dev->dev_addr);
-
+ vlan_dev_del_addr(dev, dev->dev_addr);
netif_carrier_off(dev);
return 0;
}
static int vlan_dev_set_mac_address(struct net_device *dev, void *p)
{
- struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
struct sockaddr *addr = p;
- int err;
if (!is_valid_ether_addr(addr->sa_data))
return -EADDRNOTAVAIL;
@@ -350,15 +395,8 @@ static int vlan_dev_set_mac_address(struct net_device *dev, void *p)
if (!(dev->flags & IFF_UP))
goto out;
- if (!ether_addr_equal(addr->sa_data, real_dev->dev_addr)) {
- err = dev_uc_add(real_dev, addr->sa_data);
- if (err < 0)
- return err;
- }
-
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
- dev_uc_del(real_dev, dev->dev_addr);
-
+ if (vlan_dev_subs_addr(dev, addr->sa_data))
+ return true;
out:
ether_addr_copy(dev->dev_addr, addr->sa_data);
return 0;
--
2.17.1
Despite this is supposed to be used for Ethernet VLANs, not Ethernet
addresses with space for VID also can reuse this, so VID is considered
as virtual ID extension, not belonging strictly to Ethernet VLAN VIDs,
and overall change can be named individual virtual device filtering
(IVDF).
This patch adds VID tag at the end of each address. The actual
reserved address size is 32 bytes. For Ethernet addresses with 6 bytes
long that's possible to add tag w/o increasing address size. Thus,
each address for the case has 32 - 6 = 26 bytes to hold additional
info, say VID for virtual device addresses.
Therefore, when addresses are synced to the address list of parent
device the address list of latter can contain separate addresses for
virtual devices. It allows to track separate address tables for
virtual devices if they present and the device can be placed on
any place of device tree as the address is propagated to to the end
real device thru *_sync()/ndo_set_rx_mode() APIs. Also it simplifies
handling VID addresses at real device when it supports IVDF.
If parent device doesn't want to have virtual addresses in its address
space the vid_len has to be 0, thus its address space is "shrunk" to
the state as before this patch. For now it's 0 for every device. It
allows two devices with and w/o IVDF to be part of same bond device
for instance.
The end real device supporting IVDF can retrieve VID tag from an
address and set it for a given virtual device only. By default, vid 0
is used for real devices to distinguish it from virtual addresses.
See next patches to see how it's used.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
include/linux/netdevice.h | 4 ++
net/core/dev_addr_lists.c | 124 +++++++++++++++++++++++++++++++-------
2 files changed, 105 insertions(+), 23 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 58e83bd7a861..74fef35b6bec 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1660,6 +1660,7 @@ enum netdev_priv_flags {
* @perm_addr: Permanent hw address
* @addr_assign_type: Hw address assignment type
* @addr_len: Hardware address length
+ * @vid_len: Virtual ID length, set in case of IVDF
* @neigh_priv_len: Used in neigh_alloc()
* @dev_id: Used to differentiate devices that share
* the same link layer address
@@ -1889,6 +1890,7 @@ struct net_device {
unsigned char perm_addr[MAX_ADDR_LEN];
unsigned char addr_assign_type;
unsigned char addr_len;
+ unsigned char vid_len;
unsigned short neigh_priv_len;
unsigned short dev_id;
unsigned short dev_port;
@@ -4141,8 +4143,10 @@ int dev_addr_init(struct net_device *dev);
/* Functions used for unicast addresses handling */
int dev_uc_add(struct net_device *dev, const unsigned char *addr);
+int dev_vid_uc_add(struct net_device *dev, const unsigned char *addr);
int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr);
int dev_uc_del(struct net_device *dev, const unsigned char *addr);
+int dev_vid_uc_del(struct net_device *dev, const unsigned char *addr);
int dev_uc_sync(struct net_device *to, struct net_device *from);
int dev_uc_sync_multiple(struct net_device *to, struct net_device *from);
void dev_uc_unsync(struct net_device *to, struct net_device *from);
diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
index a6723b306717..e3c80e044b8c 100644
--- a/net/core/dev_addr_lists.c
+++ b/net/core/dev_addr_lists.c
@@ -545,6 +545,26 @@ int dev_addr_del(struct net_device *dev, const unsigned char *addr,
}
EXPORT_SYMBOL(dev_addr_del);
+static int get_addr_len(struct net_device *dev)
+{
+ return dev->addr_len + dev->vid_len;
+}
+
+static int set_vid_addr(struct net_device *dev, const unsigned char *addr,
+ unsigned char *naddr)
+{
+ int i;
+
+ if (!dev->vid_len)
+ return dev->addr_len;
+
+ memcpy(naddr, addr, dev->addr_len);
+ for (i = 0; i < dev->vid_len; i++)
+ naddr[dev->addr_len + i] = 0;
+
+ return get_addr_len(dev);
+}
+
/*
* Unicast list handling functions
*/
@@ -556,18 +576,22 @@ EXPORT_SYMBOL(dev_addr_del);
*/
int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr)
{
+ unsigned char naddr[MAX_ADDR_LEN];
struct netdev_hw_addr *ha;
- int err;
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
list_for_each_entry(ha, &dev->uc.list, list) {
- if (!memcmp(ha->addr, addr, dev->addr_len) &&
+ if (!memcmp(ha->addr, addr, addr_len) &&
ha->type == NETDEV_HW_ADDR_T_UNICAST) {
err = -EEXIST;
goto out;
}
}
- err = __hw_addr_create_ex(&dev->uc, addr, dev->addr_len,
+ err = __hw_addr_create_ex(&dev->uc, addr, addr_len,
NETDEV_HW_ADDR_T_UNICAST, true, false);
if (!err)
__dev_set_rx_mode(dev);
@@ -578,47 +602,89 @@ int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr)
EXPORT_SYMBOL(dev_uc_add_excl);
/**
- * dev_uc_add - Add a secondary unicast address
+ * dev_vid_uc_add - Add a secondary unicast address with tag
* @dev: device
- * @addr: address to add
+ * @addr: address to add, includes vid tag already
*
* Add a secondary unicast address to the device or increase
* the reference count if it already exists.
*/
-int dev_uc_add(struct net_device *dev, const unsigned char *addr)
+int dev_vid_uc_add(struct net_device *dev, const unsigned char *addr)
{
int err;
netif_addr_lock_bh(dev);
- err = __hw_addr_add(&dev->uc, addr, dev->addr_len,
+ err = __hw_addr_add(&dev->uc, addr, get_addr_len(dev),
NETDEV_HW_ADDR_T_UNICAST);
if (!err)
__dev_set_rx_mode(dev);
netif_addr_unlock_bh(dev);
return err;
}
+EXPORT_SYMBOL(dev_vid_uc_add);
+
+/**
+ * dev_uc_add - Add a secondary unicast address
+ * @dev: device
+ * @addr: address to add
+ *
+ * Add a secondary unicast address to the device or increase
+ * the reference count if it already exists.
+ */
+int dev_uc_add(struct net_device *dev, const unsigned char *addr)
+{
+ unsigned char naddr[MAX_ADDR_LEN];
+ int err;
+
+ set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
+
+ err = dev_vid_uc_add(dev, addr);
+ return err;
+}
EXPORT_SYMBOL(dev_uc_add);
/**
* dev_uc_del - Release secondary unicast address.
* @dev: device
- * @addr: address to delete
+ * @addr: address to delete, includes vid tag already
*
* Release reference to a secondary unicast address and remove it
* from the device if the reference count drops to zero.
*/
-int dev_uc_del(struct net_device *dev, const unsigned char *addr)
+int dev_vid_uc_del(struct net_device *dev, const unsigned char *addr)
{
int err;
netif_addr_lock_bh(dev);
- err = __hw_addr_del(&dev->uc, addr, dev->addr_len,
+ err = __hw_addr_del(&dev->uc, addr, get_addr_len(dev),
NETDEV_HW_ADDR_T_UNICAST);
if (!err)
__dev_set_rx_mode(dev);
netif_addr_unlock_bh(dev);
return err;
}
+EXPORT_SYMBOL(dev_vid_uc_del);
+
+/**
+ * dev_uc_del - Release secondary unicast address.
+ * @dev: device
+ * @addr: address to delete
+ *
+ * Release reference to a secondary unicast address and remove it
+ * from the device if the reference count drops to zero.
+ */
+int dev_uc_del(struct net_device *dev, const unsigned char *addr)
+{
+ unsigned char naddr[MAX_ADDR_LEN];
+ int err;
+
+ set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
+
+ err = dev_vid_uc_del(dev, addr);
+ return err;
+}
EXPORT_SYMBOL(dev_uc_del);
/**
@@ -642,7 +708,7 @@ int dev_uc_sync(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync(&to->uc, &from->uc, to->addr_len);
+ err = __hw_addr_sync(&to->uc, &from->uc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -672,7 +738,7 @@ int dev_uc_sync_multiple(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync_multiple(&to->uc, &from->uc, to->addr_len);
+ err = __hw_addr_sync_multiple(&to->uc, &from->uc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -696,7 +762,7 @@ void dev_uc_unsync(struct net_device *to, struct net_device *from)
netif_addr_lock_bh(from);
netif_addr_lock_nested(to);
- __hw_addr_unsync(&to->uc, &from->uc, to->addr_len);
+ __hw_addr_unsync(&to->uc, &from->uc, get_addr_len(to));
__dev_set_rx_mode(to);
netif_addr_unlock(to);
netif_addr_unlock_bh(from);
@@ -740,18 +806,22 @@ EXPORT_SYMBOL(dev_uc_init);
*/
int dev_mc_add_excl(struct net_device *dev, const unsigned char *addr)
{
+ unsigned char naddr[MAX_ADDR_LEN];
struct netdev_hw_addr *ha;
- int err;
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
list_for_each_entry(ha, &dev->mc.list, list) {
- if (!memcmp(ha->addr, addr, dev->addr_len) &&
+ if (!memcmp(ha->addr, addr, addr_len) &&
ha->type == NETDEV_HW_ADDR_T_MULTICAST) {
err = -EEXIST;
goto out;
}
}
- err = __hw_addr_create_ex(&dev->mc, addr, dev->addr_len,
+ err = __hw_addr_create_ex(&dev->mc, addr, addr_len,
NETDEV_HW_ADDR_T_MULTICAST, true, false);
if (!err)
__dev_set_rx_mode(dev);
@@ -764,10 +834,14 @@ EXPORT_SYMBOL(dev_mc_add_excl);
static int __dev_mc_add(struct net_device *dev, const unsigned char *addr,
bool global)
{
- int err;
+ unsigned char naddr[MAX_ADDR_LEN];
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
- err = __hw_addr_add_ex(&dev->mc, addr, dev->addr_len,
+ err = __hw_addr_add_ex(&dev->mc, addr, addr_len,
NETDEV_HW_ADDR_T_MULTICAST, global, false, 0);
if (!err)
__dev_set_rx_mode(dev);
@@ -804,10 +878,14 @@ EXPORT_SYMBOL(dev_mc_add_global);
static int __dev_mc_del(struct net_device *dev, const unsigned char *addr,
bool global)
{
- int err;
+ unsigned char naddr[MAX_ADDR_LEN];
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
- err = __hw_addr_del_ex(&dev->mc, addr, dev->addr_len,
+ err = __hw_addr_del_ex(&dev->mc, addr, addr_len,
NETDEV_HW_ADDR_T_MULTICAST, global, false);
if (!err)
__dev_set_rx_mode(dev);
@@ -863,7 +941,7 @@ int dev_mc_sync(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync(&to->mc, &from->mc, to->addr_len);
+ err = __hw_addr_sync(&to->mc, &from->mc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -893,7 +971,7 @@ int dev_mc_sync_multiple(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync_multiple(&to->mc, &from->mc, to->addr_len);
+ err = __hw_addr_sync_multiple(&to->mc, &from->mc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -917,7 +995,7 @@ void dev_mc_unsync(struct net_device *to, struct net_device *from)
netif_addr_lock_bh(from);
netif_addr_lock_nested(to);
- __hw_addr_unsync(&to->mc, &from->mc, to->addr_len);
+ __hw_addr_unsync(&to->mc, &from->mc, get_addr_len(to));
__dev_set_rx_mode(to);
netif_addr_unlock(to);
netif_addr_unlock_bh(from);
--
2.17.1
Hi Ivan,
On 2/26/19 10:45 AM, Ivan Khoronzhuk wrote:
> One of the reasons of this proposition is safety and performance -
> host should not receive traffic which is not designated for it.
>
> Some network devices can hold separate address tables for vlans and
> real device, but for some reason there is no possibility to apply it
> with generic net addressing scheme easily. At this moment the fastest
> solution is to add mcast/ucast entries for every created vlan
> including real device. But it not only consumes forwarding table but
> also adds holes in the filtering and thus wastes cpus cycles.
>
> This patchseries tries to correct core to assign mcast and ucast
> addresses only for vlans that really require it and as result an end
> driver can exclusively and simply set its rx filters. As an example
> it's implemented on cpsw TI driver, but generic changes provided by
> this series can be reused by other ethernet drivers having similar
> rx filter address possibilities.
>
> An address+vid is considered as separate address. The reserved device
> address length is 32 Bytes, for ethernet devices it's additional
> opportunity to pass auxiliary address info, like virtual ID
> identifying a device the address belongs to. This series makes it
> possible at least for ETH_P_8021Q.
>
> Thus end real device can setup separate tables for virtual devices
> just retrieving VID from the address. A device address space can
> maintain addresses and references on them separately for each virtual
> device if it needs so, or only addresses for real device (and all its
> vlans) it holds usually.
>
> A vlan device can be in any place of device chain upper real device,
> say smth like rdevice/bonding/vlan or even rdevice/macvlan/vlan.
>
> This series is verified on TI am572x EVM that can hold separate tables
> for vlans. Potentially it can be easily extended to netcp driver for
> keystone 2 boards (including k2g) and also new am6 chipsets. As a
> simple test case, different combinations of vlan+macvlan, macvlan+vlan
> were used and tested as with unicast as multicast addresses.
>
> Based on net-next/master
Thanks a lot for posting this patch series, I will take a look later
tonight.
>
> It's continuation of RFC:
>
> [RFC PATCH net-next 0/5] net: allow hw addresses for virtual device
> https://lkml.org/lkml/2018/12/3/817
>
> Ivan Khoronzhuk (6):
> net: core: dev_addr_lists: add VID to device address
> net: 8021q: vlan_dev: add vid tag to addresses of uc and mc lists
> net: 8021q: vlan_dev: add vid tag for vlan device own address
> ethernet: eth: add default vid len for all ehternet kind devices
> net: ethernet: ti: cpsw: update mc filtering to use IVDF
> net: ethernet: ti: cpsw: add macvlan and ucast/vlan filtering support
>
> drivers/net/ethernet/ti/Kconfig | 1 +
> drivers/net/ethernet/ti/cpsw.c | 139 ++++++++++++--------------------
> include/linux/if_vlan.h | 2 +
> include/linux/netdevice.h | 4 +
> net/8021q/Kconfig | 12 +++
> net/8021q/vlan.c | 3 +
> net/8021q/vlan.h | 2 +
> net/8021q/vlan_core.c | 25 ++++++
> net/8021q/vlan_dev.c | 103 ++++++++++++++++++-----
> net/core/dev_addr_lists.c | 124 ++++++++++++++++++++++------
> net/ethernet/eth.c | 10 ++-
> 11 files changed, 292 insertions(+), 133 deletions(-)
>
--
Florian
On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
> Update vlan mc and uc addresses with VID tag while propagating
> addresses to lower devices, do this only if address is not synced.
> It allows at end driver level to distinguish addresses belonging
> to vlan devices.
>
> Signed-off-by: Ivan Khoronzhuk <[email protected]>
> ---
[snip]
>
> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
Having some kernel doc comment here would also be nice.
> +{
> + u16 vid = 0;
> +
> + if (dev->vid_len != NET_8021Q_VID_TSIZE)
> + return vid;
> +
> + vid = addr[dev->addr_len];
> + vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
This uses knowledge of the maximum VLAN ID is 4095, which is fine, might
be a good idea to add a check on VID not exceeding the maximum VLAN ID
number instead of doing a silent truncation?
[snip]
> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
> +{
> + struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
> + struct netdev_hw_addr *ha;
> +
> + if (!real_dev->vid_len)
> + return;
Should not this check be moved to dev_{mc,uc}_sync? It does not seem to
me like this would scale really well across different stacked devices
(VLAN, bond, macvlan) as well as underlying drivers (cpsw, dsa, etc.).
Or maybe the check should be if vlan_dev->vid_len > real_dev->vid_len ->
error, right?
--
Florian
On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
> The vlan device address is held separately from uc/mc lists and
> handled differently. The vlan dev address is bound with real device
> address only if it's inherited from init, in all other cases it's
> separate address entry in uc list. With vid set, the address becomes
> not inherited from real device after it's set manually as before, but
> is part of uc list any way, with appropriate vid tag set. If vid_len
> for real device is 0, the behaviour is the same as before this change,
> so shouldn't be any impact on systems w/o individual virtual device
> filtering (IVDF) enabled. This allows to control and sync vlan device
> address and disable concrete vlan packet income when vlan interface is
> down.
>
> Signed-off-by: Ivan Khoronzhuk <[email protected]>
> ---
[snip]
>
> +static int vlan_dev_add_addr(struct net_device *dev, u8 *addr)
> +{
> + struct net_device *real_dev = vlan_dev_real_dev(dev);
> + unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
> +
> + if (real_dev->vid_len) {
Don't you need to check that real_dev->vid_len is >= NET_8021Q_VID_TSIZE
here?
> + memcpy(naddr, addr, dev->addr_len);
> + vlan_dev_set_addr_vid(dev, naddr);
> + return dev_vid_uc_add(real_dev, naddr);
> + }
> +
> + if (ether_addr_equal(addr, real_dev->dev_addr))
> + return 0;
> +
> + return dev_uc_add(real_dev, addr);
> +}
> +
> +static void vlan_dev_del_addr(struct net_device *dev, u8 *addr)
> +{
> + struct net_device *real_dev = vlan_dev_real_dev(dev);
> + unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
> +
> + if (real_dev->vid_len) {
Same here.
> + memcpy(naddr, addr, dev->addr_len);
> + vlan_dev_set_addr_vid(dev, naddr);
> + dev_vid_uc_del(real_dev, naddr);
> + return;
> + }
> +
> + if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
> + dev_uc_del(real_dev, addr);
> +}
> +
> +static int vlan_dev_subs_addr(struct net_device *dev, u8 *addr)
> +{
> + int err;
> +
> + err = vlan_dev_add_addr(dev, addr);
> + if (err < 0)
> + return err;
> +
> + vlan_dev_del_addr(dev, dev->dev_addr);
> + return err;
> +}
> +
> bool vlan_dev_inherit_address(struct net_device *dev,
> struct net_device *real_dev)
> {
> if (dev->addr_assign_type != NET_ADDR_STOLEN)
> return false;
>
> + if (real_dev->vid_len)
> + if (vlan_dev_subs_addr(dev, real_dev->dev_addr))
> + return false;
The check on real_dev->vid_len can be absorbed into vlan_dev_subs_addr()?
> +
> ether_addr_copy(dev->dev_addr, real_dev->dev_addr);
> call_netdevice_notifiers(NETDEV_CHANGEADDR, dev);
> return true;
> @@ -278,9 +327,10 @@ static int vlan_dev_open(struct net_device *dev)
> !(vlan->flags & VLAN_FLAG_LOOSE_BINDING))
> return -ENETDOWN;
>
> - if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
> - !vlan_dev_inherit_address(dev, real_dev)) {
> - err = dev_uc_add(real_dev, dev->dev_addr);
> + if (ether_addr_equal(dev->dev_addr, real_dev->dev_addr) ||
> + (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
> + !vlan_dev_inherit_address(dev, real_dev))) {
Should this condition simply become if !vlan_dev_inherit_address() now?
--
Florian
On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
> The cpsw can filter multicast addresses only per vlan. Thus if mcast
> address is set for one of them or only for real device it must be
> added for every created vlan consuming ALE table w/o reason. In order to
> simplify dispatching vlan filters, the IVDF recently added is resused.
>
> In case IVDF is disabled - mc is updated only for real device as before.
> The previous method is harder to reuse and vlan filtering is limited
> only for vlans directly connected to real netdev, so drop it in flavor
> of IVDF decision.
>
> Signed-off-by: Ivan Khoronzhuk <[email protected]>
> ---
> drivers/net/ethernet/ti/Kconfig | 1 +
> drivers/net/ethernet/ti/cpsw.c | 113 ++++----------------------------
> 2 files changed, 13 insertions(+), 101 deletions(-)
Nice diffstat!
--
Florian
On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
> Despite this is supposed to be used for Ethernet VLANs, not Ethernet
> addresses with space for VID also can reuse this, so VID is considered
> as virtual ID extension, not belonging strictly to Ethernet VLAN VIDs,
> and overall change can be named individual virtual device filtering
> (IVDF).
>
> This patch adds VID tag at the end of each address. The actual
> reserved address size is 32 bytes. For Ethernet addresses with 6 bytes
> long that's possible to add tag w/o increasing address size. Thus,
> each address for the case has 32 - 6 = 26 bytes to hold additional
> info, say VID for virtual device addresses.
>
> Therefore, when addresses are synced to the address list of parent
> device the address list of latter can contain separate addresses for
> virtual devices. It allows to track separate address tables for
> virtual devices if they present and the device can be placed on
> any place of device tree as the address is propagated to to the end
> real device thru *_sync()/ndo_set_rx_mode() APIs. Also it simplifies
> handling VID addresses at real device when it supports IVDF.
>
> If parent device doesn't want to have virtual addresses in its address
> space the vid_len has to be 0, thus its address space is "shrunk" to
> the state as before this patch. For now it's 0 for every device. It
> allows two devices with and w/o IVDF to be part of same bond device
> for instance.
>
> The end real device supporting IVDF can retrieve VID tag from an
> address and set it for a given virtual device only. By default, vid 0
> is used for real devices to distinguish it from virtual addresses.
>
> See next patches to see how it's used.
>
> Signed-off-by: Ivan Khoronzhuk <[email protected]>
> ---
[snip]
> @@ -1889,6 +1890,7 @@ struct net_device {
> unsigned char perm_addr[MAX_ADDR_LEN];
> unsigned char addr_assign_type;
> unsigned char addr_len;
> + unsigned char vid_len;
Have not compiled or tested this patch series yet, but did you check
that adding this member does not change the structure layout (you can
use pahole for that purpose).
> unsigned short neigh_priv_len;
> unsigned short dev_id;
> unsigned short dev_port;
> @@ -4141,8 +4143,10 @@ int dev_addr_init(struct net_device *dev);
>
> /* Functions used for unicast addresses handling */
> int dev_uc_add(struct net_device *dev, const unsigned char *addr);
> +int dev_vid_uc_add(struct net_device *dev, const unsigned char *addr);
> int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr);
> int dev_uc_del(struct net_device *dev, const unsigned char *addr);
> +int dev_vid_uc_del(struct net_device *dev, const unsigned char *addr);
> int dev_uc_sync(struct net_device *to, struct net_device *from);
> int dev_uc_sync_multiple(struct net_device *to, struct net_device *from);
> void dev_uc_unsync(struct net_device *to, struct net_device *from);
> diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
> index a6723b306717..e3c80e044b8c 100644
> --- a/net/core/dev_addr_lists.c
> +++ b/net/core/dev_addr_lists.c
> @@ -545,6 +545,26 @@ int dev_addr_del(struct net_device *dev, const unsigned char *addr,
> }
> EXPORT_SYMBOL(dev_addr_del);
>
> +static int get_addr_len(struct net_device *dev)
> +{
> + return dev->addr_len + dev->vid_len;
> +}
> +
> +static int set_vid_addr(struct net_device *dev, const unsigned char *addr,
> + unsigned char *naddr)
Having some kernel doc comments here would be nice to indicate that the
return value is dev->addr_len, it was not obvious until I saw in the
next function how you used it.
> +{
> + int i;
> +
> + if (!dev->vid_len)
> + return dev->addr_len;
> +
> + memcpy(naddr, addr, dev->addr_len);
> + for (i = 0; i < dev->vid_len; i++)
> + naddr[dev->addr_len + i] = 0;
memset(naddr + dev->addr_len, 0, dev->vid_len) would be more compact and
maybe a little less error prone too?
--
Florian
On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
> IVDF - individual virtual device filtering. Allows to set per vlan
> l2 address filters on end real network device (for unicast and for
> multicast) and drop redundant not expected packet income.
>
> If CONFIG_VLAN_8021Q_IVDF is enabled the following changes are
> applied, and only for ethernet network devices.
>
> By default every ethernet netdev needs vid len = 2 bytes to be able to
> hold up to 4096 vids. So set it for every eth device to be correct,
> except vlan devs.
>
> In order to shrink all addresses of devices above vlan, the vid_len
> for vlan dev = 0, as result all suckers sync their addresses to common
> base not taking in to account vid part (vid_len of "to" devices is
> important only). And only vlan device is the source of addresses with
> actual its vid set, propagating it to parent devices while rx_mode().
>
> Also, don't bother those ethernet devices that at this moment are not
> moved to vlan addressing scheme, so while end ethernet device is
> created - set vid_len to 0, thus, while syncing, its address space is
> concatenated to one dimensional like usual, and who needs IVDF - set
> it to NET_8021Q_VID_TSIZE.
>
> There is another decision - is to inherit vid_len or some feature flag
> from end root device in order to all upper devices have vlan extended
> address space only if exact end real device have such capability. But
> I didn't, because it requires more changes and probably I'm not
> familiar with all places where it should be inherited, I would
> appreciate if someone can guid where it's applicable, then it could
> become a little bit more limited.
I would think that a call to vlan_dev_ivdf_set() would be enough to
indicate that the underlying network device driver supports IVDF and
wants to make use of it. The infrastructure itself that you added costs
little memory, it is once the call to vlan_dev_ivdf_set() is made that
the memory consumption increases which is fine, since we want to make
use of that feature.
While I appreciate the thoughts given to making this a configurable
option, I feel that enabling it unconditionally and having the
underlying driver decide would be more manageable.
We have had that conversation before, but let me ask again when we call
dev_{uc,mc}_sync() and ultimately the network device's
ndo_set_rx_mode(), by the time the ndo_set_rx_mode() function is called,
we lost track of the call chain, like which virtual device was it
originating from. If we somehow added a notification information about
the network device stack (and we could use netdevice notifiers for
that), then maybe we don't really need to add all of this code and we
can just derive the necessary bits of information we want by checking:
is this a VLAN network device? It is, okay what's your VLAN ID, etc.?
Either approach would get us our cookie anyway :)
>
> Signed-off-by: Ivan Khoronzhuk <[email protected]>
> ---
[snip]
> @@ -404,8 +405,13 @@ EXPORT_SYMBOL(ether_setup);
> struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
> unsigned int rxqs)
> {
> - return alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
> - ether_setup, txqs, rxqs);
> + struct net_device *dev;
> +
> + dev = alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
> + ether_setup, txqs, rxqs);
You need to check the return value of alloc_netdev_mqs() now, otherwise
you could be doing a NPD in the call right under. Since that is the
default though, do you really need to call vlan_dev_ivdf_set() below?
--
Florian
On Wed, Feb 27, 2019 at 08:24:00PM -0800, Florian Fainelli wrote:
>On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>> Despite this is supposed to be used for Ethernet VLANs, not Ethernet
>> addresses with space for VID also can reuse this, so VID is considered
>> as virtual ID extension, not belonging strictly to Ethernet VLAN VIDs,
>> and overall change can be named individual virtual device filtering
>> (IVDF).
>>
>> This patch adds VID tag at the end of each address. The actual
>> reserved address size is 32 bytes. For Ethernet addresses with 6 bytes
>> long that's possible to add tag w/o increasing address size. Thus,
>> each address for the case has 32 - 6 = 26 bytes to hold additional
>> info, say VID for virtual device addresses.
>>
>> Therefore, when addresses are synced to the address list of parent
>> device the address list of latter can contain separate addresses for
>> virtual devices. It allows to track separate address tables for
>> virtual devices if they present and the device can be placed on
>> any place of device tree as the address is propagated to to the end
>> real device thru *_sync()/ndo_set_rx_mode() APIs. Also it simplifies
>> handling VID addresses at real device when it supports IVDF.
>>
>> If parent device doesn't want to have virtual addresses in its address
>> space the vid_len has to be 0, thus its address space is "shrunk" to
>> the state as before this patch. For now it's 0 for every device. It
>> allows two devices with and w/o IVDF to be part of same bond device
>> for instance.
>>
>> The end real device supporting IVDF can retrieve VID tag from an
>> address and set it for a given virtual device only. By default, vid 0
>> is used for real devices to distinguish it from virtual addresses.
>>
>> See next patches to see how it's used.
>>
>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>> ---
>
>[snip]
>
>
>> @@ -1889,6 +1890,7 @@ struct net_device {
>> unsigned char perm_addr[MAX_ADDR_LEN];
>> unsigned char addr_assign_type;
>> unsigned char addr_len;
>> + unsigned char vid_len;
>
>Have not compiled or tested this patch series yet, but did you check
>that adding this member does not change the structure layout (you can
>use pahole for that purpose).
For ARM 32, on 1 hole less:
---------------------------
before (https://pastebin.com/DG1SVpFR):
/* size: 1344, cachelines: 21, members: 123 */
/* sum members: 1304, holes: 5, sum holes: 28 */
/* padding: 12 */
/* bit_padding: 31 bits */
after (https://pastebin.com/ZUMhxGkA):
/* size: 1344, cachelines: 21, members: 124 */
/* sum members: 1305, holes: 5, sum holes: 27 */
/* padding: 12 */
/* bit_padding: 31 bits */
For ARM 64, on 1 hole less:
---------------------------
before (https://pastebin.com/5CdTQWkc):
/* size: 2048, cachelines: 32, members: 120 */
/* sum members: 1972, holes: 7, sum holes: 48 */
/* padding: 28 */
/* bit_padding: 31 bits */
after (https://pastebin.com/32ktb1iV):
/* size: 2048, cachelines: 32, members: 121 */
/* sum members: 1973, holes: 7, sum holes: 47 */
/* padding: 28 */
/* bit_padding: 31 bits */
Looks Ok, but it depends on configuration ...
>
>> unsigned short neigh_priv_len;
>> unsigned short dev_id;
>> unsigned short dev_port;
>> @@ -4141,8 +4143,10 @@ int dev_addr_init(struct net_device *dev);
>>
>> /* Functions used for unicast addresses handling */
>> int dev_uc_add(struct net_device *dev, const unsigned char *addr);
>> +int dev_vid_uc_add(struct net_device *dev, const unsigned char *addr);
>> int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr);
>> int dev_uc_del(struct net_device *dev, const unsigned char *addr);
>> +int dev_vid_uc_del(struct net_device *dev, const unsigned char *addr);
>> int dev_uc_sync(struct net_device *to, struct net_device *from);
>> int dev_uc_sync_multiple(struct net_device *to, struct net_device *from);
>> void dev_uc_unsync(struct net_device *to, struct net_device *from);
>> diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
>> index a6723b306717..e3c80e044b8c 100644
>> --- a/net/core/dev_addr_lists.c
>> +++ b/net/core/dev_addr_lists.c
>> @@ -545,6 +545,26 @@ int dev_addr_del(struct net_device *dev, const unsigned char *addr,
>> }
>> EXPORT_SYMBOL(dev_addr_del);
>>
>> +static int get_addr_len(struct net_device *dev)
>> +{
>> + return dev->addr_len + dev->vid_len;
>> +}
>> +
>> +static int set_vid_addr(struct net_device *dev, const unsigned char *addr,
>> + unsigned char *naddr)
>
>Having some kernel doc comments here would be nice to indicate that the
>return value is dev->addr_len, it was not obvious until I saw in the
>next function how you used it.
Agree
>
>> +{
>> + int i;
>> +
>> + if (!dev->vid_len)
>> + return dev->addr_len;
>> +
>> + memcpy(naddr, addr, dev->addr_len);
>> + for (i = 0; i < dev->vid_len; i++)
>> + naddr[dev->addr_len + i] = 0;
>
>memset(naddr + dev->addr_len, 0, dev->vid_len) would be more compact and
>maybe a little less error prone too?
Yes, would be
--
Regards,
Ivan Khoronzhuk
On Wed, Feb 27, 2019 at 08:09:44PM -0800, Florian Fainelli wrote:
>
>
>On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>> Update vlan mc and uc addresses with VID tag while propagating
>> addresses to lower devices, do this only if address is not synced.
>> It allows at end driver level to distinguish addresses belonging
>> to vlan devices.
>>
>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>> ---
>
>[snip]
>
>>
>> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
>
>Having some kernel doc comment here would also be nice.
yes can be: vlan_dev_get_addr_vid - get vlan id the address belongs to
>
>> +{
>> + u16 vid = 0;
>> +
>> + if (dev->vid_len != NET_8021Q_VID_TSIZE)
>> + return vid;
>> +
>> + vid = addr[dev->addr_len];
>> + vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
>
>This uses knowledge of the maximum VLAN ID is 4095, which is fine, might
>be a good idea to add a check on VID not exceeding the maximum VLAN ID
>number instead of doing a silent truncation?
and then return -1, not sure, just because it's 0 or directly set by vlan
layer and is verified anyway. But no harm to verify even it looks like
redundancy.
>
>[snip]
>
>> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
>> +{
>> + struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
>> + struct netdev_hw_addr *ha;
>> +
>> + if (!real_dev->vid_len)
>> + return;
>
>Should not this check be moved to dev_{mc,uc}_sync? It does not seem to
>me like this would scale really well across different stacked devices
>(VLAN, bond, macvlan) as well as underlying drivers (cpsw, dsa, etc.).
>Or maybe the check should be if vlan_dev->vid_len > real_dev->vid_len ->
>error, right?
It shouldn't be part of netdev addr module, no any vlan_dev_vlan_id(vlan_dev)
should be there.
It's scaled becouse bond/team ...etc, are ethernet devices and have IVDF
enabled while configuration. Address propagation always is from leafs to
real root device, every underneeth device knows nothing about above, so
check is only in one direction.
--
Regards,
Ivan Khoronzhuk
On Wed, Feb 27, 2019 at 08:13:34PM -0800, Florian Fainelli wrote:
>
>
>On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>> The vlan device address is held separately from uc/mc lists and
>> handled differently. The vlan dev address is bound with real device
>> address only if it's inherited from init, in all other cases it's
>> separate address entry in uc list. With vid set, the address becomes
>> not inherited from real device after it's set manually as before, but
>> is part of uc list any way, with appropriate vid tag set. If vid_len
>> for real device is 0, the behaviour is the same as before this change,
>> so shouldn't be any impact on systems w/o individual virtual device
>> filtering (IVDF) enabled. This allows to control and sync vlan device
>> address and disable concrete vlan packet income when vlan interface is
>> down.
>>
>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>> ---
>
>[snip]
>
>>
>> +static int vlan_dev_add_addr(struct net_device *dev, u8 *addr)
>> +{
>> + struct net_device *real_dev = vlan_dev_real_dev(dev);
>> + unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
>> +
>> + if (real_dev->vid_len) {
>
>Don't you need to check that real_dev->vid_len is >= NET_8021Q_VID_TSIZE
>here?
vid_len for all eth devices or 0 or NET_8021Q_VID_TSIZE and used here only as
a flag that different addressing scheme is used.
vlan_dev_set_addr_vid() do copy only < NET_8021Q_VID_TSIZE anyway.
Can add the following to be sure:
if (real_dev->vid_len) {
if (real_dev->vid_len != NET_8021Q_VID_TSIZE)
return -1;
....
}
But frankly, if this happens the system is ill and this check can't help it.
>
>> + memcpy(naddr, addr, dev->addr_len);
>> + vlan_dev_set_addr_vid(dev, naddr);
>> + return dev_vid_uc_add(real_dev, naddr);
>> + }
>> +
>> + if (ether_addr_equal(addr, real_dev->dev_addr))
>> + return 0;
>> +
>> + return dev_uc_add(real_dev, addr);
>> +}
>> +
>> +static void vlan_dev_del_addr(struct net_device *dev, u8 *addr)
>> +{
>> + struct net_device *real_dev = vlan_dev_real_dev(dev);
>> + unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
>> +
>> + if (real_dev->vid_len) {
>
>Same here.
Not same, it's void routine.
And del can't happen w/o add, no reason.
>
>> + memcpy(naddr, addr, dev->addr_len);
>> + vlan_dev_set_addr_vid(dev, naddr);
>> + dev_vid_uc_del(real_dev, naddr);
>> + return;
>> + }
>> +
>> + if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
>> + dev_uc_del(real_dev, addr);
>> +}
>> +
>> +static int vlan_dev_subs_addr(struct net_device *dev, u8 *addr)
>> +{
>> + int err;
>> +
>> + err = vlan_dev_add_addr(dev, addr);
>> + if (err < 0)
>> + return err;
>> +
>> + vlan_dev_del_addr(dev, dev->dev_addr);
>> + return err;
>> +}
>> +
>> bool vlan_dev_inherit_address(struct net_device *dev,
>> struct net_device *real_dev)
>> {
>> if (dev->addr_assign_type != NET_ADDR_STOLEN)
>> return false;
>>
>> + if (real_dev->vid_len)
>> + if (vlan_dev_subs_addr(dev, real_dev->dev_addr))
>> + return false;
>
>The check on real_dev->vid_len can be absorbed into vlan_dev_subs_addr()?
No, I'd tried. vlan_dev_subs_addr() is used not only here and move it under
makes more not combined code.
>
>> +
>> ether_addr_copy(dev->dev_addr, real_dev->dev_addr);
>> call_netdevice_notifiers(NETDEV_CHANGEADDR, dev);
>> return true;
>> @@ -278,9 +327,10 @@ static int vlan_dev_open(struct net_device *dev)
>> !(vlan->flags & VLAN_FLAG_LOOSE_BINDING))
>> return -ENETDOWN;
>>
>> - if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
>> - !vlan_dev_inherit_address(dev, real_dev)) {
>> - err = dev_uc_add(real_dev, dev->dev_addr);
>> + if (ether_addr_equal(dev->dev_addr, real_dev->dev_addr) ||
>> + (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
>> + !vlan_dev_inherit_address(dev, real_dev))) {
>
>Should this condition simply become if !vlan_dev_inherit_address() now?
It can't, I'd tried.
vlan_dev_inherit_address() is used in (vlan.c):
vlan_sync_address(struct net_device *dev, struct net_device *vlandev);
--
Regards,
Ivan Khoronzhuk
On Wed, Feb 27, 2019 at 08:29:20PM -0800, Florian Fainelli wrote:
>
>
>On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>> IVDF - individual virtual device filtering. Allows to set per vlan
>> l2 address filters on end real network device (for unicast and for
>> multicast) and drop redundant not expected packet income.
>>
>> If CONFIG_VLAN_8021Q_IVDF is enabled the following changes are
>> applied, and only for ethernet network devices.
>>
>> By default every ethernet netdev needs vid len = 2 bytes to be able to
>> hold up to 4096 vids. So set it for every eth device to be correct,
>> except vlan devs.
>>
>> In order to shrink all addresses of devices above vlan, the vid_len
>> for vlan dev = 0, as result all suckers sync their addresses to common
>> base not taking in to account vid part (vid_len of "to" devices is
>> important only). And only vlan device is the source of addresses with
>> actual its vid set, propagating it to parent devices while rx_mode().
>>
>> Also, don't bother those ethernet devices that at this moment are not
>> moved to vlan addressing scheme, so while end ethernet device is
>> created - set vid_len to 0, thus, while syncing, its address space is
>> concatenated to one dimensional like usual, and who needs IVDF - set
>> it to NET_8021Q_VID_TSIZE.
>>
>> There is another decision - is to inherit vid_len or some feature flag
>> from end root device in order to all upper devices have vlan extended
>> address space only if exact end real device have such capability. But
>> I didn't, because it requires more changes and probably I'm not
>> familiar with all places where it should be inherited, I would
>> appreciate if someone can guid where it's applicable, then it could
>> become a little bit more limited.
>
>I would think that a call to vlan_dev_ivdf_set() would be enough to
>indicate that the underlying network device driver supports IVDF and
>wants to make use of it. The infrastructure itself that you added costs
>little memory, it is once the call to vlan_dev_ivdf_set() is made that
>the memory consumption increases which is fine, since we want to make
>use of that feature.
>
>While I appreciate the thoughts given to making this a configurable
>option, I feel that enabling it unconditionally and having the
>underlying driver decide would be more manageable.
Not exactly. Even system has no driver calling vlan_dev_ivdf_set()
I still has this "mem consumption" from the very beginning. That's why I made
this depended on the driver and CONF. For embedded world it looks fine.
The issues is that I can't change addressing scheme dynamically since some
drivers and infrastructure that exists before I called vlan_dev_ivdf_set()
can already have some synced addresses using old scheme. To do this
dynamically it needs unsync vlans from old scheme and make sync again.
Probably that is topic to "sync" :-| about....
I considered idea making "above infrastructure" IVDF to be dependent on
underneath end device IVDF and remove this config at all, but here several
issues with this, the infrastructure has to be "resynced" and some signal has
to inform each vlan to do this, and while this happens, all end devices already
configured and not supported IVDF shouldn't suffer. I can try but it looks not
doable in normal way, and appreciate any thoughts about this.
Meanwhile, this option looks fine for small embedded paltforms.
>
>We have had that conversation before, but let me ask again when we call
>dev_{uc,mc}_sync() and ultimately the network device's
>ndo_set_rx_mode(), by the time the ndo_set_rx_mode() function is called,
>we lost track of the call chain, like which virtual device was it
>originating from. If we somehow added a notification information about
>the network device stack (and we could use netdevice notifiers for
>that), then maybe we don't really need to add all of this code and we
>can just derive the necessary bits of information we want by checking:
>is this a VLAN network device? It is, okay what's your VLAN ID, etc.?
>
>Either approach would get us our cookie anyway :)
Postulate here is that address of vlan device is separate from netdevice entity
with it's own context.
Several cases talking about this:
- bound device having 2 slaves can have added vid to both slave devices but
synced addresses for only one of them. So, if vid is set in real device it
doesn't mean it needs addresses of vlan device.
- I know that's crazy, but net device tree can contain 2 same vlan devices ).
The scheme doesn't prevent this case. So one vid address can be counted by two
vlan network devices.
- Any of the devices in _sync/rx_mode() chain is legal to do with addresses
what ever it's allowed to do, drop some of them, combine with others and more,
even add it's own vid addresses w/o actual vlan network device.
These made me to look in side of building rx_sync netdevice tree holding links
on nodes per each address. And I've did this mostly...then after short look
on this asked myself "who is going to support this ..." and it anyway doesn't
cover all possible cases.
So, lost track of the device looks not so bad and opens more possibilities.
>
>>
>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>> ---
>
>[snip]
>
>> @@ -404,8 +405,13 @@ EXPORT_SYMBOL(ether_setup);
>> struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
>> unsigned int rxqs)
>> {
>> - return alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
>> - ether_setup, txqs, rxqs);
>> + struct net_device *dev;
>> +
>> + dev = alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
>> + ether_setup, txqs, rxqs);
>
>You need to check the return value of alloc_netdev_mqs() now, otherwise
>you could be doing a NPD in the call right under. Since that is the
>default though, do you really need to call vlan_dev_ivdf_set() below?
Yep. Has to be done.
But IVDF is not default....and can't be toggled dynamically, at least for now.
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On 3/1/2019 4:24 AM, Ivan Khoronzhuk wrote:
> On Wed, Feb 27, 2019 at 08:09:44PM -0800, Florian Fainelli wrote:
>>
>>
>> On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>>> Update vlan mc and uc addresses with VID tag while propagating
>>> addresses to lower devices, do this only if address is not synced.
>>> It allows at end driver level to distinguish addresses belonging
>>> to vlan devices.
>>>
>>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>>> ---
>>
>> [snip]
>>
>>>
>>> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
>>
>> Having some kernel doc comment here would also be nice.
>
> yes can be: vlan_dev_get_addr_vid - get vlan id the address belongs to
>
>
>>
>>> +{
>>> + u16 vid = 0;
>>> +
>>> + if (dev->vid_len != NET_8021Q_VID_TSIZE)
>>> + return vid;
>>> +
>>> + vid = addr[dev->addr_len];
>>> + vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
>>
>> This uses knowledge of the maximum VLAN ID is 4095, which is fine, might
>> be a good idea to add a check on VID not exceeding the maximum VLAN ID
>> number instead of doing a silent truncation?
>
> and then return -1, not sure, just because it's 0 or directly set by vlan
> layer and is verified anyway. But no harm to verify even it looks like
> redundancy.
I would have thought that there would be an existing helper function to
put/get a VLAN identifier into/from two bytes but that is fine as-is.
>
>>
>> [snip]
>>
>>> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
>>> +{
>>> + struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
>>> + struct netdev_hw_addr *ha;
>>> +
>>> + if (!real_dev->vid_len)
>>> + return;
>>
>> Should not this check be moved to dev_{mc,uc}_sync? It does not seem to
>> me like this would scale really well across different stacked devices
>> (VLAN, bond, macvlan) as well as underlying drivers (cpsw, dsa, etc.).
>> Or maybe the check should be if vlan_dev->vid_len > real_dev->vid_len ->
>> error, right?
>
> It shouldn't be part of netdev addr module, no any
> vlan_dev_vlan_id(vlan_dev)
> should be there.
>
> It's scaled becouse bond/team ...etc, are ethernet devices and have IVDF
> enabled while configuration. Address propagation always is from leafs to
> real root device, every underneeth device knows nothing about above, so
> check is only in one direction.
>
OK, indeed stacked devices would lead to that, thanks for explaining!
--
Florian
On 3/1/2019 4:28 AM, Ivan Khoronzhuk wrote:
> On Wed, Feb 27, 2019 at 08:13:34PM -0800, Florian Fainelli wrote:
>>
>>
>> On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>>> The vlan device address is held separately from uc/mc lists and
>>> handled differently. The vlan dev address is bound with real device
>>> address only if it's inherited from init, in all other cases it's
>>> separate address entry in uc list. With vid set, the address becomes
>>> not inherited from real device after it's set manually as before, but
>>> is part of uc list any way, with appropriate vid tag set. If vid_len
>>> for real device is 0, the behaviour is the same as before this change,
>>> so shouldn't be any impact on systems w/o individual virtual device
>>> filtering (IVDF) enabled. This allows to control and sync vlan device
>>> address and disable concrete vlan packet income when vlan interface is
>>> down.
>>>
>>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>>> ---
>>
>> [snip]
>>
>>>
>>> +static int vlan_dev_add_addr(struct net_device *dev, u8 *addr)
>>> +{
>>> + struct net_device *real_dev = vlan_dev_real_dev(dev);
>>> + unsigned char naddr[ETH_ALEN + NET_8021Q_VID_TSIZE];
>>> +
>>> + if (real_dev->vid_len) {
>>
>> Don't you need to check that real_dev->vid_len is >= NET_8021Q_VID_TSIZE
>> here?
>
> vid_len for all eth devices or 0 or NET_8021Q_VID_TSIZE and used here
> only as
> a flag that different addressing scheme is used.
> vlan_dev_set_addr_vid() do copy only < NET_8021Q_VID_TSIZE anyway.
>
> Can add the following to be sure:
> if (real_dev->vid_len) {
> if (real_dev->vid_len != NET_8021Q_VID_TSIZE)
> return -1;
> ....
> }
>
> But frankly, if this happens the system is ill and this check can't help
> it.
Fair enough. All of your responses below make sense to me, thanks!
--
Florian
On 3/1/2019 5:11 AM, Ivan Khoronzhuk wrote:
> On Wed, Feb 27, 2019 at 08:29:20PM -0800, Florian Fainelli wrote:
>>
>>
>> On 2/26/2019 10:45 AM, Ivan Khoronzhuk wrote:
>>> IVDF - individual virtual device filtering. Allows to set per vlan
>>> l2 address filters on end real network device (for unicast and for
>>> multicast) and drop redundant not expected packet income.
>>>
>>> If CONFIG_VLAN_8021Q_IVDF is enabled the following changes are
>>> applied, and only for ethernet network devices.
>>>
>>> By default every ethernet netdev needs vid len = 2 bytes to be able to
>>> hold up to 4096 vids. So set it for every eth device to be correct,
>>> except vlan devs.
>>>
>>> In order to shrink all addresses of devices above vlan, the vid_len
>>> for vlan dev = 0, as result all suckers sync their addresses to common
>>> base not taking in to account vid part (vid_len of "to" devices is
>>> important only). And only vlan device is the source of addresses with
>>> actual its vid set, propagating it to parent devices while rx_mode().
>>>
>>> Also, don't bother those ethernet devices that at this moment are not
>>> moved to vlan addressing scheme, so while end ethernet device is
>>> created - set vid_len to 0, thus, while syncing, its address space is
>>> concatenated to one dimensional like usual, and who needs IVDF - set
>>> it to NET_8021Q_VID_TSIZE.
>>>
>>> There is another decision - is to inherit vid_len or some feature flag
>>> from end root device in order to all upper devices have vlan extended
>>> address space only if exact end real device have such capability. But
>>> I didn't, because it requires more changes and probably I'm not
>>> familiar with all places where it should be inherited, I would
>>> appreciate if someone can guid where it's applicable, then it could
>>> become a little bit more limited.
>>
>> I would think that a call to vlan_dev_ivdf_set() would be enough to
>> indicate that the underlying network device driver supports IVDF and
>> wants to make use of it. The infrastructure itself that you added costs
>> little memory, it is once the call to vlan_dev_ivdf_set() is made that
>> the memory consumption increases which is fine, since we want to make
>> use of that feature.
>>
>> While I appreciate the thoughts given to making this a configurable
>> option, I feel that enabling it unconditionally and having the
>> underlying driver decide would be more manageable.
>
> Not exactly. Even system has no driver calling vlan_dev_ivdf_set()
> I still has this "mem consumption" from the very beginning. That's why I
> made
> this depended on the driver and CONF. For embedded world it looks fine.
>
> The issues is that I can't change addressing scheme dynamically since some
> drivers and infrastructure that exists before I called vlan_dev_ivdf_set()
> can already have some synced addresses using old scheme. To do this
> dynamically it needs unsync vlans from old scheme and make sync again.
> Probably that is topic to "sync" :-| about....
Good one, that would be very complicated indeed.
>
> I considered idea making "above infrastructure" IVDF to be dependent on
> underneath end device IVDF and remove this config at all, but here several
> issues with this, the infrastructure has to be "resynced" and some
> signal has
> to inform each vlan to do this, and while this happens, all end devices
> already
> configured and not supported IVDF shouldn't suffer. I can try but it
> looks not
> doable in normal way, and appreciate any thoughts about this.
>
> Meanwhile, this option looks fine for small embedded paltforms.
>
>>
>> We have had that conversation before, but let me ask again when we call
>> dev_{uc,mc}_sync() and ultimately the network device's
>> ndo_set_rx_mode(), by the time the ndo_set_rx_mode() function is called,
>> we lost track of the call chain, like which virtual device was it
>> originating from. If we somehow added a notification information about
>> the network device stack (and we could use netdevice notifiers for
>> that), then maybe we don't really need to add all of this code and we
>> can just derive the necessary bits of information we want by checking:
>> is this a VLAN network device? It is, okay what's your VLAN ID, etc.?
>>
>> Either approach would get us our cookie anyway :)
>
> Postulate here is that address of vlan device is separate from netdevice
> entity
> with it's own context.
>
> Several cases talking about this:
>
> - bound device having 2 slaves can have added vid to both slave devices but
> synced addresses for only one of them. So, if vid is set in real device
> it doesn't mean it needs addresses of vlan device.
>
> - I know that's crazy, but net device tree can contain 2 same vlan
> devices ).
> The scheme doesn't prevent this case. So one vid address can be counted
> by two
> vlan network devices.
>
> - Any of the devices in _sync/rx_mode() chain is legal to do with addresses
> what ever it's allowed to do, drop some of them, combine with others and
> more,
> even add it's own vid addresses w/o actual vlan network device.
>
> These made me to look in side of building rx_sync netdevice tree holding
> links
> on nodes per each address. And I've did this mostly...then after short look
> on this asked myself "who is going to support this ..." and it anyway
> doesn't
> cover all possible cases.
>
> So, lost track of the device looks not so bad and opens more possibilities.
Sounds like you did give a lot of thoughts about the scheme, I am fine
with the current approach and will hook it up to DSA next week to make
sure this resolves the current issues I had with VLAN filtering enabled.
Thanks Ivan!
--
Florian