One of the reasons of this proposition is safety and performance -
host should not receive traffic which is not designated for it.
Some network devices can hold separate address tables for vlans and
real device, but for some reason there is no possibility to apply it
with generic net addressing scheme easily. At this moment the fastest
solution is to add mcast/ucast entries for every created vlan
including real device. But it also adds holes in the filtering and
thus wastes cpus cycles.
This patchseries tries to correct core to assign mcast and ucast
addresses only for vlans that really require it and as result an end
driver can exclusively and simply set its rx filters. As an example
it's implemented on cpsw TI driver, but generic changes provided by
this series can be reused by other ethernet drivers having similar
rx filter address possibilities.
An address+vid is considered as separate address. The reserved device
address length is 32 Bytes, for ethernet devices it's additional
opportunity to pass auxiliary address info, like virtual ID
identifying a device the address belongs to. This series makes it
possible at least for ETH_P_8021Q, but can be easily extended for ab.
Thus end real device can setup separate tables for virtual devices
just retrieving VID from the address. A device address space can
maintain addresses and references on them separately for each virtual
device if it needs so, or only addresses for real device (and all its
vlans) it holds usually.
A vlan device can be in any place of device chain upper real device,
say smth like rdevice/bonding/vlan or even rdevice/macvlan/vlan.
Similar approach can be used for passing additional information for
virtual devices as allmulti flag or/and promisc flag and do this per
vlan, but this is separate story and could be added as a continuation.
I was biased by try to add exclusive mcast and ucast support for vlans
and now have same with small generic correction and mostly locally in
the cpsw driver:
https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/log/?h=ucast_vlan_fix
https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/log/?h=mcast_vlan
and can say it looks better with generic changes provided by this patchset,
that's why this RFC. Above links can be used as fallback.
This series is verified on TI am572x EVM that can hold separate tables
for vlans. Potentially it can be easily extended to netcp driver for
keystone 2 boards (including k2g) and also new am6 chipsets. As a
simple test case, different combinations of vlan+macvlan, macvlan+vlan
were used and tested as with unicast as multicast addresses.
Based on net-next/master
Ivan Khoronzhuk (5):
net: core: dev_addr_lists: add VID to device address space
net: 8021q: vlan_dev: add vid tag for uc and mc address lists
net: 8021q: vlan_dev: add vid tag for vlan device mac address
net: ethernet: add default vid len for all ehternet kind devices
net: ethernet: ti: cpsw: update mc vlan and add uc vlan support based
on addr vids
drivers/net/ethernet/ti/cpsw.c | 86 +++++++++++++++++++----
include/linux/if_vlan.h | 1 +
include/linux/netdevice.h | 7 ++
net/8021q/vlan.c | 3 +
net/8021q/vlan_core.c | 10 +++
net/8021q/vlan_dev.c | 103 ++++++++++++++++++++++-----
net/core/dev_addr_lists.c | 124 +++++++++++++++++++++++++++------
net/ethernet/eth.c | 15 +++-
8 files changed, 290 insertions(+), 59 deletions(-)
--
2.17.1
Update multicast support for vlans and add uc vlan support using newly
added net/core changes with vid tag.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
drivers/net/ethernet/ti/cpsw.c | 86 ++++++++++++++++++++++++++++------
1 file changed, 71 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index e4aa030f1726..49a223720335 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -796,6 +796,67 @@ static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num)
return 0;
}
+static int cpsw_set_uc(struct net_device *ndev, const u8 *addr,
+ int vid, int add)
+{
+ struct cpsw_priv *priv = netdev_priv(ndev);
+ struct cpsw_common *cpsw = priv->cpsw;
+ int flags, port, ret;
+
+ if (vid < 0) {
+ if (cpsw->data.dual_emac)
+ vid = cpsw->slaves[priv->emac_port].port_vlan;
+ else
+ vid = 0;
+ }
+
+ port = HOST_PORT_NUM;
+ flags = vid ? ALE_VLAN : 0;
+
+ if (add)
+ ret = cpsw_ale_add_ucast(cpsw->ale, addr, port, flags, vid);
+ else
+ ret = cpsw_ale_del_ucast(cpsw->ale, addr, port, flags, vid);
+
+ return ret;
+}
+
+static int _cpsw_add_mc_addr(struct net_device *ndev, const u8 *addr)
+{
+ u16 vid;
+
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_mc(ndev, addr, vid ? vid : -1, 1);
+ return 0;
+}
+
+static int _cpsw_del_mc_addr(struct net_device *ndev, const u8 *addr)
+{
+ u16 vid;
+
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_mc(ndev, addr, vid ? vid : -1, 0);
+ return 0;
+}
+
+static int _cpsw_add_uc_addr(struct net_device *ndev, const u8 *addr)
+{
+ u16 vid;
+
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_uc(ndev, addr, vid ? vid : -1, 1);
+ return 0;
+}
+
+static int _cpsw_del_uc_addr(struct net_device *ndev, const u8 *addr)
+{
+ u16 vid;
+
+ vid = vlan_dev_get_addr_vid(ndev, addr);
+ cpsw_set_uc(ndev, addr, vid ? vid : -1, 0);
+ return 0;
+}
+
static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
{
struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
@@ -814,8 +875,8 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
cpsw_ale_set_allmulti(cpsw->ale, ndev->flags & IFF_ALLMULTI);
/* add/remove mcast address either for real netdev or for vlan */
- __hw_addr_ref_sync_dev(&ndev->mc, ndev, cpsw_add_mc_addr,
- cpsw_del_mc_addr);
+ __dev_mc_sync(ndev, _cpsw_add_mc_addr, _cpsw_del_mc_addr);
+ __dev_uc_sync(ndev, _cpsw_add_uc_addr, _cpsw_del_uc_addr);
}
static void cpsw_intr_enable(struct cpsw_common *cpsw)
@@ -2092,7 +2153,8 @@ static int cpsw_ndo_stop(struct net_device *ndev)
struct cpsw_common *cpsw = priv->cpsw;
cpsw_info(priv, ifdown, "shutting down cpsw device\n");
- __hw_addr_ref_unsync_dev(&ndev->mc, ndev, cpsw_purge_all_mc);
+ __dev_mc_unsync(ndev, _cpsw_del_mc_addr);
+ __dev_uc_unsync(ndev, _cpsw_del_uc_addr);
netif_tx_stop_all_queues(priv->ndev);
netif_carrier_off(priv->ndev);
@@ -2453,21 +2515,11 @@ static inline int cpsw_add_vlan_ale_entry(struct cpsw_priv *priv,
if (ret != 0)
return ret;
- ret = cpsw_ale_add_ucast(cpsw->ale, priv->mac_addr,
- HOST_PORT_NUM, ALE_VLAN, vid);
- if (ret != 0)
- goto clean_vid;
-
ret = cpsw_ale_add_mcast(cpsw->ale, priv->ndev->broadcast,
mcast_mask, ALE_VLAN, vid, 0);
- if (ret != 0)
- goto clean_vlan_ucast;
- return 0;
+ if (!ret)
+ return 0;
-clean_vlan_ucast:
- cpsw_ale_del_ucast(cpsw->ale, priv->mac_addr,
- HOST_PORT_NUM, ALE_VLAN, vid);
-clean_vid:
cpsw_ale_del_vlan(cpsw->ale, vid, 0);
return ret;
}
@@ -3418,6 +3470,8 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv)
priv_sl2->emac_port = 1;
cpsw->slaves[1].ndev = ndev;
ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
+ ndev->priv_flags |= IFF_UNICAST_FLT;
+ ndev->vid_len = NET_802Q_VID_TSIZE;
ndev->netdev_ops = &cpsw_netdev_ops;
ndev->ethtool_ops = &cpsw_ethtool_ops;
@@ -3679,6 +3733,8 @@ static int cpsw_probe(struct platform_device *pdev)
}
ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
+ ndev->priv_flags |= IFF_UNICAST_FLT;
+ ndev->vid_len = NET_802Q_VID_TSIZE;
ndev->netdev_ops = &cpsw_netdev_ops;
ndev->ethtool_ops = &cpsw_ethtool_ops;
--
2.17.1
The vlan device address is hold separately from uc/mc lists and is
handled differently. With proposed change, the vlan dev address is bind
with real device address only if's inherited, in all other cases it's
separate address entry.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
include/linux/netdevice.h | 2 ++
net/8021q/vlan.c | 3 ++
net/8021q/vlan_dev.c | 76 +++++++++++++++++++++++++++++----------
3 files changed, 62 insertions(+), 19 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1b0f9c322b01..28c6da3583cf 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -76,6 +76,8 @@ void netdev_set_default_ethtool_ops(struct net_device *dev,
#define NET_RX_SUCCESS 0 /* keep 'em coming, baby */
#define NET_RX_DROP 1 /* packet dropped */
+#define NET_802Q_VID_TSIZE 2
+
/*
* Transmit return codes: transmit return codes originate from three different
* namespaces:
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index aef1a977279c..a1eae9d61284 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -295,6 +295,9 @@ static void vlan_sync_address(struct net_device *dev,
if (vlan_dev_inherit_address(vlandev, dev))
goto out;
+ if (dev->vid_len)
+ goto out;
+
/* vlan address was different from the old address and is equal to
* the new address */
if (!ether_addr_equal(vlandev->dev_addr, vlan->real_dev_addr) &&
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index c05b313314b7..f6bcd847509e 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -258,12 +258,61 @@ static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8 *addr)
addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
}
+static int vlan_dev_add_addr(struct net_device *dev, u8 *addr)
+{
+ struct net_device *real_dev = vlan_dev_real_dev(dev);
+ unsigned char naddr[ETH_ALEN + NET_802Q_VID_TSIZE];
+
+ if (real_dev->vid_len) {
+ memcpy(naddr, addr, dev->addr_len);
+ vlan_dev_set_addr_vid(dev, naddr);
+ return dev_vid_uc_add(real_dev, naddr);
+ }
+
+ if (ether_addr_equal(addr, real_dev->dev_addr))
+ return 0;
+
+ return dev_uc_add(real_dev, addr);
+}
+
+static void vlan_dev_del_addr(struct net_device *dev, u8 *addr)
+{
+ struct net_device *real_dev = vlan_dev_real_dev(dev);
+ unsigned char naddr[ETH_ALEN + NET_802Q_VID_TSIZE];
+
+ if (real_dev->vid_len) {
+ memcpy(naddr, addr, dev->addr_len);
+ vlan_dev_set_addr_vid(dev, naddr);
+ dev_vid_uc_del(real_dev, naddr);
+ return;
+ }
+
+ if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
+ dev_uc_del(real_dev, addr);
+}
+
+int vlan_dev_subs_addr(struct net_device *dev, u8 *addr)
+{
+ int err;
+
+ err = vlan_dev_add_addr(dev, addr);
+ if (err < 0)
+ return err;
+
+ vlan_dev_del_addr(dev, dev->dev_addr);
+ return err;
+}
+
bool vlan_dev_inherit_address(struct net_device *dev,
struct net_device *real_dev)
{
if (dev->addr_assign_type != NET_ADDR_STOLEN)
return false;
+ if (real_dev->vid_len)
+ if (vlan_dev_subs_addr(dev, real_dev->dev_addr))
+ return false;
+
ether_addr_copy(dev->dev_addr, real_dev->dev_addr);
call_netdevice_notifiers(NETDEV_CHANGEADDR, dev);
return true;
@@ -279,9 +328,10 @@ static int vlan_dev_open(struct net_device *dev)
!(vlan->flags & VLAN_FLAG_LOOSE_BINDING))
return -ENETDOWN;
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
- !vlan_dev_inherit_address(dev, real_dev)) {
- err = dev_uc_add(real_dev, dev->dev_addr);
+ if (ether_addr_equal(dev->dev_addr, real_dev->dev_addr) ||
+ (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr) &&
+ !vlan_dev_inherit_address(dev, real_dev))) {
+ err = vlan_dev_add_addr(dev, dev->dev_addr);
if (err < 0)
goto out;
}
@@ -313,8 +363,7 @@ static int vlan_dev_open(struct net_device *dev)
if (dev->flags & IFF_ALLMULTI)
dev_set_allmulti(real_dev, -1);
del_unicast:
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
- dev_uc_del(real_dev, dev->dev_addr);
+ vlan_dev_del_addr(dev, dev->dev_addr);
out:
netif_carrier_off(dev);
return err;
@@ -332,18 +381,14 @@ static int vlan_dev_stop(struct net_device *dev)
if (dev->flags & IFF_PROMISC)
dev_set_promiscuity(real_dev, -1);
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
- dev_uc_del(real_dev, dev->dev_addr);
-
+ vlan_dev_del_addr(dev, dev->dev_addr);
netif_carrier_off(dev);
return 0;
}
static int vlan_dev_set_mac_address(struct net_device *dev, void *p)
{
- struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
struct sockaddr *addr = p;
- int err;
if (!is_valid_ether_addr(addr->sa_data))
return -EADDRNOTAVAIL;
@@ -351,15 +396,8 @@ static int vlan_dev_set_mac_address(struct net_device *dev, void *p)
if (!(dev->flags & IFF_UP))
goto out;
- if (!ether_addr_equal(addr->sa_data, real_dev->dev_addr)) {
- err = dev_uc_add(real_dev, addr->sa_data);
- if (err < 0)
- return err;
- }
-
- if (!ether_addr_equal(dev->dev_addr, real_dev->dev_addr))
- dev_uc_del(real_dev, dev->dev_addr);
-
+ if (vlan_dev_subs_addr(dev, addr->sa_data))
+ return true;
out:
ether_addr_copy(dev->dev_addr, addr->sa_data);
return 0;
--
2.17.1
By default every ethernet netdev needs vid len = 2 bytes to be able to
hold up to 4096 vids. So set it for every eth device to be correct,
except vlan devs.
In order to shrink all addresses of devices above vlan, the vid_len
for vlan dev = 0, as result all suckers sync their addresses to common
base not taking in to account vid part (vid_len of "to" devices is
important only). And only vlan device is the source of addresses with
actual its vid set, propagating it to parent devices while rx_mode().
Also, don't bother those devices that at this moment not moved to vlan
addressing scheme, so while end ethernet device is created - set
vid_len to 0, thus, while syncing, its address space is concatenated
to one dimensional like usual, and who needs vlan addresses to be
separate - set it to VLAN_VID_TYPE_SIZE. When number of devices
supporting new vlan addressing scheme becomes more than simple ones,
it can be reversed, disabling it for those who don't need.
This vid_len should be placed under smth like CONFIG_8021Q_ADDR_FLT.
There is another decision - is to inherit vid_len or some feature flag
from end root device in order to all upper devices have vlan extended
address space only if exact end real device have such capability. But
I didn't, because it requires more changes and probably I'm not
familiar with all places where it should be inherited, I would
appreciate if someone can guid where it's applicable, then it could
become a little bit more limited.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
net/8021q/vlan_dev.c | 1 +
net/ethernet/eth.c | 15 +++++++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index f6bcd847509e..c2d934251e77 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -896,6 +896,7 @@ void vlan_setup(struct net_device *dev)
dev->min_mtu = 0;
dev->max_mtu = ETH_MAX_MTU;
+ dev->vid_len = 0;
eth_zero_addr(dev->broadcast);
}
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index 58933fa50bb5..52f90cefb6de 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -363,6 +363,7 @@ void ether_setup(struct net_device *dev)
dev->min_mtu = ETH_MIN_MTU;
dev->max_mtu = ETH_DATA_LEN;
dev->addr_len = ETH_ALEN;
+ dev->vid_len = NET_802Q_VID_TSIZE;
dev->tx_queue_len = DEFAULT_TX_QUEUE_LEN;
dev->flags = IFF_BROADCAST|IFF_MULTICAST;
dev->priv_flags |= IFF_TX_SKB_SHARING;
@@ -390,8 +391,18 @@ EXPORT_SYMBOL(ether_setup);
struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
unsigned int rxqs)
{
- return alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
- ether_setup, txqs, rxqs);
+ struct net_device *dev;
+
+ dev = alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
+ ether_setup, txqs, rxqs);
+
+ /* TODO: When number of real ehternet devices supporting vlan
+ * addressing scheme becomes more than simple ones, it should
+ * be removed, disabling it (by dev->vid_len = 0) for those
+ * who doesn't support it
+ */
+ dev->vid_len = 0;
+ return dev;
}
EXPORT_SYMBOL(alloc_etherdev_mqs);
--
2.17.1
This patch adds VID tag at the end of each address. For Ethernet
addresses with 6 bytes long, when actual reserved address size is
32 bytes, that's possible to add tag w/o increasing maximum address
length. Thus each address for the case has 32 - 6 = 26 bytes to hold
additional info, say VID for virtual device address.
Therefore, when addresses are synched to the address list of parent
device the address list of latter can contain separate addresses for
virtual devices. It allows contain separate address tables for virtual
devices if they present and vlan be placed on any place of device
chain as the address is propagated to to the end real device thru
*_sync APIs.
If parent device doesn't want to have virtual addresses in its address
space the vid_len has to be 0, thus its address space is "shrunk" to
the state as before this patch.
The end real device, if it's allowed, can retrieve VID tag from an
address and set it for given vlan only. By default vid 0 is used
for real devices to distinguish it from virtual addresses (0xffff
should be used, but for RFC is Ok). Similar approach can be used for
passing additional information for virtual devices as allmulti flag
or/and promisc flag, but this is separate story and could be added
as a continuation.
See next patches to see how it works.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
include/linux/netdevice.h | 5 ++
net/core/dev_addr_lists.c | 124 +++++++++++++++++++++++++++++++-------
2 files changed, 106 insertions(+), 23 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 94fb2e12f117..1b0f9c322b01 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1635,6 +1635,8 @@ enum netdev_priv_flags {
* @perm_addr: Permanent hw address
* @addr_assign_type: Hw address assignment type
* @addr_len: Hardware address length
+ * @vid_len: Virtual ID length, set only if virtual address
+ * filtering is supported
* @neigh_priv_len: Used in neigh_alloc()
* @dev_id: Used to differentiate devices that share
* the same link layer address
@@ -1864,6 +1866,7 @@ struct net_device {
unsigned char perm_addr[MAX_ADDR_LEN];
unsigned char addr_assign_type;
unsigned char addr_len;
+ unsigned char vid_len;
unsigned short neigh_priv_len;
unsigned short dev_id;
unsigned short dev_port;
@@ -4092,8 +4095,10 @@ int dev_addr_init(struct net_device *dev);
/* Functions used for unicast addresses handling */
int dev_uc_add(struct net_device *dev, const unsigned char *addr);
+int dev_vid_uc_add(struct net_device *dev, const unsigned char *addr);
int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr);
int dev_uc_del(struct net_device *dev, const unsigned char *addr);
+int dev_vid_uc_del(struct net_device *dev, const unsigned char *addr);
int dev_uc_sync(struct net_device *to, struct net_device *from);
int dev_uc_sync_multiple(struct net_device *to, struct net_device *from);
void dev_uc_unsync(struct net_device *to, struct net_device *from);
diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
index 81a8cd4ea3bd..82c9c347f338 100644
--- a/net/core/dev_addr_lists.c
+++ b/net/core/dev_addr_lists.c
@@ -542,6 +542,26 @@ int dev_addr_del(struct net_device *dev, const unsigned char *addr,
}
EXPORT_SYMBOL(dev_addr_del);
+static int get_addr_len(struct net_device *dev)
+{
+ return dev->addr_len + dev->vid_len;
+}
+
+static int set_vid_addr(struct net_device *dev, const unsigned char *addr,
+ unsigned char *naddr)
+{
+ int i;
+
+ if (!dev->vid_len)
+ return dev->addr_len;
+
+ memcpy(naddr, addr, dev->addr_len);
+ for (i = 0; i < dev->vid_len; i++)
+ naddr[dev->addr_len + i] = 0;
+
+ return get_addr_len(dev);
+}
+
/*
* Unicast list handling functions
*/
@@ -553,18 +573,22 @@ EXPORT_SYMBOL(dev_addr_del);
*/
int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr)
{
+ unsigned char naddr[MAX_ADDR_LEN];
struct netdev_hw_addr *ha;
- int err;
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
list_for_each_entry(ha, &dev->uc.list, list) {
- if (!memcmp(ha->addr, addr, dev->addr_len) &&
+ if (!memcmp(ha->addr, addr, addr_len) &&
ha->type == NETDEV_HW_ADDR_T_UNICAST) {
err = -EEXIST;
goto out;
}
}
- err = __hw_addr_create_ex(&dev->uc, addr, dev->addr_len,
+ err = __hw_addr_create_ex(&dev->uc, addr, addr_len,
NETDEV_HW_ADDR_T_UNICAST, true, false);
if (!err)
__dev_set_rx_mode(dev);
@@ -575,47 +599,89 @@ int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr)
EXPORT_SYMBOL(dev_uc_add_excl);
/**
- * dev_uc_add - Add a secondary unicast address
+ * dev_vid_uc_add - Add a secondary unicast address with tag
* @dev: device
- * @addr: address to add
+ * @addr: address to add, includes vid tag already
*
* Add a secondary unicast address to the device or increase
* the reference count if it already exists.
*/
-int dev_uc_add(struct net_device *dev, const unsigned char *addr)
+int dev_vid_uc_add(struct net_device *dev, const unsigned char *addr)
{
int err;
netif_addr_lock_bh(dev);
- err = __hw_addr_add(&dev->uc, addr, dev->addr_len,
+ err = __hw_addr_add(&dev->uc, addr, get_addr_len(dev),
NETDEV_HW_ADDR_T_UNICAST);
if (!err)
__dev_set_rx_mode(dev);
netif_addr_unlock_bh(dev);
return err;
}
+EXPORT_SYMBOL(dev_vid_uc_add);
+
+/**
+ * dev_uc_add - Add a secondary unicast address
+ * @dev: device
+ * @addr: address to add
+ *
+ * Add a secondary unicast address to the device or increase
+ * the reference count if it already exists.
+ */
+int dev_uc_add(struct net_device *dev, const unsigned char *addr)
+{
+ unsigned char naddr[MAX_ADDR_LEN];
+ int err;
+
+ set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
+
+ err = dev_vid_uc_add(dev, addr);
+ return err;
+}
EXPORT_SYMBOL(dev_uc_add);
/**
* dev_uc_del - Release secondary unicast address.
* @dev: device
- * @addr: address to delete
+ * @addr: address to delete, includes vid tag already
*
* Release reference to a secondary unicast address and remove it
* from the device if the reference count drops to zero.
*/
-int dev_uc_del(struct net_device *dev, const unsigned char *addr)
+int dev_vid_uc_del(struct net_device *dev, const unsigned char *addr)
{
int err;
netif_addr_lock_bh(dev);
- err = __hw_addr_del(&dev->uc, addr, dev->addr_len,
+ err = __hw_addr_del(&dev->uc, addr, get_addr_len(dev),
NETDEV_HW_ADDR_T_UNICAST);
if (!err)
__dev_set_rx_mode(dev);
netif_addr_unlock_bh(dev);
return err;
}
+EXPORT_SYMBOL(dev_vid_uc_del);
+
+/**
+ * dev_uc_del - Release secondary unicast address.
+ * @dev: device
+ * @addr: address to delete
+ *
+ * Release reference to a secondary unicast address and remove it
+ * from the device if the reference count drops to zero.
+ */
+int dev_uc_del(struct net_device *dev, const unsigned char *addr)
+{
+ unsigned char naddr[MAX_ADDR_LEN];
+ int err;
+
+ set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
+
+ err = dev_vid_uc_del(dev, addr);
+ return err;
+}
EXPORT_SYMBOL(dev_uc_del);
/**
@@ -639,7 +705,7 @@ int dev_uc_sync(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync(&to->uc, &from->uc, to->addr_len);
+ err = __hw_addr_sync(&to->uc, &from->uc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -669,7 +735,7 @@ int dev_uc_sync_multiple(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync_multiple(&to->uc, &from->uc, to->addr_len);
+ err = __hw_addr_sync_multiple(&to->uc, &from->uc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -693,7 +759,7 @@ void dev_uc_unsync(struct net_device *to, struct net_device *from)
netif_addr_lock_bh(from);
netif_addr_lock_nested(to);
- __hw_addr_unsync(&to->uc, &from->uc, to->addr_len);
+ __hw_addr_unsync(&to->uc, &from->uc, get_addr_len(to));
__dev_set_rx_mode(to);
netif_addr_unlock(to);
netif_addr_unlock_bh(from);
@@ -737,18 +803,22 @@ EXPORT_SYMBOL(dev_uc_init);
*/
int dev_mc_add_excl(struct net_device *dev, const unsigned char *addr)
{
+ unsigned char naddr[MAX_ADDR_LEN];
struct netdev_hw_addr *ha;
- int err;
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
list_for_each_entry(ha, &dev->mc.list, list) {
- if (!memcmp(ha->addr, addr, dev->addr_len) &&
+ if (!memcmp(ha->addr, addr, addr_len) &&
ha->type == NETDEV_HW_ADDR_T_MULTICAST) {
err = -EEXIST;
goto out;
}
}
- err = __hw_addr_create_ex(&dev->mc, addr, dev->addr_len,
+ err = __hw_addr_create_ex(&dev->mc, addr, addr_len,
NETDEV_HW_ADDR_T_MULTICAST, true, false);
if (!err)
__dev_set_rx_mode(dev);
@@ -761,10 +831,14 @@ EXPORT_SYMBOL(dev_mc_add_excl);
static int __dev_mc_add(struct net_device *dev, const unsigned char *addr,
bool global)
{
- int err;
+ unsigned char naddr[MAX_ADDR_LEN];
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
- err = __hw_addr_add_ex(&dev->mc, addr, dev->addr_len,
+ err = __hw_addr_add_ex(&dev->mc, addr, addr_len,
NETDEV_HW_ADDR_T_MULTICAST, global, false, 0);
if (!err)
__dev_set_rx_mode(dev);
@@ -801,10 +875,14 @@ EXPORT_SYMBOL(dev_mc_add_global);
static int __dev_mc_del(struct net_device *dev, const unsigned char *addr,
bool global)
{
- int err;
+ unsigned char naddr[MAX_ADDR_LEN];
+ int addr_len, err;
+
+ addr_len = set_vid_addr(dev, addr, naddr);
+ addr = dev->vid_len ? naddr : addr;
netif_addr_lock_bh(dev);
- err = __hw_addr_del_ex(&dev->mc, addr, dev->addr_len,
+ err = __hw_addr_del_ex(&dev->mc, addr, addr_len,
NETDEV_HW_ADDR_T_MULTICAST, global, false);
if (!err)
__dev_set_rx_mode(dev);
@@ -860,7 +938,7 @@ int dev_mc_sync(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync(&to->mc, &from->mc, to->addr_len);
+ err = __hw_addr_sync(&to->mc, &from->mc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -890,7 +968,7 @@ int dev_mc_sync_multiple(struct net_device *to, struct net_device *from)
return -EINVAL;
netif_addr_lock_nested(to);
- err = __hw_addr_sync_multiple(&to->mc, &from->mc, to->addr_len);
+ err = __hw_addr_sync_multiple(&to->mc, &from->mc, get_addr_len(to));
if (!err)
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -914,7 +992,7 @@ void dev_mc_unsync(struct net_device *to, struct net_device *from)
netif_addr_lock_bh(from);
netif_addr_lock_nested(to);
- __hw_addr_unsync(&to->mc, &from->mc, to->addr_len);
+ __hw_addr_unsync(&to->mc, &from->mc, get_addr_len(to));
__dev_set_rx_mode(to);
netif_addr_unlock(to);
netif_addr_unlock_bh(from);
--
2.17.1
Update vlan mc and uc addresses with VID tag while propagating address
set to lower devices, do this only if address is not synched. It allows
on end driver level to distinguish address belonging to vlans.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
---
include/linux/if_vlan.h | 1 +
net/8021q/vlan_core.c | 10 ++++++++++
net/8021q/vlan_dev.c | 26 ++++++++++++++++++++++++++
3 files changed, 37 insertions(+)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 4cca4da7a6de..94657f3c483a 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -136,6 +136,7 @@ extern struct net_device *__vlan_find_dev_deep_rcu(struct net_device *real_dev,
extern int vlan_for_each(struct net_device *dev,
int (*action)(struct net_device *dev, int vid,
void *arg), void *arg);
+extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr);
extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
extern u16 vlan_dev_vlan_id(const struct net_device *dev);
extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index a313165e7a67..5d17947d6988 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -454,6 +454,16 @@ bool vlan_uses_dev(const struct net_device *dev)
}
EXPORT_SYMBOL(vlan_uses_dev);
+u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
+{
+ u16 vid = 0;
+
+ vid = addr[dev->addr_len];
+ vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
+ return vid;
+}
+EXPORT_SYMBOL(vlan_dev_get_addr_vid);
+
static struct sk_buff *vlan_gro_receive(struct list_head *head,
struct sk_buff *skb)
{
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index b2d9c8f27cd7..c05b313314b7 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -250,6 +250,14 @@ void vlan_dev_get_realdev_name(const struct net_device *dev, char *result)
strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
}
+static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8 *addr)
+{
+ u16 vid = vlan_dev_vlan_id(vlan_dev);
+
+ addr[vlan_dev->addr_len] = vid & 0xff;
+ addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
+}
+
bool vlan_dev_inherit_address(struct net_device *dev,
struct net_device *real_dev)
{
@@ -481,8 +489,26 @@ static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
}
}
+static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
+{
+ struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
+ struct netdev_hw_addr *ha;
+
+ if (!real_dev->vid_len)
+ return;
+
+ netdev_for_each_mc_addr(ha, vlan_dev)
+ if (!ha->sync_cnt)
+ vlan_dev_set_addr_vid(vlan_dev, ha->addr);
+
+ netdev_for_each_uc_addr(ha, vlan_dev)
+ if (!ha->sync_cnt)
+ vlan_dev_set_addr_vid(vlan_dev, ha->addr);
+}
+
static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
{
+ vlan_dev_align_addr_vid(vlan_dev);
dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
}
--
2.17.1
On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
> Update vlan mc and uc addresses with VID tag while propagating address
> set to lower devices, do this only if address is not synched. It allows
> on end driver level to distinguish address belonging to vlans.
Underlying driver for the real device would be able to properly identify
that you are attempting to add an address to a virtual device, which
happens to be of VLAN kind so I am really not sure this is the right
approach here.
From there, it seems to me that we have two situations:
- each of your network devices expose VLAN devices directly on top of
the real device, in which case your driver should support
ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
are create and maintain a VLAN device to VID correspondence if it needs
to when being called while setting the addresses
- you are setting up a bridge that is VLAN aware on one of your bridge
ports, and there you can use switchdev to learn about such events and
know about both addresses as well as VIDs that must be programmed into
your real device
It seems to me that what you need may be something like either:
- notifications on slave devices when addresses are added via
ndo_set_rxmode()
or
- dev_{uc,mc}_sync() should be augmented with a "source net_device"
argument which allows you to differentiate which network device is the
source of the address programming. That way, no need to "hash" the MAC
address with a VID, any network device specific information can be
provided and in the real device driver you can do: if
(netif_is_vlan()... etc.)
Hopefully someone else will chime in.
>
> Signed-off-by: Ivan Khoronzhuk <[email protected]>
> ---
> include/linux/if_vlan.h | 1 +
> net/8021q/vlan_core.c | 10 ++++++++++
> net/8021q/vlan_dev.c | 26 ++++++++++++++++++++++++++
> 3 files changed, 37 insertions(+)
>
> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
> index 4cca4da7a6de..94657f3c483a 100644
> --- a/include/linux/if_vlan.h
> +++ b/include/linux/if_vlan.h
> @@ -136,6 +136,7 @@ extern struct net_device *__vlan_find_dev_deep_rcu(struct net_device *real_dev,
> extern int vlan_for_each(struct net_device *dev,
> int (*action)(struct net_device *dev, int vid,
> void *arg), void *arg);
> +extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr);
> extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
> extern u16 vlan_dev_vlan_id(const struct net_device *dev);
> extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index a313165e7a67..5d17947d6988 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -454,6 +454,16 @@ bool vlan_uses_dev(const struct net_device *dev)
> }
> EXPORT_SYMBOL(vlan_uses_dev);
>
> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
> +{
> + u16 vid = 0;
> +
> + vid = addr[dev->addr_len];
> + vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
> + return vid;
> +}
> +EXPORT_SYMBOL(vlan_dev_get_addr_vid);
> +
> static struct sk_buff *vlan_gro_receive(struct list_head *head,
> struct sk_buff *skb)
> {
> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> index b2d9c8f27cd7..c05b313314b7 100644
> --- a/net/8021q/vlan_dev.c
> +++ b/net/8021q/vlan_dev.c
> @@ -250,6 +250,14 @@ void vlan_dev_get_realdev_name(const struct net_device *dev, char *result)
> strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
> }
>
> +static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8 *addr)
> +{
> + u16 vid = vlan_dev_vlan_id(vlan_dev);
> +
> + addr[vlan_dev->addr_len] = vid & 0xff;
> + addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
> +}
> +
> bool vlan_dev_inherit_address(struct net_device *dev,
> struct net_device *real_dev)
> {
> @@ -481,8 +489,26 @@ static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
> }
> }
>
> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
> +{
> + struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
> + struct netdev_hw_addr *ha;
> +
> + if (!real_dev->vid_len)
> + return;
> +
> + netdev_for_each_mc_addr(ha, vlan_dev)
> + if (!ha->sync_cnt)
> + vlan_dev_set_addr_vid(vlan_dev, ha->addr);
> +
> + netdev_for_each_uc_addr(ha, vlan_dev)
> + if (!ha->sync_cnt)
> + vlan_dev_set_addr_vid(vlan_dev, ha->addr);
> +}
> +
> static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
> {
> + vlan_dev_align_addr_vid(vlan_dev);
> dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
> dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
> }
>
--
Florian
On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
> One of the reasons of this proposition is safety and performance -
> host should not receive traffic which is not designated for it.
>
> Some network devices can hold separate address tables for vlans and
> real device, but for some reason there is no possibility to apply it
> with generic net addressing scheme easily. At this moment the fastest
> solution is to add mcast/ucast entries for every created vlan
> including real device. But it also adds holes in the filtering and
> thus wastes cpus cycles.
>
> This patchseries tries to correct core to assign mcast and ucast
> addresses only for vlans that really require it and as result an end
> driver can exclusively and simply set its rx filters. As an example
> it's implemented on cpsw TI driver, but generic changes provided by
> this series can be reused by other ethernet drivers having similar
> rx filter address possibilities.
>
> An address+vid is considered as separate address. The reserved device
> address length is 32 Bytes, for ethernet devices it's additional
> opportunity to pass auxiliary address info, like virtual ID
> identifying a device the address belongs to. This series makes it
> possible at least for ETH_P_8021Q, but can be easily extended for ab.
> Thus end real device can setup separate tables for virtual devices
> just retrieving VID from the address. A device address space can
> maintain addresses and references on them separately for each virtual
> device if it needs so, or only addresses for real device (and all its
> vlans) it holds usually.
>
> A vlan device can be in any place of device chain upper real device,
> say smth like rdevice/bonding/vlan or even rdevice/macvlan/vlan.
> Similar approach can be used for passing additional information for
> virtual devices as allmulti flag or/and promisc flag and do this per
> vlan, but this is separate story and could be added as a continuation.
>
> I was biased by try to add exclusive mcast and ucast support for vlans
> and now have same with small generic correction and mostly locally in
> the cpsw driver:
> https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/log/?h=ucast_vlan_fix
> https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/log/?h=mcast_vlan
> and can say it looks better with generic changes provided by this patchset,
> that's why this RFC. Above links can be used as fallback.
>
> This series is verified on TI am572x EVM that can hold separate tables
> for vlans. Potentially it can be easily extended to netcp driver for
> keystone 2 boards (including k2g) and also new am6 chipsets. As a
> simple test case, different combinations of vlan+macvlan, macvlan+vlan
> were used and tested as with unicast as multicast addresses.
I think I get what you want to do, which is make sure that on per VID
basis, only the necessary UC and MC addresses are programmed into the
respective filters, when each of your network ports operate in what cpsw
calls "dual emac"? Which is really something that should be called "NIC"
mode, and/or is analogous to how we bring up ports in DSA: each of them
acts as a separate network device. Speaking of the later, we have some
known issues in unmanaged mode with MC addresses which I am working on
addressing.
It seems to me, as replied in patch 2, that what you are missing is the
source of the address programming request, and therefore either you pass
an additional "source net_device" to functions like dev_{uc,mc}_sync(),
or you pass a well defined structured that can encode the address as
well as the interface type, but the approach you have right now, where
the address storage is extended to include the VID is both extremely
specific to VLAN devices, as well as not scaling particularly well. With
your patches we have VLAN devices covered, but what about MACVLAN, bond,
team or anything that needs to look at specific MAC addresses to be
filtered to make management decisions?
Also imagine being able to store up to 4K MAC addresses for a total of
4K VLANs, this would have occupied 4K * sizeof(some structure) and now
it's 2 bytes bigger...
>
> Based on net-next/master
>
> Ivan Khoronzhuk (5):
> net: core: dev_addr_lists: add VID to device address space
> net: 8021q: vlan_dev: add vid tag for uc and mc address lists
> net: 8021q: vlan_dev: add vid tag for vlan device mac address
> net: ethernet: add default vid len for all ehternet kind devices
> net: ethernet: ti: cpsw: update mc vlan and add uc vlan support based
> on addr vids
>
> drivers/net/ethernet/ti/cpsw.c | 86 +++++++++++++++++++----
> include/linux/if_vlan.h | 1 +
> include/linux/netdevice.h | 7 ++
> net/8021q/vlan.c | 3 +
> net/8021q/vlan_core.c | 10 +++
> net/8021q/vlan_dev.c | 103 ++++++++++++++++++++++-----
> net/core/dev_addr_lists.c | 124 +++++++++++++++++++++++++++------
> net/ethernet/eth.c | 15 +++-
> 8 files changed, 290 insertions(+), 59 deletions(-)
>
--
Florian
On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>> Update vlan mc and uc addresses with VID tag while propagating address
>> set to lower devices, do this only if address is not synched. It allows
>> on end driver level to distinguish address belonging to vlans.
>
>Underlying driver for the real device would be able to properly identify
>that you are attempting to add an address to a virtual device, which
>happens to be of VLAN kind so I am really not sure this is the right
>approach here.
>
>From there, it seems to me that we have two situations:
>
>- each of your network devices expose VLAN devices directly on top of
>the real device, in which case your driver should support
>ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
>are create and maintain a VLAN device to VID correspondence if it needs
>to when being called while setting the addresses
>
>- you are setting up a bridge that is VLAN aware on one of your bridge
>ports, and there you can use switchdev to learn about such events and
>know about both addresses as well as VIDs that must be programmed into
>your real device
No limits to have any "middle" device between real end device and
virtual one, not only a bridge, but also other kind. And as it's generic
change, it should cover all such cases, the simplest example is:
real_dev/macvlan/vlan.
>
>It seems to me that what you need may be something like either:
>
>- notifications on slave devices when addresses are added via
>ndo_set_rxmode()
>
>or
>
>- dev_{uc,mc}_sync() should be augmented with a "source net_device"
>argument which allows you to differentiate which network device is the
>source of the address programming. That way, no need to "hash" the MAC
>address with a VID, any network device specific information can be
>provided and in the real device driver you can do: if
>(netif_is_vlan()... etc.)
No issue to retrieve vlan dev if it's directly on top of real dev.
Issue is to get it when it's not directly connected as it's not in
vlan_info group list. Who knows what else can be "structed" on top of
real dev till the vlan device. Please look on reply for cover letter,
as it seems requires similar response.
>
>Hopefully someone else will chime in.
>
>>
>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>> ---
>> include/linux/if_vlan.h | 1 +
>> net/8021q/vlan_core.c | 10 ++++++++++
>> net/8021q/vlan_dev.c | 26 ++++++++++++++++++++++++++
>> 3 files changed, 37 insertions(+)
>>
>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>> index 4cca4da7a6de..94657f3c483a 100644
>> --- a/include/linux/if_vlan.h
>> +++ b/include/linux/if_vlan.h
>> @@ -136,6 +136,7 @@ extern struct net_device *__vlan_find_dev_deep_rcu(struct net_device *real_dev,
>> extern int vlan_for_each(struct net_device *dev,
>> int (*action)(struct net_device *dev, int vid,
>> void *arg), void *arg);
>> +extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr);
>> extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
>> extern u16 vlan_dev_vlan_id(const struct net_device *dev);
>> extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> index a313165e7a67..5d17947d6988 100644
>> --- a/net/8021q/vlan_core.c
>> +++ b/net/8021q/vlan_core.c
>> @@ -454,6 +454,16 @@ bool vlan_uses_dev(const struct net_device *dev)
>> }
>> EXPORT_SYMBOL(vlan_uses_dev);
>>
>> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
>> +{
>> + u16 vid = 0;
>> +
>> + vid = addr[dev->addr_len];
>> + vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
>> + return vid;
>> +}
>> +EXPORT_SYMBOL(vlan_dev_get_addr_vid);
>> +
>> static struct sk_buff *vlan_gro_receive(struct list_head *head,
>> struct sk_buff *skb)
>> {
>> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
>> index b2d9c8f27cd7..c05b313314b7 100644
>> --- a/net/8021q/vlan_dev.c
>> +++ b/net/8021q/vlan_dev.c
>> @@ -250,6 +250,14 @@ void vlan_dev_get_realdev_name(const struct net_device *dev, char *result)
>> strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
>> }
>>
>> +static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8 *addr)
>> +{
>> + u16 vid = vlan_dev_vlan_id(vlan_dev);
>> +
>> + addr[vlan_dev->addr_len] = vid & 0xff;
>> + addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
>> +}
>> +
>> bool vlan_dev_inherit_address(struct net_device *dev,
>> struct net_device *real_dev)
>> {
>> @@ -481,8 +489,26 @@ static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
>> }
>> }
>>
>> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
>> +{
>> + struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
>> + struct netdev_hw_addr *ha;
>> +
>> + if (!real_dev->vid_len)
>> + return;
>> +
>> + netdev_for_each_mc_addr(ha, vlan_dev)
>> + if (!ha->sync_cnt)
>> + vlan_dev_set_addr_vid(vlan_dev, ha->addr);
>> +
>> + netdev_for_each_uc_addr(ha, vlan_dev)
>> + if (!ha->sync_cnt)
>> + vlan_dev_set_addr_vid(vlan_dev, ha->addr);
>> +}
>> +
>> static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
>> {
>> + vlan_dev_align_addr_vid(vlan_dev);
>> dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
>> dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
>> }
>>
>
>
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On 12/3/18 3:51 PM, Ivan Khoronzhuk wrote:
> On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>> On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>>> Update vlan mc and uc addresses with VID tag while propagating address
>>> set to lower devices, do this only if address is not synched. It allows
>>> on end driver level to distinguish address belonging to vlans.
>>
>> Underlying driver for the real device would be able to properly identify
>> that you are attempting to add an address to a virtual device, which
>> happens to be of VLAN kind so I am really not sure this is the right
>> approach here.
>>
>> From there, it seems to me that we have two situations:
>>
>> - each of your network devices expose VLAN devices directly on top of
>> the real device, in which case your driver should support
>> ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
>> are create and maintain a VLAN device to VID correspondence if it needs
>> to when being called while setting the addresses
>>
>> - you are setting up a bridge that is VLAN aware on one of your bridge
>> ports, and there you can use switchdev to learn about such events and
>> know about both addresses as well as VIDs that must be programmed into
>> your real device
> No limits to have any "middle" device between real end device and
> virtual one, not only a bridge, but also other kind. And as it's generic
> change, it should cover all such cases, the simplest example is:
> real_dev/macvlan/vlan.
It is not generic if the additional information is a VLAN ID, that
construct does not apply to all types of virtual devices, that is part
of my issue with the extra VID that is being added. If this was a void *
priv and any virtual device could pass up/down information that might be
more acceptable.
>
>>
>> It seems to me that what you need may be something like either:
>>
>> - notifications on slave devices when addresses are added via
>> ndo_set_rxmode()
>>
>> or
>>
>> - dev_{uc,mc}_sync() should be augmented with a "source net_device"
>> argument which allows you to differentiate which network device is the
>> source of the address programming. That way, no need to "hash" the MAC
>> address with a VID, any network device specific information can be
>> provided and in the real device driver you can do: if
>> (netif_is_vlan()... etc.)
> No issue to retrieve vlan dev if it's directly on top of real dev.
> Issue is to get it when it's not directly connected as it's not in
> vlan_info group list. Who knows what else can be "structed" on top of
> real dev till the vlan device. Please look on reply for cover letter,
> as it seems requires similar response.
In that case, there are notifications generated that you must be
listening to determine whether you have something like a VLAN device on
top of a bond, which is a port member of a bridge, on which one of your
real device port is enslaved (yes, it can be that type of stacking).
>
>>
>> Hopefully someone else will chime in.
>>
>>>
>>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>>> ---
>>> include/linux/if_vlan.h | 1 +
>>> net/8021q/vlan_core.c | 10 ++++++++++
>>> net/8021q/vlan_dev.c | 26 ++++++++++++++++++++++++++
>>> 3 files changed, 37 insertions(+)
>>>
>>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>>> index 4cca4da7a6de..94657f3c483a 100644
>>> --- a/include/linux/if_vlan.h
>>> +++ b/include/linux/if_vlan.h
>>> @@ -136,6 +136,7 @@ extern struct net_device
>>> *__vlan_find_dev_deep_rcu(struct net_device *real_dev,
>>> extern int vlan_for_each(struct net_device *dev,
>>> int (*action)(struct net_device *dev, int vid,
>>> void *arg), void *arg);
>>> +extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8
>>> *addr);
>>> extern struct net_device *vlan_dev_real_dev(const struct net_device
>>> *dev);
>>> extern u16 vlan_dev_vlan_id(const struct net_device *dev);
>>> extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
>>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>>> index a313165e7a67..5d17947d6988 100644
>>> --- a/net/8021q/vlan_core.c
>>> +++ b/net/8021q/vlan_core.c
>>> @@ -454,6 +454,16 @@ bool vlan_uses_dev(const struct net_device *dev)
>>> }
>>> EXPORT_SYMBOL(vlan_uses_dev);
>>>
>>> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
>>> +{
>>> + u16 vid = 0;
>>> +
>>> + vid = addr[dev->addr_len];
>>> + vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
>>> + return vid;
>>> +}
>>> +EXPORT_SYMBOL(vlan_dev_get_addr_vid);
>>> +
>>> static struct sk_buff *vlan_gro_receive(struct list_head *head,
>>> struct sk_buff *skb)
>>> {
>>> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
>>> index b2d9c8f27cd7..c05b313314b7 100644
>>> --- a/net/8021q/vlan_dev.c
>>> +++ b/net/8021q/vlan_dev.c
>>> @@ -250,6 +250,14 @@ void vlan_dev_get_realdev_name(const struct
>>> net_device *dev, char *result)
>>> strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
>>> }
>>>
>>> +static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8
>>> *addr)
>>> +{
>>> + u16 vid = vlan_dev_vlan_id(vlan_dev);
>>> +
>>> + addr[vlan_dev->addr_len] = vid & 0xff;
>>> + addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
>>> +}
>>> +
>>> bool vlan_dev_inherit_address(struct net_device *dev,
>>> struct net_device *real_dev)
>>> {
>>> @@ -481,8 +489,26 @@ static void vlan_dev_change_rx_flags(struct
>>> net_device *dev, int change)
>>> }
>>> }
>>>
>>> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
>>> +{
>>> + struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
>>> + struct netdev_hw_addr *ha;
>>> +
>>> + if (!real_dev->vid_len)
>>> + return;
>>> +
>>> + netdev_for_each_mc_addr(ha, vlan_dev)
>>> + if (!ha->sync_cnt)
>>> + vlan_dev_set_addr_vid(vlan_dev, ha->addr);
>>> +
>>> + netdev_for_each_uc_addr(ha, vlan_dev)
>>> + if (!ha->sync_cnt)
>>> + vlan_dev_set_addr_vid(vlan_dev, ha->addr);
>>> +}
>>> +
>>> static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
>>> {
>>> + vlan_dev_align_addr_vid(vlan_dev);
>>> dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
>>> dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
>>> }
>>>
>>
>>
>> --
>> Florian
>
--
Florian
On Mon, Dec 03, 2018 at 02:25:40PM -0800, Florian Fainelli wrote:
>On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>> One of the reasons of this proposition is safety and performance -
>> host should not receive traffic which is not designated for it.
>>
>> Some network devices can hold separate address tables for vlans and
>> real device, but for some reason there is no possibility to apply it
>> with generic net addressing scheme easily. At this moment the fastest
>> solution is to add mcast/ucast entries for every created vlan
>> including real device. But it also adds holes in the filtering and
>> thus wastes cpus cycles.
>>
>> This patchseries tries to correct core to assign mcast and ucast
>> addresses only for vlans that really require it and as result an end
>> driver can exclusively and simply set its rx filters. As an example
>> it's implemented on cpsw TI driver, but generic changes provided by
>> this series can be reused by other ethernet drivers having similar
>> rx filter address possibilities.
>>
>> An address+vid is considered as separate address. The reserved device
>> address length is 32 Bytes, for ethernet devices it's additional
>> opportunity to pass auxiliary address info, like virtual ID
>> identifying a device the address belongs to. This series makes it
>> possible at least for ETH_P_8021Q, but can be easily extended for ab.
>> Thus end real device can setup separate tables for virtual devices
>> just retrieving VID from the address. A device address space can
>> maintain addresses and references on them separately for each virtual
>> device if it needs so, or only addresses for real device (and all its
>> vlans) it holds usually.
>>
>> A vlan device can be in any place of device chain upper real device,
>> say smth like rdevice/bonding/vlan or even rdevice/macvlan/vlan.
>> Similar approach can be used for passing additional information for
>> virtual devices as allmulti flag or/and promisc flag and do this per
>> vlan, but this is separate story and could be added as a continuation.
>>
>> I was biased by try to add exclusive mcast and ucast support for vlans
>> and now have same with small generic correction and mostly locally in
>> the cpsw driver:
>> https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/log/?h=ucast_vlan_fix
>> https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/log/?h=mcast_vlan
>> and can say it looks better with generic changes provided by this patchset,
>> that's why this RFC. Above links can be used as fallback.
>>
>> This series is verified on TI am572x EVM that can hold separate tables
>> for vlans. Potentially it can be easily extended to netcp driver for
>> keystone 2 boards (including k2g) and also new am6 chipsets. As a
>> simple test case, different combinations of vlan+macvlan, macvlan+vlan
>> were used and tested as with unicast as multicast addresses.
>
>I think I get what you want to do, which is make sure that on per VID
>basis, only the necessary UC and MC addresses are programmed into the
>respective filters, when each of your network ports operate in what cpsw
>calls "dual emac"? Which is really something that should be called "NIC"
dual emac here is just a mode when two ports using shared h/w tried to
behave as a completely separate interfaces. But no sense here about it, as
this change is applicable for just one NIC that can be part of any struct
of above devices.
>mode, and/or is analogous to how we bring up ports in DSA: each of them
>acts as a separate network device. Speaking of the later, we have some
>known issues in unmanaged mode with MC addresses which I am working on
>addressing.
>
>It seems to me, as replied in patch 2, that what you are missing is the
>source of the address programming request, and therefore either you pass
>an additional "source net_device" to functions like dev_{uc,mc}_sync(),
I don't know correct way to propagate vlan device to the end real devices.
The closest I've found is to propagate it along with adding vid to parent
devices, but even in this case, it requires changes for every device
being able to propagate vid to real dev, where not always "structure" is
clear. But this approach also has drawbacks with code on end device,
see next comments.
>or you pass a well defined structured that can encode the address as
>well as the interface type, but the approach you have right now, where
>the address storage is extended to include the VID is both extremely
The address length is not changed, only used part that was reserved and
was present always. Changing addr len is much more harder then it looks
like, I've tried ).
>specific to VLAN devices, as well as not scaling particularly well.
I would rather decipher it as virtual ID. VID, virtual ID is smth not
strictly related for vlans, it can be considered as any device
identification with assumptions other types of net devs can have also
some kind of virtual devices with its ids. And it's naming only, I think
name is Ok, but can be changed.
>With
>your patches we have VLAN devices covered, but what about MACVLAN, bond,
>team or anything that needs to look at specific MAC addresses to be
>filtered to make management decisions?
That why actually this fix, if it's part of vlan addressing scheme,
bond/team/macvlan/whatever should be aware of such scheme. Particularly
if take bond, it still syncs its addresses with its parent devices
using same mc_sync and uc_sync APIs, that were changed to propagate all
vlan addresses farther (implemented in patch 1 and enabled in patch 2).
>
>Also imagine being able to store up to 4K MAC addresses for a total of
>4K VLANs, this would have occupied 4K * sizeof(some structure) and now
Yes it's additional addresses, but kernel holds separate address entries
for each device in the structure and no problems...updating not only
references. In this change, in best case, only end device contains
additional set of addresses for vlans, but pay attention, it holds also
references and sync information for each address with h/w (in normal
implementations using dev_mc/uc_sync), and allows to make updates only
if address added/removed.
In contrast to "memory wasting" approach.
Lets assume the ability to get vlan dev, not depending on the device
structure on top of it, is present. Then each end real device needs to
pass thru each vlan, compare each address, refcounters and so, to be
able to identify the vlans. Drawback: any change in vlan core / netdev
core can brake it. Assume I need to add it to several devices....,
frankly it looks not very portable, then I can try to create auxiliary
function and reuse it, but it looks not very.
For comparison on cpsw:
mc implementation using vlan device from the vlan_info list:
https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=mcast_vlan&id=984782e8e2f8907593a4956e0922477307c73999
uc implementation using vlan device from the vlan_info list even bigger:
https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=ucast_vlan_fix&id=ebc88a7d8758759322d9ff88f25f8bac51ce7219
Both, mc and uc implementations using proposed patchset is much simpler:
https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=just_mc_uc_cpsw_vaddr_example&id=e9613ef0acafb6cb00db73b068fc0445ae737b5a
>it's 2 bytes bigger...
Not bigger, it still uses same reserved length - 32 bytes
(26 of each were not used).
>
>>
>> Based on net-next/master
>>
>> Ivan Khoronzhuk (5):
>> net: core: dev_addr_lists: add VID to device address space
>> net: 8021q: vlan_dev: add vid tag for uc and mc address lists
>> net: 8021q: vlan_dev: add vid tag for vlan device mac address
>> net: ethernet: add default vid len for all ehternet kind devices
>> net: ethernet: ti: cpsw: update mc vlan and add uc vlan support based
>> on addr vids
>>
>> drivers/net/ethernet/ti/cpsw.c | 86 +++++++++++++++++++----
>> include/linux/if_vlan.h | 1 +
>> include/linux/netdevice.h | 7 ++
>> net/8021q/vlan.c | 3 +
>> net/8021q/vlan_core.c | 10 +++
>> net/8021q/vlan_dev.c | 103 ++++++++++++++++++++++-----
>> net/core/dev_addr_lists.c | 124 +++++++++++++++++++++++++++------
>> net/ethernet/eth.c | 15 +++-
>> 8 files changed, 290 insertions(+), 59 deletions(-)
>>
>
>
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On Mon, Dec 03, 2018 at 03:57:03PM -0800, Florian Fainelli wrote:
>On 12/3/18 3:51 PM, Ivan Khoronzhuk wrote:
>> On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>>> On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>>>> Update vlan mc and uc addresses with VID tag while propagating address
>>>> set to lower devices, do this only if address is not synched. It allows
>>>> on end driver level to distinguish address belonging to vlans.
>>>
>>> Underlying driver for the real device would be able to properly identify
>>> that you are attempting to add an address to a virtual device, which
>>> happens to be of VLAN kind so I am really not sure this is the right
>>> approach here.
>>>
>>> From there, it seems to me that we have two situations:
>>>
>>> - each of your network devices expose VLAN devices directly on top of
>>> the real device, in which case your driver should support
>>> ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
>>> are create and maintain a VLAN device to VID correspondence if it needs
>>> to when being called while setting the addresses
>>>
>>> - you are setting up a bridge that is VLAN aware on one of your bridge
>>> ports, and there you can use switchdev to learn about such events and
>>> know about both addresses as well as VIDs that must be programmed into
>>> your real device
>> No limits to have any "middle" device between real end device and
>> virtual one, not only a bridge, but also other kind. And as it's generic
>> change, it should cover all such cases, the simplest example is:
>> real_dev/macvlan/vlan.
>
>It is not generic if the additional information is a VLAN ID, that
>construct does not apply to all types of virtual devices, that is part
>of my issue with the extra VID that is being added. If this was a void *
>priv and any virtual device could pass up/down information that might be
>more acceptable.
I tractate it as virtual identification not relating it to vlans only.
VID just uses for eth devices part of already reserved address.
I was also thinking about passing smth like pointer on vlan device,
frankly even don't remember what stopped me, probably size of pointer or
so, maybe it looked harder... but it still seems possible and at least for
ethernet address is enough space: 32 - 6 = 26, 26 > 8.
Despite it only combines methods with extended address space and passing
link to vlan device, it adds ability to apply allmulti and promisc modes
per vlans.
>
>>
>>>
>>> It seems to me that what you need may be something like either:
>>>
>>> - notifications on slave devices when addresses are added via
>>> ndo_set_rxmode()
>>>
>>> or
>>>
>>> - dev_{uc,mc}_sync() should be augmented with a "source net_device"
>>> argument which allows you to differentiate which network device is the
>>> source of the address programming. That way, no need to "hash" the MAC
>>> address with a VID, any network device specific information can be
>>> provided and in the real device driver you can do: if
>>> (netif_is_vlan()... etc.)
>> No issue to retrieve vlan dev if it's directly on top of real dev.
>> Issue is to get it when it's not directly connected as it's not in
>> vlan_info group list. Who knows what else can be "structed" on top of
>> real dev till the vlan device. Please look on reply for cover letter,
>> as it seems requires similar response.
>
>In that case, there are notifications generated that you must be
>listening to determine whether you have something like a VLAN device on
>top of a bond, which is a port member of a bridge, on which one of your
>real device port is enslaved (yes, it can be that type of stacking).
It becomes even harder, traversing all structure of devices, seems not very
simple task.
>
>>
>>>
>>> Hopefully someone else will chime in.
>>>
>>>>
>>>> Signed-off-by: Ivan Khoronzhuk <[email protected]>
>>>> ---
>>>> ?include/linux/if_vlan.h |? 1 +
>>>> ?net/8021q/vlan_core.c?? | 10 ++++++++++
>>>> ?net/8021q/vlan_dev.c??? | 26 ++++++++++++++++++++++++++
>>>> ?3 files changed, 37 insertions(+)
>>>>
>>>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>>>> index 4cca4da7a6de..94657f3c483a 100644
>>>> --- a/include/linux/if_vlan.h
>>>> +++ b/include/linux/if_vlan.h
>>>> @@ -136,6 +136,7 @@ extern struct net_device
>>>> *__vlan_find_dev_deep_rcu(struct net_device *real_dev,
>>>> ?extern int vlan_for_each(struct net_device *dev,
>>>> ????????????? int (*action)(struct net_device *dev, int vid,
>>>> ??????????????????????? void *arg), void *arg);
>>>> +extern u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8
>>>> *addr);
>>>> ?extern struct net_device *vlan_dev_real_dev(const struct net_device
>>>> *dev);
>>>> ?extern u16 vlan_dev_vlan_id(const struct net_device *dev);
>>>> ?extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
>>>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>>>> index a313165e7a67..5d17947d6988 100644
>>>> --- a/net/8021q/vlan_core.c
>>>> +++ b/net/8021q/vlan_core.c
>>>> @@ -454,6 +454,16 @@ bool vlan_uses_dev(const struct net_device *dev)
>>>> ?}
>>>> ?EXPORT_SYMBOL(vlan_uses_dev);
>>>>
>>>> +u16 vlan_dev_get_addr_vid(struct net_device *dev, const u8 *addr)
>>>> +{
>>>> +??? u16 vid = 0;
>>>> +
>>>> +??? vid = addr[dev->addr_len];
>>>> +??? vid |= (addr[dev->addr_len + 1] & 0xf) << 8;
>>>> +??? return vid;
>>>> +}
>>>> +EXPORT_SYMBOL(vlan_dev_get_addr_vid);
>>>> +
>>>> ?static struct sk_buff *vlan_gro_receive(struct list_head *head,
>>>> ???????????????????? struct sk_buff *skb)
>>>> ?{
>>>> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
>>>> index b2d9c8f27cd7..c05b313314b7 100644
>>>> --- a/net/8021q/vlan_dev.c
>>>> +++ b/net/8021q/vlan_dev.c
>>>> @@ -250,6 +250,14 @@ void vlan_dev_get_realdev_name(const struct
>>>> net_device *dev, char *result)
>>>> ???? strncpy(result, vlan_dev_priv(dev)->real_dev->name, 23);
>>>> ?}
>>>>
>>>> +static void vlan_dev_set_addr_vid(struct net_device *vlan_dev, u8
>>>> *addr)
>>>> +{
>>>> +??? u16 vid = vlan_dev_vlan_id(vlan_dev);
>>>> +
>>>> +??? addr[vlan_dev->addr_len] = vid & 0xff;
>>>> +??? addr[vlan_dev->addr_len + 1] = (vid >> 8) & 0xf;
>>>> +}
>>>> +
>>>> ?bool vlan_dev_inherit_address(struct net_device *dev,
>>>> ?????????????????? struct net_device *real_dev)
>>>> ?{
>>>> @@ -481,8 +489,26 @@ static void vlan_dev_change_rx_flags(struct
>>>> net_device *dev, int change)
>>>> ???? }
>>>> ?}
>>>>
>>>> +static void vlan_dev_align_addr_vid(struct net_device *vlan_dev)
>>>> +{
>>>> +??? struct net_device *real_dev = vlan_dev_real_dev(vlan_dev);
>>>> +??? struct netdev_hw_addr *ha;
>>>> +
>>>> +??? if (!real_dev->vid_len)
>>>> +??????? return;
>>>> +
>>>> +??? netdev_for_each_mc_addr(ha, vlan_dev)
>>>> +??????? if (!ha->sync_cnt)
>>>> +??????????? vlan_dev_set_addr_vid(vlan_dev, ha->addr);
>>>> +
>>>> +??? netdev_for_each_uc_addr(ha, vlan_dev)
>>>> +??????? if (!ha->sync_cnt)
>>>> +??????????? vlan_dev_set_addr_vid(vlan_dev, ha->addr);
>>>> +}
>>>> +
>>>> ?static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
>>>> ?{
>>>> +??? vlan_dev_align_addr_vid(vlan_dev);
>>>> ???? dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
>>>> ???? dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev);
>>>> ?}
>>>>
>>>
>>>
>>> --?
>>> Florian
>>
>
>
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On Mon, Dec 03, 2018 at 03:57:03PM -0800, Florian Fainelli wrote:
>On 12/3/18 3:51 PM, Ivan Khoronzhuk wrote:
>> On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>>> On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>>>> Update vlan mc and uc addresses with VID tag while propagating address
>>>> set to lower devices, do this only if address is not synched. It allows
>>>> on end driver level to distinguish address belonging to vlans.
>>>
>>> Underlying driver for the real device would be able to properly identify
>>> that you are attempting to add an address to a virtual device, which
>>> happens to be of VLAN kind so I am really not sure this is the right
>>> approach here.
>>>
>>> From there, it seems to me that we have two situations:
>>>
>>> - each of your network devices expose VLAN devices directly on top of
>>> the real device, in which case your driver should support
>>> ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
>>> are create and maintain a VLAN device to VID correspondence if it needs
>>> to when being called while setting the addresses
>>>
>>> - you are setting up a bridge that is VLAN aware on one of your bridge
>>> ports, and there you can use switchdev to learn about such events and
>>> know about both addresses as well as VIDs that must be programmed into
>>> your real device
>> No limits to have any "middle" device between real end device and
>> virtual one, not only a bridge, but also other kind. And as it's generic
>> change, it should cover all such cases, the simplest example is:
>> real_dev/macvlan/vlan.
>
>It is not generic if the additional information is a VLAN ID, that
>construct does not apply to all types of virtual devices, that is part
>of my issue with the extra VID that is being added. If this was a void *
>priv and any virtual device could pass up/down information that might be
>more acceptable.
You mean to create smth like common struct pinned to "an address" and
pass information not only like vid, but in parallel what ever user wanted.
Even if pass vlan device pointer it still considered like an address
continuation and same sync method is used w/o modification. And here vid
is considered as part of address, by a big account address+vid it's a
separate address, same happens with the pointer, address+pointer it's
still separate address.
I was thinking also about pinned list of vlans to the address, but in
this case this information also has to be synced by members of device chain,
because it can be modified on any device level and it looks not very friendly,
and at the end address space has addresses with pinned lists of vlans with
their pointers. But keeping this stuff in sync is not simplest decision.
--
Regards,
Ivan Khoronzhuk
On 12/4/18 10:57 AM, Ivan Khoronzhuk wrote:
> On Mon, Dec 03, 2018 at 03:57:03PM -0800, Florian Fainelli wrote:
>> On 12/3/18 3:51 PM, Ivan Khoronzhuk wrote:
>>> On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>>>> On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>>>>> Update vlan mc and uc addresses with VID tag while propagating address
>>>>> set to lower devices, do this only if address is not synched. It
>>>>> allows
>>>>> on end driver level to distinguish address belonging to vlans.
>>>>
>>>> Underlying driver for the real device would be able to properly
>>>> identify
>>>> that you are attempting to add an address to a virtual device, which
>>>> happens to be of VLAN kind so I am really not sure this is the right
>>>> approach here.
>>>>
>>>> From there, it seems to me that we have two situations:
>>>>
>>>> - each of your network devices expose VLAN devices directly on top of
>>>> the real device, in which case your driver should support
>>>> ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
>>>> are create and maintain a VLAN device to VID correspondence if it needs
>>>> to when being called while setting the addresses
>>>>
>>>> - you are setting up a bridge that is VLAN aware on one of your bridge
>>>> ports, and there you can use switchdev to learn about such events and
>>>> know about both addresses as well as VIDs that must be programmed into
>>>> your real device
>>> No limits to have any "middle" device between real end device and
>>> virtual one, not only a bridge, but also other kind. And as it's generic
>>> change, it should cover all such cases, the simplest example is:
>>> real_dev/macvlan/vlan.
>>
>> It is not generic if the additional information is a VLAN ID, that
>> construct does not apply to all types of virtual devices, that is part
>> of my issue with the extra VID that is being added. If this was a void *
>> priv and any virtual device could pass up/down information that might be
>> more acceptable.
>
> You mean to create smth like common struct pinned to "an address" and
> pass information not only like vid, but in parallel what ever user wanted.
> Even if pass vlan device pointer it still considered like an address
> continuation and same sync method is used w/o modification. And here vid
> is considered as part of address, by a big account address+vid it's a
> separate address, same happens with the pointer, address+pointer it's
> still separate address.
That depends on the HW implementation, some switches do individual VLAN
learning (IVL) and some do shared VLAN learning (SVL) so whether the VID
becomes part of the address resolution logic is HW dependent, obviously
the more capable, the better (IVL).
>
> I was thinking also about pinned list of vlans to the address, but in
> this case this information also has to be synced by members of device
> chain,
> because it can be modified on any device level and it looks not very
> friendly,
> and at the end address space has addresses with pinned lists of vlans with
> their pointers. But keeping this stuff in sync is not simplest decision.
>
>
I really think we are not communicating properly, it really seems to me
that if you had the information about the upper device trying to add an
address to the lower device filter's either through notification or call
to ndo_set_rxmode, you could be solving your problems. What are we
missing here?
--
Florian
On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>On 12/4/18 10:57 AM, Ivan Khoronzhuk wrote:
>> On Mon, Dec 03, 2018 at 03:57:03PM -0800, Florian Fainelli wrote:
>>> On 12/3/18 3:51 PM, Ivan Khoronzhuk wrote:
>>>> On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>>>>> On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>>>>>> Update vlan mc and uc addresses with VID tag while propagating address
>>>>>> set to lower devices, do this only if address is not synched. It
>>>>>> allows
>>>>>> on end driver level to distinguish address belonging to vlans.
>>>>>
>>>>> Underlying driver for the real device would be able to properly
>>>>> identify
>>>>> that you are attempting to add an address to a virtual device, which
>>>>> happens to be of VLAN kind so I am really not sure this is the right
>>>>> approach here.
>>>>>
>>>>> From there, it seems to me that we have two situations:
>>>>>
>>>>> - each of your network devices expose VLAN devices directly on top of
>>>>> the real device, in which case your driver should support
>>>>> ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN devices
>>>>> are create and maintain a VLAN device to VID correspondence if it needs
>>>>> to when being called while setting the addresses
>>>>>
>>>>> - you are setting up a bridge that is VLAN aware on one of your bridge
>>>>> ports, and there you can use switchdev to learn about such events and
>>>>> know about both addresses as well as VIDs that must be programmed into
>>>>> your real device
>>>> No limits to have any "middle" device between real end device and
>>>> virtual one, not only a bridge, but also other kind. And as it's generic
>>>> change, it should cover all such cases, the simplest example is:
>>>> real_dev/macvlan/vlan.
>>>
>>> It is not generic if the additional information is a VLAN ID, that
>>> construct does not apply to all types of virtual devices, that is part
>>> of my issue with the extra VID that is being added. If this was a void *
>>> priv and any virtual device could pass up/down information that might be
>>> more acceptable.
>>
>> You mean to create smth like common struct pinned to "an address" and
>> pass information not only like vid, but in parallel what ever user wanted.
>> Even if pass vlan device pointer it still considered like an address
>> continuation and same sync method is used w/o modification. And here vid
>> is considered as part of address, by a big account address+vid it's a
>> separate address, same happens with the pointer, address+pointer it's
>> still separate address.
>
>That depends on the HW implementation, some switches do individual VLAN
>learning (IVL) and some do shared VLAN learning (SVL) so whether the VID
>becomes part of the address resolution logic is HW dependent, obviously
>the more capable, the better (IVL).
In my case IVL is only choice, as SVL is rather imitation, as each vlan
has it's own address table anyway. And I mean not only vlan configuration
above the bridge but also any simple configuration above real device.
There is proposition to add smth like additional list of entries pinned
to the each address as you proposed, but in a little bit different way.
Pin the context pointers to each address if IVL config is enabled.
Smth like
+struct ctx_entry {
+ void *info;
+ unsigned type;
+ int synced;
+ int sync_cnt;
+ int refcount;
+}
Then in hw_addr struct add a ctx_list:
struct netdev_hw_addr {
struct list_head list;
+ struct list_head ctx_list;
unsigned char addr[MAX_ADDR_LEN];
unsigned char type;
.....
}
Each ctx_entry contains pointer to some structure, in my case it could be
pointer on vlan net_dev, and it can be marked with type VLAN_DEVICE_POINTER or
else. In some other invisible cases it could be another one. Main difference
between each of them is its pointer and type.
And once each net dev calls mc/uc_sync these entires, if not synced, are synced
along with addresses. But main difference that these ctx entires are pinned to
the address, when addresses are pinned to the device.
It can allow to bring information any new abstraction can apply.
For real device the list can be empty or contain special entry to differ
it from the vlan device entries, as could happen only some vlan is address
owner.
Not sure it can be much simpler but it definitely can introduce more
capabilities, and potentially cover some other cases including your.
Probably I need rename the series on smth like:
"make addressing scheme to be IVL capable"
and send RFCv2
Thanks for your comment, it's really valuable.
>
>>
>> I was thinking also about pinned list of vlans to the address, but in
>> this case this information also has to be synced by members of device
>> chain,
>> because it can be modified on any device level and it looks not very
>> friendly,
>> and at the end address space has addresses with pinned lists of vlans with
>> their pointers. But keeping this stuff in sync is not simplest decision.
>>
>>
>
>I really think we are not communicating properly, it really seems to me
>that if you had the information about the upper device trying to add an
>address to the lower device filter's either through notification or call
>to ndo_set_rxmode, you could be solving your problems. What are we
>missing here?
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
...
>>
>> I was thinking also about pinned list of vlans to the address, but in
>> this case this information also has to be synced by members of device
>> chain,
>> because it can be modified on any device level and it looks not very
>> friendly,
>> and at the end address space has addresses with pinned lists of vlans with
>> their pointers. But keeping this stuff in sync is not simplest decision.
>>
>>
>
>I really think we are not communicating properly, it really seems to me
>that if you had the information about the upper device trying to add an
>address to the lower device filter's either through notification or call
>to ndo_set_rxmode, you could be solving your problems. What are we
>missing here?
Sry, missed this one. The problem in getting the owner of address.
Just simple case: vlan/macvlan/real_dev or vlan/.../.../real_dev
The real dev hasn't simple way to get vid the address belong to, or it has?
>--
>Florian
--
Regards,
Ivan Khoronzhuk
Le 12/4/18 à 4:04 PM, Ivan Khoronzhuk a écrit :
> On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>
> ...
>
>>>
>>> I was thinking also about pinned list of vlans to the address, but in
>>> this case this information also has to be synced by members of device
>>> chain,
>>> because it can be modified on any device level and it looks not very
>>> friendly,
>>> and at the end address space has addresses with pinned lists of vlans
>>> with
>>> their pointers. But keeping this stuff in sync is not simplest decision.
>>>
>>>
>>
>> I really think we are not communicating properly, it really seems to me
>> that if you had the information about the upper device trying to add an
>> address to the lower device filter's either through notification or call
>> to ndo_set_rxmode, you could be solving your problems. What are we
>> missing here?
>
> Sry, missed this one. The problem in getting the owner of address.
> Just simple case: vlan/macvlan/real_dev or vlan/.../.../real_dev
>
> The real dev hasn't simple way to get vid the address belong to, or it has?
Humm looks like your right, by the time the address lists are
synchronized (e.g: from = vlan_dev, to = real_dev), we lost that
information. It looks like I just managed to find such an use case
myself with VLAN filtering enabled on a bridge (so switch is VLAN aware)
and a VLAN device created on the bridge (br0.42) but with IGMP snooping
turned off (so we don't get HOST_MDB notifications with correct VLAN ID).
Maybe keeping the "from" net_device within the address list that is
processed by ndo_set_rx_mode() will do the job though?
Then you can do things like:
if (is_vlan_dev(ha->dev) && ha->dev != dev)
vid = vlan_dev_vlan_id(ha->dev);
and it should scale to any type of stacked device, regardless of VID or
something else that we need?
Can you remind me of your use case again? Is it because your switch has
VLAN filtering enabled and you need to make sure that MC addresses on
VLAN device get programmed into the switch's multicast database with
correct VID?
--
Florian
On 1/7/19 9:01 PM, Florian Fainelli wrote:
> Le 12/4/18 à 4:04 PM, Ivan Khoronzhuk a écrit :
>> On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>>
>> ...
>>
>>>>
>>>> I was thinking also about pinned list of vlans to the address, but in
>>>> this case this information also has to be synced by members of device
>>>> chain,
>>>> because it can be modified on any device level and it looks not very
>>>> friendly,
>>>> and at the end address space has addresses with pinned lists of vlans
>>>> with
>>>> their pointers. But keeping this stuff in sync is not simplest decision.
>>>>
>>>>
>>>
>>> I really think we are not communicating properly, it really seems to me
>>> that if you had the information about the upper device trying to add an
>>> address to the lower device filter's either through notification or call
>>> to ndo_set_rxmode, you could be solving your problems. What are we
>>> missing here?
>>
>> Sry, missed this one. The problem in getting the owner of address.
>> Just simple case: vlan/macvlan/real_dev or vlan/.../.../real_dev
>>
>> The real dev hasn't simple way to get vid the address belong to, or it has?
>
> Humm looks like your right, by the time the address lists are
> synchronized (e.g: from = vlan_dev, to = real_dev), we lost that
> information. It looks like I just managed to find such an use case
> myself with VLAN filtering enabled on a bridge (so switch is VLAN aware)
> and a VLAN device created on the bridge (br0.42) but with IGMP snooping
> turned off (so we don't get HOST_MDB notifications with correct VLAN ID).
>
> Maybe keeping the "from" net_device within the address list that is
> processed by ndo_set_rx_mode() will do the job though?
>
> Then you can do things like:
>
> if (is_vlan_dev(ha->dev) && ha->dev != dev)
> vid = vlan_dev_vlan_id(ha->dev);
>
> and it should scale to any type of stacked device, regardless of VID or
> something else that we need?
>
> Can you remind me of your use case again? Is it because your switch has
> VLAN filtering enabled and you need to make sure that MC addresses on
> VLAN device get programmed into the switch's multicast database with
> correct VID?
Ivan, can you see if the following would work for you:
https://github.com/ffainelli/linux/commit/19e173ebdcdd32f5f5b5ef29049e35d33ad058f2
this should be more scalable approach in that we can support HOST_MDB
notifications from the bridge, the same way we would get notified about
IGMP snooping from the bridge and this does not impact any other driver
than those that elect to receive switchdev object notifications, which
cpsw should really implement by now...
--
Florian
On 12/4/18 3:42 PM, Ivan Khoronzhuk wrote:
> On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>> On 12/4/18 10:57 AM, Ivan Khoronzhuk wrote:
>>> On Mon, Dec 03, 2018 at 03:57:03PM -0800, Florian Fainelli wrote:
>>>> On 12/3/18 3:51 PM, Ivan Khoronzhuk wrote:
>>>>> On Mon, Dec 03, 2018 at 02:17:00PM -0800, Florian Fainelli wrote:
>>>>>> On 12/3/18 10:40 AM, Ivan Khoronzhuk wrote:
>>>>>>> Update vlan mc and uc addresses with VID tag while propagating
>>>>>>> address
>>>>>>> set to lower devices, do this only if address is not synched. It
>>>>>>> allows
>>>>>>> on end driver level to distinguish address belonging to vlans.
>>>>>>
>>>>>> Underlying driver for the real device would be able to properly
>>>>>> identify
>>>>>> that you are attempting to add an address to a virtual device, which
>>>>>> happens to be of VLAN kind so I am really not sure this is the right
>>>>>> approach here.
>>>>>>
>>>>>> From there, it seems to me that we have two situations:
>>>>>>
>>>>>> - each of your network devices expose VLAN devices directly on top of
>>>>>> the real device, in which case your driver should support
>>>>>> ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid to know when VLAN
>>>>>> devices
>>>>>> are create and maintain a VLAN device to VID correspondence if it
>>>>>> needs
>>>>>> to when being called while setting the addresses
>>>>>>
>>>>>> - you are setting up a bridge that is VLAN aware on one of your
>>>>>> bridge
>>>>>> ports, and there you can use switchdev to learn about such events and
>>>>>> know about both addresses as well as VIDs that must be programmed
>>>>>> into
>>>>>> your real device
>>>>> No limits to have any "middle" device between real end device and
>>>>> virtual one, not only a bridge, but also other kind. And as it's
>>>>> generic
>>>>> change, it should cover all such cases, the simplest example is:
>>>>> real_dev/macvlan/vlan.
>>>>
>>>> It is not generic if the additional information is a VLAN ID, that
>>>> construct does not apply to all types of virtual devices, that is part
>>>> of my issue with the extra VID that is being added. If this was a
>>>> void *
>>>> priv and any virtual device could pass up/down information that
>>>> might be
>>>> more acceptable.
>>>
>>> You mean to create smth like common struct pinned to "an address" and
>>> pass information not only like vid, but in parallel what ever user
>>> wanted.
>>> Even if pass vlan device pointer it still considered like an address
>>> continuation and same sync method is used w/o modification. And here vid
>>> is considered as part of address, by a big account address+vid it's a
>>> separate address, same happens with the pointer, address+pointer it's
>>> still separate address.
>>
>> That depends on the HW implementation, some switches do individual VLAN
>> learning (IVL) and some do shared VLAN learning (SVL) so whether the VID
>> becomes part of the address resolution logic is HW dependent, obviously
>> the more capable, the better (IVL).
>
> In my case IVL is only choice, as SVL is rather imitation, as each vlan
> has it's own address table anyway. And I mean not only vlan configuration
> above the bridge but also any simple configuration above real device.
>
> There is proposition to add smth like additional list of entries pinned
> to the each address as you proposed, but in a little bit different way.
>
> Pin the context pointers to each address if IVL config is enabled.
> Smth like
>
> +struct ctx_entry {
> + void *info;
> + unsigned type;
> + int synced;
> + int sync_cnt;
> + int refcount;
> +}
>
> Then in hw_addr struct add a ctx_list:
>
> struct netdev_hw_addr {
> struct list_head list;
> + struct list_head ctx_list;
> unsigned char addr[MAX_ADDR_LEN];
> unsigned char type;
> .....
> }
>
> Each ctx_entry contains pointer to some structure, in my case it could be
> pointer on vlan net_dev, and it can be marked with type
> VLAN_DEVICE_POINTER or
> else. In some other invisible cases it could be another one. Main
> difference
> between each of them is its pointer and type.
>
> And once each net dev calls mc/uc_sync these entires, if not synced, are
> synced
> along with addresses. But main difference that these ctx entires are
> pinned to
> the address, when addresses are pinned to the device.
>
> It can allow to bring information any new abstraction can apply.
> For real device the list can be empty or contain special entry to differ
> it from the vlan device entries, as could happen only some vlan is address
> owner.
>
> Not sure it can be much simpler but it definitely can introduce more
> capabilities, and potentially cover some other cases including your.
>
> Probably I need rename the series on smth like:
> "make addressing scheme to be IVL capable"
>
> and send RFCv2
>
> Thanks for your comment, it's really valuable.
Ivan, based on the recent submission I copied you on [1], it sounds like
we want to move ahead with your proposal to extend netdev_hw_addr with a
vid member.
On second thought, your approach is good and if we enclose the vid
member within an #if IS_ENABLED(CONFIG_VLAN)8021Q) we should be good for
most foreseeable use cases, if not, we can always introduce a variable
size/defined context in the future.
Can you resubmit this patch series as non-RFC in the next few days so I
can also repost mine [1] and take advantage of these changes for
multicast over VLAN when VLAN filtering is globally enabled on the device.
[1]: https://www.spinics.net/lists/netdev/msg544722.html
Thanks!
--
Florian
On Mon, Jan 21, 2019 at 03:37:41PM -0800, Florian Fainelli wrote:
>On 12/4/18 3:42 PM, Ivan Khoronzhuk wrote:
>> On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
[...]
>
>Ivan, based on the recent submission I copied you on [1], it sounds like
>we want to move ahead with your proposal to extend netdev_hw_addr with a
>vid member.
>
>On second thought, your approach is good and if we enclose the vid
>member within an #if IS_ENABLED(CONFIG_VLAN)8021Q) we should be good for
>most foreseeable use cases, if not, we can always introduce a variable
>size/defined context in the future.
>
>Can you resubmit this patch series as non-RFC in the next few days so I
>can also repost mine [1] and take advantage of these changes for
>multicast over VLAN when VLAN filtering is globally enabled on the device.
>
>[1]: https://www.spinics.net/lists/netdev/msg544722.html
>
>Thanks!
Yes, sure. I can start to do that in several days.
Just a little busy right now.
Just before doing this, maybe some comments could be added as it has more
attention now. Meanwhile I can send alternative variant but based on
real dev splitting addresses between vlans. In this approach it leaves address
space w/o vid extension but requires more changes to vlan core. Drawback here
that to change one address alg traverses all related vlan addresses, it can be
cpu/time wasteful, if it's done regularly, but saves memory....
Basically it's implemented locally in cpsw and requires more changes to move
it as some vlan core auxiliary functions to be reused. But it can work only
with vlans directly on top of real dev, which is fixable.
Core function here:
__hw_addr_ref_sync_dev
it is called only for address the link of which was increased/decreased, thus
update made only on one address, comparing it for every vlan dev.
It was added with this patch:
[1] net: core: dev_addr_lists: add auxiliary func to handle reference address
update e7946760de5852f32
And used by this patch:
[2] net: ethernet: ti: cpsw: fix vlan mcast 15180eca569bfe1d4d
So, idea is to move [2] to be vlan core auxiliary function to be reused
by NIC drivers.
But potentially it can bring a little more changes I assume:
1) add priv_flag |= IFF_IV_FLT (independent vlan filtering). It allows to reuse
this flag for farther changes, probably for per vlan allmulti or so.
2) real dev has to have complete list for vlans, not only their vids, but also
all vlandevs in device chain above it. So changes in add_vid can be required.
Vlan core can assign vlan dev pointer to real device only after it's completely
initialized. And for propagation reasons it requires every device in
infrastructure to be aware. That seems doable, but depends not only on me.
3) Move code from [2] to be auxiliary vlan core API for setting mc and uc.
From this patch only one function is cpsw specific: cpsw_set_mc(). The rest can
be applicable on every NIC supporting IFF_IV_FLT.
4) Move code from link below to do the same but for uc addresses:
https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=ucast_vlan_fix&id=ebc88a7d8758759322d9ff88f25f8bac51ce7219
here only one func cpsw specific: cpsw_set_uc()
the rest can be generic.
As third alternative, we can think about how to reduce memory for addresses by
reusing them or else, but this is as continuation of addr+vid approach, and API
probably would be the same.
Then all this can be compared for proper decision.
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On Tue, Jan 08, 2019 at 10:21:25AM -0800, Florian Fainelli wrote:
>On 1/7/19 9:01 PM, Florian Fainelli wrote:
>> Le 12/4/18 ? 4:04 PM, Ivan Khoronzhuk a ?crit?:
>>> On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>>>
>>> ...
>>>
>>>>>
>>>>> I was thinking also about pinned list of vlans to the address, but in
>>>>> this case this information also has to be synced by members of device
>>>>> chain,
>>>>> because it can be modified on any device level and it looks not very
>>>>> friendly,
>>>>> and at the end address space has addresses with pinned lists of vlans
>>>>> with
>>>>> their pointers. But keeping this stuff in sync is not simplest decision.
>>>>>
>>>>>
>>>>
>>>> I really think we are not communicating properly, it really seems to me
>>>> that if you had the information about the upper device trying to add an
>>>> address to the lower device filter's either through notification or call
>>>> to ndo_set_rxmode, you could be solving your problems. What are we
>>>> missing here?
>>>
>>> Sry, missed this one. The problem in getting? the owner of address.
>>> Just simple case: vlan/macvlan/real_dev or vlan/.../.../real_dev
>>>
>>> The real dev hasn't simple way to get vid the address belong to, or it has?
>>
>> Humm looks like your right, by the time the address lists are
>> synchronized (e.g: from = vlan_dev, to = real_dev), we lost that
>> information. It looks like I just managed to find such an use case
>> myself with VLAN filtering enabled on a bridge (so switch is VLAN aware)
>> and a VLAN device created on the bridge (br0.42) but with IGMP snooping
>> turned off (so we don't get HOST_MDB notifications with correct VLAN ID).
>>
>> Maybe keeping the "from" net_device within the address list that is
>> processed by ndo_set_rx_mode() will do the job though?
>>
>> Then you can do things like:
>>
>> if (is_vlan_dev(ha->dev) && ha->dev != dev)
>> vid = vlan_dev_vlan_id(ha->dev);
>>
>> and it should scale to any type of stacked device, regardless of VID or
>> something else that we need?
>>
>> Can you remind me of your use case again? Is it because your switch has
>> VLAN filtering enabled and you need to make sure that MC addresses on
>> VLAN device get programmed into the switch's multicast database with
>> correct VID?
>
>Ivan, can you see if the following would work for you:
>
>https://github.com/ffainelli/linux/commit/19e173ebdcdd32f5f5b5ef29049e35d33ad058f2
>
>this should be more scalable approach in that we can support HOST_MDB
>notifications from the bridge, the same way we would get notified about
>IGMP snooping from the bridge and this does not impact any other driver
>than those that elect to receive switchdev object notifications, which
>cpsw should really implement by now...
Sorry missed 2 last comments and seems like it's clear now.
I need to be in TO or CC list my filters can parse it, probably I need
to create more smart filters.
>--
>Florian
--
Regards,
Ivan Khoronzhuk
On Tue, Jan 22, 2019 at 03:12:41PM +0200, Ivan Khoronzhuk wrote:
>On Mon, Jan 21, 2019 at 03:37:41PM -0800, Florian Fainelli wrote:
>>On 12/4/18 3:42 PM, Ivan Khoronzhuk wrote:
>>>On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>
>[...]
>
>>
>>Ivan, based on the recent submission I copied you on [1], it sounds like
>>we want to move ahead with your proposal to extend netdev_hw_addr with a
>>vid member.
>>
>>On second thought, your approach is good and if we enclose the vid
>>member within an #if IS_ENABLED(CONFIG_VLAN)8021Q) we should be good for
>>most foreseeable use cases, if not, we can always introduce a variable
>>size/defined context in the future.
>>
>>Can you resubmit this patch series as non-RFC in the next few days so I
>>can also repost mine [1] and take advantage of these changes for
>>multicast over VLAN when VLAN filtering is globally enabled on the device.
>>
>>[1]: https://www.spinics.net/lists/netdev/msg544722.html
>>
>>Thanks!
>
>Yes, sure. I can start to do that in several days.
>Just a little busy right now.
>
>Just before doing this, maybe some comments could be added as it has more
>attention now. Meanwhile I can send alternative variant but based on
>real dev splitting addresses between vlans. In this approach it leaves address
>space w/o vid extension but requires more changes to vlan core. Drawback here
>that to change one address alg traverses all related vlan addresses, it can be
>cpu/time wasteful, if it's done regularly, but saves memory....
>
>Basically it's implemented locally in cpsw and requires more changes to move
>it as some vlan core auxiliary functions to be reused. But it can work only
>with vlans directly on top of real dev, which is fixable.
>
>Core function here:
>__hw_addr_ref_sync_dev
>it is called only for address the link of which was increased/decreased, thus
>update made only on one address, comparing it for every vlan dev.
>
>It was added with this patch:
>[1] net: core: dev_addr_lists: add auxiliary func to handle reference
>address update e7946760de5852f32
>
>And used by this patch:
>[2] net: ethernet: ti: cpsw: fix vlan mcast 15180eca569bfe1d4d
>
>So, idea is to move [2] to be vlan core auxiliary function to be reused
>by NIC drivers.
>
>But potentially it can bring a little more changes I assume:
>
>1) add priv_flag |= IFF_IV_FLT (independent vlan filtering). It allows to reuse
>this flag for farther changes, probably for per vlan allmulti or so.
>
>2) real dev has to have complete list for vlans, not only their vids, but also
>all vlandevs in device chain above it. So changes in add_vid can be required.
>Vlan core can assign vlan dev pointer to real device only after it's completely
>initialized. And for propagation reasons it requires every device in
>infrastructure to be aware. That seems doable, but depends not only on me.
>
>3) Move code from [2] to be auxiliary vlan core API for setting mc and uc.
>From this patch only one function is cpsw specific: cpsw_set_mc(). The rest can
>be applicable on every NIC supporting IFF_IV_FLT.
>
>4) Move code from link below to do the same but for uc addresses:
>https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=ucast_vlan_fix&id=ebc88a7d8758759322d9ff88f25f8bac51ce7219
>here only one func cpsw specific: cpsw_set_uc()
>the rest can be generic.
>
>As third alternative, we can think about how to reduce memory for addresses by
>reusing them or else, but this is as continuation of addr+vid approach, and API
>probably would be the same.
>
>Then all this can be compared for proper decision.
Hi Florian,
After several more investigations and tries probably better left this
idea as is.
Here actually several explanations for this:
1) If even assume that we can get access to vlan devices in the above ndev
tree (we can) that doesn't guarantee that receive vlan filters are set
replicating this structure. For example bond device can have one active slave
but both of them in the tree having vid set, in this case addresses are
syched only with active slave, no filters should be applied to not active slave.
this can be achieved only each address has vid context.
2) According to 1) rx filters device structure can be created while mc_sync()
in each rx_mode(), and then used as orthogonal info. I've tried and it looks
not cool and consumes anyway memory and even if it's less it's still not very
scalable. (+ no normal signal "in complex structure case" when address should
be undated to avoid redundant cpu cycles). Not sure it can have practical
results and be universal enouph.
3) Assuming that every device in the tree (bond, team or else) is legal to
modify its own address space, the real end device cannot be sure the vlan device
address spaces reflects vid addresses that device tree want's from him.
According to this each address in address space must hold its own context at
every device and this context is comparable with address size.
>-- Regards,
>Ivan Khoronzhuk
--
Regards,
Ivan Khoronzhuk
On February 13, 2019 8:17:16 AM PST, Ivan Khoronzhuk <[email protected]> wrote:
>On Tue, Jan 22, 2019 at 03:12:41PM +0200, Ivan Khoronzhuk wrote:
>>On Mon, Jan 21, 2019 at 03:37:41PM -0800, Florian Fainelli wrote:
>>>On 12/4/18 3:42 PM, Ivan Khoronzhuk wrote:
>>>>On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>>
>>[...]
>>
>>>
>>>Ivan, based on the recent submission I copied you on [1], it sounds
>like
>>>we want to move ahead with your proposal to extend netdev_hw_addr
>with a
>>>vid member.
>>>
>>>On second thought, your approach is good and if we enclose the vid
>>>member within an #if IS_ENABLED(CONFIG_VLAN)8021Q) we should be good
>for
>>>most foreseeable use cases, if not, we can always introduce a
>variable
>>>size/defined context in the future.
>>>
>>>Can you resubmit this patch series as non-RFC in the next few days so
>I
>>>can also repost mine [1] and take advantage of these changes for
>>>multicast over VLAN when VLAN filtering is globally enabled on the
>device.
>>>
>>>[1]: https://www.spinics.net/lists/netdev/msg544722.html
>>>
>>>Thanks!
>>
>>Yes, sure. I can start to do that in several days.
>>Just a little busy right now.
>>
>>Just before doing this, maybe some comments could be added as it has
>more
>>attention now. Meanwhile I can send alternative variant but based on
>>real dev splitting addresses between vlans. In this approach it leaves
>address
>>space w/o vid extension but requires more changes to vlan core.
>Drawback here
>>that to change one address alg traverses all related vlan addresses,
>it can be
>>cpu/time wasteful, if it's done regularly, but saves memory....
>>
>>Basically it's implemented locally in cpsw and requires more changes
>to move
>>it as some vlan core auxiliary functions to be reused. But it can work
>only
>>with vlans directly on top of real dev, which is fixable.
>>
>>Core function here:
>>__hw_addr_ref_sync_dev
>>it is called only for address the link of which was
>increased/decreased, thus
>>update made only on one address, comparing it for every vlan dev.
>>
>>It was added with this patch:
>>[1] net: core: dev_addr_lists: add auxiliary func to handle reference
>>address update e7946760de5852f32
>>
>>And used by this patch:
>>[2] net: ethernet: ti: cpsw: fix vlan mcast 15180eca569bfe1d4d
>>
>>So, idea is to move [2] to be vlan core auxiliary function to be
>reused
>>by NIC drivers.
>>
>>But potentially it can bring a little more changes I assume:
>>
>>1) add priv_flag |= IFF_IV_FLT (independent vlan filtering). It allows
>to reuse
>>this flag for farther changes, probably for per vlan allmulti or so.
>>
>>2) real dev has to have complete list for vlans, not only their vids,
>but also
>>all vlandevs in device chain above it. So changes in add_vid can be
>required.
>>Vlan core can assign vlan dev pointer to real device only after it's
>completely
>>initialized. And for propagation reasons it requires every device in
>>infrastructure to be aware. That seems doable, but depends not only on
>me.
>>
>>3) Move code from [2] to be auxiliary vlan core API for setting mc and
>uc.
>>From this patch only one function is cpsw specific: cpsw_set_mc(). The
>rest can
>>be applicable on every NIC supporting IFF_IV_FLT.
>>
>>4) Move code from link below to do the same but for uc addresses:
>>https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=ucast_vlan_fix&id=ebc88a7d8758759322d9ff88f25f8bac51ce7219
>>here only one func cpsw specific: cpsw_set_uc()
>>the rest can be generic.
>>
>>As third alternative, we can think about how to reduce memory for
>addresses by
>>reusing them or else, but this is as continuation of addr+vid
>approach, and API
>>probably would be the same.
>>
>>Then all this can be compared for proper decision.
>
>
>Hi Florian,
>
>After several more investigations and tries probably better left this
>idea as is.
Thank you for keeping the thread alive, does that mean you are going to resubmit this patch series as-is (rebased) or are you saying that you are abandoning the idea and leaving the situation the way it is in cpsw?
>
>Here actually several explanations for this:
>1) If even assume that we can get access to vlan devices in the above
>ndev
>tree (we can) that doesn't guarantee that receive vlan filters are set
>replicating this structure. For example bond device can have one active
>slave
>but both of them in the tree having vid set, in this case addresses are
>syched only with active slave, no filters should be applied to not
>active slave.
>this can be achieved only each address has vid context.
>
>2) According to 1) rx filters device structure can be created while
>mc_sync()
>in each rx_mode(), and then used as orthogonal info. I've tried and it
>looks
>not cool and consumes anyway memory and even if it's less it's still
>not very
>scalable. (+ no normal signal "in complex structure case" when address
>should
>be undated to avoid redundant cpu cycles). Not sure it can have
>practical
>results and be universal enouph.
>
>3) Assuming that every device in the tree (bond, team or else) is legal
>to
>modify its own address space, the real end device cannot be sure the
>vlan device
>address spaces reflects vid addresses that device tree want's from him.
>According to this each address in address space must hold its own
>context at
>every device and this context is comparable with address size.
>
>>-- Regards,
>>Ivan Khoronzhuk
--
Florian
On Wed, Feb 13, 2019 at 08:49:39PM -0800, Florian Fainelli wrote:
>
>
>On February 13, 2019 8:17:16 AM PST, Ivan Khoronzhuk <[email protected]> wrote:
>>On Tue, Jan 22, 2019 at 03:12:41PM +0200, Ivan Khoronzhuk wrote:
>>>On Mon, Jan 21, 2019 at 03:37:41PM -0800, Florian Fainelli wrote:
>>>>On 12/4/18 3:42 PM, Ivan Khoronzhuk wrote:
>>>>>On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>>>
>>>[...]
>>>
>>>>
>>>>Ivan, based on the recent submission I copied you on [1], it sounds
>>like
>>>>we want to move ahead with your proposal to extend netdev_hw_addr
>>with a
>>>>vid member.
>>>>
>>>>On second thought, your approach is good and if we enclose the vid
>>>>member within an #if IS_ENABLED(CONFIG_VLAN)8021Q) we should be good
>>for
>>>>most foreseeable use cases, if not, we can always introduce a
>>variable
>>>>size/defined context in the future.
>>>>
>>>>Can you resubmit this patch series as non-RFC in the next few days so
>>I
>>>>can also repost mine [1] and take advantage of these changes for
>>>>multicast over VLAN when VLAN filtering is globally enabled on the
>>device.
>>>>
>>>>[1]: https://www.spinics.net/lists/netdev/msg544722.html
>>>>
>>>>Thanks!
>>>
>>>Yes, sure. I can start to do that in several days.
>>>Just a little busy right now.
>>>
>>>Just before doing this, maybe some comments could be added as it has
>>more
>>>attention now. Meanwhile I can send alternative variant but based on
>>>real dev splitting addresses between vlans. In this approach it leaves
>>address
>>>space w/o vid extension but requires more changes to vlan core.
>>Drawback here
>>>that to change one address alg traverses all related vlan addresses,
>>it can be
>>>cpu/time wasteful, if it's done regularly, but saves memory....
>>>
>>>Basically it's implemented locally in cpsw and requires more changes
>>to move
>>>it as some vlan core auxiliary functions to be reused. But it can work
>>only
>>>with vlans directly on top of real dev, which is fixable.
>>>
>>>Core function here:
>>>__hw_addr_ref_sync_dev
>>>it is called only for address the link of which was
>>increased/decreased, thus
>>>update made only on one address, comparing it for every vlan dev.
>>>
>>>It was added with this patch:
>>>[1] net: core: dev_addr_lists: add auxiliary func to handle reference
>>>address update e7946760de5852f32
>>>
>>>And used by this patch:
>>>[2] net: ethernet: ti: cpsw: fix vlan mcast 15180eca569bfe1d4d
>>>
>>>So, idea is to move [2] to be vlan core auxiliary function to be
>>reused
>>>by NIC drivers.
>>>
>>>But potentially it can bring a little more changes I assume:
>>>
>>>1) add priv_flag |= IFF_IV_FLT (independent vlan filtering). It allows
>>to reuse
>>>this flag for farther changes, probably for per vlan allmulti or so.
>>>
>>>2) real dev has to have complete list for vlans, not only their vids,
>>but also
>>>all vlandevs in device chain above it. So changes in add_vid can be
>>required.
>>>Vlan core can assign vlan dev pointer to real device only after it's
>>completely
>>>initialized. And for propagation reasons it requires every device in
>>>infrastructure to be aware. That seems doable, but depends not only on
>>me.
>>>
>>>3) Move code from [2] to be auxiliary vlan core API for setting mc and
>>uc.
>>>From this patch only one function is cpsw specific: cpsw_set_mc(). The
>>rest can
>>>be applicable on every NIC supporting IFF_IV_FLT.
>>>
>>>4) Move code from link below to do the same but for uc addresses:
>>>https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=ucast_vlan_fix&id=ebc88a7d8758759322d9ff88f25f8bac51ce7219
>>>here only one func cpsw specific: cpsw_set_uc()
>>>the rest can be generic.
>>>
>>>As third alternative, we can think about how to reduce memory for
>>addresses by
>>>reusing them or else, but this is as continuation of addr+vid
>>approach, and API
>>>probably would be the same.
>>>
>>>Then all this can be compared for proper decision.
>>
>>
>>Hi Florian,
>>
>>After several more investigations and tries probably better left this
>>idea as is.
>
>Thank you for keeping the thread alive, does that mean you are going to resubmit this patch series as-is (rebased) or are you saying that you are abandoning the idea and leaving the situation the way it is in cpsw?
I will resubmit this one. But:
I have to try one more approach before this.
The idea is to create simple rx flt device tree while mc/us sync.
Then use it at real device to dispatch addresses.
It increases hw_addr struct a little and code base,
But:
- no need to keep linearly all vlan addresses in one address space.
- replicates RX filtering struct of net devices,
(but not logical tree of netdevs)
- keeps devs info per address.
- no need to change addr lenth and modify existent API
- access at any net dev to above rx flt device structure per address
- potentially can be use not only for vlan devs identification but for
other rx path offloads.
Idea is simple but not completely verified it yet,
need a little bit more time to prove/clean ...or drop it.
>
>>
>>Here actually several explanations for this:
>>1) If even assume that we can get access to vlan devices in the above
>>ndev
>>tree (we can) that doesn't guarantee that receive vlan filters are set
>>replicating this structure. For example bond device can have one active
>>slave
>>but both of them in the tree having vid set, in this case addresses are
>>syched only with active slave, no filters should be applied to not
>>active slave.
>>this can be achieved only each address has vid context.
>>
>>2) According to 1) rx filters device structure can be created while
>>mc_sync()
>>in each rx_mode(), and then used as orthogonal info. I've tried and it
>>looks
>>not cool and consumes anyway memory and even if it's less it's still
>>not very
>>scalable. (+ no normal signal "in complex structure case" when address
>>should
>>be undated to avoid redundant cpu cycles). Not sure it can have
>>practical
>>results and be universal enouph.
>>
>>3) Assuming that every device in the tree (bond, team or else) is legal
>>to
>>modify its own address space, the real end device cannot be sure the
>>vlan device
>>address spaces reflects vid addresses that device tree want's from him.
>>According to this each address in address space must hold its own
>>context at
>>every device and this context is comparable with address size.
>>
>>>-- Regards,
>>>Ivan Khoronzhuk
>
>--
>Florian
--
Regards,
Ivan Khoronzhuk