(Applies to alacrityvm.git/master:34534534)
This patchset implements a vbus venet device with a
macvlan backend.
These patches allow an alacrityvm guest to send and receive
directly over a macvlan, avoiding the bridge entirely.
This driver inherits all of the benefits of the work done
to date on vbus/venet driver(SAR offloading, zero-copy
in the guest->host path, configurable tx-complete mitigation,
interrupt coalescing at the vbus level). Some of the work to
re-factor and share the common code between venet-tap and
venet-macvlan was done prior because it should be generally
useful to anyone wanting to implement a venet type of device.
Once the driver is built and installed, you may use it
just like you would a venet-tap device. In order to
instantiate a venet-macvlan, there are just two differences
from the procedure to instantiating a venet-tap. In order
to create the venet-macvlan device, just:
echo venet-macvlan > /config/vbus/devices/<device-name>/type
and
echo "lower-devicename" > /sys/vbus/devices/<device-name>/ll_ifname
where lower-devicename is something like eth0, eth1, eth2 etc.
The second step associates the lower-devicename, usually
a physical device, with the venet-macvlan device being created.
This step must be perform prior to enabling the venet-macvlan
device.
After that, a guest can make use of the venet-macvlan in
exactly the same manner as a venet-tap. In fact, the guest
actually sees venet-tap and venet-macvlan as identical
types of the devices on the vbus.
Using the venet-macvlan driver will reduce some overhead by
eliminating the linux bridge from the send and receive
paths. For a lightly loaded network segment and system,
we have measured this to be aproximately 1-3 us per side
depending on what hardware is involved.
Since this driver layered over the macvlan driver, it will
have that same limitations as the macvlan driver. For example,
forwarding between macvlan devices on the same host is not
supported. This driver targeted toward VEPA environments as
described by the 'Edge Virtual Bridging' working group.
---
Patrick Mullaney (4):
venet-macvlan: add new driver to connect a venet to a macvlan netdevice
venetdev: support common venet netdev routines
macvlan: allow in-kernel modules to create and manage macvlan devices
macvlan: derived from Arnd Bergmann's patch for macvtap
drivers/net/macvlan.c | 105 +++--
drivers/net/vbus-enet.c | 8
include/linux/macvlan.h | 43 ++
include/linux/venet.h | 5
kernel/vbus/devices/venet/Kconfig | 11 +
kernel/vbus/devices/venet/Makefile | 10 -
kernel/vbus/devices/venet/device.c | 53 ++-
kernel/vbus/devices/venet/macvlan.c | 598 +++++++++++++++++++++++++++++++
kernel/vbus/devices/venet/venetdevice.h | 12 +
9 files changed, 785 insertions(+), 60 deletions(-)
create mode 100644 include/linux/macvlan.h
create mode 100644 kernel/vbus/devices/venet/macvlan.c
This is in the series because this has not gone upstream yet and
the subsequent patches depend on it. This patch includes only the
basic framework for overriding the receive path and the macvlan header
was moved to allow modules outside of driver/net to use it.
Signed-off-by: Patrick Mullaney <[email protected]>
---
drivers/net/macvlan.c | 39 +++++++++++++++------------------------
include/linux/macvlan.h | 37 +++++++++++++++++++++++++++++++++++++
2 files changed, 52 insertions(+), 24 deletions(-)
create mode 100644 include/linux/macvlan.h
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 99eed9f..0a389b8 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -30,22 +30,7 @@
#include <linux/if_macvlan.h>
#include <net/rtnetlink.h>
-#define MACVLAN_HASH_SIZE (1 << BITS_PER_BYTE)
-
-struct macvlan_port {
- struct net_device *dev;
- struct hlist_head vlan_hash[MACVLAN_HASH_SIZE];
- struct list_head vlans;
-};
-
-struct macvlan_dev {
- struct net_device *dev;
- struct list_head list;
- struct hlist_node hlist;
- struct macvlan_port *port;
- struct net_device *lowerdev;
-};
-
+#include <linux/macvlan.h>
static struct macvlan_dev *macvlan_hash_lookup(const struct macvlan_port *port,
const unsigned char *addr)
@@ -135,7 +120,7 @@ static void macvlan_broadcast(struct sk_buff *skb,
else
nskb->pkt_type = PACKET_MULTICAST;
- netif_rx(nskb);
+ vlan->receive(nskb);
}
}
}
@@ -180,11 +165,11 @@ static struct sk_buff *macvlan_handle_frame(struct sk_buff *skb)
skb->dev = dev;
skb->pkt_type = PACKET_HOST;
- netif_rx(skb);
+ vlan->receive(skb);
return NULL;
}
-static int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev)
+int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
const struct macvlan_dev *vlan = netdev_priv(dev);
unsigned int len = skb->len;
@@ -202,6 +187,7 @@ static int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev)
}
return NETDEV_TX_OK;
}
+EXPORT_SYMBOL_GPL(macvlan_start_xmit);
static int macvlan_hard_header(struct sk_buff *skb, struct net_device *dev,
unsigned short type, const void *daddr,
@@ -412,7 +398,7 @@ static const struct net_device_ops macvlan_netdev_ops = {
.ndo_validate_addr = eth_validate_addr,
};
-static void macvlan_setup(struct net_device *dev)
+void macvlan_setup(struct net_device *dev)
{
ether_setup(dev);
@@ -423,6 +409,7 @@ static void macvlan_setup(struct net_device *dev)
dev->ethtool_ops = &macvlan_ethtool_ops;
dev->tx_queue_len = 0;
}
+EXPORT_SYMBOL_GPL(macvlan_setup);
static int macvlan_port_create(struct net_device *dev)
{
@@ -472,7 +459,7 @@ static void macvlan_transfer_operstate(struct net_device *dev)
}
}
-static int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
+int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
{
if (tb[IFLA_ADDRESS]) {
if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
@@ -482,9 +469,10 @@ static int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
}
return 0;
}
+EXPORT_SYMBOL_GPL(macvlan_validate);
-static int macvlan_newlink(struct net_device *dev,
- struct nlattr *tb[], struct nlattr *data[])
+int macvlan_newlink(struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[])
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct macvlan_port *port;
@@ -524,6 +512,7 @@ static int macvlan_newlink(struct net_device *dev,
vlan->lowerdev = lowerdev;
vlan->dev = dev;
vlan->port = port;
+ vlan->receive = netif_rx;
err = register_netdevice(dev);
if (err < 0)
@@ -533,8 +522,9 @@ static int macvlan_newlink(struct net_device *dev,
macvlan_transfer_operstate(dev);
return 0;
}
+EXPORT_SYMBOL_GPL(macvlan_newlink);
-static void macvlan_dellink(struct net_device *dev)
+void macvlan_dellink(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct macvlan_port *port = vlan->port;
@@ -545,6 +535,7 @@ static void macvlan_dellink(struct net_device *dev)
if (list_empty(&port->vlans))
macvlan_port_destroy(port->dev);
}
+EXPORT_SYMBOL_GPL(macvlan_dellink);
static struct rtnl_link_ops macvlan_link_ops __read_mostly = {
.kind = "macvlan",
diff --git a/include/linux/macvlan.h b/include/linux/macvlan.h
new file mode 100644
index 0000000..3f3c6c3
--- /dev/null
+++ b/include/linux/macvlan.h
@@ -0,0 +1,37 @@
+#ifndef _MACVLAN_H
+#define _MACVLAN_H
+
+#include <linux/netdevice.h>
+#include <linux/netlink.h>
+#include <linux/list.h>
+
+#define MACVLAN_HASH_SIZE (1 << BITS_PER_BYTE)
+
+struct macvlan_port {
+ struct net_device *dev;
+ struct hlist_head vlan_hash[MACVLAN_HASH_SIZE];
+ struct list_head vlans;
+};
+
+struct macvlan_dev {
+ struct net_device *dev;
+ struct list_head list;
+ struct hlist_node hlist;
+ struct macvlan_port *port;
+ struct net_device *lowerdev;
+
+ int (*receive)(struct sk_buff *skb);
+};
+
+extern int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev);
+
+extern void macvlan_setup(struct net_device *dev);
+
+extern int macvlan_validate(struct nlattr *tb[], struct nlattr *data[]);
+
+extern int macvlan_newlink(struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[]);
+
+extern void macvlan_dellink(struct net_device *dev);
+
+#endif /* _MACVLAN_H */
The macvlan driver didn't allow for creation/deletion of devices
by other in-kernel modules. This patch provides common routines
for both in-kernel and netlink based management. This patch
also enables macvlan device support for gro for lower level
devices that support gro.
Signed-off-by: Patrick Mullaney <[email protected]>
---
drivers/net/macvlan.c | 72 +++++++++++++++++++++++++++++++----------------
include/linux/macvlan.h | 6 ++++
2 files changed, 53 insertions(+), 25 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 0a389b8..6b98b26 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -208,7 +208,7 @@ static const struct header_ops macvlan_hard_header_ops = {
.cache_update = eth_header_cache_update,
};
-static int macvlan_open(struct net_device *dev)
+int macvlan_open(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan->lowerdev;
@@ -235,7 +235,7 @@ out:
return err;
}
-static int macvlan_stop(struct net_device *dev)
+int macvlan_stop(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan->lowerdev;
@@ -316,7 +316,7 @@ static struct lock_class_key macvlan_netdev_addr_lock_key;
#define MACVLAN_FEATURES \
(NETIF_F_SG | NETIF_F_ALL_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \
NETIF_F_GSO | NETIF_F_TSO | NETIF_F_UFO | NETIF_F_GSO_ROBUST | \
- NETIF_F_TSO_ECN | NETIF_F_TSO6)
+ NETIF_F_TSO_ECN | NETIF_F_TSO6 | NETIF_F_GRO)
#define MACVLAN_STATE_MASK \
((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT))
@@ -440,7 +440,7 @@ static void macvlan_port_destroy(struct net_device *dev)
kfree(port);
}
-static void macvlan_transfer_operstate(struct net_device *dev)
+void macvlan_transfer_operstate(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
const struct net_device *lowerdev = vlan->lowerdev;
@@ -458,6 +458,7 @@ static void macvlan_transfer_operstate(struct net_device *dev)
netif_carrier_off(dev);
}
}
+EXPORT_SYMBOL_GPL(macvlan_transfer_operstate);
int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
{
@@ -471,11 +472,47 @@ int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
}
EXPORT_SYMBOL_GPL(macvlan_validate);
-int macvlan_newlink(struct net_device *dev,
- struct nlattr *tb[], struct nlattr *data[])
+int macvlan_link_lowerdev(struct net_device *dev,
+ struct net_device *lowerdev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct macvlan_port *port;
+ int err = 0;
+
+ if (lowerdev->macvlan_port == NULL) {
+ err = macvlan_port_create(lowerdev);
+ if (err < 0)
+ return err;
+ }
+ port = lowerdev->macvlan_port;
+
+ vlan->lowerdev = lowerdev;
+ vlan->dev = dev;
+ vlan->port = port;
+ vlan->receive = netif_rx;
+
+ macvlan_init(dev);
+
+ list_add_tail(&vlan->list, &port->vlans);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(macvlan_link_lowerdev);
+
+void macvlan_unlink_lowerdev(struct net_device *dev)
+{
+ struct macvlan_dev *vlan = netdev_priv(dev);
+ struct macvlan_port *port = vlan->port;
+
+ list_del(&vlan->list);
+
+ if (list_empty(&port->vlans))
+ macvlan_port_destroy(port->dev);
+}
+EXPORT_SYMBOL_GPL(macvlan_unlink_lowerdev);
+
+int macvlan_newlink(struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[])
+{
struct net_device *lowerdev;
int err;
@@ -502,23 +539,14 @@ int macvlan_newlink(struct net_device *dev,
if (!tb[IFLA_ADDRESS])
random_ether_addr(dev->dev_addr);
- if (lowerdev->macvlan_port == NULL) {
- err = macvlan_port_create(lowerdev);
- if (err < 0)
- return err;
- }
- port = lowerdev->macvlan_port;
-
- vlan->lowerdev = lowerdev;
- vlan->dev = dev;
- vlan->port = port;
- vlan->receive = netif_rx;
+ err = macvlan_link_lowerdev(dev, lowerdev);
+ if (err < 0)
+ return err;
err = register_netdevice(dev);
if (err < 0)
return err;
- list_add_tail(&vlan->list, &port->vlans);
macvlan_transfer_operstate(dev);
return 0;
}
@@ -526,14 +554,8 @@ EXPORT_SYMBOL_GPL(macvlan_newlink);
void macvlan_dellink(struct net_device *dev)
{
- struct macvlan_dev *vlan = netdev_priv(dev);
- struct macvlan_port *port = vlan->port;
-
- list_del(&vlan->list);
+ macvlan_unlink_lowerdev(dev);
unregister_netdevice(dev);
-
- if (list_empty(&port->vlans))
- macvlan_port_destroy(port->dev);
}
EXPORT_SYMBOL_GPL(macvlan_dellink);
diff --git a/include/linux/macvlan.h b/include/linux/macvlan.h
index 3f3c6c3..cf8738a 100644
--- a/include/linux/macvlan.h
+++ b/include/linux/macvlan.h
@@ -24,6 +24,12 @@ struct macvlan_dev {
};
extern int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev);
+extern int macvlan_link_lowerdev(struct net_device *dev,
+ struct net_device *lowerdev);
+
+extern void macvlan_unlink_lowerdev(struct net_device *dev);
+
+extern void macvlan_transfer_operstate(struct net_device *dev);
extern void macvlan_setup(struct net_device *dev);
This patch breaks out common netdev routines that allow
a device to pass venetdev pointer as opposed to assuming
it is the priv member of the netdevice.
Signed-off-by: Patrick Mullaney <[email protected]>
---
kernel/vbus/devices/venet/device.c | 43 ++++++++++++++++++++++++++-----
kernel/vbus/devices/venet/venetdevice.h | 5 ++++
2 files changed, 41 insertions(+), 7 deletions(-)
diff --git a/kernel/vbus/devices/venet/device.c b/kernel/vbus/devices/venet/device.c
index d49ba7f..9fd94ca 100644
--- a/kernel/vbus/devices/venet/device.c
+++ b/kernel/vbus/devices/venet/device.c
@@ -228,9 +228,8 @@ venetdev_txq_notify_dec(struct venetdev *priv)
*/
int
-venetdev_netdev_open(struct net_device *dev)
+venetdev_open(struct venetdev *priv)
{
- struct venetdev *priv = netdev_priv(dev);
unsigned long flags;
BUG_ON(priv->netif.link);
@@ -260,7 +259,7 @@ venetdev_netdev_open(struct net_device *dev)
priv->netif.link = true;
if (!priv->vbus.link)
- netif_carrier_off(dev);
+ netif_carrier_off(priv->netif.dev);
spin_unlock_irqrestore(&priv->lock, flags);
@@ -268,9 +267,16 @@ venetdev_netdev_open(struct net_device *dev)
}
int
-venetdev_netdev_stop(struct net_device *dev)
+venetdev_netdev_open(struct net_device *dev)
{
struct venetdev *priv = netdev_priv(dev);
+
+ return venetdev_open(priv);
+}
+
+int
+venetdev_stop(struct venetdev *priv)
+{
unsigned long flags;
int needs_stop = false;
@@ -296,6 +302,14 @@ venetdev_netdev_stop(struct net_device *dev)
return 0;
}
+int
+venetdev_netdev_stop(struct net_device *dev)
+{
+ struct venetdev *priv = netdev_priv(dev);
+
+ return venetdev_stop(priv);
+}
+
/*
* Configuration changes (passed on by ifconfig)
*/
@@ -1541,10 +1555,10 @@ venetdev_apply_backpressure(struct venetdev *priv)
* the netif flow control is still managed by the actual consumer,
* thereby avoiding the creation of an extra servo-loop to the equation.
*/
+
int
-venetdev_netdev_tx(struct sk_buff *skb, struct net_device *dev)
+venetdev_xmit(struct sk_buff *skb, struct venetdev *priv)
{
- struct venetdev *priv = netdev_priv(dev);
struct ioq *ioq = NULL;
unsigned long flags;
@@ -1585,6 +1599,15 @@ flowcontrol:
return NETDEV_TX_BUSY;
}
+int
+venetdev_netdev_tx(struct sk_buff *skb, struct net_device *dev)
+{
+ struct venetdev *priv = netdev_priv(dev);
+
+ return venetdev_xmit(skb, priv);
+}
+
+
/*
* Ioctl commands
*/
@@ -1599,10 +1622,16 @@ venetdev_netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
* Return statistics to the caller
*/
struct net_device_stats *
+venetdev_get_stats(struct venetdev *priv)
+{
+ return &priv->netif.stats;
+}
+
+struct net_device_stats *
venetdev_netdev_stats(struct net_device *dev)
{
struct venetdev *priv = netdev_priv(dev);
- return &priv->netif.stats;
+ return venetdev_get_stats(priv);
}
/*
diff --git a/kernel/vbus/devices/venet/venetdevice.h b/kernel/vbus/devices/venet/venetdevice.h
index 9a60a2e..71c9f0f 100644
--- a/kernel/vbus/devices/venet/venetdevice.h
+++ b/kernel/vbus/devices/venet/venetdevice.h
@@ -142,6 +142,11 @@ int venetdev_netdev_ioctl(struct net_device *dev, struct ifreq *rq,
int cmd);
struct net_device_stats *venetdev_netdev_stats(struct net_device *dev);
+int venetdev_open(struct venetdev *dev);
+int venetdev_stop(struct venetdev *dev);
+int venetdev_xmit(struct sk_buff *skb, struct venetdev *dev);
+struct net_device_stats *venetdev_get_stats(struct venetdev *dev);
+
static inline void venetdev_netdev_unregister(struct venetdev *priv)
{
if (priv->netif.enabled) {
This driver implements a macvlan device as a venet device that can
be connected to vbus. Since it is a macvlan device, it provides
a more direct path to the underlying adapter by avoiding the
bridge.
Signed-off-by: Patrick Mullaney <[email protected]>
---
drivers/net/vbus-enet.c | 8
include/linux/venet.h | 5
kernel/vbus/devices/venet/Kconfig | 11 +
kernel/vbus/devices/venet/Makefile | 10 -
kernel/vbus/devices/venet/device.c | 10 -
kernel/vbus/devices/venet/macvlan.c | 598 +++++++++++++++++++++++++++++++
kernel/vbus/devices/venet/venetdevice.h | 7
7 files changed, 642 insertions(+), 7 deletions(-)
create mode 100644 kernel/vbus/devices/venet/macvlan.c
diff --git a/drivers/net/vbus-enet.c b/drivers/net/vbus-enet.c
index 29b388f..9985020 100644
--- a/drivers/net/vbus-enet.c
+++ b/drivers/net/vbus-enet.c
@@ -832,6 +832,14 @@ vbus_enet_tx_start(struct sk_buff *skb, struct net_device *dev)
vsg->cookie = (u64)(unsigned long)skb;
vsg->len = skb->len;
+ vsg->phdr.transport = skb_transport_header(skb) - skb->head;
+ vsg->phdr.network = skb_network_header(skb) - skb->head;
+
+ if (skb_mac_header_was_set(skb))
+ vsg->phdr.mac = skb_mac_header(skb) - skb->head;
+ else
+ vsg->phdr.mac = ~0U;
+
if (skb->ip_summed == CHECKSUM_PARTIAL) {
vsg->flags |= VENET_SG_FLAG_NEEDS_CSUM;
vsg->csum.start = skb->csum_start - skb_headroom(skb);
diff --git a/include/linux/venet.h b/include/linux/venet.h
index 0578d79..4e5fdf4 100644
--- a/include/linux/venet.h
+++ b/include/linux/venet.h
@@ -78,6 +78,11 @@ struct venet_sg {
__u16 hdrlen;
__u16 size;
} gso;
+ struct {
+ __u32 mac; /* mac offset */
+ __u32 network; /* network offset */
+ __u32 transport; /* transport offset */
+ } phdr;
__u32 count; /* nr of iovs */
struct venet_iov iov[1];
};
diff --git a/kernel/vbus/devices/venet/Kconfig b/kernel/vbus/devices/venet/Kconfig
index 4f89afb..c3b1ac6 100644
--- a/kernel/vbus/devices/venet/Kconfig
+++ b/kernel/vbus/devices/venet/Kconfig
@@ -20,3 +20,14 @@ config VBUS_VENETTAP
If unsure, say N
+config VBUS_VENETMACV
+ tristate "Virtual-Bus Ethernet MACVLAN Device"
+ depends on VBUS_DEVICES && MACVLAN
+ select VBUS_VENETDEV
+ default n
+ help
+ Provides a vbus based virtual ethernet adapter with a macvlan
+ device as its backend.
+
+ If unsure, say N
+
diff --git a/kernel/vbus/devices/venet/Makefile b/kernel/vbus/devices/venet/Makefile
index 185d825..5bf7cb4 100644
--- a/kernel/vbus/devices/venet/Makefile
+++ b/kernel/vbus/devices/venet/Makefile
@@ -1,7 +1,7 @@
-venet-device-objs += device.o
-ifneq ($(CONFIG_VBUS_VENETTAP),n)
-venet-device-objs += tap.o
-endif
+venet-tap-objs := device.o tap.o
+venet-macvlan-objs := device.o macvlan.o
+
+obj-$(CONFIG_VBUS_VENETTAP) += venet-tap.o
+obj-$(CONFIG_VBUS_VENETMACV) += venet-macvlan.o
-obj-$(CONFIG_VBUS_VENETDEV) += venet-device.o
diff --git a/kernel/vbus/devices/venet/device.c b/kernel/vbus/devices/venet/device.c
index 9fd94ca..a30df94 100644
--- a/kernel/vbus/devices/venet/device.c
+++ b/kernel/vbus/devices/venet/device.c
@@ -776,6 +776,12 @@ venetdev_sg_import(struct venetdev *priv, void *ptr, int len)
return NULL;
}
+ if (vsg->phdr.mac != ~0U)
+ skb_set_mac_header(skb, vsg->phdr.mac);
+
+ skb_set_network_header(skb, vsg->phdr.network);
+ skb_set_transport_header(skb, vsg->phdr.transport);
+
if (vsg->flags & VENET_SG_FLAG_GSO) {
struct skb_shared_info *sinfo = skb_shinfo(skb);
@@ -2250,7 +2256,7 @@ host_mac_show(struct vbus_device *dev, struct vbus_device_attribute *attr,
struct vbus_device_attribute attr_hmac =
__ATTR_RO(host_mac);
-static ssize_t
+ssize_t
cmac_store(struct vbus_device *dev, struct vbus_device_attribute *attr,
const char *buf, size_t count)
{
@@ -2282,7 +2288,7 @@ cmac_store(struct vbus_device *dev, struct vbus_device_attribute *attr,
return count;
}
-static ssize_t
+ssize_t
client_mac_show(struct vbus_device *dev, struct vbus_device_attribute *attr,
char *buf)
{
diff --git a/kernel/vbus/devices/venet/macvlan.c b/kernel/vbus/devices/venet/macvlan.c
new file mode 100644
index 0000000..8724e26
--- /dev/null
+++ b/kernel/vbus/devices/venet/macvlan.c
@@ -0,0 +1,598 @@
+/*
+ * venet-macvlan - A Vbus based 802.x virtual network device that utilizes
+ * a macvlan device as the backend
+ *
+ * Copyright (C) 2009 Novell, Patrick Mullaney <[email protected]>
+ *
+ * Based on the venet-tap driver from Gregory Haskins
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+
+#include <linux/sched.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/interrupt.h>
+#include <linux/wait.h>
+
+#include <linux/in.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/skbuff.h>
+#include <linux/ioq.h>
+#include <linux/vbus.h>
+#include <linux/freezer.h>
+#include <linux/kthread.h>
+#include <linux/ktime.h>
+#include <linux/macvlan.h>
+
+#include "venetdevice.h"
+
+#include <linux/in6.h>
+#include <asm/checksum.h>
+
+MODULE_AUTHOR("Patrick Mullaney");
+MODULE_LICENSE("GPL");
+
+#undef PDEBUG /* undef it, just in case */
+#ifdef VENETMACVLAN_DEBUG
+# define PDEBUG(fmt, args...) printk(KERN_DEBUG "venet-tap: " fmt, ## args)
+#else
+# define PDEBUG(fmt, args...) /* not debugging: nothing */
+#endif
+
+struct venetmacv {
+ struct macvlan_dev mdev;
+ unsigned char ll_ifname[IFNAMSIZ];
+ struct venetdev dev;
+ const struct net_device_ops *macvlan_netdev_ops;
+};
+
+static inline struct venetmacv *conn_to_macv(struct vbus_connection *conn)
+{
+ return container_of(conn, struct venetmacv, dev.vbus.conn);
+}
+
+static inline
+struct venetmacv *venetdev_to_macv(struct venetdev *vdev)
+{
+ return container_of(vdev, struct venetmacv, dev);
+}
+
+static inline
+struct venetmacv *vbusintf_to_macv(struct vbus_device_interface *intf)
+{
+ return container_of(intf, struct venetmacv, dev.vbus.intf);
+}
+
+static inline
+struct venetmacv *vbusdev_to_macv(struct vbus_device *vdev)
+{
+ return container_of(vdev, struct venetmacv, dev.vbus.dev);
+}
+
+int
+venetmacv_tx(struct sk_buff *skb, struct net_device *dev)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+
+ return venetdev_xmit(skb, &priv->dev);
+}
+
+static int venetmacv_receive(struct sk_buff *skb)
+{
+ struct venetmacv *priv = netdev_priv(skb->dev);
+ int err;
+
+ if (netif_queue_stopped(skb->dev)) {
+ PDEBUG("venetmacv_receive: queue congested - dropping..\n");
+ priv->dev.netif.stats.tx_dropped++;
+ return NET_RX_DROP;
+ }
+ err = skb_linearize(skb);
+ if (unlikely(err)) {
+ printk(KERN_WARNING "venetmacv_receive: linearize failure\n");
+ kfree_skb(skb);
+ return -1;
+ }
+ skb_push(skb, ETH_HLEN);
+ return venetmacv_tx(skb, skb->dev);
+}
+
+static void
+venetmacv_vlink_release(struct vbus_connection *conn)
+{
+ struct venetmacv *macv = conn_to_macv(conn);
+ macvlan_unlink_lowerdev(macv->mdev.dev);
+ venetdev_vlink_release(conn);
+}
+
+static void
+venetmacv_vlink_up(struct venetdev *vdev)
+{
+ struct venetmacv *macv = venetdev_to_macv(vdev);
+ int ret;
+
+ if (vdev->netif.link) {
+ rtnl_lock();
+ ret = macv->macvlan_netdev_ops->ndo_open(vdev->netif.dev);
+ rtnl_unlock();
+ if (ret)
+ printk(KERN_ERR "macvlanopen failed %d!\n", ret);
+ }
+}
+
+static void
+venetmacv_vlink_down(struct venetdev *vdev)
+{
+ struct venetmacv *macv = venetdev_to_macv(vdev);
+ int ret;
+
+ if (vdev->netif.link) {
+ rtnl_lock();
+ ret = macv->macvlan_netdev_ops->ndo_stop(vdev->netif.dev);
+ rtnl_unlock();
+ if (ret)
+ printk(KERN_ERR "macvlan close failed %d!\n", ret);
+ }
+}
+
+int
+venetmacv_vlink_call(struct vbus_connection *conn,
+ unsigned long func,
+ void *data,
+ unsigned long len,
+ unsigned long flags)
+{
+ struct venetdev *priv = conn_to_priv(conn);
+ int ret;
+
+ switch (func) {
+ case VENET_FUNC_LINKUP:
+ venetmacv_vlink_up(priv);
+ break;
+ case VENET_FUNC_LINKDOWN:
+ venetmacv_vlink_down(priv);
+ break;
+ }
+ ret = venetdev_vlink_call(conn, func, data, len, flags);
+ return ret;
+}
+
+static struct vbus_connection_ops venetmacv_vbus_link_ops = {
+ .call = venetmacv_vlink_call,
+ .shm = venetdev_vlink_shm,
+ .close = venetdev_vlink_close,
+ .release = venetmacv_vlink_release,
+};
+
+/*
+ * This is called whenever a driver wants to open our device_interface
+ * for communication. The connection is represented by a
+ * vbus_connection object. It is up to the implementation to decide
+ * if it allows more than one connection at a time. This simple example
+ * does not.
+ */
+
+static int
+venetmacv_intf_connect(struct vbus_device_interface *intf,
+ struct vbus_memctx *ctx,
+ int version,
+ struct vbus_connection **conn)
+{
+ struct venetmacv *macv = vbusintf_to_macv(intf);
+ unsigned long flags;
+ int ret;
+
+ PDEBUG("connect\n");
+
+ if (version != VENET_VERSION)
+ return -EINVAL;
+
+ spin_lock_irqsave(&macv->dev.lock, flags);
+
+ /*
+ * We only allow one connection to this device
+ */
+ if (macv->dev.vbus.opened) {
+ spin_unlock_irqrestore(&macv->dev.lock, flags);
+ return -EBUSY;
+ }
+
+ kobject_get(intf->dev->kobj);
+
+ vbus_connection_init(&macv->dev.vbus.conn, &venetmacv_vbus_link_ops);
+
+ macv->dev.vbus.opened = true;
+ macv->dev.vbus.ctx = ctx;
+
+ vbus_memctx_get(ctx);
+
+ if (!macv->mdev.lowerdev)
+ return -ENXIO;
+
+ ret = macvlan_link_lowerdev(macv->mdev.dev, macv->mdev.lowerdev);
+
+ if (ret) {
+ printk(KERN_ERR "macvlan_link_lowerdev: failed\n");
+ return -ENXIO;
+ }
+
+ macvlan_transfer_operstate(macv->mdev.dev);
+
+ macv->mdev.receive = venetmacv_receive;
+
+ spin_unlock_irqrestore(&macv->dev.lock, flags);
+
+ *conn = &macv->dev.vbus.conn;
+
+ return 0;
+}
+
+static void
+venetmacv_intf_release(struct vbus_device_interface *intf)
+{
+ kobject_put(intf->dev->kobj);
+}
+
+static struct vbus_device_interface_ops venetmacv_device_interface_ops = {
+ .connect = venetmacv_intf_connect,
+ .release = venetmacv_intf_release,
+};
+
+/*
+ * This is called whenever the admin creates a symbolic link between
+ * a bus in /config/vbus/buses and our device. It represents a bus
+ * connection. Your device can chose to allow more than one bus to
+ * connect, or it can restrict it to one bus. It can also choose to
+ * register one or more device_interfaces on each bus that it
+ * successfully connects to.
+ *
+ * This example device only registers a single interface
+ */
+static int
+venetmacv_device_bus_connect(struct vbus_device *dev, struct vbus *vbus)
+{
+ struct venetdev *priv = vdev_to_priv(dev);
+ struct vbus_device_interface *intf = &priv->vbus.intf;
+
+ /* We only allow one bus to connect */
+ if (priv->vbus.connected)
+ return -EBUSY;
+
+ kobject_get(dev->kobj);
+
+ intf->name = "default";
+ intf->type = VENET_TYPE;
+ intf->ops = &venetmacv_device_interface_ops;
+
+ priv->vbus.connected = true;
+
+ /*
+ * Our example only registers one interface. If you need
+ * more, simply call interface_register() multiple times
+ */
+ return vbus_device_interface_register(dev, vbus, intf);
+}
+
+/*
+ * This is called whenever the admin removes the symbolic link between
+ * a bus in /config/vbus/buses and our device.
+ */
+static int
+venetmacv_device_bus_disconnect(struct vbus_device *dev, struct vbus *vbus)
+{
+ struct venetdev *priv = vdev_to_priv(dev);
+ struct vbus_device_interface *intf = &priv->vbus.intf;
+
+ if (!priv->vbus.connected)
+ return -EINVAL;
+
+ vbus_device_interface_unregister(intf);
+
+ priv->vbus.connected = false;
+ kobject_put(dev->kobj);
+
+ return 0;
+}
+
+static void
+venetmacv_device_release(struct vbus_device *dev)
+{
+ struct venetmacv *macv = vbusdev_to_macv(dev);
+
+ if (macv->mdev.lowerdev)
+ dev_put(macv->mdev.lowerdev);
+
+ venetdev_netdev_unregister(&macv->dev);
+ free_netdev(macv->mdev.dev);
+}
+
+
+static struct vbus_device_ops venetmacv_device_ops = {
+ .bus_connect = venetmacv_device_bus_connect,
+ .bus_disconnect = venetmacv_device_bus_disconnect,
+ .release = venetmacv_device_release,
+};
+
+#define VENETMACV_TYPE "venet-macvlan"
+static ssize_t
+ll_ifname_store(struct vbus_device *dev, struct vbus_device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct venetmacv *priv = vbusdev_to_macv(dev);
+ size_t len;
+
+ len = strlen(buf);
+
+ if (len >= IFNAMSIZ)
+ return -EINVAL;
+
+ if (priv->dev.vbus.opened)
+ return -EINVAL;
+
+ strncpy(priv->ll_ifname, buf, count-1);
+
+ if (priv->mdev.lowerdev) {
+ dev_put(priv->mdev.lowerdev);
+ priv->mdev.lowerdev = NULL;
+ }
+
+ priv->mdev.lowerdev = dev_get_by_name(dev_net(priv->mdev.dev),
+ priv->ll_ifname);
+
+ if (!priv->mdev.lowerdev)
+ return -ENXIO;
+
+ return len;
+}
+
+static ssize_t
+ll_ifname_show(struct vbus_device *dev, struct vbus_device_attribute *attr,
+ char *buf)
+{
+ struct venetmacv *priv = vbusdev_to_macv(dev);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", priv->ll_ifname);
+}
+
+static struct vbus_device_attribute attr_ll_ifname =
+__ATTR(ll_ifname, S_IRUGO | S_IWUSR, ll_ifname_show, ll_ifname_store);
+
+ssize_t
+clientmac_store(struct vbus_device *dev, struct vbus_device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct venetmacv *macv = vbusdev_to_macv(dev);
+ int ret;
+
+ ret = attr_cmac.store(dev, attr, buf, count);
+
+ if (ret == count)
+ memcpy(macv->mdev.dev->dev_addr, macv->dev.cmac, ETH_ALEN);
+
+ return ret;
+}
+
+struct vbus_device_attribute attr_clientmac =
+ __ATTR(client_mac, S_IRUGO | S_IWUSR, client_mac_show, clientmac_store);
+
+static struct attribute *attrs[] = {
+ &attr_clientmac.attr,
+ &attr_enabled.attr,
+ &attr_burstthresh.attr,
+ &attr_txmitigation.attr,
+ &attr_ifname.attr,
+ &attr_ll_ifname.attr,
+ NULL,
+};
+
+static struct attribute_group venetmacv_attr_group = {
+ .attrs = attrs,
+};
+
+static int
+venetmacv_netdev_open(struct net_device *dev)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+ int ret = 0;
+
+ venetdev_open(&priv->dev);
+
+ if (priv->dev.vbus.link) {
+ rtnl_lock();
+ ret = priv->macvlan_netdev_ops->ndo_open(priv->mdev.dev);
+ rtnl_unlock();
+ }
+
+ return ret;
+}
+
+static int
+venetmacv_netdev_stop(struct net_device *dev)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+ int needs_stop = false;
+ int ret = 0;
+
+ if (priv->dev.netif.link)
+ needs_stop = true;
+
+ venetdev_stop(&priv->dev);
+
+ if (priv->dev.vbus.link && needs_stop) {
+ rtnl_lock();
+ ret = priv->macvlan_netdev_ops->ndo_stop(dev);
+ rtnl_unlock();
+ }
+
+ return ret;
+}
+
+/*
+ * out routine for macvlan
+ */
+
+static int
+venetmacv_out(struct venetdev *vdev, struct sk_buff *skb)
+{
+ struct venetmacv *macv = venetdev_to_macv(vdev);
+ skb->dev = macv->mdev.lowerdev;
+ skb->protocol = eth_type_trans(skb, macv->mdev.lowerdev);
+ skb_push(skb, ETH_HLEN);
+ return macv->macvlan_netdev_ops->ndo_start_xmit(skb, macv->mdev.dev);
+}
+
+static int
+venetmacv_netdev_tx(struct sk_buff *skb, struct net_device *dev)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+
+ return venetmacv_out(&priv->dev, skb);
+}
+
+static struct net_device_stats *
+venetmacv_netdev_stats(struct net_device *dev)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+ return venetdev_get_stats(&priv->dev);
+}
+
+static int venetmacv_set_mac_address(struct net_device *dev, void *p)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+ int ret;
+
+ ret = priv->macvlan_netdev_ops->ndo_set_mac_address(dev, p);
+
+ if (!ret)
+ memcpy(priv->dev.cmac, p, ETH_ALEN);
+
+ return ret;
+}
+
+int venetmacv_change_mtu(struct net_device *dev, int new_mtu)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+
+ return priv->macvlan_netdev_ops->ndo_change_mtu(dev, new_mtu);
+}
+
+void venetmacv_change_rx_flags(struct net_device *dev, int change)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+
+ priv->macvlan_netdev_ops->ndo_change_rx_flags(dev, change);
+}
+
+void venetmacv_set_multicast_list(struct net_device *dev)
+{
+ struct venetmacv *priv = netdev_priv(dev);
+
+ priv->macvlan_netdev_ops->ndo_set_multicast_list(dev);
+}
+
+static struct net_device_ops venetmacv_netdev_ops = {
+ .ndo_open = venetmacv_netdev_open,
+ .ndo_stop = venetmacv_netdev_stop,
+ .ndo_set_config = venetdev_netdev_config,
+ .ndo_change_mtu = venetmacv_change_mtu,
+ .ndo_set_mac_address = venetmacv_set_mac_address,
+ .ndo_change_rx_flags = venetmacv_change_rx_flags,
+ .ndo_set_multicast_list = venetmacv_set_multicast_list,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_start_xmit = venetmacv_netdev_tx,
+ .ndo_do_ioctl = venetdev_netdev_ioctl,
+ .ndo_get_stats = venetmacv_netdev_stats,
+};
+
+
+/*
+ * This is called whenever the admin instantiates our devclass via
+ * "mkdir /config/vbus/devices/$(inst)/venet-tap"
+ */
+static int
+venetmacv_device_create(struct vbus_devclass *dc,
+ struct vbus_device **vdev)
+{
+ struct net_device *dev;
+ struct venetmacv *priv;
+ struct vbus_device *_vdev;
+
+ dev = alloc_netdev(sizeof(struct venetmacv), "macvenet%d",
+ macvlan_setup);
+
+ if (!dev)
+ return -ENOMEM;
+
+ priv = netdev_priv(dev);
+ memset(priv, 0, sizeof(*priv));
+
+ spin_lock_init(&priv->dev.lock);
+ random_ether_addr(priv->dev.cmac);
+ memcpy(priv->dev.hmac, priv->dev.cmac, ETH_ALEN);
+
+ /*
+ * vbus init
+ */
+ _vdev = &priv->dev.vbus.dev;
+
+ _vdev->type = VENETMACV_TYPE;
+ _vdev->ops = &venetmacv_device_ops;
+ _vdev->attrs = &venetmacv_attr_group;
+
+ venetdev_init(&priv->dev, dev);
+
+ priv->mdev.dev = dev;
+ priv->dev.netif.out = venetmacv_out;
+
+ priv->macvlan_netdev_ops = dev->netdev_ops;
+ dev->netdev_ops = &venetmacv_netdev_ops;
+
+ *vdev = _vdev;
+
+ return 0;
+}
+
+static struct vbus_devclass_ops venetmacv_devclass_ops = {
+ .create = venetmacv_device_create,
+};
+
+static struct vbus_devclass venetmacv_devclass = {
+ .name = VENETMACV_TYPE,
+ .ops = &venetmacv_devclass_ops,
+ .owner = THIS_MODULE,
+};
+
+static int __init venetmacv_init(void)
+{
+ return vbus_devclass_register(&venetmacv_devclass);
+}
+
+static void __exit venetmacv_cleanup(void)
+{
+ vbus_devclass_unregister(&venetmacv_devclass);
+}
+
+module_init(venetmacv_init);
+module_exit(venetmacv_cleanup);
+
diff --git a/kernel/vbus/devices/venet/venetdevice.h b/kernel/vbus/devices/venet/venetdevice.h
index 71c9f0f..1a74723 100644
--- a/kernel/vbus/devices/venet/venetdevice.h
+++ b/kernel/vbus/devices/venet/venetdevice.h
@@ -173,4 +173,11 @@ extern struct vbus_device_attribute attr_ifname;
extern struct vbus_device_attribute attr_txmitigation;
extern struct vbus_device_attribute attr_zcthresh;
+ssize_t cmac_store(struct vbus_device *dev,
+ struct vbus_device_attribute *attr,
+ const char *buf, size_t count);
+ssize_t client_mac_show(struct vbus_device *dev,
+ struct vbus_device_attribute *attr, char *buf);
+
+
#endif
Patrick Mullaney wrote:
> The macvlan driver didn't allow for creation/deletion of devices
> by other in-kernel modules. This patch provides common routines
> for both in-kernel and netlink based management. This patch
> also enables macvlan device support for gro for lower level
> devices that support gro.
> -static void macvlan_transfer_operstate(struct net_device *dev)
> +void macvlan_transfer_operstate(struct net_device *dev)
> {
> struct macvlan_dev *vlan = netdev_priv(dev);
> const struct net_device *lowerdev = vlan->lowerdev;
> @@ -458,6 +458,7 @@ static void macvlan_transfer_operstate(struct net_device *dev)
> netif_carrier_off(dev);
> }
> }
> +EXPORT_SYMBOL_GPL(macvlan_transfer_operstate);
I think this function could be moved to net/core/dev.c or
net/core/link_watch.c. The VLAN code has an identical copy.
> -int macvlan_newlink(struct net_device *dev,
> - struct nlattr *tb[], struct nlattr *data[])
> +int macvlan_link_lowerdev(struct net_device *dev,
> + struct net_device *lowerdev)
Please indent this more cleanly.
> {
> struct macvlan_dev *vlan = netdev_priv(dev);
> struct macvlan_port *port;
> + int err = 0;
> +
> + if (lowerdev->macvlan_port == NULL) {
> + err = macvlan_port_create(lowerdev);
> + if (err < 0)
> + return err;
> + }
> + port = lowerdev->macvlan_port;
> +
> + vlan->lowerdev = lowerdev;
> + vlan->dev = dev;
> + vlan->port = port;
> + vlan->receive = netif_rx;
> +
> + macvlan_init(dev);
> +
> + list_add_tail(&vlan->list, &port->vlans);
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(macvlan_link_lowerdev);
> @@ -502,23 +539,14 @@ int macvlan_newlink(struct net_device *dev,
> if (!tb[IFLA_ADDRESS])
> random_ether_addr(dev->dev_addr);
>
> - if (lowerdev->macvlan_port == NULL) {
> - err = macvlan_port_create(lowerdev);
> - if (err < 0)
> - return err;
> - }
> - port = lowerdev->macvlan_port;
> -
> - vlan->lowerdev = lowerdev;
> - vlan->dev = dev;
> - vlan->port = port;
> - vlan->receive = netif_rx;
> + err = macvlan_link_lowerdev(dev, lowerdev);
> + if (err < 0)
> + return err;
>
> err = register_netdevice(dev);
> if (err < 0)
> return err;
You've already added the device to the port->vlans list, so you
need to remove it again when register_netdevice() fails.
> - list_add_tail(&vlan->list, &port->vlans);
> macvlan_transfer_operstate(dev);
> return 0;
> }
> @@ -526,14 +554,8 @@ EXPORT_SYMBOL_GPL(macvlan_newlink);
Patrick Mullaney wrote:
> This driver implements a macvlan device as a venet device that can
> be connected to vbus. Since it is a macvlan device, it provides
> a more direct path to the underlying adapter by avoiding the
> bridge.
> --- /dev/null
> +++ b/kernel/vbus/devices/venet/macvlan.c
> ...
> +struct venetmacv {
> + struct macvlan_dev mdev;
> + unsigned char ll_ifname[IFNAMSIZ];
> + struct venetdev dev;
> + const struct net_device_ops *macvlan_netdev_ops;
> +};
macvlan might destroy the device below you when the underlying
device is unregistered. You need to handle this by releasing
the venetmacv device. Check out the NETDEV_UNREGISTER case in
macvlan_device_event().
Patrick Mullaney wrote:
> (Applies to alacrityvm.git/master:34534534)
>
> This patchset implements a vbus venet device with a
> macvlan backend.
Thanks Pat, applied.
If possible, please submit a patch for the userspace side "-net
venet-macvlan[,macaddr][,lower-devname]" feature ASAP and I will merge
that as well.
-Greg
Gregory Haskins wrote:
> Patrick Mullaney wrote:
>> (Applies to alacrityvm.git/master:34534534)
>>
>> This patchset implements a vbus venet device with a
>> macvlan backend.
>
> Thanks Pat, applied.
As I mentioned in my response to these patches, the macvlan part
need more work.
Patrick McHardy wrote:
> Gregory Haskins wrote:
>> Patrick Mullaney wrote:
>>> (Applies to alacrityvm.git/master:34534534)
>>>
>>> This patchset implements a vbus venet device with a
>>> macvlan backend.
>> Thanks Pat, applied.
>
> As I mentioned in my response to these patches, the macvlan part
> need more work.
Yeah, I talked to Pat offline. He is going to patch it incrementally as
its not a trivial problem to cleanup. For the time being, they are
sitting in the alacrityvm tree, but they are not going upstream until
the macvlan stuff is resolved. All but the last patch are not really my
business to push up, anyway.
In the meantime, Pat or others can enhance the work that he's done to
date, so I will carry them out-of-tree.
Kind Regards,
-Greg
Gregory Haskins wrote:
> Patrick McHardy wrote:
>> Gregory Haskins wrote:
>>> Patrick Mullaney wrote:
>>>> (Applies to alacrityvm.git/master:34534534)
>>>>
>>>> This patchset implements a vbus venet device with a
>>>> macvlan backend.
>>> Thanks Pat, applied.
>> As I mentioned in my response to these patches, the macvlan part
>> need more work.
>
> Yeah, I talked to Pat offline. He is going to patch it incrementally as
> its not a trivial problem to cleanup. For the time being, they are
> sitting in the alacrityvm tree, but they are not going upstream until
> the macvlan stuff is resolved.
I see, thanks.
Patrick,
This patch is intended to address your comment on moving the operstate
transition function. I decided to move it to netdevice.h, perhaps that
is a bad idea? It didn't seem to logically fall into dev.c or link_watch.c.
I am not against moving them to either one though. Your other comments
are addressed and I will send out a second series once this gets
reviewed and agreed on.
Thanks for your review/comments.
----------
Provide common routine for the transition of operational state for a leaf
device during a root device transition.
Signed-off-by: Patrick Mullaney <[email protected]>
---
include/linux/netdevice.h | 19 +++++++++++++++++++
net/8021q/vlan.c | 29 ++++-------------------------
2 files changed, 23 insertions(+), 25 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d4a4d98..a15920a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1999,6 +1999,25 @@ static inline u32 dev_ethtool_get_flags(struct net_device *dev)
return 0;
return dev->ethtool_ops->get_flags(dev);
}
+
+static inline
+void netif_stacked_transfer_operstate(const struct net_device *rootdev,
+ struct net_device *dev)
+{
+ if (rootdev->operstate == IF_OPER_DORMANT)
+ netif_dormant_on(dev);
+ else
+ netif_dormant_off(dev);
+
+ if (netif_carrier_ok(rootdev)) {
+ if (!netif_carrier_ok(dev))
+ netif_carrier_on(dev);
+ } else {
+ if (netif_carrier_ok(dev))
+ netif_carrier_off(dev);
+ }
+}
+
#endif /* __KERNEL__ */
#endif /* _LINUX_NETDEVICE_H */
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index fe64908..5d11c12 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -183,27 +183,6 @@ void unregister_vlan_dev(struct net_device *dev)
dev_put(real_dev);
}
-static void vlan_transfer_operstate(const struct net_device *dev,
- struct net_device *vlandev)
-{
- /* Have to respect userspace enforced dormant state
- * of real device, also must allow supplicant running
- * on VLAN device
- */
- if (dev->operstate == IF_OPER_DORMANT)
- netif_dormant_on(vlandev);
- else
- netif_dormant_off(vlandev);
-
- if (netif_carrier_ok(dev)) {
- if (!netif_carrier_ok(vlandev))
- netif_carrier_on(vlandev);
- } else {
- if (netif_carrier_ok(vlandev))
- netif_carrier_off(vlandev);
- }
-}
-
int vlan_check_real_dev(struct net_device *real_dev, u16 vlan_id)
{
const char *name = real_dev->name;
@@ -267,7 +246,7 @@ int register_vlan_dev(struct net_device *dev)
/* Account for reference in struct vlan_dev_info */
dev_hold(real_dev);
- vlan_transfer_operstate(real_dev, dev);
+ netif_stacked_transfer_operstate(real_dev, dev);
linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */
/* So, got the sucker initialized, now lets place
@@ -449,7 +428,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
if (!vlandev)
continue;
- vlan_transfer_operstate(dev, vlandev);
+ netif_stacked_transfer_operstate(dev, vlandev);
}
break;
@@ -492,7 +471,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
continue;
dev_change_flags(vlandev, flgs & ~IFF_UP);
- vlan_transfer_operstate(dev, vlandev);
+ netif_stacked_transfer_operstate(dev, vlandev);
}
break;
@@ -508,7 +487,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
continue;
dev_change_flags(vlandev, flgs | IFF_UP);
- vlan_transfer_operstate(dev, vlandev);
+ netif_stacked_transfer_operstate(dev, vlandev);
}
break;
Patrick Mullaney wrote:
> This patch is intended to address your comment on moving the operstate
> transition function. I decided to move it to netdevice.h, perhaps that
> is a bad idea? It didn't seem to logically fall into dev.c or link_watch.c.
> I am not against moving them to either one though. Your other comments
> are addressed and I will send out a second series once this gets
> reviewed and agreed on.
Thanks. I don't mind much where exactly it is located, but I'd prefer
to not have it inlined. It doesn't seem terribly wrong to move it to
dev.c, there are even some helpers for stacked devices already, like
address list synchronization.
Besides that the patch looks fine to me.
(Applies to net-2.6.git/master:1dfc5827)
These patches allow other modules to override the receive
path of a macvlan. This is being done to support guest
VMs operating directly over a macvlan. Routines to allow
creation and deletion of macvlans from in-kernel modules
were also exposed/added.
---
Patrick Mullaney (3):
macvlan: allow in-kernel modules to create and manage macvlan devices
macvlan: derived from Arnd Bergmann's patch for macvtap
netdevice: provide common routine for macvlan and vlan operstate management
drivers/net/macvlan.c | 135 ++++++++++++++++++++++-----------------------
include/linux/macvlan.h | 41 ++++++++++++++
include/linux/netdevice.h | 3 +
net/8021q/vlan.c | 29 +---------
net/core/dev.c | 27 +++++++++
5 files changed, 142 insertions(+), 93 deletions(-)
create mode 100644 include/linux/macvlan.h
Provide common routine for the transition of operational state for a leaf
device during a root device transition.
Signed-off-by: Patrick Mullaney <[email protected]>
---
drivers/net/macvlan.c | 24 +++---------------------
include/linux/netdevice.h | 3 +++
net/8021q/vlan.c | 29 ++++-------------------------
net/core/dev.c | 27 +++++++++++++++++++++++++++
4 files changed, 37 insertions(+), 46 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 3aabfd9..41dc71f 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -455,25 +455,6 @@ static void macvlan_port_destroy(struct net_device *dev)
kfree(port);
}
-static void macvlan_transfer_operstate(struct net_device *dev)
-{
- struct macvlan_dev *vlan = netdev_priv(dev);
- const struct net_device *lowerdev = vlan->lowerdev;
-
- if (lowerdev->operstate == IF_OPER_DORMANT)
- netif_dormant_on(dev);
- else
- netif_dormant_off(dev);
-
- if (netif_carrier_ok(lowerdev)) {
- if (!netif_carrier_ok(dev))
- netif_carrier_on(dev);
- } else {
- if (netif_carrier_ok(dev))
- netif_carrier_off(dev);
- }
-}
-
static int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
{
if (tb[IFLA_ADDRESS]) {
@@ -551,7 +532,7 @@ static int macvlan_newlink(struct net_device *dev,
return err;
list_add_tail(&vlan->list, &port->vlans);
- macvlan_transfer_operstate(dev);
+ netif_stacked_transfer_operstate(dev, lowerdev);
return 0;
}
@@ -591,7 +572,8 @@ static int macvlan_device_event(struct notifier_block *unused,
switch (event) {
case NETDEV_CHANGE:
list_for_each_entry(vlan, &port->vlans, list)
- macvlan_transfer_operstate(vlan->dev);
+ netif_stacked_transfer_operstate(vlan->dev,
+ vlan->lowerdev);
break;
case NETDEV_FEAT_CHANGE:
list_for_each_entry(vlan, &port->vlans, list) {
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 812a5f3..1587715 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1914,6 +1914,9 @@ unsigned long netdev_increment_features(unsigned long all, unsigned long one,
unsigned long mask);
unsigned long netdev_fix_features(unsigned long features, const char *name);
+void netif_stacked_transfer_operstate(const struct net_device *rootdev,
+ struct net_device *dev);
+
static inline int net_gso_ok(int features, int gso_type)
{
int feature = gso_type << NETIF_F_GSO_SHIFT;
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 8836575..8157b32 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -183,27 +183,6 @@ void unregister_vlan_dev(struct net_device *dev)
dev_put(real_dev);
}
-static void vlan_transfer_operstate(const struct net_device *dev,
- struct net_device *vlandev)
-{
- /* Have to respect userspace enforced dormant state
- * of real device, also must allow supplicant running
- * on VLAN device
- */
- if (dev->operstate == IF_OPER_DORMANT)
- netif_dormant_on(vlandev);
- else
- netif_dormant_off(vlandev);
-
- if (netif_carrier_ok(dev)) {
- if (!netif_carrier_ok(vlandev))
- netif_carrier_on(vlandev);
- } else {
- if (netif_carrier_ok(vlandev))
- netif_carrier_off(vlandev);
- }
-}
-
int vlan_check_real_dev(struct net_device *real_dev, u16 vlan_id)
{
const char *name = real_dev->name;
@@ -261,7 +240,7 @@ int register_vlan_dev(struct net_device *dev)
/* Account for reference in struct vlan_dev_info */
dev_hold(real_dev);
- vlan_transfer_operstate(real_dev, dev);
+ netif_stacked_transfer_operstate(real_dev, dev);
linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */
/* So, got the sucker initialized, now lets place
@@ -447,7 +426,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
if (!vlandev)
continue;
- vlan_transfer_operstate(dev, vlandev);
+ netif_stacked_transfer_operstate(dev, vlandev);
}
break;
@@ -503,7 +482,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
continue;
dev_change_flags(vlandev, flgs & ~IFF_UP);
- vlan_transfer_operstate(dev, vlandev);
+ netif_stacked_transfer_operstate(dev, vlandev);
}
break;
@@ -519,7 +498,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
continue;
dev_change_flags(vlandev, flgs | IFF_UP);
- vlan_transfer_operstate(dev, vlandev);
+ netif_stacked_transfer_operstate(dev, vlandev);
}
break;
diff --git a/net/core/dev.c b/net/core/dev.c
index b8f74cf..21e6f09 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4746,6 +4746,33 @@ unsigned long netdev_fix_features(unsigned long features, const char *name)
EXPORT_SYMBOL(netdev_fix_features);
/**
+ * netif_stacked_transfer_operstate - transfer operstate
+ * @rootdev: the root or lower level device to transfer state from
+ * @dev: the device to transfer operstate to
+ *
+ * Transfer operational state from root to device. This is normally
+ * called when a stacking relationship exists between the root
+ * device and the device(a leaf device).
+ */
+void netif_stacked_transfer_operstate(const struct net_device *rootdev,
+ struct net_device *dev)
+{
+ if (rootdev->operstate == IF_OPER_DORMANT)
+ netif_dormant_on(dev);
+ else
+ netif_dormant_off(dev);
+
+ if (netif_carrier_ok(rootdev)) {
+ if (!netif_carrier_ok(dev))
+ netif_carrier_on(dev);
+ } else {
+ if (netif_carrier_ok(dev))
+ netif_carrier_off(dev);
+ }
+}
+EXPORT_SYMBOL(netif_stacked_transfer_operstate);
+
+/**
* register_netdevice - register a network device
* @dev: device to register
*
This patch includes only the basic framework for overriding the
receive path and the macvlan header was moved to allow modules
outside of driver/net to use it.
Signed-off-by: Patrick Mullaney <[email protected]>
---
drivers/net/macvlan.c | 41 ++++++++++++++++-------------------------
include/linux/macvlan.h | 37 +++++++++++++++++++++++++++++++++++++
2 files changed, 53 insertions(+), 25 deletions(-)
create mode 100644 include/linux/macvlan.h
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 41dc71f..3425e12 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -30,22 +30,7 @@
#include <linux/if_macvlan.h>
#include <net/rtnetlink.h>
-#define MACVLAN_HASH_SIZE (1 << BITS_PER_BYTE)
-
-struct macvlan_port {
- struct net_device *dev;
- struct hlist_head vlan_hash[MACVLAN_HASH_SIZE];
- struct list_head vlans;
-};
-
-struct macvlan_dev {
- struct net_device *dev;
- struct list_head list;
- struct hlist_node hlist;
- struct macvlan_port *port;
- struct net_device *lowerdev;
-};
-
+#include <linux/macvlan.h>
static struct macvlan_dev *macvlan_hash_lookup(const struct macvlan_port *port,
const unsigned char *addr)
@@ -135,7 +120,7 @@ static void macvlan_broadcast(struct sk_buff *skb,
else
nskb->pkt_type = PACKET_MULTICAST;
- netif_rx(nskb);
+ vlan->receive(nskb);
}
}
}
@@ -180,12 +165,12 @@ static struct sk_buff *macvlan_handle_frame(struct sk_buff *skb)
skb->dev = dev;
skb->pkt_type = PACKET_HOST;
- netif_rx(skb);
+ vlan->receive(skb);
return NULL;
}
-static netdev_tx_t macvlan_start_xmit(struct sk_buff *skb,
- struct net_device *dev)
+netdev_tx_t macvlan_start_xmit(struct sk_buff *skb,
+ struct net_device *dev)
{
int i = skb_get_queue_mapping(skb);
struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
@@ -204,6 +189,7 @@ static netdev_tx_t macvlan_start_xmit(struct sk_buff *skb,
return NETDEV_TX_OK;
}
+EXPORT_SYMBOL_GPL(macvlan_start_xmit);
static int macvlan_hard_header(struct sk_buff *skb, struct net_device *dev,
unsigned short type, const void *daddr,
@@ -414,7 +400,7 @@ static const struct net_device_ops macvlan_netdev_ops = {
.ndo_validate_addr = eth_validate_addr,
};
-static void macvlan_setup(struct net_device *dev)
+void macvlan_setup(struct net_device *dev)
{
ether_setup(dev);
@@ -425,6 +411,7 @@ static void macvlan_setup(struct net_device *dev)
dev->ethtool_ops = &macvlan_ethtool_ops;
dev->tx_queue_len = 0;
}
+EXPORT_SYMBOL_GPL(macvlan_setup);
static int macvlan_port_create(struct net_device *dev)
{
@@ -455,7 +442,7 @@ static void macvlan_port_destroy(struct net_device *dev)
kfree(port);
}
-static int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
+int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
{
if (tb[IFLA_ADDRESS]) {
if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
@@ -465,6 +452,7 @@ static int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
}
return 0;
}
+EXPORT_SYMBOL_GPL(macvlan_validate);
static int macvlan_get_tx_queues(struct net *net,
struct nlattr *tb[],
@@ -485,8 +473,8 @@ static int macvlan_get_tx_queues(struct net *net,
return 0;
}
-static int macvlan_newlink(struct net_device *dev,
- struct nlattr *tb[], struct nlattr *data[])
+int macvlan_newlink(struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[])
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct macvlan_port *port;
@@ -526,6 +514,7 @@ static int macvlan_newlink(struct net_device *dev,
vlan->lowerdev = lowerdev;
vlan->dev = dev;
vlan->port = port;
+ vlan->receive = netif_rx;
err = register_netdevice(dev);
if (err < 0)
@@ -535,8 +524,9 @@ static int macvlan_newlink(struct net_device *dev,
netif_stacked_transfer_operstate(dev, lowerdev);
return 0;
}
+EXPORT_SYMBOL_GPL(macvlan_newlink);
-static void macvlan_dellink(struct net_device *dev)
+void macvlan_dellink(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct macvlan_port *port = vlan->port;
@@ -547,6 +537,7 @@ static void macvlan_dellink(struct net_device *dev)
if (list_empty(&port->vlans))
macvlan_port_destroy(port->dev);
}
+EXPORT_SYMBOL_GPL(macvlan_dellink);
static struct rtnl_link_ops macvlan_link_ops __read_mostly = {
.kind = "macvlan",
diff --git a/include/linux/macvlan.h b/include/linux/macvlan.h
new file mode 100644
index 0000000..3f3c6c3
--- /dev/null
+++ b/include/linux/macvlan.h
@@ -0,0 +1,37 @@
+#ifndef _MACVLAN_H
+#define _MACVLAN_H
+
+#include <linux/netdevice.h>
+#include <linux/netlink.h>
+#include <linux/list.h>
+
+#define MACVLAN_HASH_SIZE (1 << BITS_PER_BYTE)
+
+struct macvlan_port {
+ struct net_device *dev;
+ struct hlist_head vlan_hash[MACVLAN_HASH_SIZE];
+ struct list_head vlans;
+};
+
+struct macvlan_dev {
+ struct net_device *dev;
+ struct list_head list;
+ struct hlist_node hlist;
+ struct macvlan_port *port;
+ struct net_device *lowerdev;
+
+ int (*receive)(struct sk_buff *skb);
+};
+
+extern int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev);
+
+extern void macvlan_setup(struct net_device *dev);
+
+extern int macvlan_validate(struct nlattr *tb[], struct nlattr *data[]);
+
+extern int macvlan_newlink(struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[]);
+
+extern void macvlan_dellink(struct net_device *dev);
+
+#endif /* _MACVLAN_H */
The macvlan driver didn't allow for creation/deletion of devices
by other in-kernel modules. This patch provides common routines
for both in-kernel and netlink based management. This patch
also enables macvlan device support for gro for lower level
devices that support gro.
Signed-off-by: Patrick Mullaney <[email protected]>
---
drivers/net/macvlan.c | 72 ++++++++++++++++++++++++++++++++---------------
include/linux/macvlan.h | 4 +++
2 files changed, 53 insertions(+), 23 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 3425e12..bb180d0 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -210,7 +210,7 @@ static const struct header_ops macvlan_hard_header_ops = {
.cache_update = eth_header_cache_update,
};
-static int macvlan_open(struct net_device *dev)
+int macvlan_open(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan->lowerdev;
@@ -237,7 +237,7 @@ out:
return err;
}
-static int macvlan_stop(struct net_device *dev)
+int macvlan_stop(struct net_device *dev)
{
struct macvlan_dev *vlan = netdev_priv(dev);
struct net_device *lowerdev = vlan->lowerdev;
@@ -318,7 +318,7 @@ static struct lock_class_key macvlan_netdev_addr_lock_key;
#define MACVLAN_FEATURES \
(NETIF_F_SG | NETIF_F_ALL_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \
NETIF_F_GSO | NETIF_F_TSO | NETIF_F_UFO | NETIF_F_GSO_ROBUST | \
- NETIF_F_TSO_ECN | NETIF_F_TSO6)
+ NETIF_F_TSO_ECN | NETIF_F_TSO6 | NETIF_F_GRO)
#define MACVLAN_STATE_MASK \
((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT))
@@ -454,6 +454,44 @@ int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
}
EXPORT_SYMBOL_GPL(macvlan_validate);
+int macvlan_link_lowerdev(struct net_device *dev,
+ struct net_device *lowerdev)
+{
+ struct macvlan_dev *vlan = netdev_priv(dev);
+ struct macvlan_port *port;
+ int err = 0;
+
+ if (lowerdev->macvlan_port == NULL) {
+ err = macvlan_port_create(lowerdev);
+ if (err < 0)
+ return err;
+ }
+ port = lowerdev->macvlan_port;
+
+ vlan->lowerdev = lowerdev;
+ vlan->dev = dev;
+ vlan->port = port;
+ vlan->receive = netif_rx;
+
+ macvlan_init(dev);
+
+ list_add_tail(&vlan->list, &port->vlans);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(macvlan_link_lowerdev);
+
+void macvlan_unlink_lowerdev(struct net_device *dev)
+{
+ struct macvlan_dev *vlan = netdev_priv(dev);
+ struct macvlan_port *port = vlan->port;
+
+ list_del(&vlan->list);
+
+ if (list_empty(&port->vlans))
+ macvlan_port_destroy(port->dev);
+}
+EXPORT_SYMBOL_GPL(macvlan_unlink_lowerdev);
+
static int macvlan_get_tx_queues(struct net *net,
struct nlattr *tb[],
unsigned int *num_tx_queues,
@@ -504,38 +542,26 @@ int macvlan_newlink(struct net_device *dev,
if (!tb[IFLA_ADDRESS])
random_ether_addr(dev->dev_addr);
- if (lowerdev->macvlan_port == NULL) {
- err = macvlan_port_create(lowerdev);
- if (err < 0)
- return err;
- }
- port = lowerdev->macvlan_port;
-
- vlan->lowerdev = lowerdev;
- vlan->dev = dev;
- vlan->port = port;
- vlan->receive = netif_rx;
+ err = macvlan_link_lowerdev(dev, lowerdev);
+ if (err < 0)
+ return err;
err = register_netdevice(dev);
- if (err < 0)
+ if (err < 0) {
+ macvlan_unlink_lowerdev(dev);
return err;
+ }
- list_add_tail(&vlan->list, &port->vlans);
netif_stacked_transfer_operstate(dev, lowerdev);
+
return 0;
}
EXPORT_SYMBOL_GPL(macvlan_newlink);
void macvlan_dellink(struct net_device *dev)
{
- struct macvlan_dev *vlan = netdev_priv(dev);
- struct macvlan_port *port = vlan->port;
-
- list_del(&vlan->list);
+ macvlan_unlink_lowerdev(dev);
unregister_netdevice(dev);
-
- if (list_empty(&port->vlans))
- macvlan_port_destroy(port->dev);
}
EXPORT_SYMBOL_GPL(macvlan_dellink);
diff --git a/include/linux/macvlan.h b/include/linux/macvlan.h
index 3f3c6c3..27f56d9 100644
--- a/include/linux/macvlan.h
+++ b/include/linux/macvlan.h
@@ -24,6 +24,10 @@ struct macvlan_dev {
};
extern int macvlan_start_xmit(struct sk_buff *skb, struct net_device *dev);
+extern int macvlan_link_lowerdev(struct net_device *dev,
+ struct net_device *lowerdev);
+
+extern void macvlan_unlink_lowerdev(struct net_device *dev);
extern void macvlan_setup(struct net_device *dev);
On Fri, 13 Nov 2009 14:55:06 -0500
Patrick Mullaney <[email protected]> wrote:
> These patches allow other modules to override the receive
> path of a macvlan. This is being done to support guest
> VMs operating directly over a macvlan. Routines to allow
> creation and deletion of macvlans from in-kernel modules
> were also exposed/added.
Which guest VM, how will it use it? The kernel is not in the business
of providing infrastructure for out of tree patches.
Also, macvlan should really being calling netif_receive_skb()
not going through another queue/softirq cycle.
On Fri, 2009-11-13 at 13:27 -0800, Stephen Hemminger wrote:
> On Fri, 13 Nov 2009 14:55:06 -0500
> Patrick Mullaney <[email protected]> wrote:
>
> > These patches allow other modules to override the receive
> > path of a macvlan. This is being done to support guest
> > VMs operating directly over a macvlan. Routines to allow
> > creation and deletion of macvlans from in-kernel modules
> > were also exposed/added.
>
> Which guest VM, how will it use it? The kernel is not in the business
> of providing infrastructure for out of tree patches.
Actually, any guest vm or container. macvtap was the first to suggest
a patch like this to provide this functionality. This infrastructure
is generic enough to allow others to use it. My interest is in
supporting kvm guests via vbus drivers which are being integrated
in the alacrityvm tree.
>
> Also, macvlan should really being calling netif_receive_skb()
> not going through another queue/softirq cycle.
I understand but you are talking about the current behavior of
macvlans - my goal wasn't to change the current behavior
of macvlans, as I didn't want to disturb what the original author
intended, just provide the ability to override the rx path and
provide for management of macvlans from other kernel modules(not
just via rtnl).
On Friday 13 November 2009, Patrick Mullaney wrote:
> @@ -551,7 +532,7 @@ static int macvlan_newlink(struct net_device *dev,
> return err;
>
> list_add_tail(&vlan->list, &port->vlans);
> - macvlan_transfer_operstate(dev);
> + netif_stacked_transfer_operstate(dev, lowerdev);
> return 0;
> }
>
> @@ -591,7 +572,8 @@ static int macvlan_device_event(struct notifier_block *unused,
> switch (event) {
> case NETDEV_CHANGE:
> list_for_each_entry(vlan, &port->vlans, list)
> - macvlan_transfer_operstate(vlan->dev);
> + netif_stacked_transfer_operstate(vlan->dev,
> + vlan->lowerdev);
> break;
> case NETDEV_FEAT_CHANGE:
> list_for_each_entry(vlan, &port->vlans, list) {
These have the arguments reversed, lowerdev should come first.
Arnd <><
Thanks. I'll post an updated patch.
On Fri, 2009-11-27 at 14:09 +0100, Arnd Bergmann wrote:
> On Friday 13 November 2009, Patrick Mullaney wrote:
> > @@ -551,7 +532,7 @@ static int macvlan_newlink(struct net_device *dev,
> > return err;
> >
> > list_add_tail(&vlan->list, &port->vlans);
> > - macvlan_transfer_operstate(dev);
> > + netif_stacked_transfer_operstate(dev, lowerdev);
> > return 0;
> > }
> >
> > @@ -591,7 +572,8 @@ static int macvlan_device_event(struct notifier_block *unused,
> > switch (event) {
> > case NETDEV_CHANGE:
> > list_for_each_entry(vlan, &port->vlans, list)
> > - macvlan_transfer_operstate(vlan->dev);
> > + netif_stacked_transfer_operstate(vlan->dev,
> > + vlan->lowerdev);
> > break;
> > case NETDEV_FEAT_CHANGE:
> > list_for_each_entry(vlan, &port->vlans, list) {
>
> These have the arguments reversed, lowerdev should come first.
>
> Arnd <><
On Friday 13 November 2009, Patrick Mullaney wrote:
>
> The macvlan driver didn't allow for creation/deletion of devices
> by other in-kernel modules. This patch provides common routines
> for both in-kernel and netlink based management. This patch
> also enables macvlan device support for gro for lower level
> devices that support gro.
I wonder if doing this way round is a good idea, why don't
you just use netlink to set up the endpoint device like
the current macvlan and macvtap do? I think doing it consistently
for all backends would be a significant advantage.
Arnd <><
On Friday 13 November 2009, Patrick Mullaney wrote:
> @@ -318,7 +318,7 @@ static struct lock_class_key macvlan_netdev_addr_lock_key;
> #define MACVLAN_FEATURES \
> (NETIF_F_SG | NETIF_F_ALL_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \
> NETIF_F_GSO | NETIF_F_TSO | NETIF_F_UFO | NETIF_F_GSO_ROBUST | \
> - NETIF_F_TSO_ECN | NETIF_F_TSO6)
> + NETIF_F_TSO_ECN | NETIF_F_TSO6 | NETIF_F_GRO)
>
> #define MACVLAN_STATE_MASK \
> ((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT))
This hunk looks like it should be a separate patch, because we will
want to have this independently of the rest. I have taken it into
a series I'm preparing for a new posting of macvtap based on the
current net-next tree with my bridge mode changes. I also have
your patch 1 (the fixed version) and 2 in there. It's currently
work in progress, but if you are interested, take a look at [1].
Arnd <><
[1] http://git.kernel.org/?p=linux/kernel/git/arnd/playground.git;a=shortlog;h=refs/heads/macvlan
On Friday 13 November 2009, Stephen Hemminger wrote:
> Also, macvlan should really being calling netif_receive_skb()
> not going through another queue/softirq cycle.
I've added a patch for this in my experimental queue now.
When I last tried this, I saw a kernel stack overflow
but it seems fine now.
I wonder if we can also return from the macvlan hook with
skb->dev changed to netif_receive_skb rather than calling it
recursively.
Arnd <><
From: Arnd Bergmann <[email protected]>
Date: Sat, 28 Nov 2009 00:43:58 +0100
> On Friday 13 November 2009, Stephen Hemminger wrote:
>> Also, macvlan should really being calling netif_receive_skb()
>> not going through another queue/softirq cycle.
>
> I've added a patch for this in my experimental queue now.
> When I last tried this, I saw a kernel stack overflow
> but it seems fine now.
I think it is unwise for any virtual device layer to use netif_receive_skb().
Just like tunnels they should always use netif_rx().
Otherwise stack overflow is a very real concern.
On Fri, 27 Nov 2009 16:19:57 -0800 (PST)
David Miller <[email protected]> wrote:
> From: Arnd Bergmann <[email protected]>
> Date: Sat, 28 Nov 2009 00:43:58 +0100
>
> > On Friday 13 November 2009, Stephen Hemminger wrote:
> >> Also, macvlan should really being calling netif_receive_skb()
> >> not going through another queue/softirq cycle.
> >
> > I've added a patch for this in my experimental queue now.
> > When I last tried this, I saw a kernel stack overflow
> > but it seems fine now.
>
> I think it is unwise for any virtual device layer to use netif_receive_skb().
> Just like tunnels they should always use netif_rx().
>
> Otherwise stack overflow is a very real concern.
Maybe we should figure out a way for protocols to return new skb in netif_receive_skb
to avoid extra softirq, but avoid stack overflow?
--
From: Stephen Hemminger <[email protected]>
Date: Fri, 27 Nov 2009 21:38:24 -0800
> Maybe we should figure out a way for protocols to return new skb in netif_receive_skb
> to avoid extra softirq, but avoid stack overflow?
Eric Dumazet and I tried to find ways to handle this, please see the
archives. It's not an easy problem to solve and none of the patches
we came up with avoided crashes under high stress scenarios.
On Fri, 2009-11-27 at 23:14 +0100, Arnd Bergmann wrote:
> On Friday 13 November 2009, Patrick Mullaney wrote:
> >
> > The macvlan driver didn't allow for creation/deletion of devices
> > by other in-kernel modules. This patch provides common routines
> > for both in-kernel and netlink based management. This patch
> > also enables macvlan device support for gro for lower level
> > devices that support gro.
>
> I wonder if doing this way round is a good idea, why don't
> you just use netlink to set up the endpoint device like
> the current macvlan and macvtap do? I think doing it consistently
> for all backends would be a significant advantage.
sorry for the late response - I'm thinking about re-implementing
this along the lines that you are talking about. Especially in light
of your new configuration options. The reason(probably short sighted)
for the previous approach was that the creation step was already being
handled in our venet driver(but it doesn't have to be).
Thanks for the suggestion.
Patrick
>
> Arnd <><
I hope I didn't confuse things by posting:
netdevice: provide common routine for macvlan and vlan operstate
management
again. I offered to send that out patched against net-next-2.6 last
week and I just got back to following up. I'm fine with you rolling
them into your series too.
Thanks.
On Fri, 2009-11-27 at 23:19 +0100, Arnd Bergmann wrote:
> On Friday 13 November 2009, Patrick Mullaney wrote:
> > @@ -318,7 +318,7 @@ static struct lock_class_key macvlan_netdev_addr_lock_key;
> > #define MACVLAN_FEATURES \
> > (NETIF_F_SG | NETIF_F_ALL_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \
> > NETIF_F_GSO | NETIF_F_TSO | NETIF_F_UFO | NETIF_F_GSO_ROBUST | \
> > - NETIF_F_TSO_ECN | NETIF_F_TSO6)
> > + NETIF_F_TSO_ECN | NETIF_F_TSO6 | NETIF_F_GRO)
> >
> > #define MACVLAN_STATE_MASK \
> > ((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT))
>
> This hunk looks like it should be a separate patch, because we will
> want to have this independently of the rest. I have taken it into
> a series I'm preparing for a new posting of macvtap based on the
> current net-next tree with my bridge mode changes. I also have
> your patch 1 (the fixed version) and 2 in there. It's currently
> work in progress, but if you are interested, take a look at [1].
>
> Arnd <><
>
> [1] http://git.kernel.org/?p=linux/kernel/git/arnd/playground.git;a=shortlog;h=refs/heads/macvlan
On Thursday 03 December 2009 20:45:43 Patrick Mullaney wrote:
> I hope I didn't confuse things by posting:
>
> netdevice: provide common routine for macvlan and vlan operstate
> management
>
> again. I offered to send that out patched against net-next-2.6 last
> week and I just got back to following up. I'm fine with you rolling
> them into your series too.
Not at all, thanks for sending that yourself! You did miss the Acked-by
lines from Patrick McHardy and myself though, they should have been
part of your resend. I'll add give you another one then.
Arnd <><