Add dev_num for vmbus device based on channel offer sequence.
Use this number for nic naming, and enable async probing.
Haiyang Zhang (3):
Drivers: hv: vmbus: Add a dev_num variable based on channel offer
sequence
Drivers: hv: vmbus: Add dev_num to sysfs
hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
Documentation/ABI/stable/sysfs-bus-vmbus | 8 +++++++
drivers/hv/channel_mgmt.c | 38 ++++++++++++++++++++++++++++++--
drivers/hv/vmbus_drv.c | 13 +++++++++++
drivers/net/hyperv/netvsc_drv.c | 18 ++++++++++++---
include/linux/hyperv.h | 6 +++++
5 files changed, 78 insertions(+), 5 deletions(-)
--
1.8.3.1
The dev_num field in vmbus channel structure is assigned to the first
available number when the channel is offered. So netvsc driver uses it
for NIC naming based on channel offer sequence. Now re-enable the async
probing mode for faster probing.
Signed-off-by: Haiyang Zhang <[email protected]>
---
drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index f3f9eb8..39c412f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
struct net_device_context *net_device_ctx;
struct netvsc_device_info *device_info = NULL;
struct netvsc_device *nvdev;
+ char name[IFNAMSIZ];
int ret = -ENOMEM;
- net = alloc_etherdev_mq(sizeof(struct net_device_context),
- VRSS_CHANNEL_MAX);
+ snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
+ net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
+ NET_NAME_ENUM, ether_setup,
+ VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
+
if (!net)
goto no_net;
@@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
net->max_mtu = ETH_DATA_LEN;
ret = register_netdevice(net);
+
+ if (ret == -EEXIST) {
+ pr_info("NIC name %s exists, request another name.\n",
+ net->name);
+ strlcpy(net->name, "eth%d", IFNAMSIZ);
+ ret = register_netdevice(net);
+ }
+
if (ret != 0) {
pr_err("Unable to register netdev.\n");
goto register_failed;
@@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev)
.suspend = netvsc_suspend,
.resume = netvsc_resume,
.driver = {
- .probe_type = PROBE_FORCE_SYNCHRONOUS,
+ .probe_type = PROBE_PREFER_ASYNCHRONOUS,
},
};
--
1.8.3.1
From: Haiyang Zhang <[email protected]> Sent: Monday, December 30, 2019 12:14 PM
>
> The dev_num field in vmbus channel structure is assigned to the first
Let's use "VMBus" in text.
> available number when the channel is offered. So netvsc driver uses it
> for NIC naming based on channel offer sequence. Now re-enable the async
> probing mode for faster probing.
>
> Signed-off-by: Haiyang Zhang <[email protected]>
> ---
> drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index f3f9eb8..39c412f 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> struct net_device_context *net_device_ctx;
> struct netvsc_device_info *device_info = NULL;
> struct netvsc_device *nvdev;
> + char name[IFNAMSIZ];
> int ret = -ENOMEM;
>
> - net = alloc_etherdev_mq(sizeof(struct net_device_context),
> - VRSS_CHANNEL_MAX);
> + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> + net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> + NET_NAME_ENUM, ether_setup,
> + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> +
> if (!net)
> goto no_net;
>
> @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> net->max_mtu = ETH_DATA_LEN;
>
> ret = register_netdevice(net);
> +
> + if (ret == -EEXIST) {
> + pr_info("NIC name %s exists, request another name.\n",
> + net->name);
> + strlcpy(net->name, "eth%d", IFNAMSIZ);
> + ret = register_netdevice(net);
The message above could be confusing. "request another name" sounds
like a directive to the user/sysadmin, which I don't think it is. Perhaps
better would be "requesting another name", which says the system is
handling the collision automatically.
Also will this second call to register_netdevice() actually assign a numeric
name? If so, it would be really nice for the message to be output *after*
the second call to register_netdevice() and to show both the originally
selected name that collided as well as the new name. We sometimes get
into NIC naming issues with customers in Azure, and when a new name
has to be selected it will help debugging if we can show both the original
selection and the new selection. With both pieces of data, there's less
guessing about who did what regarding NIC naming.
Finally, having to choose a different name because of a collision does
*not* update the channel->dev_num value. Subsequent calls to
hv_set_devnum() will detect "in use" based on the originally assigned
dev_num value. I don't think that fundamentally breaks anything, but
I wondered if you had thought about that case. Maybe a comment here
to that effect would help a future reader of this code.
Michael
> + }
> +
> if (ret != 0) {
> pr_err("Unable to register netdev.\n");
> goto register_failed;
> @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev)
> .suspend = netvsc_suspend,
> .resume = netvsc_resume,
> .driver = {
> - .probe_type = PROBE_FORCE_SYNCHRONOUS,
> + .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> },
> };
>
> --
> 1.8.3.1
On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> The dev_num field in vmbus channel structure is assigned to the first
> available number when the channel is offered. So netvsc driver uses it
> for NIC naming based on channel offer sequence. Now re-enable the async
> probing mode for faster probing.
>
> Signed-off-by: Haiyang Zhang <[email protected]>
> ---
> drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index f3f9eb8..39c412f 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> struct net_device_context *net_device_ctx;
> struct netvsc_device_info *device_info = NULL;
> struct netvsc_device *nvdev;
> + char name[IFNAMSIZ];
> int ret = -ENOMEM;
>
> - net = alloc_etherdev_mq(sizeof(struct net_device_context),
> - VRSS_CHANNEL_MAX);
> + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
How is this supposed to work when there are other ethernet device types
on the system, which may claim the same device names?
> + net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> + NET_NAME_ENUM, ether_setup,
> + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> +
> if (!net)
> goto no_net;
>
> @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> net->max_mtu = ETH_DATA_LEN;
>
> ret = register_netdevice(net);
> +
> + if (ret == -EEXIST) {
> + pr_info("NIC name %s exists, request another name.\n",
> + net->name);
> + strlcpy(net->name, "eth%d", IFNAMSIZ);
> + ret = register_netdevice(net);
> + }
IOW you want the device naming to be predictable, but don't guarantee
this?
I think the problem this patchset is trying to solve is much better
solved with a udev rule, similar to how it's done for PCI net devices.
And IMO the primary channel number, being a device's "hardware"
property, is more suited to be used in the device name, than this
completely ephemeral device number.
Thanks,
Roman.
> +
> if (ret != 0) {
> pr_err("Unable to register netdev.\n");
> goto register_failed;
> @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev)
> .suspend = netvsc_suspend,
> .resume = netvsc_resume,
> .driver = {
> - .probe_type = PROBE_FORCE_SYNCHRONOUS,
> + .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> },
> };
>
> --
> 1.8.3.1
>
> -----Original Message-----
> From: Michael Kelley <[email protected]>
> Sent: Monday, December 30, 2019 8:35 PM
> To: Haiyang Zhang <[email protected]>; [email protected]; linux-
> [email protected]; [email protected]
> Cc: Haiyang Zhang <[email protected]>; KY Srinivasan
> <[email protected]>; Stephen Hemminger <[email protected]>;
> [email protected]; vkuznets <[email protected]>; [email protected];
> [email protected]
> Subject: RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
>
> From: Haiyang Zhang <[email protected]> Sent: Monday, December 30,
> 2019 12:14 PM
> >
> > The dev_num field in vmbus channel structure is assigned to the first
>
> Let's use "VMBus" in text.
I will.
>
> > available number when the channel is offered. So netvsc driver uses it
> > for NIC naming based on channel offer sequence. Now re-enable the async
> > probing mode for faster probing.
> >
> > Signed-off-by: Haiyang Zhang <[email protected]>
> > ---
> > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > 1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> > index f3f9eb8..39c412f 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > struct net_device_context *net_device_ctx;
> > struct netvsc_device_info *device_info = NULL;
> > struct netvsc_device *nvdev;
> > + char name[IFNAMSIZ];
> > int ret = -ENOMEM;
> >
> > - net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > - VRSS_CHANNEL_MAX);
> > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > + NET_NAME_ENUM, ether_setup,
> > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > +
> > if (!net)
> > goto no_net;
> >
> > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > net->max_mtu = ETH_DATA_LEN;
> >
> > ret = register_netdevice(net);
> > +
> > + if (ret == -EEXIST) {
> > + pr_info("NIC name %s exists, request another name.\n",
> > + net->name);
> > + strlcpy(net->name, "eth%d", IFNAMSIZ);
> > + ret = register_netdevice(net);
>
> The message above could be confusing. "request another name" sounds
> like a directive to the user/sysadmin, which I don't think it is. Perhaps
> better would be "requesting another name", which says the system is
> handling the collision automatically.
>
> Also will this second call to register_netdevice() actually assign a numeric
> name? If so, it would be really nice for the message to be output *after*
> the second call to register_netdevice() and to show both the originally
> selected name that collided as well as the new name. We sometimes get
> into NIC naming issues with customers in Azure, and when a new name
> has to be selected it will help debugging if we can show both the original
> selection and the new selection. With both pieces of data, there's less
> guessing about who did what regarding NIC naming.
Good idea! I will include both new and old names in the log messages.
>
> Finally, having to choose a different name because of a collision does
> *not* update the channel->dev_num value. Subsequent calls to
> hv_set_devnum() will detect "in use" based on the originally assigned
> dev_num value. I don't think that fundamentally breaks anything, but
> I wondered if you had thought about that case. Maybe a comment here
> to that effect would help a future reader of this code.
That's correct. And I'm aware of this situation. But, the dev_num shouldn't
be changed, because we want it to keep track of the sequence of device
offering.
Avoid single or multiple collisions, we should use udev or other daemons to
set name of VF or other types of NICs to different formats, like "vf*", or "enP*",
etc. For distros have not done so already, we need to work with the distro vendors
to ensure udev or other settings are correctly included in Distros, and Azure images.
Thanks,
- Haiyang
> -----Original Message-----
> From: Roman Kagan <[email protected]>
> Sent: Tuesday, December 31, 2019 6:35 AM
> To: Haiyang Zhang <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> KY Srinivasan <[email protected]>; Stephen Hemminger
> <[email protected]>; [email protected]; vkuznets
> <[email protected]>; [email protected]; [email protected]
> Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
>
> On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> > The dev_num field in vmbus channel structure is assigned to the first
> > available number when the channel is offered. So netvsc driver uses it
> > for NIC naming based on channel offer sequence. Now re-enable the
> > async probing mode for faster probing.
> >
> > Signed-off-by: Haiyang Zhang <[email protected]>
> > ---
> > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > 1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > struct net_device_context *net_device_ctx;
> > struct netvsc_device_info *device_info = NULL;
> > struct netvsc_device *nvdev;
> > + char name[IFNAMSIZ];
> > int ret = -ENOMEM;
> >
> > - net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > - VRSS_CHANNEL_MAX);
> > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
>
> How is this supposed to work when there are other ethernet device types on the
> system, which may claim the same device names?
>
> > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > + NET_NAME_ENUM, ether_setup,
> > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > +
> > if (!net)
> > goto no_net;
> >
> > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > net->max_mtu = ETH_DATA_LEN;
> >
> > ret = register_netdevice(net);
> > +
> > + if (ret == -EEXIST) {
> > + pr_info("NIC name %s exists, request another name.\n",
> > + net->name);
> > + strlcpy(net->name, "eth%d", IFNAMSIZ);
> > + ret = register_netdevice(net);
> > + }
>
> IOW you want the device naming to be predictable, but don't guarantee this?
>
> I think the problem this patchset is trying to solve is much better solved with a
> udev rule, similar to how it's done for PCI net devices.
> And IMO the primary channel number, being a device's "hardware"
> property, is more suited to be used in the device name, than this completely
> ephemeral device number.
The vmbus number can be affected by other types of devices and/or subchannel
offerings. They are not stable either. That's why this patch set keeps track of the
offering sequence within the same device type in a new variable "dev_num".
As in my earlier email, to avoid impact by other types of NICs, we should put them
into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in
udev.
But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And
the variable "dev_num" gives them the basis for stable naming with Async probing.
Thanks,
- Haiyang