2018-04-05 19:12:31

by Mohammed Gamal

[permalink] [raw]
Subject: [PATCH 0/4] hv_netvsc: Fix shutdown issues on older Windows hosts

Guests running on WS2012 hosts would not shutdown when changing network
interface setting (e.g. Number of channels, MTU ... etc).

This patch series addresses these shutdown issues we enecountered with WS2012
hosts. It's essentialy a rework of the series sent in
https://lkml.org/lkml/2018/1/23/111 on top of latest upstream

Fixes: 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions")

Mohammed Gamal (4):
hv_netvsc: Use Windows version instead of NVSP version on GPAD
teardown
hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()
hv_netvsc: Ensure correct teardown message sequence order
hv_netvsc: Pass net_device parameter to revoke and teardown functions

drivers/net/hyperv/netvsc.c | 60 +++++++++++++++++++++++++++++++++------------
1 file changed, 44 insertions(+), 16 deletions(-)

--
1.8.3.1



2018-04-05 19:11:21

by Mohammed Gamal

[permalink] [raw]
Subject: [PATCH 4/4] hv_netvsc: Pass net_device parameter to revoke and teardown functions

The callers to netvsc_revoke_*_buf() and netvsc_teardown_*_gpadl()
already have their net_device instances. Pass them as a paramaeter to
the function instead of obtaining them from netvsc_device struct
everytime

Signed-off-by: Mohammed Gamal <[email protected]>
---
drivers/net/hyperv/netvsc.c | 37 ++++++++++++++++++-------------------
1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index df92c2f..04f611e 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -110,9 +110,9 @@ static void free_netvsc_device_rcu(struct netvsc_device *nvdev)
}

static void netvsc_revoke_recv_buf(struct hv_device *device,
- struct netvsc_device *net_device)
+ struct netvsc_device *net_device,
+ struct net_device *ndev)
{
- struct net_device *ndev = hv_get_drvdata(device);
struct nvsp_message *revoke_packet;
int ret;

@@ -160,9 +160,9 @@ static void netvsc_revoke_recv_buf(struct hv_device *device,
}

static void netvsc_revoke_send_buf(struct hv_device *device,
- struct netvsc_device *net_device)
+ struct netvsc_device *net_device,
+ struct net_device *ndev)
{
- struct net_device *ndev = hv_get_drvdata(device);
struct nvsp_message *revoke_packet;
int ret;

@@ -211,9 +211,9 @@ static void netvsc_revoke_send_buf(struct hv_device *device,
}

static void netvsc_teardown_recv_gpadl(struct hv_device *device,
- struct netvsc_device *net_device)
+ struct netvsc_device *net_device,
+ struct net_device *ndev)
{
- struct net_device *ndev = hv_get_drvdata(device);
int ret;

if (net_device->recv_buf_gpadl_handle) {
@@ -233,9 +233,9 @@ static void netvsc_teardown_recv_gpadl(struct hv_device *device,
}

static void netvsc_teardown_send_gpadl(struct hv_device *device,
- struct netvsc_device *net_device)
+ struct netvsc_device *net_device,
+ struct net_device *ndev)
{
- struct net_device *ndev = hv_get_drvdata(device);
int ret;

if (net_device->send_buf_gpadl_handle) {
@@ -452,10 +452,10 @@ static int netvsc_init_buf(struct hv_device *device,
goto exit;

cleanup:
- netvsc_revoke_recv_buf(device, net_device);
- netvsc_revoke_send_buf(device, net_device);
- netvsc_teardown_recv_gpadl(device, net_device);
- netvsc_teardown_send_gpadl(device, net_device);
+ netvsc_revoke_recv_buf(device, net_device, ndev);
+ netvsc_revoke_send_buf(device, net_device, ndev);
+ netvsc_teardown_recv_gpadl(device, net_device, ndev);
+ netvsc_teardown_send_gpadl(device, net_device, ndev);

exit:
return ret;
@@ -474,7 +474,6 @@ static int negotiate_nvsp_ver(struct hv_device *device,
init_packet->hdr.msg_type = NVSP_MSG_TYPE_INIT;
init_packet->msg.init_msg.init.min_protocol_ver = nvsp_ver;
init_packet->msg.init_msg.init.max_protocol_ver = nvsp_ver;
-
trace_nvsp_send(ndev, init_packet);

/* Send the init request */
@@ -596,13 +595,13 @@ void netvsc_device_remove(struct hv_device *device)
* Revoke receive buffer. If host is pre-Win2016 then tear down
* receive buffer GPADL. Do the same for send buffer.
*/
- netvsc_revoke_recv_buf(device, net_device);
+ netvsc_revoke_recv_buf(device, net_device, ndev);
if (vmbus_proto_version < VERSION_WIN10)
- netvsc_teardown_recv_gpadl(device, net_device);
+ netvsc_teardown_recv_gpadl(device, net_device, ndev);

- netvsc_revoke_send_buf(device, net_device);
+ netvsc_revoke_send_buf(device, net_device, ndev);
if (vmbus_proto_version < VERSION_WIN10)
- netvsc_teardown_send_gpadl(device, net_device);
+ netvsc_teardown_send_gpadl(device, net_device, ndev);

RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);

@@ -624,8 +623,8 @@ void netvsc_device_remove(struct hv_device *device)
* here after VMBus is closed.
*/
if (vmbus_proto_version >= VERSION_WIN10) {
- netvsc_teardown_recv_gpadl(device, net_device);
- netvsc_teardown_send_gpadl(device, net_device);
+ netvsc_teardown_recv_gpadl(device, net_device, ndev);
+ netvsc_teardown_send_gpadl(device, net_device, ndev);
}

/* Release all resources */
--
1.8.3.1


2018-04-05 19:11:40

by Mohammed Gamal

[permalink] [raw]
Subject: [PATCH 2/4] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()

Split each of the functions into two for each of send/recv buffers.
This will be needed in order to implement a fine-grained messaging
sequence to the host so tht we accommodate the requirements of
different Windows versions

Fixes: 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions")

Signed-off-by: Mohammed Gamal <[email protected]>
---
drivers/net/hyperv/netvsc.c | 46 +++++++++++++++++++++++++++++++++------------
1 file changed, 34 insertions(+), 12 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index d65b7fc..f4df5de 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -109,11 +109,11 @@ static void free_netvsc_device_rcu(struct netvsc_device *nvdev)
call_rcu(&nvdev->rcu, free_netvsc_device);
}

-static void netvsc_revoke_buf(struct hv_device *device,
- struct netvsc_device *net_device)
+static void netvsc_revoke_recv_buf(struct hv_device *device,
+ struct netvsc_device *net_device)
{
- struct nvsp_message *revoke_packet;
struct net_device *ndev = hv_get_drvdata(device);
+ struct nvsp_message *revoke_packet;
int ret;

/*
@@ -157,6 +157,14 @@ static void netvsc_revoke_buf(struct hv_device *device,
}
net_device->recv_section_cnt = 0;
}
+}
+
+static void netvsc_revoke_send_buf(struct hv_device *device,
+ struct netvsc_device *net_device)
+{
+ struct net_device *ndev = hv_get_drvdata(device);
+ struct nvsp_message *revoke_packet;
+ int ret;

/* Deal with the send buffer we may have setup.
* If we got a send section size, it means we received a
@@ -202,8 +210,8 @@ static void netvsc_revoke_buf(struct hv_device *device,
}
}

-static void netvsc_teardown_gpadl(struct hv_device *device,
- struct netvsc_device *net_device)
+static void netvsc_teardown_recv_gpadl(struct hv_device *device,
+ struct netvsc_device *net_device)
{
struct net_device *ndev = hv_get_drvdata(device);
int ret;
@@ -222,6 +230,13 @@ static void netvsc_teardown_gpadl(struct hv_device *device,
}
net_device->recv_buf_gpadl_handle = 0;
}
+}
+
+static void netvsc_teardown_send_gpadl(struct hv_device *device,
+ struct netvsc_device *net_device)
+{
+ struct net_device *ndev = hv_get_drvdata(device);
+ int ret;

if (net_device->send_buf_gpadl_handle) {
ret = vmbus_teardown_gpadl(device->channel,
@@ -437,8 +452,10 @@ static int netvsc_init_buf(struct hv_device *device,
goto exit;

cleanup:
- netvsc_revoke_buf(device, net_device);
- netvsc_teardown_gpadl(device, net_device);
+ netvsc_revoke_recv_buf(device, net_device);
+ netvsc_revoke_send_buf(device, net_device);
+ netvsc_teardown_recv_gpadl(device, net_device);
+ netvsc_teardown_send_gpadl(device, net_device);

exit:
return ret;
@@ -575,7 +592,8 @@ void netvsc_device_remove(struct hv_device *device)
= rtnl_dereference(net_device_ctx->nvdev);
int i;

- netvsc_revoke_buf(device, net_device);
+ netvsc_revoke_recv_buf(device, net_device);
+ netvsc_revoke_send_buf(device, net_device);

RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);

@@ -590,14 +608,18 @@ void netvsc_device_remove(struct hv_device *device)
netdev_dbg(ndev, "net device safe to remove\n");

/* older versions require that buffer be revoked before close */
- if (vmbus_proto_version < VERSION_WIN10)
- netvsc_teardown_gpadl(device, net_device);
+ if (vmbus_proto_version < VERSION_WIN10) {
+ netvsc_teardown_recv_gpadl(device, net_device);
+ netvsc_teardown_send_gpadl(device, net_device);
+ }

/* Now, we can close the channel safely */
vmbus_close(device->channel);

- if (vmbus_proto_version >= VERSION_WIN10)
- netvsc_teardown_gpadl(device, net_device);
+ if (vmbus_proto_version >= VERSION_WIN10) {
+ netvsc_teardown_recv_gpadl(device, net_device);
+ netvsc_teardown_send_gpadl(device, net_device);
+ }

/* Release all resources */
free_netvsc_device_rcu(net_device);
--
1.8.3.1


2018-04-05 19:11:55

by Mohammed Gamal

[permalink] [raw]
Subject: [PATCH 3/4] hv_netvsc: Ensure correct teardown message sequence order

Prior to commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split")
the call sequence in netvsc_device_remove() was as follows (as
implemented in netvsc_destroy_buf()):
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Teardown receive buffer GPADL
3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
4- Teardown send buffer GPADL
5- Close vmbus

This didn't work for WS2016 hosts. Commit 0cf737808ae7
("hv_netvsc: netvsc_teardown_gpadl() split") rearranged the
teardown sequence as follows:
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
3- Close vmbus
4- Teardown receive buffer GPADL
5- Teardown send buffer GPADL

That worked well for WS2016 hosts, but it prevented guests on older hosts from
shutting down after changing network settings. Commit 0ef58b0a05c1
("hv_netvsc: change GPAD teardown order on older versions") ensured the
following message sequence for older hosts
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
3- Teardown receive buffer GPADL
4- Teardown send buffer GPADL
5- Close vmbus

However, with this sequence calling `ip link set eth0 mtu 1000` hangs and the
process becomes uninterruptible. On futher analysis it turns out that on tearing
down the receive buffer GPADL the kernel is waiting indefinitely
in vmbus_teardown_gpadl() for a completion to be signaled.

Here is a snippet of where this occurs:
int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 gpadl_handle)
{
struct vmbus_channel_gpadl_teardown *msg;
struct vmbus_channel_msginfo *info;
unsigned long flags;
int ret;

info = kmalloc(sizeof(*info) +
sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL);
if (!info)
return -ENOMEM;

init_completion(&info->waitevent);
info->waiting_channel = channel;
[....]
ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_gpadl_teardown),
true);

if (ret)
goto post_msg_err;

wait_for_completion(&info->waitevent);
[....]
}

The completion is signaled from vmbus_ongpadl_torndown(), which gets called when
the corresponding message is received from the host, which apparently never happens
in that case.
This patch works around the issue by restoring the first mentioned message sequence
for older hosts

Fixes: 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions")

Signed-off-by: Mohammed Gamal <[email protected]>
---
drivers/net/hyperv/netvsc.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index f4df5de..df92c2f 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -592,8 +592,17 @@ void netvsc_device_remove(struct hv_device *device)
= rtnl_dereference(net_device_ctx->nvdev);
int i;

+ /*
+ * Revoke receive buffer. If host is pre-Win2016 then tear down
+ * receive buffer GPADL. Do the same for send buffer.
+ */
netvsc_revoke_recv_buf(device, net_device);
+ if (vmbus_proto_version < VERSION_WIN10)
+ netvsc_teardown_recv_gpadl(device, net_device);
+
netvsc_revoke_send_buf(device, net_device);
+ if (vmbus_proto_version < VERSION_WIN10)
+ netvsc_teardown_send_gpadl(device, net_device);

RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);

@@ -607,15 +616,13 @@ void netvsc_device_remove(struct hv_device *device)
*/
netdev_dbg(ndev, "net device safe to remove\n");

- /* older versions require that buffer be revoked before close */
- if (vmbus_proto_version < VERSION_WIN10) {
- netvsc_teardown_recv_gpadl(device, net_device);
- netvsc_teardown_send_gpadl(device, net_device);
- }
-
/* Now, we can close the channel safely */
vmbus_close(device->channel);

+ /*
+ * If host is Win2016 or higher then we do the GPADL tear down
+ * here after VMBus is closed.
+ */
if (vmbus_proto_version >= VERSION_WIN10) {
netvsc_teardown_recv_gpadl(device, net_device);
netvsc_teardown_send_gpadl(device, net_device);
--
1.8.3.1


2018-04-05 19:12:10

by Mohammed Gamal

[permalink] [raw]
Subject: [PATCH 1/4] hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown

When changing network interface settings, Windows guests
older than WS2016 can no longer shutdown. This was addressed
by commit 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order
on older versions"), however the issue also occurs on WS2012
guests that share NVSP protocol versions with WS2016 guests.
Hence we use Windows version directly to differentiate them.

Fixes: 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions")

Signed-off-by: Mohammed Gamal <[email protected]>
---
drivers/net/hyperv/netvsc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index c9910c3..d65b7fc 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -590,13 +590,13 @@ void netvsc_device_remove(struct hv_device *device)
netdev_dbg(ndev, "net device safe to remove\n");

/* older versions require that buffer be revoked before close */
- if (net_device->nvsp_version < NVSP_PROTOCOL_VERSION_4)
+ if (vmbus_proto_version < VERSION_WIN10)
netvsc_teardown_gpadl(device, net_device);

/* Now, we can close the channel safely */
vmbus_close(device->channel);

- if (net_device->nvsp_version >= NVSP_PROTOCOL_VERSION_4)
+ if (vmbus_proto_version >= VERSION_WIN10)
netvsc_teardown_gpadl(device, net_device);

/* Release all resources */
--
1.8.3.1


2018-04-06 02:22:43

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/4] hv_netvsc: Fix shutdown issues on older Windows hosts

From: Mohammed Gamal <[email protected]>
Date: Thu, 5 Apr 2018 21:09:17 +0200

> Guests running on WS2012 hosts would not shutdown when changing network
> interface setting (e.g. Number of channels, MTU ... etc).
>
> This patch series addresses these shutdown issues we enecountered with WS2012
> hosts. It's essentialy a rework of the series sent in
> https://lkml.org/lkml/2018/1/23/111 on top of latest upstream
>
> Fixes: 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions")

Series applied, thank you.

2018-04-10 13:54:39

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 1/4] hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 0ef58b0a05c1 hv_netvsc: change GPAD teardown order on older versions.

The bot has also determined it's probably a bug fixing patch. (score: 19.6070)

The bot has tested the following trees: v4.16.1.

v4.16.1: Build OK!

--
Thanks,
Sasha

2018-04-10 14:00:15

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 2/4] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 0ef58b0a05c1 hv_netvsc: change GPAD teardown order on older versions.

The bot has also determined it's probably a bug fixing patch. (score: 3.6623)

The bot has tested the following trees: v4.16.1.

v4.16.1: Failed to apply! Possible dependencies:
2afc5d61a719 ("hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown")


--
Thanks,
Sasha

2018-04-10 14:01:00

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 3/4] hv_netvsc: Ensure correct teardown message sequence order

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 0ef58b0a05c1 hv_netvsc: change GPAD teardown order on older versions.

The bot has also determined it's probably a bug fixing patch. (score: 60.7987)

The bot has tested the following trees: v4.16.1.

v4.16.1: Failed to apply! Possible dependencies:
7992894c305e ("hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()")


--
Thanks,
Sasha