2019-01-09 22:37:07

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH][re-post] vmbus: fix subchannel removal


The changes to split ring allocation from open/close, broke
the cleanup of subchannels. This resulted in problems using
uio on network devices because the subchannel was left behind
when the network device was unbound.

The cause was in the disconnect logic which used list splice
to move the subchannel list into a local variable. This won't
work because the subchannel list is needed later during the
process of the rescind messages (relid2channel).

The fix is to just leave the subchannel list in place
which is what the original code did. The list is cleaned
up later when the host rescind is processed.

Without the fix, we have a lot of "hang" issues in netvsc when we
try to change the NIC's MTU, set the number of channels, etc.

Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open")
Cc: [email protected]
Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: Dexuan Cui <[email protected]>

---

The patch was firstly posted on Dec-7 2018:
https://www.spinics.net/lists/linux-driver-devel/msg120802.html
but it looks it's neglected.

Now let me rebase it to v5.0-rc1: the line "channel->num_sc = 0;"
in the original patch must be removed due to
4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API vmbus_get_outgoing_channel()")

When the patch is backported to v4.20.1, there will be a conflict because
v4.20.1 doesn't have 4d3c5c69191f. I suggest we cherry-pick 4d3c5c69191f
into v4.20.1, before we backport this patch.

drivers/hv/channel.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index ce0ba20..bea4c98 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -701,19 +701,12 @@ static int vmbus_close_internal(struct vmbus_channel *channel)
int vmbus_disconnect_ring(struct vmbus_channel *channel)
{
struct vmbus_channel *cur_channel, *tmp;
- unsigned long flags;
- LIST_HEAD(list);
int ret;

if (channel->primary_channel != NULL)
return -EINVAL;

- /* Snapshot the list of subchannels */
- spin_lock_irqsave(&channel->lock, flags);
- list_splice_init(&channel->sc_list, &list);
- spin_unlock_irqrestore(&channel->lock, flags);
-
- list_for_each_entry_safe(cur_channel, tmp, &list, sc_list) {
+ list_for_each_entry_safe(cur_channel, tmp, &channel->sc_list, sc_list) {
if (cur_channel->rescind)
wait_for_completion(&cur_channel->rescind_event);

--
2.7.4



2019-01-10 00:31:05

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH][re-post] vmbus: fix subchannel removal

On Wed, Jan 09, 2019 at 08:56:06PM +0000, Dexuan Cui wrote:
>
>The changes to split ring allocation from open/close, broke
>the cleanup of subchannels. This resulted in problems using
>uio on network devices because the subchannel was left behind
>when the network device was unbound.
>
>The cause was in the disconnect logic which used list splice
>to move the subchannel list into a local variable. This won't
>work because the subchannel list is needed later during the
>process of the rescind messages (relid2channel).
>
>The fix is to just leave the subchannel list in place
>which is what the original code did. The list is cleaned
>up later when the host rescind is processed.
>
>Without the fix, we have a lot of "hang" issues in netvsc when we
>try to change the NIC's MTU, set the number of channels, etc.
>
>Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open")
>Cc: [email protected]
>Signed-off-by: Stephen Hemminger <[email protected]>
>Signed-off-by: Dexuan Cui <[email protected]>

Queued, thank you.

--
Thanks,
Sasha