2017-04-30 23:21:33

by kys

[permalink] [raw]
Subject: [PATCH 0/6] Drivers: hv: Miscellaneous fixes

From: K. Y. Srinivasan <[email protected]>

Miscellaneous fixes to vmbus and util drivers.

Alex Ng (1):
Tools: hv: vss: Thaw the filesystem and continue if freeze call has
timed out

K. Y. Srinivasan (3):
Drivers: hv: vmbus: Fix error code returned by vmbus_post_msg()
Drivers: hv: util: Make hv_poll_channel() a little more efficient
Drivers: hv: vmbus: Fix rescind handling

Long Li (1):
HV: properly delay KVP packets when negotiation is in progress

Vitaly Kuznetsov (1):
tools: hv: properly handle long paths

drivers/hv/channel.c | 8 ++++-
drivers/hv/channel_mgmt.c | 69 ++++++++++++++++++++++++++++++++++----------
drivers/hv/connection.c | 9 ++++--
drivers/hv/hv_kvp.c | 14 +++++----
drivers/hv/hyperv_vmbus.h | 11 +++++++
drivers/hv/vmbus_drv.c | 29 ++++++++++++++++++-
tools/hv/hv_kvp_daemon.c | 44 ++++++++++++-----------------
tools/hv/hv_vss_daemon.c | 4 ++-
8 files changed, 133 insertions(+), 55 deletions(-)


2017-04-30 23:22:30

by kys

[permalink] [raw]
Subject: [PATCH 1/6] Tools: hv: vss: Thaw the filesystem and continue if freeze call has timed out

From: Alex Ng <[email protected]>

If a FREEZE operation takes too long, the driver may time out and move on
to another operation. The daemon is unaware of this and attempts to
notify the driver that the FREEZE succeeded. This results in an error from
the driver and the daemon leaves the filesystem in frozen state.

Fix this by thawing the filesystem and continuing.

Signed-off-by: Michael Gissing <[email protected]>
Signed-off-by: Alex Ng <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
tools/hv/hv_vss_daemon.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/tools/hv/hv_vss_daemon.c b/tools/hv/hv_vss_daemon.c
index e082980..7ba5419 100644
--- a/tools/hv/hv_vss_daemon.c
+++ b/tools/hv/hv_vss_daemon.c
@@ -261,7 +261,9 @@ int main(int argc, char *argv[])
if (len != sizeof(struct hv_vss_msg)) {
syslog(LOG_ERR, "write failed; error: %d %s", errno,
strerror(errno));
- exit(EXIT_FAILURE);
+
+ if (op == VSS_OP_FREEZE)
+ vss_operate(VSS_OP_THAW);
}
}

--
1.7.1

2017-04-30 23:22:47

by kys

[permalink] [raw]
Subject: [PATCH 6/6] HV: properly delay KVP packets when negotiation is in progress

From: Long Li <[email protected]>

The host may send multiple negotiation packets
(due to timeout) before the KVP user-mode daemon
is connected. KVP user-mode daemon is connected.
We need to defer processing those packets
until the daemon is negotiated and connected.
It's okay for guest to respond
to all negotiation packets.

In addition, the host may send multiple staged
KVP requests as soon as negotiation is done.
We need to properly process those packets using one
tasklet for exclusive access to ring buffer.

This patch is based on the work of
Nick Meier <[email protected]>.

Signed-off-by: Long Li <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/hv_kvp.c | 14 ++++++++------
1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index e99ff2d..9a90b91 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -112,7 +112,7 @@ static void kvp_poll_wrapper(void *channel)
{
/* Transaction is finished, reset the state here to avoid races. */
kvp_transaction.state = HVUTIL_READY;
- hv_kvp_onchannelcallback(channel);
+ tasklet_schedule(&((struct vmbus_channel *)channel)->callback_event);
}

static void kvp_register_done(void)
@@ -159,7 +159,7 @@ static void kvp_timeout_func(struct work_struct *dummy)

static void kvp_host_handshake_func(struct work_struct *dummy)
{
- hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback);
+ tasklet_schedule(&kvp_transaction.recv_channel->callback_event);
}

static int kvp_handle_handshake(struct hv_kvp_msg *msg)
@@ -625,16 +625,17 @@ void hv_kvp_onchannelcallback(void *context)
NEGO_IN_PROGRESS,
NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED;

- if (host_negotiatied == NEGO_NOT_STARTED &&
- kvp_transaction.state < HVUTIL_READY) {
+ if (kvp_transaction.state < HVUTIL_READY) {
/*
* If userspace daemon is not connected and host is asking
* us to negotiate we need to delay to not lose messages.
* This is important for Failover IP setting.
*/
- host_negotiatied = NEGO_IN_PROGRESS;
- schedule_delayed_work(&kvp_host_handshake_work,
+ if (host_negotiatied == NEGO_NOT_STARTED) {
+ host_negotiatied = NEGO_IN_PROGRESS;
+ schedule_delayed_work(&kvp_host_handshake_work,
HV_UTIL_NEGO_TIMEOUT * HZ);
+ }
return;
}
if (kvp_transaction.state > HVUTIL_READY)
@@ -702,6 +703,7 @@ void hv_kvp_onchannelcallback(void *context)
VM_PKT_DATA_INBAND, 0);

host_negotiatied = NEGO_FINISHED;
+ hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper);
}

}
--
1.7.1

2017-04-30 23:22:59

by kys

[permalink] [raw]
Subject: [PATCH 2/6] tools: hv: properly handle long paths

From: Vitaly Kuznetsov <[email protected]>

Paths can be up to PATH_MAX long and PATH_MAX is usually greater than 256.
While on it, simplify path reconstruction to a simple snprintf(), define
and reuse KVP_NET_DIR.

Suggested-by: Tomas Hozza <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
tools/hv/hv_kvp_daemon.c | 44 ++++++++++++++++++--------------------------
1 files changed, 18 insertions(+), 26 deletions(-)

diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index f1758fc..88b20e0 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -39,6 +39,7 @@
#include <fcntl.h>
#include <dirent.h>
#include <net/if.h>
+#include <limits.h>
#include <getopt.h>

/*
@@ -97,6 +98,8 @@ enum {
#define KVP_SCRIPTS_PATH "/usr/libexec/hypervkvpd/"
#endif

+#define KVP_NET_DIR "/sys/class/net/"
+
#define MAX_FILE_NAME 100
#define ENTRIES_PER_BLOCK 50

@@ -596,26 +599,21 @@ void kvp_get_os_info(void)
DIR *dir;
struct dirent *entry;
FILE *file;
- char *p, *q, *x;
+ char *p, *x;
char *if_name = NULL;
char buf[256];
- char *kvp_net_dir = "/sys/class/net/";
- char dev_id[256];
+ char dev_id[PATH_MAX];

- dir = opendir(kvp_net_dir);
+ dir = opendir(KVP_NET_DIR);
if (dir == NULL)
return NULL;

- snprintf(dev_id, sizeof(dev_id), "%s", kvp_net_dir);
- q = dev_id + strlen(kvp_net_dir);
-
while ((entry = readdir(dir)) != NULL) {
/*
* Set the state for the next pass.
*/
- *q = '\0';
- strcat(dev_id, entry->d_name);
- strcat(dev_id, "/device/device_id");
+ snprintf(dev_id, sizeof(dev_id), "%s%s/device/device_id",
+ KVP_NET_DIR, entry->d_name);

file = fopen(dev_id, "r");
if (file == NULL)
@@ -653,12 +651,12 @@ void kvp_get_os_info(void)
FILE *file;
char *p, *x;
char buf[256];
- char addr_file[256];
+ char addr_file[PATH_MAX];
unsigned int i;
char *mac_addr = NULL;

- snprintf(addr_file, sizeof(addr_file), "%s%s%s", "/sys/class/net/",
- if_name, "/address");
+ snprintf(addr_file, sizeof(addr_file), "%s%s%s", KVP_NET_DIR,
+ if_name, "/address");

file = fopen(addr_file, "r");
if (file == NULL)
@@ -688,28 +686,22 @@ void kvp_get_os_info(void)
DIR *dir;
struct dirent *entry;
FILE *file;
- char *p, *q, *x;
+ char *p, *x;
char *if_name = NULL;
char buf[256];
- char *kvp_net_dir = "/sys/class/net/";
- char dev_id[256];
+ char dev_id[PATH_MAX];
unsigned int i;

- dir = opendir(kvp_net_dir);
+ dir = opendir(KVP_NET_DIR);
if (dir == NULL)
return NULL;

- snprintf(dev_id, sizeof(dev_id), "%s", kvp_net_dir);
- q = dev_id + strlen(kvp_net_dir);
-
while ((entry = readdir(dir)) != NULL) {
/*
* Set the state for the next pass.
*/
- *q = '\0';
-
- strcat(dev_id, entry->d_name);
- strcat(dev_id, "/address");
+ snprintf(dev_id, sizeof(dev_id), "%s%s/address", KVP_NET_DIR,
+ entry->d_name);

file = fopen(dev_id, "r");
if (file == NULL)
@@ -1218,9 +1210,9 @@ static int process_ip_string(FILE *f, char *ip_string, int type)
static int kvp_set_ip_info(char *if_name, struct hv_kvp_ipaddr_value *new_val)
{
int error = 0;
- char if_file[128];
+ char if_file[PATH_MAX];
FILE *file;
- char cmd[512];
+ char cmd[PATH_MAX];
char *mac_addr;

/*
--
1.7.1

2017-04-30 23:34:03

by kys

[permalink] [raw]
Subject: [PATCH 4/6] Drivers: hv: util: Make hv_poll_channel() a little more efficient

From: K. Y. Srinivasan <[email protected]>

The current code unconditionally sends an IPI. If we are running on the
correct CPU and are in interrupt level, we don't need an IPI.
Make this adjustment.

Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/hyperv_vmbus.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 6113e91..fa514be 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -411,6 +411,10 @@ static inline void hv_poll_channel(struct vmbus_channel *channel,
if (!channel)
return;

+ if (in_interrupt() && (channel->target_cpu == smp_processor_id())) {
+ cb(channel);
+ return;
+ }
smp_call_function_single(channel->target_cpu, cb, channel, true);
}

--
1.7.1

2017-04-30 23:34:11

by kys

[permalink] [raw]
Subject: [PATCH 5/6] Drivers: hv: vmbus: Fix rescind handling

From: K. Y. Srinivasan <[email protected]>

Fix the rescind handling. This patch addresses the following rescind
scenario that is currently not handled correctly:

If a rescind were to be received while the offer is still being
peocessed, we will be blocked indefinitely since the rescind message
is handled on the same work element as the offer message. Fix this
issue.

I would like to thank Dexuan Cui <[email protected]> and
Long Li <[email protected]> for working with me on this patch.

Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/channel.c | 8 ++++-
drivers/hv/channel_mgmt.c | 69 ++++++++++++++++++++++++++++++++++----------
drivers/hv/connection.c | 7 +++-
drivers/hv/hyperv_vmbus.h | 7 ++++
drivers/hv/vmbus_drv.c | 29 ++++++++++++++++++-
5 files changed, 99 insertions(+), 21 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 736ac76..e9bf0bb 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -630,9 +630,13 @@ void vmbus_close(struct vmbus_channel *channel)
*/
list_for_each_safe(cur, tmp, &channel->sc_list) {
cur_channel = list_entry(cur, struct vmbus_channel, sc_list);
- if (cur_channel->state != CHANNEL_OPENED_STATE)
- continue;
vmbus_close_internal(cur_channel);
+ if (cur_channel->rescind) {
+ mutex_lock(&vmbus_connection.channel_mutex);
+ hv_process_channel_removal(cur_channel,
+ cur_channel->offermsg.child_relid);
+ mutex_unlock(&vmbus_connection.channel_mutex);
+ }
}
/*
* Now close the primary.
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index eec616a..06529b3 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -428,7 +428,6 @@ void vmbus_free_channels(void)
{
struct vmbus_channel *channel, *tmp;

- mutex_lock(&vmbus_connection.channel_mutex);
list_for_each_entry_safe(channel, tmp, &vmbus_connection.chn_list,
listentry) {
/* hv_process_channel_removal() needs this */
@@ -436,7 +435,6 @@ void vmbus_free_channels(void)

vmbus_device_unregister(channel->device_obj);
}
- mutex_unlock(&vmbus_connection.channel_mutex);
}

/*
@@ -483,8 +481,10 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
list_add_tail(&newchannel->sc_list, &channel->sc_list);
channel->num_sc++;
spin_unlock_irqrestore(&channel->lock, flags);
- } else
+ } else {
+ atomic_dec(&vmbus_connection.offer_in_progress);
goto err_free_chan;
+ }
}

dev_type = hv_get_dev_type(newchannel);
@@ -511,6 +511,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
if (!fnew) {
if (channel->sc_creation_callback != NULL)
channel->sc_creation_callback(newchannel);
+ atomic_dec(&vmbus_connection.offer_in_progress);
return;
}

@@ -532,9 +533,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
* binding which eventually invokes the device driver's AddDevice()
* method.
*/
- mutex_lock(&vmbus_connection.channel_mutex);
ret = vmbus_device_register(newchannel->device_obj);
- mutex_unlock(&vmbus_connection.channel_mutex);

if (ret != 0) {
pr_err("unable to add child device object (relid %d)\n",
@@ -542,6 +541,8 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
kfree(newchannel->device_obj);
goto err_deq_chan;
}
+
+ atomic_dec(&vmbus_connection.offer_in_progress);
return;

err_deq_chan:
@@ -799,6 +800,7 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr)
newchannel = alloc_channel();
if (!newchannel) {
vmbus_release_relid(offer->child_relid);
+ atomic_dec(&vmbus_connection.offer_in_progress);
pr_err("Unable to allocate channel object\n");
return;
}
@@ -845,16 +847,38 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)

rescind = (struct vmbus_channel_rescind_offer *)hdr;

+ /*
+ * The offer msg and the corresponding rescind msg
+ * from the host are guranteed to be ordered -
+ * offer comes in first and then the rescind.
+ * Since we process these events in work elements,
+ * and with preemption, we may end up processing
+ * the events out of order. Given that we handle these
+ * work elements on the same CPU, this is possible only
+ * in the case of preemption. In any case wait here
+ * until the offer processing has moved beyond the
+ * point where the channel is discoverable.
+ */
+
+ while (atomic_read(&vmbus_connection.offer_in_progress) != 0) {
+ /*
+ * We wait here until any channel offer is currently
+ * being processed.
+ */
+ msleep(1);
+ }
+
mutex_lock(&vmbus_connection.channel_mutex);
channel = relid2channel(rescind->child_relid);
+ mutex_unlock(&vmbus_connection.channel_mutex);

if (channel == NULL) {
/*
- * This is very impossible, because in
- * vmbus_process_offer(), we have already invoked
- * vmbus_release_relid() on error.
+ * We failed in processing the offer message;
+ * we would have cleaned up the relid in that
+ * failure path.
*/
- goto out;
+ return;
}

spin_lock_irqsave(&channel->lock, flags);
@@ -866,7 +890,7 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)
if (channel->device_obj) {
if (channel->chn_rescind_callback) {
channel->chn_rescind_callback(channel);
- goto out;
+ return;
}
/*
* We will have to unregister this device from the
@@ -877,13 +901,26 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)
vmbus_device_unregister(channel->device_obj);
put_device(dev);
}
- } else {
- hv_process_channel_removal(channel,
- channel->offermsg.child_relid);
}
-
-out:
- mutex_unlock(&vmbus_connection.channel_mutex);
+ if (channel->primary_channel != NULL) {
+ /*
+ * Sub-channel is being rescinded. Following is the channel
+ * close sequence when initiated from the driveri (refer to
+ * vmbus_close() for details):
+ * 1. Close all sub-channels first
+ * 2. Then close the primary channel.
+ */
+ if (channel->state == CHANNEL_OPEN_STATE) {
+ /*
+ * The channel is currently not open;
+ * it is safe for us to cleanup the channel.
+ */
+ mutex_lock(&vmbus_connection.channel_mutex);
+ hv_process_channel_removal(channel,
+ channel->offermsg.child_relid);
+ mutex_unlock(&vmbus_connection.channel_mutex);
+ }
+ }
}

void vmbus_hvsock_device_unregister(struct vmbus_channel *channel)
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index a938fcf..c2d74ee 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -93,10 +93,13 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo,
* all the CPUs. This is needed for kexec to work correctly where
* the CPU attempting to connect may not be CPU 0.
*/
- if (version >= VERSION_WIN8_1)
+ if (version >= VERSION_WIN8_1) {
msg->target_vcpu = hv_context.vp_index[smp_processor_id()];
- else
+ vmbus_connection.connect_cpu = smp_processor_id();
+ } else {
msg->target_vcpu = 0;
+ vmbus_connection.connect_cpu = 0;
+ }

/*
* Add to list before we send the request since we may
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index fa514be..1b6a5e0 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -303,6 +303,13 @@ enum vmbus_connect_state {
#define MAX_SIZE_CHANNEL_MESSAGE HV_MESSAGE_PAYLOAD_BYTE_COUNT

struct vmbus_connection {
+ /*
+ * CPU on which the initial host contact was made.
+ */
+ int connect_cpu;
+
+ atomic_t offer_in_progress;
+
enum vmbus_connect_state conn_state;

atomic_t next_gpadl_handle;
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 0087b49..59bb3ef 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -798,8 +798,10 @@ static void vmbus_device_release(struct device *device)
struct hv_device *hv_dev = device_to_hv_device(device);
struct vmbus_channel *channel = hv_dev->channel;

+ mutex_lock(&vmbus_connection.channel_mutex);
hv_process_channel_removal(channel,
channel->offermsg.child_relid);
+ mutex_unlock(&vmbus_connection.channel_mutex);
kfree(hv_dev);

}
@@ -877,7 +879,32 @@ void vmbus_on_msg_dpc(unsigned long data)
INIT_WORK(&ctx->work, vmbus_onmessage_work);
memcpy(&ctx->msg, msg, sizeof(*msg));

- queue_work(vmbus_connection.work_queue, &ctx->work);
+ /*
+ * The host can generate a rescind message while we
+ * may still be handling the original offer. We deal with
+ * this condition by ensuring the processing is done on the
+ * same CPU.
+ */
+ switch (hdr->msgtype) {
+ case CHANNELMSG_RESCIND_CHANNELOFFER:
+ /*
+ * If we are handling the rescind message;
+ * schedule the work on the global work queue.
+ */
+ schedule_work_on(vmbus_connection.connect_cpu,
+ &ctx->work);
+ break;
+
+ case CHANNELMSG_OFFERCHANNEL:
+ atomic_inc(&vmbus_connection.offer_in_progress);
+ queue_work_on(vmbus_connection.connect_cpu,
+ vmbus_connection.work_queue,
+ &ctx->work);
+ break;
+
+ default:
+ queue_work(vmbus_connection.work_queue, &ctx->work);
+ }
} else
entry->message_handler(hdr);

--
1.7.1

2017-04-30 23:33:49

by kys

[permalink] [raw]
Subject: [PATCH 3/6] Drivers: hv: vmbus: Fix error code returned by vmbus_post_msg()

From: K. Y. Srinivasan <[email protected]>

ENOBUFS is a more approrpiate error code to be returned
when the hypervisor cannot post the message because of
insufficient buffers. Make the adjustment.

Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/connection.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index fce27fb..a938fcf 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -370,7 +370,7 @@ int vmbus_post_msg(void *buffer, size_t buflen, bool can_sleep)
break;
case HV_STATUS_INSUFFICIENT_MEMORY:
case HV_STATUS_INSUFFICIENT_BUFFERS:
- ret = -ENOMEM;
+ ret = -ENOBUFS;
break;
case HV_STATUS_SUCCESS:
return ret;
--
1.7.1