2017-09-18 03:51:03

by kys

[permalink] [raw]
Subject: [PATCH V2 0/4] Drivers: hv: Miscellaneous fixes

From: "K. Y. Srinivasan" <[email protected]>

Miscellaneous fixes.

V2: Dropped the patch: "vmbus: suppress uevents for hv_sock devices" as
this was only to address test trigerred issues.

Dexuan Cui (1):
vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()

Olaf Hering (1):
Drivers: hv: fcopy: restore correct transfer length

Stephen Hemminger (2):
vmbus: add per-channel sysfs info
Drivers: hv: vmbus: Expose per-channel interrupts and events counters

Documentation/ABI/stable/sysfs-bus-vmbus | 70 +++++++++++
drivers/hv/channel_mgmt.c | 14 ++-
drivers/hv/connection.c | 2 +
drivers/hv/hv_fcopy.c | 4 +
drivers/hv/hyperv_vmbus.h | 2 +
drivers/hv/vmbus_drv.c | 203 +++++++++++++++++++++++++++++--
include/linux/hyperv.h | 10 ++
7 files changed, 288 insertions(+), 17 deletions(-)

--
2.14.1


2017-09-18 03:55:25

by kys

[permalink] [raw]
Subject: [PATCH V2 1/4] vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()

From: Dexuan Cui <[email protected]>

Due to commit 54a66265d675 ("Drivers: hv: vmbus: Fix rescind handling"),
we need this patch to resolve the below deadlock:

after we get the mutex in vmbus_hvsock_device_unregister() and call
vmbus_device_unregister() -> device_unregister() -> ... -> device_release()
-> vmbus_device_release(), we'll get a deadlock, because
vmbus_device_release() tries to get the same mutex.

Signed-off-by: Dexuan Cui <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/channel_mgmt.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 968af173c4c1..624d815745e4 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -945,14 +945,10 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)

void vmbus_hvsock_device_unregister(struct vmbus_channel *channel)
{
- mutex_lock(&vmbus_connection.channel_mutex);
-
BUG_ON(!is_hvsock_channel(channel));

channel->rescind = true;
vmbus_device_unregister(channel->device_obj);
-
- mutex_unlock(&vmbus_connection.channel_mutex);
}
EXPORT_SYMBOL_GPL(vmbus_hvsock_device_unregister);

--
2.14.1

2017-09-18 03:55:24

by kys

[permalink] [raw]
Subject: [PATCH V2 2/4] Drivers: hv: fcopy: restore correct transfer length

From: Olaf Hering <[email protected]>

Till recently the expected length of bytes read by the
daemon did depend on the context. It was either hv_start_fcopy or
hv_do_fcopy. The daemon had a buffer size of two pages, which was much
larger than needed.

Now the expected length of bytes read by the
daemon changed slightly. For START_FILE_COPY it is still the size of
hv_start_fcopy. But for WRITE_TO_FILE and the other operations it is as
large as the buffer that arrived via vmbus. In case of WRITE_TO_FILE
that is slightly larger than a struct hv_do_fcopy. Since the buffer in
the daemon was still larger everything was fine.

Currently, the daemon reads only what is actually needed.
The new buffer layout is as large as a struct hv_do_fcopy, for the
WRITE_TO_FILE operation. Since the kernel expects a slightly larger
size, hvt_op_read will return -EINVAL because the daemon will read
slightly less than expected. Address this by restoring the expected
buffer size in case of WRITE_TO_FILE.

Fixes: 'commit c7e490fc23eb ("Drivers: hv: fcopy: convert to hv_utils_transport")'
Fixes: 'commit 3f2baa8a7d2e ("Tools: hv: update buffer handling in hv_fcopy_daemon")'

Signed-off-by: Olaf Hering <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/hv_fcopy.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index daa75bd41f86..2364281d8593 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -170,6 +170,10 @@ static void fcopy_send_data(struct work_struct *dummy)
out_src = smsg_out;
break;

+ case WRITE_TO_FILE:
+ out_src = fcopy_transaction.fcopy_msg;
+ out_len = sizeof(struct hv_do_fcopy);
+ break;
default:
out_src = fcopy_transaction.fcopy_msg;
out_len = fcopy_transaction.recv_len;
--
2.14.1

2017-09-18 03:55:44

by kys

[permalink] [raw]
Subject: [PATCH V2 4/4] Drivers: hv: vmbus: Expose per-channel event counters events counters

From: Stephen Hemminger <[email protected]>

When investigating performance, it is useful to be able to look at
the number of host and guest events per-channel. This is equivalent
to per-device interrupt statistics.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
Documentation/ABI/stable/sysfs-bus-vmbus | 14 ++++++++++++++
drivers/hv/connection.c | 2 ++
drivers/hv/vmbus_drv.c | 18 ++++++++++++++++++
include/linux/hyperv.h | 4 ++++
4 files changed, 38 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
index 0ebd8a1537a0..d4eca1717adb 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -97,3 +97,17 @@ KernelVersion: 4.14
Contact: Stephen Hemminger <[email protected]>
Description: Bytes availabble to write
Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/events
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Number of times we have signaled the host
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/interrupts
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Number of times we have taken an interrupt (incoming)
+Users: Debuggig tools
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 59c11ff90d12..2bb0d03c5212 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -406,6 +406,8 @@ void vmbus_set_event(struct vmbus_channel *channel)
if (!channel->is_dedicated_interrupt)
vmbus_send_interrupt(child_relid);

+ ++channel->sig_events;
+
hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL);
}
EXPORT_SYMBOL_GPL(vmbus_set_event);
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index c126d3937414..88c4f5e98fc9 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -945,6 +945,8 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu)
if (channel->rescind)
continue;

+ ++channel->interrupts;
+
switch (channel->callback_mode) {
case HV_CALL_ISR:
vmbus_channel_isr(channel);
@@ -1238,6 +1240,20 @@ static ssize_t channel_latency_show(const struct vmbus_channel *channel,
}
VMBUS_CHAN_ATTR(latency, S_IRUGO, channel_latency_show, NULL);

+static ssize_t channel_interrupts_show(const struct vmbus_channel *channel, char *buf)
+{
+ return sprintf(buf, "%u\n", channel->interrupts);
+}
+VMBUS_CHAN_ATTR(interrupts, S_IRUGO, channel_interrupts_show, NULL);
+
+static ssize_t channel_events_show(const struct vmbus_channel *channel, char *buf)
+{
+ return sprintf(buf, "%u\n", channel->sig_events);
+}
+VMBUS_CHAN_ATTR(events, S_IRUGO, channel_events_show, NULL);
+
+
+
static struct attribute *vmbus_chan_attrs[] = {
&chan_attr_out_mask.attr,
&chan_attr_in_mask.attr,
@@ -1246,6 +1262,8 @@ static struct attribute *vmbus_chan_attrs[] = {
&chan_attr_cpu.attr,
&chan_attr_pending.attr,
&chan_attr_latency.attr,
+ &chan_attr_interrupts.attr,
+ &chan_attr_events.attr,
NULL
};

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 91c88e0e771e..d7c85a475c46 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -731,6 +731,10 @@ struct vmbus_channel {

struct vmbus_close_msg close_msg;

+ /* Statistics */
+ unsigned int interrupts; /* Host to Guest interrupts */
+ unsigned int sig_events; /* Guest to Host events */
+
/* Channel callback's invoked in softirq context */
struct tasklet_struct callback_event;
void (*onchannel_callback)(void *context);
--
2.14.1

2017-09-18 03:55:43

by kys

[permalink] [raw]
Subject: [PATCH V2 3/4] vmbus: add per-channel sysfs info

From: Stephen Hemminger <[email protected]>

This extends existing vmbus related sysfs structure to provide per-channel
state information. This is useful when diagnosing issues with multiple
queues in networking and storage.

The existing sysfs only displayed information about the primary
channel. The one place it reported multiple channels was the
channel_vp_mapping file which violated the sysfs convention
of one value per file.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
Documentation/ABI/stable/sysfs-bus-vmbus | 56 ++++++++++
drivers/hv/channel_mgmt.c | 10 +-
drivers/hv/hyperv_vmbus.h | 2 +
drivers/hv/vmbus_drv.c | 185 +++++++++++++++++++++++++++++--
include/linux/hyperv.h | 6 +
5 files changed, 246 insertions(+), 13 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
index 5d0125f7bcaf..0ebd8a1537a0 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -41,3 +41,59 @@ KernelVersion: 4.5
Contact: K. Y. Srinivasan <[email protected]>
Description: The 16 bit vendor ID of the device
Users: tools/hv/lsvmbus and user level RDMA libraries
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/cpu
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: VCPU (sub)channel is affinitized to
+Users: tools/hv/lsvmbus and other debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/cpu
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: VCPU (sub)channel is affinitized to
+Users: tools/hv/lsvmbus and other debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/in_mask
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Inbound channel signaling state
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/latency
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Channel signaling latency
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/out_mask
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Outbound channel signaling state
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/pending
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Channel interrupt pending state
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/read_avail
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Bytes availabble to read
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/write_avail
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Bytes availabble to write
+Users: Debuggig tools
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 624d815745e4..747887038992 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -350,7 +350,7 @@ static void free_channel(struct vmbus_channel *channel)
{
tasklet_kill(&channel->callback_event);

- kfree_rcu(channel, rcu);
+ kobject_put(&channel->kobj);
}

static void percpu_channel_enq(void *arg)
@@ -513,6 +513,14 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
newchannel->state = CHANNEL_OPEN_STATE;

if (!fnew) {
+ struct hv_device *dev
+ = newchannel->primary_channel->device_obj;
+
+ if (vmbus_add_channel_kobj(dev, newchannel)) {
+ atomic_dec(&vmbus_connection.offer_in_progress);
+ goto err_free_chan;
+ }
+
if (channel->sc_creation_callback != NULL)
channel->sc_creation_callback(newchannel);
return;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 1b6a5e0dfa75..bfa2d0af4e1d 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -384,6 +384,8 @@ struct hv_device *vmbus_device_create(const uuid_le *type,

int vmbus_device_register(struct hv_device *child_device_obj);
void vmbus_device_unregister(struct hv_device *device_obj);
+int vmbus_add_channel_kobj(struct hv_device *device_obj,
+ struct vmbus_channel *channel);

struct vmbus_channel *relid2channel(u32 relid);

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 43160a2eafe0..c126d3937414 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -107,28 +107,30 @@ static void print_alias_name(struct hv_device *hv_dev, char *alias_name)
sprintf(&alias_name[i], "%02x", hv_dev->dev_type.b[i/2]);
}

-static u8 channel_monitor_group(struct vmbus_channel *channel)
+static u8 channel_monitor_group(const struct vmbus_channel *channel)
{
return (u8)channel->offermsg.monitorid / 32;
}

-static u8 channel_monitor_offset(struct vmbus_channel *channel)
+static u8 channel_monitor_offset(const struct vmbus_channel *channel)
{
return (u8)channel->offermsg.monitorid % 32;
}

-static u32 channel_pending(struct vmbus_channel *channel,
- struct hv_monitor_page *monitor_page)
+static u32 channel_pending(const struct vmbus_channel *channel,
+ const struct hv_monitor_page *monitor_page)
{
u8 monitor_group = channel_monitor_group(channel);
+
return monitor_page->trigger_group[monitor_group].pending;
}

-static u32 channel_latency(struct vmbus_channel *channel,
- struct hv_monitor_page *monitor_page)
+static u32 channel_latency(const struct vmbus_channel *channel,
+ const struct hv_monitor_page *monitor_page)
{
u8 monitor_group = channel_monitor_group(channel);
u8 monitor_offset = channel_monitor_offset(channel);
+
return monitor_page->latency[monitor_group][monitor_offset];
}

@@ -1134,6 +1136,145 @@ void vmbus_driver_unregister(struct hv_driver *hv_driver)
}
EXPORT_SYMBOL_GPL(vmbus_driver_unregister);

+
+/*
+ * Called when last reference to channel is gone.
+ */
+static void vmbus_chan_release(struct kobject *kobj)
+{
+ struct vmbus_channel *channel
+ = container_of(kobj, struct vmbus_channel, kobj);
+
+ kfree_rcu(channel, rcu);
+}
+
+struct vmbus_chan_attribute {
+ struct attribute attr;
+ ssize_t (*show)(const struct vmbus_channel *chan, char *buf);
+ ssize_t (*store)(struct vmbus_channel *chan,
+ const char *buf, size_t count);
+};
+#define VMBUS_CHAN_ATTR(_name, _mode, _show, _store) \
+ struct vmbus_chan_attribute chan_attr_##_name \
+ = __ATTR(_name, _mode, _show, _store)
+#define VMBUS_CHAN_ATTR_RW(_name) \
+ struct vmbus_chan_attribute chan_attr_##_name = __ATTR_RW(_name)
+#define VMBUS_CHAN_ATTR_RO(_name) \
+ struct vmbus_chan_attribute chan_attr_##_name = __ATTR_RO(_name)
+#define VMBUS_CHAN_ATTR_WO(_name) \
+ struct vmbus_chan_attribute chan_attr_##_name = __ATTR_WO(_name)
+
+static ssize_t vmbus_chan_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ const struct vmbus_chan_attribute *attribute
+ = container_of(attr, struct vmbus_chan_attribute, attr);
+ const struct vmbus_channel *chan
+ = container_of(kobj, struct vmbus_channel, kobj);
+
+ if (!attribute->show)
+ return -EIO;
+
+ return attribute->show(chan, buf);
+}
+
+static const struct sysfs_ops vmbus_chan_sysfs_ops = {
+ .show = vmbus_chan_attr_show,
+};
+
+static ssize_t out_mask_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->outbound;
+
+ return sprintf(buf, "%u\n", rbi->ring_buffer->interrupt_mask);
+}
+VMBUS_CHAN_ATTR_RO(out_mask);
+
+static ssize_t in_mask_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->inbound;
+
+ return sprintf(buf, "%u\n", rbi->ring_buffer->interrupt_mask);
+}
+VMBUS_CHAN_ATTR_RO(in_mask);
+
+static ssize_t read_avail_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->inbound;
+
+ return sprintf(buf, "%u\n", hv_get_bytes_to_read(rbi));
+}
+VMBUS_CHAN_ATTR_RO(read_avail);
+
+static ssize_t write_avail_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->outbound;
+
+ return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
+}
+VMBUS_CHAN_ATTR_RO(write_avail);
+
+static ssize_t show_target_cpu(const struct vmbus_channel *channel, char *buf)
+{
+ return sprintf(buf, "%u\n", channel->target_cpu);
+}
+VMBUS_CHAN_ATTR(cpu, S_IRUGO, show_target_cpu, NULL);
+
+static ssize_t channel_pending_show(const struct vmbus_channel *channel,
+ char *buf)
+{
+ return sprintf(buf, "%d\n",
+ channel_pending(channel,
+ vmbus_connection.monitor_pages[1]));
+}
+VMBUS_CHAN_ATTR(pending, S_IRUGO, channel_pending_show, NULL);
+
+static ssize_t channel_latency_show(const struct vmbus_channel *channel,
+ char *buf)
+{
+ return sprintf(buf, "%d\n",
+ channel_latency(channel,
+ vmbus_connection.monitor_pages[1]));
+}
+VMBUS_CHAN_ATTR(latency, S_IRUGO, channel_latency_show, NULL);
+
+static struct attribute *vmbus_chan_attrs[] = {
+ &chan_attr_out_mask.attr,
+ &chan_attr_in_mask.attr,
+ &chan_attr_read_avail.attr,
+ &chan_attr_write_avail.attr,
+ &chan_attr_cpu.attr,
+ &chan_attr_pending.attr,
+ &chan_attr_latency.attr,
+ NULL
+};
+
+static struct kobj_type vmbus_chan_ktype = {
+ .sysfs_ops = &vmbus_chan_sysfs_ops,
+ .release = vmbus_chan_release,
+ .default_attrs = vmbus_chan_attrs,
+};
+
+/*
+ * vmbus_add_channel_kobj - setup a sub-directory under device/channels
+ */
+int vmbus_add_channel_kobj(struct hv_device *dev, struct vmbus_channel *channel)
+{
+ struct kobject *kobj = &channel->kobj;
+ u32 relid = channel->offermsg.child_relid;
+ int ret;
+
+ kobj->kset = dev->channels_kset;
+ ret = kobject_init_and_add(kobj, &vmbus_chan_ktype, NULL,
+ "%u", relid);
+ if (ret)
+ return ret;
+
+ kobject_uevent(kobj, KOBJ_ADD);
+
+ return 0;
+}
+
/*
* vmbus_device_create - Creates and registers a new child device
* on the vmbus.
@@ -1165,7 +1306,8 @@ struct hv_device *vmbus_device_create(const uuid_le *type,
*/
int vmbus_device_register(struct hv_device *child_device_obj)
{
- int ret = 0;
+ struct kobject *kobj = &child_device_obj->device.kobj;
+ int ret;

dev_set_name(&child_device_obj->device, "%pUl",
child_device_obj->channel->offermsg.offer.if_instance.b);
@@ -1179,13 +1321,32 @@ int vmbus_device_register(struct hv_device *child_device_obj)
* binding...which will eventually call vmbus_match() and vmbus_probe()
*/
ret = device_register(&child_device_obj->device);
-
- if (ret)
+ if (ret) {
pr_err("Unable to register child device\n");
- else
- pr_debug("child device %s registered\n",
- dev_name(&child_device_obj->device));
+ return ret;
+ }
+
+ child_device_obj->channels_kset = kset_create_and_add("channels",
+ NULL, kobj);
+ if (!child_device_obj->channels_kset) {
+ ret = -ENOMEM;
+ goto err_dev_unregister;
+ }
+
+ ret = vmbus_add_channel_kobj(child_device_obj,
+ child_device_obj->channel);
+ if (ret) {
+ pr_err("Unable to register primary channeln");
+ goto err_kset_unregister;
+ }
+
+ return 0;
+
+err_kset_unregister:
+ kset_unregister(child_device_obj->channels_kset);

+err_dev_unregister:
+ device_unregister(&child_device_obj->device);
return ret;
}

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 07650d0232cc..91c88e0e771e 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -841,6 +841,11 @@ struct vmbus_channel {
*/
struct rcu_head rcu;

+ /*
+ * For sysfs per-channel properties.
+ */
+ struct kobject kobj;
+
/*
* For performance critical channels (storage, networking
* etc,), Hyper-V has a mechanism to enhance the throughput
@@ -1123,6 +1128,7 @@ struct hv_device {
struct device device;

struct vmbus_channel *channel;
+ struct kset *channels_kset;
};


--
2.14.1

2017-09-18 13:57:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH V2 1/4] vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()

On Sun, Sep 17, 2017 at 08:54:16PM -0700, [email protected] wrote:
> From: Dexuan Cui <[email protected]>
>
> Due to commit 54a66265d675 ("Drivers: hv: vmbus: Fix rescind handling"),
> we need this patch to resolve the below deadlock:

So does this patch need a "Fixes:" tag, and a "stable@" tag as well, so
it gets into 4.13?

thanks,

greg k-h

2017-09-18 13:58:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH V2 2/4] Drivers: hv: fcopy: restore correct transfer length

On Sun, Sep 17, 2017 at 08:54:17PM -0700, [email protected] wrote:
> From: Olaf Hering <[email protected]>
>
> Till recently the expected length of bytes read by the
> daemon did depend on the context. It was either hv_start_fcopy or
> hv_do_fcopy. The daemon had a buffer size of two pages, which was much
> larger than needed.
>
> Now the expected length of bytes read by the
> daemon changed slightly. For START_FILE_COPY it is still the size of
> hv_start_fcopy. But for WRITE_TO_FILE and the other operations it is as
> large as the buffer that arrived via vmbus. In case of WRITE_TO_FILE
> that is slightly larger than a struct hv_do_fcopy. Since the buffer in
> the daemon was still larger everything was fine.
>
> Currently, the daemon reads only what is actually needed.
> The new buffer layout is as large as a struct hv_do_fcopy, for the
> WRITE_TO_FILE operation. Since the kernel expects a slightly larger
> size, hvt_op_read will return -EINVAL because the daemon will read
> slightly less than expected. Address this by restoring the expected
> buffer size in case of WRITE_TO_FILE.
>
> Fixes: 'commit c7e490fc23eb ("Drivers: hv: fcopy: convert to hv_utils_transport")'
> Fixes: 'commit 3f2baa8a7d2e ("Tools: hv: update buffer handling in hv_fcopy_daemon")'

What's with the 'commit' here?

It should look like:
Fixes: c7e490fc23eb ("Drivers: hv: fcopy: convert to hv_utils_transport")

Please fix...

2017-09-18 13:59:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH V2 4/4] Drivers: hv: vmbus: Expose per-channel event counters events counters

On Sun, Sep 17, 2017 at 08:54:19PM -0700, [email protected] wrote:
> From: Stephen Hemminger <[email protected]>
>
> When investigating performance, it is useful to be able to look at
> the number of host and guest events per-channel. This is equivalent
> to per-device interrupt statistics.
>
> Signed-off-by: Stephen Hemminger <[email protected]>
> Signed-off-by: K. Y. Srinivasan <[email protected]>
> ---
> Documentation/ABI/stable/sysfs-bus-vmbus | 14 ++++++++++++++
> drivers/hv/connection.c | 2 ++
> drivers/hv/vmbus_drv.c | 18 ++++++++++++++++++
> include/linux/hyperv.h | 4 ++++
> 4 files changed, 38 insertions(+)

How is this a fix and not just a new feature?

2017-09-18 13:59:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

On Sun, Sep 17, 2017 at 08:54:18PM -0700, [email protected] wrote:
> From: Stephen Hemminger <[email protected]>
>
> This extends existing vmbus related sysfs structure to provide per-channel
> state information. This is useful when diagnosing issues with multiple
> queues in networking and storage.
>
> The existing sysfs only displayed information about the primary
> channel. The one place it reported multiple channels was the
> channel_vp_mapping file which violated the sysfs convention
> of one value per file.
>
> Signed-off-by: Stephen Hemminger <[email protected]>
> Signed-off-by: K. Y. Srinivasan <[email protected]>
> ---
> Documentation/ABI/stable/sysfs-bus-vmbus | 56 ++++++++++
> drivers/hv/channel_mgmt.c | 10 +-
> drivers/hv/hyperv_vmbus.h | 2 +
> drivers/hv/vmbus_drv.c | 185 +++++++++++++++++++++++++++++--
> include/linux/hyperv.h | 6 +
> 5 files changed, 246 insertions(+), 13 deletions(-)

Feels like a new feature to me, and not needed in 4.14-final.

thanks,

greg k-h

2017-09-18 14:00:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH V2 1/4] vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()

On Sun, Sep 17, 2017 at 08:54:16PM -0700, [email protected] wrote:
> From: Dexuan Cui <[email protected]>
>
> Due to commit 54a66265d675 ("Drivers: hv: vmbus: Fix rescind handling"),
> we need this patch to resolve the below deadlock:
>
> after we get the mutex in vmbus_hvsock_device_unregister() and call
> vmbus_device_unregister() -> device_unregister() -> ... -> device_release()
> -> vmbus_device_release(), we'll get a deadlock, because
> vmbus_device_release() tries to get the same mutex.
>
> Signed-off-by: Dexuan Cui <[email protected]>
> Cc: K. Y. Srinivasan <[email protected]>
> Cc: Haiyang Zhang <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Signed-off-by: K. Y. Srinivasan <[email protected]>

As every one of these patches had questions from me, please break them
up into different series, one for 4.14-final, and one for 4.15-rc1.

thanks,

greg k-h

2017-09-18 14:43:55

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH V2 1/4] vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()



> -----Original Message-----
> From: Greg KH [mailto:[email protected]]
> Sent: Monday, September 18, 2017 7:00 AM
> To: KY Srinivasan <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; Stephen Hemminger
> <[email protected]>; Haiyang Zhang <[email protected]>
> Subject: Re: [PATCH V2 1/4] vmbus: don't acquire the mutex in
> vmbus_hvsock_device_unregister()
>
> On Sun, Sep 17, 2017 at 08:54:16PM -0700, [email protected]
> wrote:
> > From: Dexuan Cui <[email protected]>
> >
> > Due to commit 54a66265d675 ("Drivers: hv: vmbus: Fix rescind handling"),
> > we need this patch to resolve the below deadlock:
> >
> > after we get the mutex in vmbus_hvsock_device_unregister() and call
> > vmbus_device_unregister() -> device_unregister() -> ... ->
> device_release()
> > -> vmbus_device_release(), we'll get a deadlock, because
> > vmbus_device_release() tries to get the same mutex.
> >
> > Signed-off-by: Dexuan Cui <[email protected]>
> > Cc: K. Y. Srinivasan <[email protected]>
> > Cc: Haiyang Zhang <[email protected]>
> > Cc: Stephen Hemminger <[email protected]>
> > Signed-off-by: K. Y. Srinivasan <[email protected]>
>
> As every one of these patches had questions from me, please break them
> up into different series, one for 4.14-final, and one for 4.15-rc1.

Will do.

Thanks,

K. Y
>
> thanks,
>
> greg k-h

2018-10-18 15:27:04

by Olaf Hering

[permalink] [raw]
Subject: Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

Am Sun, 17 Sep 2017 20:54:18 -0700
schrieb [email protected]:

> This extends existing vmbus related sysfs structure to provide per-channel
> state information. This is useful when diagnosing issues with multiple
> queues in networking and storage.

> +++ b/drivers/hv/vmbus_drv.c
> +static ssize_t write_avail_show(const struct vmbus_channel *channel, char *buf)
> +{
> + const struct hv_ring_buffer_info *rbi = &channel->outbound;
> +
> + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> +}
> +VMBUS_CHAN_ATTR_RO(write_avail);

This is upstream since a year.

But I wonder how this can work if vmbus_device_register is called,
and then something reads the populated sysfs files before vmbus_open returns.
Nothing protects rbi->ring_buffer in this case, which remains NULL
until vmbus_open populates it.

A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
head /sys/bus/vmbus/devices/*/channels/*/*

Olaf


Attachments:
(No filename) (201.00 B)
Digitale Signatur von OpenPGP

2018-10-18 15:33:27

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH V2 3/4] vmbus: add per-channel sysfs info

From Olaf Hering Sent: Thursday, October 18, 2018 8:20 AM
>
> > This extends existing vmbus related sysfs structure to provide per-channel
> > state information. This is useful when diagnosing issues with multiple
> > queues in networking and storage.
>
> > +++ b/drivers/hv/vmbus_drv.c
> > +static ssize_t write_avail_show(const struct vmbus_channel *channel, char *buf)
> > +{
> > + const struct hv_ring_buffer_info *rbi = &channel->outbound;
> > +
> > + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> > +}
> > +VMBUS_CHAN_ATTR_RO(write_avail);
>
> This is upstream since a year.
>
> But I wonder how this can work if vmbus_device_register is called,
> and then something reads the populated sysfs files before vmbus_open returns.
> Nothing protects rbi->ring_buffer in this case, which remains NULL
> until vmbus_open populates it.
>
> A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
> head /sys/bus/vmbus/devices/*/channels/*/*
>

There are multiple race conditions with this and other VMbus sysfs information.
There's a race on the close path as well. I've got an action on my list to get it
cleaned up.

Michael


2018-10-18 16:39:54

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

On Thu, 18 Oct 2018 17:19:53 +0200
Olaf Hering <[email protected]> wrote:

> Am Sun, 17 Sep 2017 20:54:18 -0700
> schrieb [email protected]:
>
> > This extends existing vmbus related sysfs structure to provide per-channel
> > state information. This is useful when diagnosing issues with multiple
> > queues in networking and storage.
>
> > +++ b/drivers/hv/vmbus_drv.c
> > +static ssize_t write_avail_show(const struct vmbus_channel *channel, char *buf)
> > +{
> > + const struct hv_ring_buffer_info *rbi = &channel->outbound;
> > +
> > + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> > +}
> > +VMBUS_CHAN_ATTR_RO(write_avail);
>
> This is upstream since a year.
>
> But I wonder how this can work if vmbus_device_register is called,
> and then something reads the populated sysfs files before vmbus_open returns.
> Nothing protects rbi->ring_buffer in this case, which remains NULL
> until vmbus_open populates it.
>
> A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
> head /sys/bus/vmbus/devices/*/channels/*/*
>
> Olaf


Good catch, actually the problem goes across all of the ring buffer sysfs files
so it existed long before that.

The channel ring buffer could be missing.

I am less worried about the open from init case, and more worried about issues
when channels are closed (as happens when changing number of channels on a net device).

As Al has pointed out for years, sysfs is riddled with dangling reference issues.


2018-10-18 16:43:04

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH V2 3/4] vmbus: add per-channel sysfs info

On Thu, 18 Oct 2018 15:32:35 +0000
Michael Kelley <[email protected]> wrote:

> From Olaf Hering Sent: Thursday, October 18, 2018 8:20 AM
> >
> > > This extends existing vmbus related sysfs structure to provide per-channel
> > > state information. This is useful when diagnosing issues with multiple
> > > queues in networking and storage.
> >
> > > +++ b/drivers/hv/vmbus_drv.c
> > > +static ssize_t write_avail_show(const struct vmbus_channel *channel, char *buf)
> > > +{
> > > + const struct hv_ring_buffer_info *rbi = &channel->outbound;
> > > +
> > > + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
> > > +}
> > > +VMBUS_CHAN_ATTR_RO(write_avail);
> >
> > This is upstream since a year.
> >
> > But I wonder how this can work if vmbus_device_register is called,
> > and then something reads the populated sysfs files before vmbus_open returns.
> > Nothing protects rbi->ring_buffer in this case, which remains NULL
> > until vmbus_open populates it.
> >
> > A simple reproduce, with a modular kernel, is to boot with init=/bin/bash
> > head /sys/bus/vmbus/devices/*/channels/*/*
> >
>
> There are multiple race conditions with this and other VMbus sysfs information.
> There's a race on the close path as well. I've got an action on my list to get it
> cleaned up.
>
> Michael
>

There is also a bunch of issues with code like:

static ssize_t id_show(struct device *dev, struct device_attribute *dev_attr,
char *buf)
{
struct hv_device *hv_dev = device_to_hv_device(dev);

if (!hv_dev->channel)
return -ENODEV;
return sprintf(buf, "%d\n", hv_dev->channel->offermsg.child_relid);
}

Which should be using ACCESS_ONCE on hv_dev->channel or doing proper RCU.