2017-09-10 05:54:39

by kys

[permalink] [raw]
Subject: [PATCH 0/5] Drivers: hv: Miscellaneous fixes

From: "K. Y. Srinivasan" <[email protected]>

Miscellaneous fixes.

Dexuan Cui (2):
vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()
vmbus: suppress uevents for hv_sock devices

Olaf Hering (1):
Drivers: hv: fcopy: restore correct transfer length

Stephen Hemminger (2):
vmbus: add per-channel sysfs info
Drivers: hv: vmbus: Expose per-channel interrupts and events counters

Documentation/ABI/stable/sysfs-bus-vmbus | 70 +++++++++++
drivers/hv/channel_mgmt.c | 14 ++-
drivers/hv/connection.c | 2 +
drivers/hv/hv_fcopy.c | 4 +
drivers/hv/hyperv_vmbus.h | 2 +
drivers/hv/vmbus_drv.c | 205 +++++++++++++++++++++++++++++--
include/linux/hyperv.h | 10 ++
7 files changed, 290 insertions(+), 17 deletions(-)

--
2.14.1


2017-09-10 06:09:55

by kys

[permalink] [raw]
Subject: [PATCH 4/5] vmbus: add per-channel sysfs info

From: Stephen Hemminger <[email protected]>

This extends existing vmbus related sysfs structure to provide per-channel
state information. This is useful when diagnosing issues with multiple
queues in networking and storage.

The existing sysfs only displayed information about the primary
channel. The one place it reported multiple channels was the
channel_vp_mapping file which violated the sysfs convention
of one value per file.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
Documentation/ABI/stable/sysfs-bus-vmbus | 56 ++++++++++
drivers/hv/channel_mgmt.c | 10 +-
drivers/hv/hyperv_vmbus.h | 2 +
drivers/hv/vmbus_drv.c | 185 +++++++++++++++++++++++++++++--
include/linux/hyperv.h | 6 +
5 files changed, 246 insertions(+), 13 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
index 5d0125f7bcaf..0ebd8a1537a0 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -41,3 +41,59 @@ KernelVersion: 4.5
Contact: K. Y. Srinivasan <[email protected]>
Description: The 16 bit vendor ID of the device
Users: tools/hv/lsvmbus and user level RDMA libraries
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/cpu
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: VCPU (sub)channel is affinitized to
+Users: tools/hv/lsvmbus and other debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/cpu
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: VCPU (sub)channel is affinitized to
+Users: tools/hv/lsvmbus and other debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/in_mask
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Inbound channel signaling state
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/latency
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Channel signaling latency
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/out_mask
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Outbound channel signaling state
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/pending
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Channel interrupt pending state
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/read_avail
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Bytes availabble to read
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/write_avail
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Bytes availabble to write
+Users: Debuggig tools
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 624d815745e4..747887038992 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -350,7 +350,7 @@ static void free_channel(struct vmbus_channel *channel)
{
tasklet_kill(&channel->callback_event);

- kfree_rcu(channel, rcu);
+ kobject_put(&channel->kobj);
}

static void percpu_channel_enq(void *arg)
@@ -513,6 +513,14 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
newchannel->state = CHANNEL_OPEN_STATE;

if (!fnew) {
+ struct hv_device *dev
+ = newchannel->primary_channel->device_obj;
+
+ if (vmbus_add_channel_kobj(dev, newchannel)) {
+ atomic_dec(&vmbus_connection.offer_in_progress);
+ goto err_free_chan;
+ }
+
if (channel->sc_creation_callback != NULL)
channel->sc_creation_callback(newchannel);
return;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 1b6a5e0dfa75..bfa2d0af4e1d 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -384,6 +384,8 @@ struct hv_device *vmbus_device_create(const uuid_le *type,

int vmbus_device_register(struct hv_device *child_device_obj);
void vmbus_device_unregister(struct hv_device *device_obj);
+int vmbus_add_channel_kobj(struct hv_device *device_obj,
+ struct vmbus_channel *channel);

struct vmbus_channel *relid2channel(u32 relid);

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 2c147702ce08..a2986aac730d 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -107,28 +107,30 @@ static void print_alias_name(struct hv_device *hv_dev, char *alias_name)
sprintf(&alias_name[i], "%02x", hv_dev->dev_type.b[i/2]);
}

-static u8 channel_monitor_group(struct vmbus_channel *channel)
+static u8 channel_monitor_group(const struct vmbus_channel *channel)
{
return (u8)channel->offermsg.monitorid / 32;
}

-static u8 channel_monitor_offset(struct vmbus_channel *channel)
+static u8 channel_monitor_offset(const struct vmbus_channel *channel)
{
return (u8)channel->offermsg.monitorid % 32;
}

-static u32 channel_pending(struct vmbus_channel *channel,
- struct hv_monitor_page *monitor_page)
+static u32 channel_pending(const struct vmbus_channel *channel,
+ const struct hv_monitor_page *monitor_page)
{
u8 monitor_group = channel_monitor_group(channel);
+
return monitor_page->trigger_group[monitor_group].pending;
}

-static u32 channel_latency(struct vmbus_channel *channel,
- struct hv_monitor_page *monitor_page)
+static u32 channel_latency(const struct vmbus_channel *channel,
+ const struct hv_monitor_page *monitor_page)
{
u8 monitor_group = channel_monitor_group(channel);
u8 monitor_offset = channel_monitor_offset(channel);
+
return monitor_page->latency[monitor_group][monitor_offset];
}

@@ -1134,6 +1136,145 @@ void vmbus_driver_unregister(struct hv_driver *hv_driver)
}
EXPORT_SYMBOL_GPL(vmbus_driver_unregister);

+
+/*
+ * Called when last reference to channel is gone.
+ */
+static void vmbus_chan_release(struct kobject *kobj)
+{
+ struct vmbus_channel *channel
+ = container_of(kobj, struct vmbus_channel, kobj);
+
+ kfree_rcu(channel, rcu);
+}
+
+struct vmbus_chan_attribute {
+ struct attribute attr;
+ ssize_t (*show)(const struct vmbus_channel *chan, char *buf);
+ ssize_t (*store)(struct vmbus_channel *chan,
+ const char *buf, size_t count);
+};
+#define VMBUS_CHAN_ATTR(_name, _mode, _show, _store) \
+ struct vmbus_chan_attribute chan_attr_##_name \
+ = __ATTR(_name, _mode, _show, _store)
+#define VMBUS_CHAN_ATTR_RW(_name) \
+ struct vmbus_chan_attribute chan_attr_##_name = __ATTR_RW(_name)
+#define VMBUS_CHAN_ATTR_RO(_name) \
+ struct vmbus_chan_attribute chan_attr_##_name = __ATTR_RO(_name)
+#define VMBUS_CHAN_ATTR_WO(_name) \
+ struct vmbus_chan_attribute chan_attr_##_name = __ATTR_WO(_name)
+
+static ssize_t vmbus_chan_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ const struct vmbus_chan_attribute *attribute
+ = container_of(attr, struct vmbus_chan_attribute, attr);
+ const struct vmbus_channel *chan
+ = container_of(kobj, struct vmbus_channel, kobj);
+
+ if (!attribute->show)
+ return -EIO;
+
+ return attribute->show(chan, buf);
+}
+
+static const struct sysfs_ops vmbus_chan_sysfs_ops = {
+ .show = vmbus_chan_attr_show,
+};
+
+static ssize_t out_mask_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->outbound;
+
+ return sprintf(buf, "%u\n", rbi->ring_buffer->interrupt_mask);
+}
+VMBUS_CHAN_ATTR_RO(out_mask);
+
+static ssize_t in_mask_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->inbound;
+
+ return sprintf(buf, "%u\n", rbi->ring_buffer->interrupt_mask);
+}
+VMBUS_CHAN_ATTR_RO(in_mask);
+
+static ssize_t read_avail_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->inbound;
+
+ return sprintf(buf, "%u\n", hv_get_bytes_to_read(rbi));
+}
+VMBUS_CHAN_ATTR_RO(read_avail);
+
+static ssize_t write_avail_show(const struct vmbus_channel *channel, char *buf)
+{
+ const struct hv_ring_buffer_info *rbi = &channel->outbound;
+
+ return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi));
+}
+VMBUS_CHAN_ATTR_RO(write_avail);
+
+static ssize_t show_target_cpu(const struct vmbus_channel *channel, char *buf)
+{
+ return sprintf(buf, "%u\n", channel->target_cpu);
+}
+VMBUS_CHAN_ATTR(cpu, S_IRUGO, show_target_cpu, NULL);
+
+static ssize_t channel_pending_show(const struct vmbus_channel *channel,
+ char *buf)
+{
+ return sprintf(buf, "%d\n",
+ channel_pending(channel,
+ vmbus_connection.monitor_pages[1]));
+}
+VMBUS_CHAN_ATTR(pending, S_IRUGO, channel_pending_show, NULL);
+
+static ssize_t channel_latency_show(const struct vmbus_channel *channel,
+ char *buf)
+{
+ return sprintf(buf, "%d\n",
+ channel_latency(channel,
+ vmbus_connection.monitor_pages[1]));
+}
+VMBUS_CHAN_ATTR(latency, S_IRUGO, channel_latency_show, NULL);
+
+static struct attribute *vmbus_chan_attrs[] = {
+ &chan_attr_out_mask.attr,
+ &chan_attr_in_mask.attr,
+ &chan_attr_read_avail.attr,
+ &chan_attr_write_avail.attr,
+ &chan_attr_cpu.attr,
+ &chan_attr_pending.attr,
+ &chan_attr_latency.attr,
+ NULL
+};
+
+static struct kobj_type vmbus_chan_ktype = {
+ .sysfs_ops = &vmbus_chan_sysfs_ops,
+ .release = vmbus_chan_release,
+ .default_attrs = vmbus_chan_attrs,
+};
+
+/*
+ * vmbus_add_channel_kobj - setup a sub-directory under device/channels
+ */
+int vmbus_add_channel_kobj(struct hv_device *dev, struct vmbus_channel *channel)
+{
+ struct kobject *kobj = &channel->kobj;
+ u32 relid = channel->offermsg.child_relid;
+ int ret;
+
+ kobj->kset = dev->channels_kset;
+ ret = kobject_init_and_add(kobj, &vmbus_chan_ktype, NULL,
+ "%u", relid);
+ if (ret)
+ return ret;
+
+ kobject_uevent(kobj, KOBJ_ADD);
+
+ return 0;
+}
+
/*
* vmbus_device_create - Creates and registers a new child device
* on the vmbus.
@@ -1165,7 +1306,8 @@ struct hv_device *vmbus_device_create(const uuid_le *type,
*/
int vmbus_device_register(struct hv_device *child_device_obj)
{
- int ret = 0;
+ struct kobject *kobj = &child_device_obj->device.kobj;
+ int ret;

dev_set_name(&child_device_obj->device, "%pUl",
child_device_obj->channel->offermsg.offer.if_instance.b);
@@ -1181,13 +1323,32 @@ int vmbus_device_register(struct hv_device *child_device_obj)
* binding...which will eventually call vmbus_match() and vmbus_probe()
*/
ret = device_register(&child_device_obj->device);
-
- if (ret)
+ if (ret) {
pr_err("Unable to register child device\n");
- else
- pr_debug("child device %s registered\n",
- dev_name(&child_device_obj->device));
+ return ret;
+ }
+
+ child_device_obj->channels_kset = kset_create_and_add("channels",
+ NULL, kobj);
+ if (!child_device_obj->channels_kset) {
+ ret = -ENOMEM;
+ goto err_dev_unregister;
+ }
+
+ ret = vmbus_add_channel_kobj(child_device_obj,
+ child_device_obj->channel);
+ if (ret) {
+ pr_err("Unable to register primary channeln");
+ goto err_kset_unregister;
+ }
+
+ return 0;
+
+err_kset_unregister:
+ kset_unregister(child_device_obj->channels_kset);

+err_dev_unregister:
+ device_unregister(&child_device_obj->device);
return ret;
}

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 07650d0232cc..91c88e0e771e 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -841,6 +841,11 @@ struct vmbus_channel {
*/
struct rcu_head rcu;

+ /*
+ * For sysfs per-channel properties.
+ */
+ struct kobject kobj;
+
/*
* For performance critical channels (storage, networking
* etc,), Hyper-V has a mechanism to enhance the throughput
@@ -1123,6 +1128,7 @@ struct hv_device {
struct device device;

struct vmbus_channel *channel;
+ struct kset *channels_kset;
};


--
2.14.1

2017-09-10 06:10:00

by kys

[permalink] [raw]
Subject: [PATCH 1/5] vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()

From: Dexuan Cui <[email protected]>

Due to commit 54a66265d675 ("Drivers: hv: vmbus: Fix rescind handling"),
we need this patch to resolve the below deadlock:

after we get the mutex in vmbus_hvsock_device_unregister() and call
vmbus_device_unregister() -> device_unregister() -> ... -> device_release()
-> vmbus_device_release(), we'll get a deadlock, because
vmbus_device_release() tries to get the same mutex.

Signed-off-by: Dexuan Cui <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/channel_mgmt.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 968af173c4c1..624d815745e4 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -945,14 +945,10 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)

void vmbus_hvsock_device_unregister(struct vmbus_channel *channel)
{
- mutex_lock(&vmbus_connection.channel_mutex);
-
BUG_ON(!is_hvsock_channel(channel));

channel->rescind = true;
vmbus_device_unregister(channel->device_obj);
-
- mutex_unlock(&vmbus_connection.channel_mutex);
}
EXPORT_SYMBOL_GPL(vmbus_hvsock_device_unregister);

--
2.14.1

2017-09-10 06:09:59

by kys

[permalink] [raw]
Subject: [PATCH 5/5] Drivers: hv: vmbus: Expose per-channel event counters

From: Stephen Hemminger <[email protected]>

When investigating performance, it is useful to be able to look at
the number of host and guest events per-channel. This is equivalent
to per-device interrupt statistics.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
Documentation/ABI/stable/sysfs-bus-vmbus | 14 ++++++++++++++
drivers/hv/connection.c | 2 ++
drivers/hv/vmbus_drv.c | 18 ++++++++++++++++++
include/linux/hyperv.h | 4 ++++
4 files changed, 38 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
index 0ebd8a1537a0..d4eca1717adb 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -97,3 +97,17 @@ KernelVersion: 4.14
Contact: Stephen Hemminger <[email protected]>
Description: Bytes availabble to write
Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/events
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Number of times we have signaled the host
+Users: Debuggig tools
+
+What: /sys/bus/vmbus/devices/vmbus_*/channels/relid/interrupts
+Date: September. 2017
+KernelVersion: 4.14
+Contact: Stephen Hemminger <[email protected]>
+Description: Number of times we have taken an interrupt (incoming)
+Users: Debuggig tools
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 59c11ff90d12..2bb0d03c5212 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -406,6 +406,8 @@ void vmbus_set_event(struct vmbus_channel *channel)
if (!channel->is_dedicated_interrupt)
vmbus_send_interrupt(child_relid);

+ ++channel->sig_events;
+
hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL);
}
EXPORT_SYMBOL_GPL(vmbus_set_event);
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index a2986aac730d..2a329a2f286b 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -945,6 +945,8 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu)
if (channel->rescind)
continue;

+ ++channel->interrupts;
+
switch (channel->callback_mode) {
case HV_CALL_ISR:
vmbus_channel_isr(channel);
@@ -1238,6 +1240,20 @@ static ssize_t channel_latency_show(const struct vmbus_channel *channel,
}
VMBUS_CHAN_ATTR(latency, S_IRUGO, channel_latency_show, NULL);

+static ssize_t channel_interrupts_show(const struct vmbus_channel *channel, char *buf)
+{
+ return sprintf(buf, "%u\n", channel->interrupts);
+}
+VMBUS_CHAN_ATTR(interrupts, S_IRUGO, channel_interrupts_show, NULL);
+
+static ssize_t channel_events_show(const struct vmbus_channel *channel, char *buf)
+{
+ return sprintf(buf, "%u\n", channel->sig_events);
+}
+VMBUS_CHAN_ATTR(events, S_IRUGO, channel_events_show, NULL);
+
+
+
static struct attribute *vmbus_chan_attrs[] = {
&chan_attr_out_mask.attr,
&chan_attr_in_mask.attr,
@@ -1246,6 +1262,8 @@ static struct attribute *vmbus_chan_attrs[] = {
&chan_attr_cpu.attr,
&chan_attr_pending.attr,
&chan_attr_latency.attr,
+ &chan_attr_interrupts.attr,
+ &chan_attr_events.attr,
NULL
};

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 91c88e0e771e..d7c85a475c46 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -731,6 +731,10 @@ struct vmbus_channel {

struct vmbus_close_msg close_msg;

+ /* Statistics */
+ unsigned int interrupts; /* Host to Guest interrupts */
+ unsigned int sig_events; /* Guest to Host events */
+
/* Channel callback's invoked in softirq context */
struct tasklet_struct callback_event;
void (*onchannel_callback)(void *context);
--
2.14.1

2017-09-10 06:09:58

by kys

[permalink] [raw]
Subject: [PATCH 3/5] Drivers: hv: fcopy: restore correct transfer length

From: Olaf Hering <[email protected]>

Till recently the expected length of bytes read by the
daemon did depend on the context. It was either hv_start_fcopy or
hv_do_fcopy. The daemon had a buffer size of two pages, which was much
larger than needed.

Now the expected length of bytes read by the
daemon changed slightly. For START_FILE_COPY it is still the size of
hv_start_fcopy. But for WRITE_TO_FILE and the other operations it is as
large as the buffer that arrived via vmbus. In case of WRITE_TO_FILE
that is slightly larger than a struct hv_do_fcopy. Since the buffer in
the daemon was still larger everything was fine.

Currently, the daemon reads only what is actually needed.
The new buffer layout is as large as a struct hv_do_fcopy, for the
WRITE_TO_FILE operation. Since the kernel expects a slightly larger
size, hvt_op_read will return -EINVAL because the daemon will read
slightly less than expected. Address this by restoring the expected
buffer size in case of WRITE_TO_FILE.

Fixes: 'commit c7e490fc23eb ("Drivers: hv: fcopy: convert to hv_utils_transport")'
Fixes: 'commit 3f2baa8a7d2e ("Tools: hv: update buffer handling in hv_fcopy_daemon")'

Signed-off-by: Olaf Hering <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/hv_fcopy.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index daa75bd41f86..2364281d8593 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -170,6 +170,10 @@ static void fcopy_send_data(struct work_struct *dummy)
out_src = smsg_out;
break;

+ case WRITE_TO_FILE:
+ out_src = fcopy_transaction.fcopy_msg;
+ out_len = sizeof(struct hv_do_fcopy);
+ break;
default:
out_src = fcopy_transaction.fcopy_msg;
out_len = fcopy_transaction.recv_len;
--
2.14.1

2017-09-10 06:09:54

by kys

[permalink] [raw]
Subject: [PATCH 2/5] vmbus: suppress uevents for hv_sock devices

From: Dexuan Cui <[email protected]>

hv_sock driver is automatically loaded when an application creates an
AF_VSOCK socket, so we don't really need to trigger uevents to the user
space udevd.

And hv_sock devices can appear and disappear frequency, e.g. 100 per
second, so triggering the udevents can cause a high cpu utilization of
udevd, e.g. 30% on a 2-cpu virtual machine. So let's suppress the
uevents to avoid this.

Signed-off-by: Dexuan Cui <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/hv/vmbus_drv.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 43160a2eafe0..2c147702ce08 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1174,6 +1174,8 @@ int vmbus_device_register(struct hv_device *child_device_obj)
child_device_obj->device.parent = &hv_acpi_dev->dev;
child_device_obj->device.release = vmbus_device_release;

+ if (is_hvsock_channel(child_device_obj->channel))
+ dev_set_uevent_suppress(&child_device_obj->device, 1);
/*
* Register with the LDM. This will kick off the driver/device
* binding...which will eventually call vmbus_match() and vmbus_probe()
--
2.14.1

2017-09-10 09:21:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/5] vmbus: suppress uevents for hv_sock devices

On Sat, Sep 09, 2017 at 11:08:46PM -0700, [email protected] wrote:
> From: Dexuan Cui <[email protected]>
>
> hv_sock driver is automatically loaded when an application creates an
> AF_VSOCK socket, so we don't really need to trigger uevents to the user
> space udevd.
>
> And hv_sock devices can appear and disappear frequency, e.g. 100 per
> second, so triggering the udevents can cause a high cpu utilization of
> udevd, e.g. 30% on a 2-cpu virtual machine. So let's suppress the
> uevents to avoid this.

100 per second for a struct device? That's crazy, and the uevent is the
least of your worries. Please fix that, as it's not the correct way to
use the driver model at all.

And really, why is uevent taking all that much cpu time anyway? It
_should_ be pretty fast, unless your distro is doing crazy things with
it...

sorry, am not going to take this patch.

greg k-h

2017-09-18 00:02:38

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 2/5] vmbus: suppress uevents for hv_sock devices



> -----Original Message-----
> From: Greg KH [mailto:[email protected]]
> Sent: Sunday, September 10, 2017 1:10 AM
> To: KY Srinivasan <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; Stephen Hemminger
> <[email protected]>; Dexuan Cui <[email protected]>; Haiyang
> Zhang <[email protected]>
> Subject: Re: [PATCH 2/5] vmbus: suppress uevents for hv_sock devices
>
> On Sat, Sep 09, 2017 at 11:08:46PM -0700, [email protected]
> wrote:
> > From: Dexuan Cui <[email protected]>
> >
> > hv_sock driver is automatically loaded when an application creates an
> > AF_VSOCK socket, so we don't really need to trigger uevents to the user
> > space udevd.
> >
> > And hv_sock devices can appear and disappear frequency, e.g. 100 per
> > second, so triggering the udevents can cause a high cpu utilization of
> > udevd, e.g. 30% on a 2-cpu virtual machine. So let's suppress the
> > uevents to avoid this.
>
> 100 per second for a struct device? That's crazy, and the uevent is the
> least of your worries. Please fix that, as it's not the correct way to
> use the driver model at all.
>
> And really, why is uevent taking all that much cpu time anyway? It
> _should_ be pretty fast, unless your distro is doing crazy things with
> it...
>
> sorry, am not going to take this patch.

Greg,

This is not a real problem that needs fixing. The test automation triggered this condition.
I will drop this patch and send the rest.

Regards,

K. Y
>
> greg k-h