2015-04-27 17:04:27

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 0/6] Drivers: hv: vmbus: fair round robin algorithm for vmbus_get_outgoing_channel()

Changes in v2:
- Address Dexuan's review comments:
PATCH 3/6: s,channel,primary_channel;
PATCH 4/6: add a forward declaration instead of moving code around;
PATCH 6/6: fix an off-by-one
- Change the algorithm in PATCH 6/6:
Instead of a simple round robin we first try to find a (sub)channel with
the current_cpu == target_cpu and we fallback to a round robin when we fail
to find one.

K. Y., Dexuan, can you please give it a spin in various testing environments
you have? Thanks!

Original description:

This series is a continuation of the "Drivers: hv: vmbus: Use a round-robin
algorithm for picking the outgoing channel" work. It is supposed to bring two
significant changes:
1) Subchannels for a channel are distributed evenly across all vcpus we have.
Currently we try to distribute all channels (including subchannels) across
all vcpus, this approach doesn't guarantee that the particular channel's
subchannels will be distributed in the same way as we process all offer
requests in some random order. (Patch 05)
2) Channel picking based on the current vcpu is dropped from
vmbus_get_outgoing_channel() in favor of a fair round robin. (Patch 06) (this
is not true anymore, see 'Changes').

Patches 01 - 04 are cleanup/refactoring.

Vitaly Kuznetsov (6):
Drivers: hv: vmbus: unify calls to percpu_channel_enq()
Drivers: hv: vmbus: briefly comment num_sc and next_oc
Drivers: hv: vmbus: decrease num_sc on subchannel removal
Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer()
Drivers: hv: vmbus: distribute subchannels among all vcpus
Drivers: hv: vmbus: improve selection of an outgoing channel

drivers/hv/channel_mgmt.c | 127 ++++++++++++++++++++++++++--------------------
include/linux/hyperv.h | 12 +++--
2 files changed, 80 insertions(+), 59 deletions(-)

--
1.9.3


2015-04-27 17:04:39

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 1/6] Drivers: hv: vmbus: unify calls to percpu_channel_enq()

Remove some code duplication, no functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 51 +++++++++++++++++------------------------------
1 file changed, 18 insertions(+), 33 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 4b9d89a..b28cbdf 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -233,7 +233,6 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
{
struct vmbus_channel *channel;
bool fnew = true;
- bool enq = false;
unsigned long flags;

/* Make sure this is a new offer */
@@ -249,25 +248,12 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
}
}

- if (fnew) {
+ if (fnew)
list_add_tail(&newchannel->listentry,
&vmbus_connection.chn_list);
- enq = true;
- }

spin_unlock_irqrestore(&vmbus_connection.channel_lock, flags);

- if (enq) {
- if (newchannel->target_cpu != get_cpu()) {
- put_cpu();
- smp_call_function_single(newchannel->target_cpu,
- percpu_channel_enq,
- newchannel, true);
- } else {
- percpu_channel_enq(newchannel);
- put_cpu();
- }
- }
if (!fnew) {
/*
* Check to see if this is a sub-channel.
@@ -280,26 +266,19 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
spin_lock_irqsave(&channel->lock, flags);
list_add_tail(&newchannel->sc_list, &channel->sc_list);
spin_unlock_irqrestore(&channel->lock, flags);
-
- if (newchannel->target_cpu != get_cpu()) {
- put_cpu();
- smp_call_function_single(newchannel->target_cpu,
- percpu_channel_enq,
- newchannel, true);
- } else {
- percpu_channel_enq(newchannel);
- put_cpu();
- }
-
- newchannel->state = CHANNEL_OPEN_STATE;
channel->num_sc++;
- if (channel->sc_creation_callback != NULL)
- channel->sc_creation_callback(newchannel);
-
- return;
- }
+ } else
+ goto err_free_chan;
+ }

- goto err_free_chan;
+ if (newchannel->target_cpu != get_cpu()) {
+ put_cpu();
+ smp_call_function_single(newchannel->target_cpu,
+ percpu_channel_enq,
+ newchannel, true);
+ } else {
+ percpu_channel_enq(newchannel);
+ put_cpu();
}

/*
@@ -309,6 +288,12 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
*/
newchannel->state = CHANNEL_OPEN_STATE;

+ if (!fnew) {
+ if (channel->sc_creation_callback != NULL)
+ channel->sc_creation_callback(newchannel);
+ return;
+ }
+
/*
* Start the process of binding this offer to the driver
* We need to set the DeviceObject field before calling
--
1.9.3

2015-04-27 17:04:29

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 2/6] Drivers: hv: vmbus: briefly comment num_sc and next_oc

next_oc and num_sc fields of struct vmbus_channel deserve a description. Move
them closer to sc_list as these fields are related to it.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
include/linux/hyperv.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index c739cba..cce7f66 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -727,6 +727,15 @@ struct vmbus_channel {
*/
struct list_head sc_list;
/*
+ * Current number of sub-channels.
+ */
+ int num_sc;
+ /*
+ * Number of a sub-channel (position within sc_list) which is supposed
+ * to be used as the next outgoing channel.
+ */
+ int next_oc;
+ /*
* The primary channel this sub-channel belongs to.
* This will be NULL for the primary channel.
*/
@@ -740,9 +749,6 @@ struct vmbus_channel {
* link up channels based on their CPU affinity.
*/
struct list_head percpu_list;
-
- int num_sc;
- int next_oc;
};

static inline void set_channel_read_state(struct vmbus_channel *c, bool state)
--
1.9.3

2015-04-27 17:04:35

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 3/6] Drivers: hv: vmbus: decrease num_sc on subchannel removal

It is unlikely that that host will ask us to close only one subchannel for a
device but let's be consistent. Do both num_sc++ and num_sc-- with
channel->lock to be on the safe side.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index b28cbdf..c53a171 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -205,6 +205,7 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid)
primary_channel = channel->primary_channel;
spin_lock_irqsave(&primary_channel->lock, flags);
list_del(&channel->sc_list);
+ primary_channel->num_sc--;
spin_unlock_irqrestore(&primary_channel->lock, flags);
}
free_channel(channel);
@@ -265,8 +266,8 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
newchannel->primary_channel = channel;
spin_lock_irqsave(&channel->lock, flags);
list_add_tail(&newchannel->sc_list, &channel->sc_list);
- spin_unlock_irqrestore(&channel->lock, flags);
channel->num_sc++;
+ spin_unlock_irqrestore(&channel->lock, flags);
} else
goto err_free_chan;
}
--
1.9.3

2015-04-27 17:04:54

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 4/6] Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer()

We need to call init_vp_index() after we added the channel to the appropriate
list (global or subchannel) to be able to use this information when assigning
the channel to the particular vcpu. To do so we need to move a couple of
functions around. The only real change is the init_vp_index() call. This is a
small refactoring without a functional change.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index c53a171..655c0a0 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -32,6 +32,9 @@

#include "hyperv_vmbus.h"

+static void init_vp_index(struct vmbus_channel *channel,
+ const uuid_le *type_guid);
+
/**
* vmbus_prep_negotiate_resp() - Create default response for Hyper-V Negotiate message
* @icmsghdrp: Pointer to msg header structure
@@ -272,6 +275,8 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
goto err_free_chan;
}

+ init_vp_index(newchannel, &newchannel->offermsg.offer.if_type);
+
if (newchannel->target_cpu != get_cpu()) {
put_cpu();
smp_call_function_single(newchannel->target_cpu,
@@ -476,8 +481,6 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr)
offer->connection_id;
}

- init_vp_index(newchannel, &offer->offer.if_type);
-
memcpy(&newchannel->offermsg, offer,
sizeof(struct vmbus_channel_offer_channel));
newchannel->monitor_grp = (u8)offer->monitorid / 32;
--
1.9.3

2015-04-27 17:07:10

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 5/6] Drivers: hv: vmbus: distribute subchannels among all vcpus

Primary channels are distributed evenly across all vcpus we have. When the host
asks us to create subchannels it usually makes us num_cpus-1 offers and we are
supposed to distribute the work evenly among the channel itself and all its
subchannels. Make sure they are all assigned to different vcpus.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 655c0a0..1f1417d 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -387,6 +387,8 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui
int i;
bool perf_chn = false;
u32 max_cpus = num_online_cpus();
+ struct vmbus_channel *primary = channel->primary_channel, *prev;
+ unsigned long flags;

for (i = IDE; i < MAX_PERF_CHN; i++) {
if (!memcmp(type_guid->b, hp_devs[i].guid,
@@ -407,7 +409,32 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui
channel->target_vp = 0;
return;
}
- cur_cpu = (++next_vp % max_cpus);
+
+ /*
+ * Primary channels are distributed evenly across all vcpus we have.
+ * When the host asks us to create subchannels it usually makes us
+ * num_cpus-1 offers and we are supposed to distribute the work evenly
+ * among the channel itself and all its subchannels. Make sure they are
+ * all assigned to different vcpus.
+ */
+ if (!primary)
+ cur_cpu = (++next_vp % max_cpus);
+ else {
+ /*
+ * Let's assign the first subchannel of a channel to the
+ * primary->target_cpu+1 and all the subsequent channels to
+ * the prev->target_cpu+1.
+ */
+ spin_lock_irqsave(&primary->lock, flags);
+ if (primary->num_sc == 1)
+ cur_cpu = (primary->target_cpu + 1) % max_cpus;
+ else {
+ prev = list_prev_entry(channel, sc_list);
+ cur_cpu = (prev->target_cpu + 1) % max_cpus;
+ }
+ spin_unlock_irqrestore(&primary->lock, flags);
+ }
+
channel->target_cpu = cur_cpu;
channel->target_vp = hv_context.vp_index[cur_cpu];
}
--
1.9.3

2015-04-27 17:06:40

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 6/6] Drivers: hv: vmbus: improve selection of an outgoing channel

vmbus_get_outgoing_channel() implements the following algorithm for selecting
an outgoing channel (despite the comment before the function saying it
distributes the load equally):
1) If we have no subchannels return the primary channel;
2) If primary->next_oc is grater than primary->num_sc reset the primary->next_oc
to 0 and return the primary channel;
3) Aim for the primary->next_oc subchannel, increment primary->next_oc;
4) Loop through all opened subchannels. If we see a channel which has
target_cpu == current_cpu return it. If we reached the primary->next_oc'th
open subchannel return it;
5) Return the primary channel.
The implementation also skips the subchannel No. 0 unless it matches the current
cpu as we assign i to 1 in the initialization.

Improve the algorithm to do the full scan searching for a (sub)channel with the
target_cpu == current_cpu, use round-robin as a fallback.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 39 +++++++++++++++++++--------------------
1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 1f1417d..cdc3914 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -823,46 +823,45 @@ cleanup:
}

/*
- * Retrieve the (sub) channel on which to send an outgoing request.
- * When a primary channel has multiple sub-channels, we try to
- * distribute the load equally amongst all available channels.
+ * Retrieve the (sub) channel on which to send an outgoing request. When a
+ * primary channel has multiple sub-channels, we try to find a (sub)channel
+ * with target_cpu == current CPU and we try to distribute the load equally
+ * amongst all available channels when we fail.
*/
struct vmbus_channel *vmbus_get_outgoing_channel(struct vmbus_channel *primary)
{
struct list_head *cur, *tmp;
- int cur_cpu;
struct vmbus_channel *cur_channel;
- struct vmbus_channel *outgoing_channel = primary;
- int next_channel;
- int i = 1;
+ struct vmbus_channel *out_channel_rr = NULL;
+ int i = 0;

if (list_empty(&primary->sc_list))
- return outgoing_channel;
+ return primary;

- next_channel = primary->next_oc++;
-
- if (next_channel > (primary->num_sc)) {
+ if (primary->next_oc >= primary->num_sc) {
primary->next_oc = 0;
- return outgoing_channel;
+ return primary;
}

- cur_cpu = hv_context.vp_index[get_cpu()];
- put_cpu();
+ if (primary->target_cpu == smp_processor_id())
+ return primary;
+
list_for_each_safe(cur, tmp, &primary->sc_list) {
+ i++;
cur_channel = list_entry(cur, struct vmbus_channel, sc_list);
if (cur_channel->state != CHANNEL_OPENED_STATE)
continue;

- if (cur_channel->target_vp == cur_cpu)
+ if (cur_channel->target_cpu == smp_processor_id())
return cur_channel;

- if (i == next_channel)
- return cur_channel;
-
- i++;
+ if (!out_channel_rr && i > primary->next_oc) {
+ primary->next_oc = i;
+ out_channel_rr = cur_channel;
+ }
}

- return outgoing_channel;
+ return out_channel_rr ? out_channel_rr : primary;
}
EXPORT_SYMBOL_GPL(vmbus_get_outgoing_channel);

--
1.9.3

2015-04-28 04:30:23

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2 0/6] Drivers: hv: vmbus: fair round robin algorithm for vmbus_get_outgoing_channel()

> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:[email protected]]
> Sent: Tuesday, April 28, 2015 1:04
> To: KY Srinivasan
> Cc: Haiyang Zhang; [email protected]; linux-
> [email protected]; Dexuan Cui
> Subject: [PATCH v2 0/6] Drivers: hv: vmbus: fair round robin algorithm for
> vmbus_get_outgoing_channel()
>
> Changes in v2:
> - Address Dexuan's review comments:
> PATCH 3/6: s,channel,primary_channel;
> PATCH 4/6: add a forward declaration instead of moving code around;
> PATCH 6/6: fix an off-by-one
> - Change the algorithm in PATCH 6/6:
> Instead of a simple round robin we first try to find a (sub)channel with
> the current_cpu == target_cpu and we fallback to a round robin when we
> fail
> to find one.
>
> K. Y., Dexuan, can you please give it a spin in various testing environments
> you have? Thanks!
>
> Original description:
>
> This series is a continuation of the "Drivers: hv: vmbus: Use a round-robin
> algorithm for picking the outgoing channel" work. It is supposed to bring
> two
> significant changes:
> 1) Subchannels for a channel are distributed evenly across all vcpus we have.
> Currently we try to distribute all channels (including subchannels) across
> all vcpus, this approach doesn't guarantee that the particular channel's
> subchannels will be distributed in the same way as we process all offer
> requests in some random order. (Patch 05)
> 2) Channel picking based on the current vcpu is dropped from
> vmbus_get_outgoing_channel() in favor of a fair round robin. (Patch 06)
> (this
> is not true anymore, see 'Changes').
>
> Patches 01 - 04 are cleanup/refactoring.
>
> Vitaly Kuznetsov (6):
> Drivers: hv: vmbus: unify calls to percpu_channel_enq()
> Drivers: hv: vmbus: briefly comment num_sc and next_oc
> Drivers: hv: vmbus: decrease num_sc on subchannel removal
> Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer()
> Drivers: hv: vmbus: distribute subchannels among all vcpus
> Drivers: hv: vmbus: improve selection of an outgoing channel
>
> drivers/hv/channel_mgmt.c | 127 ++++++++++++++++++++++++++---------
> -----------
> include/linux/hyperv.h | 12 +++--
> 2 files changed, 80 insertions(+), 59 deletions(-)
>
> --

Patch 1, 2 and 3 are good to me.

We'll have to test 4~6 for performance change.

Thanks,
-- Dexuan