2022-07-11 04:16:35

by Shradha Gupta

[permalink] [raw]
Subject: [PATCH v3] Drivers: hv: vm_bus: Handle vmbus rescind calls after vmbus is suspended

Add a flag to indicate that the vmbus is suspended so we should ignore
any offer message. Add a new work_queue for rescind msg, so we could drain
it along with other offer work_queues upon suspension.
It was observed that in some hibernation related scenario testing, after
vmbus_bus_suspend() we get rescind offer message for the vmbus. This would
lead to processing of a rescind message for a channel that has already been
suspended.

Signed-off-by: Shradha Gupta <[email protected]>
---

Changes in v3:
* Remove unused variable hv_cpu from vmbus_bus_resume() call

---
drivers/hv/connection.c | 11 +++++++++++
drivers/hv/hyperv_vmbus.h | 7 +++++++
drivers/hv/vmbus_drv.c | 27 +++++++++++++++++++--------
3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 6218bbf6863a..eca7afd366d6 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -171,6 +171,14 @@ int vmbus_connect(void)
goto cleanup;
}

+ vmbus_connection.rescind_work_queue =
+ create_workqueue("hv_vmbus_rescind");
+ if (!vmbus_connection.rescind_work_queue) {
+ ret = -ENOMEM;
+ goto cleanup;
+ }
+ vmbus_connection.ignore_any_offer_msg = false;
+
vmbus_connection.handle_primary_chan_wq =
create_workqueue("hv_pri_chan");
if (!vmbus_connection.handle_primary_chan_wq) {
@@ -357,6 +365,9 @@ void vmbus_disconnect(void)
if (vmbus_connection.handle_primary_chan_wq)
destroy_workqueue(vmbus_connection.handle_primary_chan_wq);

+ if (vmbus_connection.rescind_work_queue)
+ destroy_workqueue(vmbus_connection.rescind_work_queue);
+
if (vmbus_connection.work_queue)
destroy_workqueue(vmbus_connection.work_queue);

diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 4f5b824b16cf..dc673edf053c 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -261,6 +261,13 @@ struct vmbus_connection {
struct workqueue_struct *work_queue;
struct workqueue_struct *handle_primary_chan_wq;
struct workqueue_struct *handle_sub_chan_wq;
+ struct workqueue_struct *rescind_work_queue;
+
+ /*
+ * On suspension of the vmbus, the accumulated offer messages
+ * must be dropped.
+ */
+ bool ignore_any_offer_msg;

/*
* The number of sub-channels and hv_sock channels that should be
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 547ae334e5cd..23c680d1a0f5 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1160,7 +1160,9 @@ void vmbus_on_msg_dpc(unsigned long data)
* work queue: the RESCIND handler can not start to
* run before the OFFER handler finishes.
*/
- schedule_work(&ctx->work);
+ if (vmbus_connection.ignore_any_offer_msg)
+ break;
+ queue_work(vmbus_connection.rescind_work_queue, &ctx->work);
break;

case CHANNELMSG_OFFERCHANNEL:
@@ -1186,6 +1188,8 @@ void vmbus_on_msg_dpc(unsigned long data)
* to the CPUs which will execute the offer & rescind
* works by the time these works will start execution.
*/
+ if (vmbus_connection.ignore_any_offer_msg)
+ break;
atomic_inc(&vmbus_connection.offer_in_progress);
fallthrough;

@@ -2446,15 +2450,20 @@ static int vmbus_acpi_add(struct acpi_device *device)
#ifdef CONFIG_PM_SLEEP
static int vmbus_bus_suspend(struct device *dev)
{
+ struct hv_per_cpu_context *hv_cpu = per_cpu_ptr(
+ hv_context.cpu_context, VMBUS_CONNECT_CPU);
struct vmbus_channel *channel, *sc;

- while (atomic_read(&vmbus_connection.offer_in_progress) != 0) {
- /*
- * We wait here until the completion of any channel
- * offers that are currently in progress.
- */
- usleep_range(1000, 2000);
- }
+ tasklet_disable(&hv_cpu->msg_dpc);
+ vmbus_connection.ignore_any_offer_msg = true;
+ /* The tasklet_enable() takes care of providing a memory barrier */
+ tasklet_enable(&hv_cpu->msg_dpc);
+
+ /* Drain all the workqueues as we are in suspend */
+ drain_workqueue(vmbus_connection.rescind_work_queue);
+ drain_workqueue(vmbus_connection.work_queue);
+ drain_workqueue(vmbus_connection.handle_primary_chan_wq);
+ drain_workqueue(vmbus_connection.handle_sub_chan_wq);

mutex_lock(&vmbus_connection.channel_mutex);
list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
@@ -2531,6 +2540,8 @@ static int vmbus_bus_resume(struct device *dev)
size_t msgsize;
int ret;

+ vmbus_connection.ignore_any_offer_msg = false;
+
/*
* We only use the 'vmbus_proto_version', which was in use before
* hibernation, to re-negotiate with the host.
--
2.17.1


2022-07-11 18:11:05

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v3] Drivers: hv: vm_bus: Handle vmbus rescind calls after vmbus is suspended

From: Shradha Gupta <[email protected]> Sent: Sunday, July 10, 2022 9:12 PM
>
> Add a flag to indicate that the vmbus is suspended so we should ignore
> any offer message. Add a new work_queue for rescind msg, so we could drain
> it along with other offer work_queues upon suspension.
> It was observed that in some hibernation related scenario testing, after
> vmbus_bus_suspend() we get rescind offer message for the vmbus. This would
> lead to processing of a rescind message for a channel that has already been
> suspended.
>
> Signed-off-by: Shradha Gupta <[email protected]>
> ---
>
> Changes in v3:
> * Remove unused variable hv_cpu from vmbus_bus_resume() call
>
> ---
> drivers/hv/connection.c | 11 +++++++++++
> drivers/hv/hyperv_vmbus.h | 7 +++++++
> drivers/hv/vmbus_drv.c | 27 +++++++++++++++++++--------
> 3 files changed, 37 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 6218bbf6863a..eca7afd366d6 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -171,6 +171,14 @@ int vmbus_connect(void)
> goto cleanup;
> }
>
> + vmbus_connection.rescind_work_queue =
> + create_workqueue("hv_vmbus_rescind");
> + if (!vmbus_connection.rescind_work_queue) {
> + ret = -ENOMEM;
> + goto cleanup;
> + }
> + vmbus_connection.ignore_any_offer_msg = false;
> +
> vmbus_connection.handle_primary_chan_wq =
> create_workqueue("hv_pri_chan");
> if (!vmbus_connection.handle_primary_chan_wq) {
> @@ -357,6 +365,9 @@ void vmbus_disconnect(void)
> if (vmbus_connection.handle_primary_chan_wq)
> destroy_workqueue(vmbus_connection.handle_primary_chan_wq);
>
> + if (vmbus_connection.rescind_work_queue)
> + destroy_workqueue(vmbus_connection.rescind_work_queue);
> +
> if (vmbus_connection.work_queue)
> destroy_workqueue(vmbus_connection.work_queue);
>
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index 4f5b824b16cf..dc673edf053c 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -261,6 +261,13 @@ struct vmbus_connection {
> struct workqueue_struct *work_queue;
> struct workqueue_struct *handle_primary_chan_wq;
> struct workqueue_struct *handle_sub_chan_wq;
> + struct workqueue_struct *rescind_work_queue;
> +
> + /*
> + * On suspension of the vmbus, the accumulated offer messages
> + * must be dropped.
> + */
> + bool ignore_any_offer_msg;
>
> /*
> * The number of sub-channels and hv_sock channels that should be
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 547ae334e5cd..23c680d1a0f5 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1160,7 +1160,9 @@ void vmbus_on_msg_dpc(unsigned long data)
> * work queue: the RESCIND handler can not start to
> * run before the OFFER handler finishes.
> */
> - schedule_work(&ctx->work);
> + if (vmbus_connection.ignore_any_offer_msg)
> + break;
> + queue_work(vmbus_connection.rescind_work_queue, &ctx->work);
> break;
>
> case CHANNELMSG_OFFERCHANNEL:
> @@ -1186,6 +1188,8 @@ void vmbus_on_msg_dpc(unsigned long data)
> * to the CPUs which will execute the offer & rescind
> * works by the time these works will start execution.
> */
> + if (vmbus_connection.ignore_any_offer_msg)
> + break;
> atomic_inc(&vmbus_connection.offer_in_progress);
> fallthrough;
>
> @@ -2446,15 +2450,20 @@ static int vmbus_acpi_add(struct acpi_device *device)
> #ifdef CONFIG_PM_SLEEP
> static int vmbus_bus_suspend(struct device *dev)
> {
> + struct hv_per_cpu_context *hv_cpu = per_cpu_ptr(
> + hv_context.cpu_context, VMBUS_CONNECT_CPU);
> struct vmbus_channel *channel, *sc;
>
> - while (atomic_read(&vmbus_connection.offer_in_progress) != 0) {
> - /*
> - * We wait here until the completion of any channel
> - * offers that are currently in progress.
> - */
> - usleep_range(1000, 2000);
> - }
> + tasklet_disable(&hv_cpu->msg_dpc);
> + vmbus_connection.ignore_any_offer_msg = true;
> + /* The tasklet_enable() takes care of providing a memory barrier */
> + tasklet_enable(&hv_cpu->msg_dpc);
> +
> + /* Drain all the workqueues as we are in suspend */
> + drain_workqueue(vmbus_connection.rescind_work_queue);
> + drain_workqueue(vmbus_connection.work_queue);
> + drain_workqueue(vmbus_connection.handle_primary_chan_wq);
> + drain_workqueue(vmbus_connection.handle_sub_chan_wq);
>
> mutex_lock(&vmbus_connection.channel_mutex);
> list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
> @@ -2531,6 +2540,8 @@ static int vmbus_bus_resume(struct device *dev)
> size_t msgsize;
> int ret;
>
> + vmbus_connection.ignore_any_offer_msg = false;
> +
> /*
> * We only use the 'vmbus_proto_version', which was in use before
> * hibernation, to re-negotiate with the host.
> --
> 2.17.1

Reviewed-by: Michael Kelley <[email protected]>

2022-07-11 19:08:03

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH v3] Drivers: hv: vm_bus: Handle vmbus rescind calls after vmbus is suspended

On Mon, Jul 11, 2022 at 05:50:43PM +0000, Michael Kelley (LINUX) wrote:
> From: Shradha Gupta <[email protected]> Sent: Sunday, July 10, 2022 9:12 PM
> >
> > Add a flag to indicate that the vmbus is suspended so we should ignore
> > any offer message. Add a new work_queue for rescind msg, so we could drain
> > it along with other offer work_queues upon suspension.
> > It was observed that in some hibernation related scenario testing, after
> > vmbus_bus_suspend() we get rescind offer message for the vmbus. This would
> > lead to processing of a rescind message for a channel that has already been
> > suspended.
> >
> > Signed-off-by: Shradha Gupta <[email protected]>
[...]
>
> Reviewed-by: Michael Kelley <[email protected]>
>

Applied to hyperv-next. Thanks.