2022-06-09 05:56:17

by Saurabh Singh Sengar

[permalink] [raw]
Subject: [PATCH] Drivers: hv: vmbus: Add cpu read lock

Add cpus_read_lock to prevent CPUs from going offline between query and
actual use of cpumask. cpumask_of_node is first queried, and based on it
used later, in case any CPU goes offline between these two events, it can
potentially cause an infinite loop of retries.

Signed-off-by: Saurabh Sengar <[email protected]>
---
drivers/hv/channel_mgmt.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 85a2142..6a88b7e 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -749,6 +749,9 @@ static void init_vp_index(struct vmbus_channel *channel)
return;
}

+ /* No CPUs should come up or down during this. */
+ cpus_read_lock();
+
for (i = 1; i <= ncpu + 1; i++) {
while (true) {
numa_node = next_numa_node_id++;
@@ -781,6 +784,7 @@ static void init_vp_index(struct vmbus_channel *channel)
break;
}

+ cpus_read_unlock();
channel->target_cpu = target_cpu;

free_cpumask_var(available_mask);
--
1.8.3.1


2022-06-09 14:20:03

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH] Drivers: hv: vmbus: Add cpu read lock

From: Saurabh Sengar <[email protected]> Sent: Wednesday, June 8, 2022 10:27 PM
>
> Add cpus_read_lock to prevent CPUs from going offline between query and
> actual use of cpumask. cpumask_of_node is first queried, and based on it
> used later, in case any CPU goes offline between these two events, it can
> potentially cause an infinite loop of retries.
>
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> drivers/hv/channel_mgmt.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 85a2142..6a88b7e 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -749,6 +749,9 @@ static void init_vp_index(struct vmbus_channel *channel)
> return;
> }
>
> + /* No CPUs should come up or down during this. */
> + cpus_read_lock();
> +
> for (i = 1; i <= ncpu + 1; i++) {
> while (true) {
> numa_node = next_numa_node_id++;
> @@ -781,6 +784,7 @@ static void init_vp_index(struct vmbus_channel *channel)
> break;
> }
>
> + cpus_read_unlock();
> channel->target_cpu = target_cpu;
>
> free_cpumask_var(available_mask);
> --
> 1.8.3.1

This patch was motivated because I suggested a potential issue here during
a separate conversation with Saurabh, but it turns out I was wrong. :-(

init_vp_index() is only called from vmbus_process_offer(), and the
cpus_read_lock() is already held when init_vp_index() is called. So the
issue doesn't exist, and this patch isn't needed.

However, looking at vmbus_process_offer(), there appears to be a
different problem in that cpus_read_unlock() is not called when taking
the error return because the sub_channel_index is zero.

Michael


2022-06-09 14:21:26

by Haiyang Zhang

[permalink] [raw]
Subject: RE: [PATCH] Drivers: hv: vmbus: Add cpu read lock



> -----Original Message-----
> From: Michael Kelley (LINUX) <[email protected]>
> Sent: Thursday, June 9, 2022 9:51 AM
> To: Saurabh Sengar <[email protected]>; KY Srinivasan
> <[email protected]>; Haiyang Zhang <[email protected]>; Stephen
> Hemminger <[email protected]>; [email protected]; Dexuan Cui
> <[email protected]>; [email protected]; linux-
> [email protected]; Saurabh Singh Sengar <[email protected]>
> Subject: RE: [PATCH] Drivers: hv: vmbus: Add cpu read lock
>
> From: Saurabh Sengar <[email protected]> Sent: Wednesday, June
> 8, 2022 10:27 PM
> >
> > Add cpus_read_lock to prevent CPUs from going offline between query and
> > actual use of cpumask. cpumask_of_node is first queried, and based on it
> > used later, in case any CPU goes offline between these two events, it can
> > potentially cause an infinite loop of retries.
> >
> > Signed-off-by: Saurabh Sengar <[email protected]>
> > ---
> > drivers/hv/channel_mgmt.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 85a2142..6a88b7e 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -749,6 +749,9 @@ static void init_vp_index(struct vmbus_channel
> *channel)
> > return;
> > }
> >
> > + /* No CPUs should come up or down during this. */
> > + cpus_read_lock();
> > +
> > for (i = 1; i <= ncpu + 1; i++) {
> > while (true) {
> > numa_node = next_numa_node_id++;
> > @@ -781,6 +784,7 @@ static void init_vp_index(struct vmbus_channel
> *channel)
> > break;
> > }
> >
> > + cpus_read_unlock();
> > channel->target_cpu = target_cpu;
> >
> > free_cpumask_var(available_mask);
> > --
> > 1.8.3.1
>
> This patch was motivated because I suggested a potential issue here during
> a separate conversation with Saurabh, but it turns out I was wrong. :-(
>
> init_vp_index() is only called from vmbus_process_offer(), and the
> cpus_read_lock() is already held when init_vp_index() is called. So the
> issue doesn't exist, and this patch isn't needed.
>
> However, looking at vmbus_process_offer(), there appears to be a
> different problem in that cpus_read_unlock() is not called when taking
> the error return because the sub_channel_index is zero.
>
> Michael
>

} else {
/*
* Check to see if this is a valid sub-channel.
*/
if (newchannel->offermsg.offer.sub_channel_index == 0) {
mutex_unlock(&vmbus_connection.channel_mutex);
/*
* Don't call free_channel(), because newchannel->kobj
* is not initialized yet.
*/
kfree(newchannel);
WARN_ON_ONCE(1);
return;
}

If this happens, it should be a host bug. Yes, I also think the cpus_read_unlock()
is missing in this error path.

Thanks,
- Haiyang

2022-06-09 14:35:06

by Saurabh Singh Sengar

[permalink] [raw]
Subject: Re: [PATCH] Drivers: hv: vmbus: Add cpu read lock

On Thu, Jun 09, 2022 at 01:59:02PM +0000, Haiyang Zhang wrote:
>
>
> > -----Original Message-----
> > From: Michael Kelley (LINUX) <[email protected]>
> > Sent: Thursday, June 9, 2022 9:51 AM
> > To: Saurabh Sengar <[email protected]>; KY Srinivasan
> > <[email protected]>; Haiyang Zhang <[email protected]>; Stephen
> > Hemminger <[email protected]>; [email protected]; Dexuan Cui
> > <[email protected]>; [email protected]; linux-
> > [email protected]; Saurabh Singh Sengar <[email protected]>
> > Subject: RE: [PATCH] Drivers: hv: vmbus: Add cpu read lock
> >
> > From: Saurabh Sengar <[email protected]> Sent: Wednesday, June
> > 8, 2022 10:27 PM
> > >
> > > Add cpus_read_lock to prevent CPUs from going offline between query and
> > > actual use of cpumask. cpumask_of_node is first queried, and based on it
> > > used later, in case any CPU goes offline between these two events, it can
> > > potentially cause an infinite loop of retries.
> > >
> > > Signed-off-by: Saurabh Sengar <[email protected]>
> > > ---
> > > drivers/hv/channel_mgmt.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > > index 85a2142..6a88b7e 100644
> > > --- a/drivers/hv/channel_mgmt.c
> > > +++ b/drivers/hv/channel_mgmt.c
> > > @@ -749,6 +749,9 @@ static void init_vp_index(struct vmbus_channel
> > *channel)
> > > return;
> > > }
> > >
> > > + /* No CPUs should come up or down during this. */
> > > + cpus_read_lock();
> > > +
> > > for (i = 1; i <= ncpu + 1; i++) {
> > > while (true) {
> > > numa_node = next_numa_node_id++;
> > > @@ -781,6 +784,7 @@ static void init_vp_index(struct vmbus_channel
> > *channel)
> > > break;
> > > }
> > >
> > > + cpus_read_unlock();
> > > channel->target_cpu = target_cpu;
> > >
> > > free_cpumask_var(available_mask);
> > > --
> > > 1.8.3.1
> >
> > This patch was motivated because I suggested a potential issue here during
> > a separate conversation with Saurabh, but it turns out I was wrong. :-(
> >
> > init_vp_index() is only called from vmbus_process_offer(), and the
> > cpus_read_lock() is already held when init_vp_index() is called. So the
> > issue doesn't exist, and this patch isn't needed.
> >
> > However, looking at vmbus_process_offer(), there appears to be a
> > different problem in that cpus_read_unlock() is not called when taking
> > the error return because the sub_channel_index is zero.
> >
> > Michael
> >
>
> } else {
> /*
> * Check to see if this is a valid sub-channel.
> */
> if (newchannel->offermsg.offer.sub_channel_index == 0) {
> mutex_unlock(&vmbus_connection.channel_mutex);
> /*
> * Don't call free_channel(), because newchannel->kobj
> * is not initialized yet.
> */
> kfree(newchannel);
> WARN_ON_ONCE(1);
> return;
> }
>
> If this happens, it should be a host bug. Yes, I also think the cpus_read_unlock()
> is missing in this error path.
>
> Thanks,
> - Haiyang

I see, will send another patch to fix this.

Regards,
Saurabh