On Thu, May 26, 2022 at 08:45:33PM +0000, Michael Kelley (LINUX) wrote:
> From: Saurabh Sengar <[email protected]> Sent: Thursday, May 26, 2022 11:55 AM
>
> > Subject: [PATCH] Drivers: hv: vmbus: Adding isolated cpu support for channel interrupts
> > mapping
>
> Let me suggest a more compact and precise Subject:
>
> Drivers: hv: vmbus: Don't assign VMbus channel interrupts to isolated CPUs
[sss]: ok
>
> >
> > Adding support for vmbus channels to take isolated cpu in consideration
> > while assigning interrupt to different cpus. This also prevents user from
> > setting any isolated cpu to vmbus channel interrupt assignment by sysfs
> > entry. Isolated cpu can be configured by kernel command line parameter
> > 'isolcpus=managed_irq,<#cpu>'.
>
> Also, for the commit statement:
>
> When initially assigning a VMbus channel interrupt to a CPU, don't choose
> a managed IRQ isolated CPU (as specified on the kernel boot line with
> parameter 'isolcpus=managed_irq,<#cpu>'). Also, when using sysfs to
> change the CPU that a VMbus channel will interrupt, don't allow changing
> to a managed IRQ isolated CPU.
>
[sss] : ok
> >
> > Signed-off-by: Saurabh Sengar <[email protected]>
> > ---
> > drivers/hv/channel_mgmt.c | 18 ++++++++++++------
> > drivers/hv/vmbus_drv.c | 6 ++++++
> > 2 files changed, 18 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 97d8f56..e1fe029 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -21,6 +21,7 @@
> > #include <linux/cpu.h>
> > #include <linux/hyperv.h>
> > #include <asm/mshyperv.h>
> > +#include <linux/sched/isolation.h>
> >
> > #include "hyperv_vmbus.h"
> >
> > @@ -728,16 +729,20 @@ static void init_vp_index(struct vmbus_channel *channel)
> > u32 i, ncpu = num_online_cpus();
> > cpumask_var_t available_mask;
> > struct cpumask *allocated_mask;
> > + const struct cpumask *hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
> > u32 target_cpu;
> > int numa_node;
> >
> > if (!perf_chn ||
> > - !alloc_cpumask_var(&available_mask, GFP_KERNEL)) {
> > + !alloc_cpumask_var(&available_mask, GFP_KERNEL) ||
> > + cpumask_empty(hk_mask)) {
> > /*
> > * If the channel is not a performance critical
> > * channel, bind it to VMBUS_CONNECT_CPU.
> > * In case alloc_cpumask_var() fails, bind it to
> > * VMBUS_CONNECT_CPU.
> > + * If all the cpus are isolated, bind it to
> > + * VMBUS_CONNECT_CPU.
> > */
> > channel->target_cpu = VMBUS_CONNECT_CPU;
> > if (perf_chn)
> > @@ -758,17 +763,19 @@ static void init_vp_index(struct vmbus_channel *channel)
> > }
> > allocated_mask = &hv_context.hv_numa_map[numa_node];
> >
> > - if (cpumask_equal(allocated_mask, cpumask_of_node(numa_node))) {
> > +retry:
> > + cpumask_xor(available_mask, allocated_mask, cpumask_of_node(numa_node));
>
> There's a bug here that existed in the code prior to this patch. The code
> checks to make sure cpumask_of_node(numa_node) is not empty, and then
> later references cpumask_of_node(numa_node) again. But in between the
> check and the use, one or more CPUs could go offline, leaving
> cpumask_of_node(numa_node) empty since that array of cpumasks contains
> only online CPUs. In such a case, execution could get stuck in an infinite
> loop with available_mask being empty.
>
> The solution is to call cpus_read_lock() before starting the main "for"
> loop and then call cpus_read_unlock() at the end. This lock will prevent
> CPUs from going offline, and hence ensure that the node mask can't
> become empty. You'll notice that target_cpu_store() uses that lock
> to prevent a similar problem.
>
> Fixing this locking problem should probably be a separate patch.
>
> Michael
[sss] : Got it, will send this fix after this patch review is complete.
>
> > + cpumask_and(available_mask, available_mask, hk_mask);
> > +
> > + if (cpumask_empty(available_mask)) {
> > /*
> > * We have cycled through all the CPUs in the node;
> > * reset the allocated map.
> > */
> > cpumask_clear(allocated_mask);
> > + goto retry;
> > }
> >
> > - cpumask_xor(available_mask, allocated_mask,
> > - cpumask_of_node(numa_node));
> > -
> > target_cpu = cpumask_first(available_mask);
> > cpumask_set_cpu(target_cpu, allocated_mask);
> >
> > @@ -778,7 +785,6 @@ static void init_vp_index(struct vmbus_channel *channel)
> > }
> >
> > channel->target_cpu = target_cpu;
> > -
> > free_cpumask_var(available_mask);
> > }
>
> Removing the blank line above is a gratuitous change that isn't needed.
> Generally, a patch should avoid such changes unless the purpose of
> the patch is code cleanup.
>
[sss] : Got in by mistake, will remove
> >
> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > index 714d549..23660a8 100644
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -21,6 +21,7 @@
> > #include <linux/kernel_stat.h>
> > #include <linux/clockchips.h>
> > #include <linux/cpu.h>
> > +#include <linux/sched/isolation.h>
> > #include <linux/sched/task_stack.h>
> >
> > #include <linux/delay.h>
> > @@ -1770,6 +1771,11 @@ static ssize_t target_cpu_store(struct vmbus_channel
> > *channel,
> > if (target_cpu >= nr_cpumask_bits)
> > return -EINVAL;
> >
> > + if (!cpumask_test_cpu(target_cpu, housekeeping_cpumask(HK_TYPE_MANAGED_IRQ))) {
> > + dev_err(&channel->device_obj->device,
> > + "cpu (%d) is isolated, can't be assigned\n", target_cpu);
>
> I don't think a message should be output here. The other errors in this
> function don't output a message. Generally, the kernel doesn't output
> a message just because a user provided bad input. Doing so makes it
> too easy for a user (even a sysadmin) to cause the kernel to go wild
> outputting messages.
>
> Michael
>
[sss] : sure, will remove
> > + return -EINVAL;
> > + }
> > /* No CPUs should come up or down during this. */
> > cpus_read_lock();
> >
> > --
> > 1.8.3.1