From: Long Li <[email protected]>
When kernel boots with a NUMA topology with some NUMA nodes offline, the PCI
driver should only set an online NUMA node on the device. This can happen
during KDUMP where some NUMA nodes are not made online by the KDUMP kernel.
This patch also fixes the case where kernel is booting with "numa=off".
Signed-off-by: Long Li <[email protected]>
Change from v1:
Use numa_map_to_online_node() to assign a node to device (suggested by
Michael Kelly <[email protected]>)
---
drivers/pci/controller/pci-hyperv.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 6c9efeefae1b..c7519add6f13 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2130,7 +2130,15 @@ static void hv_pci_assign_numa_node(struct hv_pcibus_device *hbus)
continue;
if (hv_dev->desc.flags & HV_PCI_DEVICE_FLAG_NUMA_AFFINITY)
- set_dev_node(&dev->dev, hv_dev->desc.virtual_numa_node);
+ /*
+ * The kernel may boot with some NUMA nodes offline
+ * (e.g. in a KDUMP kernel) or with NUMA disabled via
+ * "numa=off". In those cases, adjust the host provided
+ * NUMA node to a valid NUMA node used by the kernel.
+ */
+ set_dev_node(&dev->dev,
+ numa_map_to_online_node(
+ hv_dev->desc.virtual_numa_node));
put_pcichild(hv_dev);
}
--
2.25.1
From: [email protected] <[email protected]> Sent: Tuesday, January 18, 2022 6:45 PM
>
> When kernel boots with a NUMA topology with some NUMA nodes offline, the PCI
> driver should only set an online NUMA node on the device. This can happen
> during KDUMP where some NUMA nodes are not made online by the KDUMP kernel.
>
> This patch also fixes the case where kernel is booting with "numa=off".
>
> Signed-off-by: Long Li <[email protected]>
It seems like adding a "Fixes:" tag would be appropriate.
>
> Change from v1:
> Use numa_map_to_online_node() to assign a node to device (suggested by
> Michael Kelly <[email protected]>)
>
> ---
> drivers/pci/controller/pci-hyperv.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 6c9efeefae1b..c7519add6f13 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2130,7 +2130,15 @@ static void hv_pci_assign_numa_node(struct
> hv_pcibus_device *hbus)
> continue;
>
> if (hv_dev->desc.flags & HV_PCI_DEVICE_FLAG_NUMA_AFFINITY)
> - set_dev_node(&dev->dev, hv_dev->desc.virtual_numa_node);
> + /*
> + * The kernel may boot with some NUMA nodes offline
> + * (e.g. in a KDUMP kernel) or with NUMA disabled via
> + * "numa=off". In those cases, adjust the host provided
> + * NUMA node to a valid NUMA node used by the kernel.
> + */
> + set_dev_node(&dev->dev,
> + numa_map_to_online_node(
> + hv_dev->desc.virtual_numa_node));
Double-check me, but I think this approach has a flaw in that
numa_map_to_online_node() doesn't check the input node for being
out-of-range. The call to node_online() uses the input node to index into
the node_states bitmap (of type nodemask_t), and could go off the end of
the bitmap if the input node isn't validated to be less than MAX_NUMNODES
(or nr_node_ids).
Michael
>
> put_pcichild(hv_dev);
> }
> --
> 2.25.1
> Subject: RE: [Patch v2] PCI: hv: Fix NUMA node assignment when kernel boots
> with custom NUMA topology
>
> From: [email protected] <[email protected]> Sent: Tuesday,
> January 18, 2022 6:45 PM
> >
> > When kernel boots with a NUMA topology with some NUMA nodes offline,
> > the PCI driver should only set an online NUMA node on the device. This
> > can happen during KDUMP where some NUMA nodes are not made online by
> the KDUMP kernel.
> >
> > This patch also fixes the case where kernel is booting with "numa=off".
> >
> > Signed-off-by: Long Li <[email protected]>
>
> It seems like adding a "Fixes:" tag would be appropriate.
Okay, sending v3 with "Fixes:".
>
> >
> > Change from v1:
> > Use numa_map_to_online_node() to assign a node to device (suggested by
> > Michael Kelly <[email protected]>)
> >
> > ---
> > drivers/pci/controller/pci-hyperv.c | 10 +++++++++-
> > 1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index 6c9efeefae1b..c7519add6f13 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -2130,7 +2130,15 @@ static void hv_pci_assign_numa_node(struct
> > hv_pcibus_device *hbus)
> > continue;
> >
> > if (hv_dev->desc.flags &
> HV_PCI_DEVICE_FLAG_NUMA_AFFINITY)
> > - set_dev_node(&dev->dev, hv_dev-
> >desc.virtual_numa_node);
> > + /*
> > + * The kernel may boot with some NUMA nodes offline
> > + * (e.g. in a KDUMP kernel) or with NUMA disabled via
> > + * "numa=off". In those cases, adjust the host provided
> > + * NUMA node to a valid NUMA node used by the kernel.
> > + */
> > + set_dev_node(&dev->dev,
> > + numa_map_to_online_node(
> > + hv_dev->desc.virtual_numa_node));
>
> Double-check me, but I think this approach has a flaw in that
> numa_map_to_online_node() doesn't check the input node for being out-of-
> range. The call to node_online() uses the input node to index into the
> node_states bitmap (of type nodemask_t), and could go off the end of the
> bitmap if the input node isn't validated to be less than MAX_NUMNODES (or
> nr_node_ids).
Indeed, your concern is correct when "numa=off" is set.
>
> Michael
>
> >
> > put_pcichild(hv_dev);
> > }
> > --
> > 2.25.1