Testing
=======
* Patch 1:
Fix for cpumask_local_spread() is tested by creating VFs, loading
iavf module and by adding a tracepoint to confirm that only housekeeping
CPUs are picked when an appropriate profile is set up and all remaining
CPUs when no CPU isolation is configured.
* Patch 2:
To test the PCI fix, I hotplugged a virtio-net-pci from qemu console
and forced its addition to a specific node to trigger the code path that
includes the proposed fix and verified that only housekeeping CPUs
are included via tracepoint.
* Patch 3:
To test the fix in store_rps_map(), I tried configuring an isolated
CPU by writing to /sys/class/net/en*/queues/rx*/rps_cpus which
resulted in 'write error: Invalid argument' error. For the case
where a non-isolated CPU is writing in rps_cpus the above operation
succeeded without any error.
Changes from v1:
===============
- Included the suggestions made by Bjorn Helgaas in the commit messages.
- Included the 'Reviewed-by' and 'Acked-by' received for Patch-2.
[1] https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/
Alex Belits (3):
lib: Restrict cpumask_local_spread to houskeeping CPUs
PCI: Restrict probe functions to housekeeping CPUs
net: Restrict receive packets queuing to housekeeping CPUs
drivers/pci/pci-driver.c | 5 ++++-
lib/cpumask.c | 43 +++++++++++++++++++++++-----------------
net/core/net-sysfs.c | 10 +++++++++-
3 files changed, 38 insertions(+), 20 deletions(-)
--
On 6/22/20 7:45 PM, Nitesh Narayan Lal wrote:
>
> Testing
> =======
> * Patch 1:
> Fix for cpumask_local_spread() is tested by creating VFs, loading
> iavf module and by adding a tracepoint to confirm that only housekeeping
> CPUs are picked when an appropriate profile is set up and all remaining
> CPUs when no CPU isolation is configured.
>
> * Patch 2:
> To test the PCI fix, I hotplugged a virtio-net-pci from qemu console
> and forced its addition to a specific node to trigger the code path that
> includes the proposed fix and verified that only housekeeping CPUs
> are included via tracepoint.
>
> * Patch 3:
> To test the fix in store_rps_map(), I tried configuring an isolated
> CPU by writing to /sys/class/net/en*/queues/rx*/rps_cpus which
> resulted in 'write error: Invalid argument' error. For the case
> where a non-isolated CPU is writing in rps_cpus the above operation
> succeeded without any error.
>
>
> Changes from v1:
> ===============
> - Included the suggestions made by Bjorn Helgaas in the commit messages.
> - Included the 'Reviewed-by' and 'Acked-by' received for Patch-2.
>
> [1] https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/
>
> Alex Belits (3):
> lib: Restrict cpumask_local_spread to houskeeping CPUs
> PCI: Restrict probe functions to housekeeping CPUs
> net: Restrict receive packets queuing to housekeeping CPUs
>
> drivers/pci/pci-driver.c | 5 ++++-
> lib/cpumask.c | 43 +++++++++++++++++++++++-----------------
> net/core/net-sysfs.c | 10 +++++++++-
> 3 files changed, 38 insertions(+), 20 deletions(-)
>
> --
>
Hi,
It seems that the cover email got messed up while I was sending the patches.
I am putting my intended cover-email below for now. I can send a v3 with proper
cover-email if needed. The reason, I am not sending it right now, is that if I
get some comments in my patches I will prefer including them as well in my
v3 posting.
"
This patch-set is originated from one of the patches that have been
posted earlier as a part of "Task_isolation" mode [1] patch series
by Alex Belits <[email protected]>. There are only a couple of
changes that I am proposing in this patch-set compared to what Alex
has posted earlier.
Context
=======
On a broad level, all three patches that are included in this patch
set are meant to improve the driver/library to respect isolated
CPUs by not pinning any job on it. Not doing so could impact
the latency values in RT use-cases.
Patches
=======
* Patch1:
The first patch is meant to make cpumask_local_spread()
aware of the isolated CPUs. It ensures that the CPUs that
are returned by this API only includes housekeeping CPUs.
* Patch2:
This patch ensures that a probe function that is called
using work_on_cpu() doesn't run any task on an isolated CPU.
* Patch3:
This patch makes store_rps_map() aware of the isolated
CPUs so that rps don't queue any jobs on an isolated CPU.
Proposed Changes
================
To fix the above-mentioned issues Alex has used housekeeping_cpumask().
The only changes that I am proposing here are:
- Removing the dependency on CONFIG_TASK_ISOLATION that was proposed by
Alex. As it should be safe to rely on housekeeping_cpumask()
even when we don't have any isolated CPUs and we want
to fall back to using all available CPUs in any of the above scenarios.
- Using both HK_FLAG_DOMAIN and HK_FLAG_WQ in all three patches, this is
because we would want the above fixes not only when we have isolcpus but
also with something like systemd's CPU affinity.
Testing
=======
* Patch 1:
Fix for cpumask_local_spread() is tested by creating VFs, loading
iavf module and by adding a tracepoint to confirm that only housekeeping
CPUs are picked when an appropriate profile is set up and all remaining
CPUs when no CPU isolation is configured.
* Patch 2:
To test the PCI fix, I hotplugged a virtio-net-pci from qemu console
and forced its addition to a specific node to trigger the code path that
includes the proposed fix and verified that only housekeeping CPUs
are included via tracepoint.
* Patch 3:
To test the fix in store_rps_map(), I tried configuring an isolated
CPU by writing to /sys/class/net/en*/queues/rx*/rps_cpus which
resulted in 'write error: Invalid argument' error. For the case
where a non-isolated CPU is writing in rps_cpus the above operation
succeeded without any error.
Changes from v1: [2]
===============
- Included the suggestions made by Bjorn Helgaas in the commit message.
- Included the 'Reviewed-by' and 'Acked-by' received for Patch-2.
[1]
https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/
[2]
https://patchwork.ozlabs.org/project/linux-pci/cover/[email protected]/
Alex Belits (3):
lib: Restrict cpumask_local_spread to houskeeping CPUs
PCI: Restrict probe functions to housekeeping CPUs
net: Restrict receive packets queuing to housekeeping CPUs
drivers/pci/pci-driver.c | 5 ++++-
lib/cpumask.c | 43 +++++++++++++++++++++++-----------------
net/core/net-sysfs.c | 10 +++++++++-
3 files changed, 38 insertions(+), 20 deletions(-)
--
"
--
Thanks
Nitesh
From: Alex Belits <[email protected]>
pci_call_probe() prevents the nesting of work_on_cpu() for a scenario
where a VF device is probed from work_on_cpu() of the PF.
Replace the cpumask used in pci_call_probe() from all online CPUs to only
housekeeping CPUs. This is to ensure that there are no additional latency
overheads caused due to the pinning of jobs on isolated CPUs.
Signed-off-by: Alex Belits <[email protected]>
Signed-off-by: Nitesh Narayan Lal <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
---
drivers/pci/pci-driver.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index da6510af1221..449466f71040 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -12,6 +12,7 @@
#include <linux/string.h>
#include <linux/slab.h>
#include <linux/sched.h>
+#include <linux/sched/isolation.h>
#include <linux/cpu.h>
#include <linux/pm_runtime.h>
#include <linux/suspend.h>
@@ -333,6 +334,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
const struct pci_device_id *id)
{
int error, node, cpu;
+ int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
struct drv_dev_and_id ddi = { drv, dev, id };
/*
@@ -353,7 +355,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
pci_physfn_is_probed(dev))
cpu = nr_cpu_ids;
else
- cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
+ cpu = cpumask_any_and(cpumask_of_node(node),
+ housekeeping_cpumask(hk_flags));
if (cpu < nr_cpu_ids)
error = work_on_cpu(cpu, local_pci_probe, &ddi);
--
2.18.4