2020-06-23 19:28:25

by Nitesh Narayan Lal

[permalink] [raw]
Subject: [PATCH v3 0/3] Preventing job distribution to isolated CPUs

This patch-set is originated from one of the patches that have been
posted earlier as a part of "Task_isolation" mode [1] patch series
by Alex Belits <[email protected]>. There are only a couple of
changes that I am proposing in this patch-set compared to what Alex
has posted earlier.


Context
=======
On a broad level, all three patches that are included in this patch
set are meant to improve the driver/library to respect isolated
CPUs by not pinning any job on it. Not doing so could impact
the latency values in RT use-cases.


Patches
=======
* Patch1:
The first patch is meant to make cpumask_local_spread()
aware of the isolated CPUs. It ensures that the CPUs that
are returned by this API only includes housekeeping CPUs.

* Patch2:
This patch ensures that a probe function that is called
using work_on_cpu() doesn't run any task on an isolated CPU.

* Patch3:
This patch makes store_rps_map() aware of the isolated
CPUs so that rps don't queue any jobs on an isolated CPU.


Proposed Changes
================
To fix the above-mentioned issues Alex has used housekeeping_cpumask().
The only changes that I am proposing here are:
- Removing the dependency on CONFIG_TASK_ISOLATION that was proposed by
Alex. As it should be safe to rely on housekeeping_cpumask()
even when we don't have any isolated CPUs and we want
to fall back to using all available CPUs in any of the above scenarios.
- Using both HK_FLAG_DOMAIN and HK_FLAG_WQ in all three patches, this is
because we would want the above fixes not only when we have isolcpus but
also with something like systemd's CPU affinity.


Testing
=======
* Patch 1:
Fix for cpumask_local_spread() is tested by creating VFs, loading
iavf module and by adding a tracepoint to confirm that only housekeeping
CPUs are picked when an appropriate profile is set up and all remaining
CPUs when no CPU isolation is configured.

* Patch 2:
To test the PCI fix, I hotplugged a virtio-net-pci from qemu console
and forced its addition to a specific node to trigger the code path that
includes the proposed fix and verified that only housekeeping CPUs
are included via tracepoint.

* Patch 3:
To test the fix in store_rps_map(), I tried configuring an isolated
CPU by writing to /sys/class/net/en*/queues/rx*/rps_cpus which
resulted in 'write error: Invalid argument' error. For the case
where a non-isolated CPU is writing in rps_cpus the above operation
succeeded without any error.


Changes from v2[2]:
==================
- Patch1: Removed the extra while loop from cpumask_local_spread and fixed
the code styling issues.
- Patch3: Change to use cpumask_empty() for verifying that the requested
CPUs are available in the housekeeping CPUs.

Changes from v1[3]:
==================
- Included the suggestions made by Bjorn Helgaas in the commit message.
- Included the 'Reviewed-by' and 'Acked-by' received for Patch-2.


[1] https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/
[2] https://patchwork.ozlabs.org/project/linux-pci/cover/[email protected]/
[3] https://patchwork.ozlabs.org/project/linux-pci/cover/[email protected]/


Alex Belits (3):
lib: Restrict cpumask_local_spread to houskeeping CPUs
PCI: Restrict probe functions to housekeeping CPUs
net: Restrict receive packets queuing to housekeeping CPUs

drivers/pci/pci-driver.c | 5 ++++-
lib/cpumask.c | 16 +++++++++++-----
net/core/net-sysfs.c | 10 +++++++++-
3 files changed, 24 insertions(+), 7 deletions(-)

--


2020-06-24 10:11:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] Preventing job distribution to isolated CPUs

On Tue, Jun 23, 2020 at 03:23:28PM -0400, Nitesh Narayan Lal wrote:
> This patch-set is originated from one of the patches that have been
> posted earlier as a part of "Task_isolation" mode [1] patch series
> by Alex Belits <[email protected]>. There are only a couple of
> changes that I am proposing in this patch-set compared to what Alex
> has posted earlier.

>
> Alex Belits (3):
> lib: Restrict cpumask_local_spread to houskeeping CPUs
> PCI: Restrict probe functions to housekeeping CPUs
> net: Restrict receive packets queuing to housekeeping CPUs
>
> drivers/pci/pci-driver.c | 5 ++++-
> lib/cpumask.c | 16 +++++++++++-----
> net/core/net-sysfs.c | 10 +++++++++-
> 3 files changed, 24 insertions(+), 7 deletions(-)

This looks reasonable to me; who is expected to merge this? Should I
take it through the scheduler tree like most of the nohz_full, or what
do we do?