2020-06-22 23:51:52

by Nitesh Narayan Lal

[permalink] [raw]
Subject: [Patch v2 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs

From: Alex Belits <[email protected]>

The current implementation of cpumask_local_spread() does not respect the
isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
it will return it to the caller for pinning of its IRQ threads. Having
these unwanted IRQ threads on an isolated CPU adds up to a latency
overhead.

Restrict the CPUs that are returned for spreading IRQs only to the
available housekeeping CPUs.

Signed-off-by: Alex Belits <[email protected]>
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
lib/cpumask.c | 43 +++++++++++++++++++++++++------------------
1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index fb22fb266f93..cc4311a8c079 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -6,6 +6,7 @@
#include <linux/export.h>
#include <linux/memblock.h>
#include <linux/numa.h>
+#include <linux/sched/isolation.h>

/**
* cpumask_next - get the next cpu in a cpumask
@@ -205,28 +206,34 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
*/
unsigned int cpumask_local_spread(unsigned int i, int node)
{
- int cpu;
+ int cpu, m, n, hk_flags;
+ const struct cpumask *mask;

+ hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
+ mask = housekeeping_cpumask(hk_flags);
+ m = cpumask_weight(mask);
/* Wrap: we always want a cpu. */
- i %= num_online_cpus();
+ n = i % m;
+ while (m-- > 0) {
+ if (node == NUMA_NO_NODE) {
+ for_each_cpu(cpu, mask)
+ if (n-- == 0)
+ return cpu;
+ } else {
+ /* NUMA first. */
+ for_each_cpu_and(cpu, cpumask_of_node(node), mask)
+ if (n-- == 0)
+ return cpu;

- if (node == NUMA_NO_NODE) {
- for_each_cpu(cpu, cpu_online_mask)
- if (i-- == 0)
- return cpu;
- } else {
- /* NUMA first. */
- for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
- if (i-- == 0)
- return cpu;
+ for_each_cpu(cpu, mask) {
+ /* Skip NUMA nodes, done above. */
+ if (cpumask_test_cpu(cpu,
+ cpumask_of_node(node)))
+ continue;

- for_each_cpu(cpu, cpu_online_mask) {
- /* Skip NUMA nodes, done above. */
- if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
- continue;
-
- if (i-- == 0)
- return cpu;
+ if (n-- == 0)
+ return cpu;
+ }
}
}
BUG();
--
2.18.4


2020-06-23 09:25:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [Patch v2 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs

On Mon, Jun 22, 2020 at 07:45:08PM -0400, Nitesh Narayan Lal wrote:
> From: Alex Belits <[email protected]>
>
> The current implementation of cpumask_local_spread() does not respect the
> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
> it will return it to the caller for pinning of its IRQ threads. Having
> these unwanted IRQ threads on an isolated CPU adds up to a latency
> overhead.
>
> Restrict the CPUs that are returned for spreading IRQs only to the
> available housekeeping CPUs.
>
> Signed-off-by: Alex Belits <[email protected]>
> Signed-off-by: Nitesh Narayan Lal <[email protected]>
> ---
> lib/cpumask.c | 43 +++++++++++++++++++++++++------------------
> 1 file changed, 25 insertions(+), 18 deletions(-)
>
> diff --git a/lib/cpumask.c b/lib/cpumask.c
> index fb22fb266f93..cc4311a8c079 100644
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -6,6 +6,7 @@
> #include <linux/export.h>
> #include <linux/memblock.h>
> #include <linux/numa.h>
> +#include <linux/sched/isolation.h>
>
> /**
> * cpumask_next - get the next cpu in a cpumask
> @@ -205,28 +206,34 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
> */
> unsigned int cpumask_local_spread(unsigned int i, int node)
> {
> - int cpu;
> + int cpu, m, n, hk_flags;
> + const struct cpumask *mask;
>
> + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
> + mask = housekeeping_cpumask(hk_flags);
> + m = cpumask_weight(mask);
> /* Wrap: we always want a cpu. */
> - i %= num_online_cpus();
> + n = i % m;
> + while (m-- > 0) {

I are confuzled. What do we need this outer loop for?

Why isn't something like:

i %= cpumask_weight(mask);

good enough? That voids having to touch the test.
Still when you're there, at the very least you can fix the horrible
style:


> + if (node == NUMA_NO_NODE) {
> + for_each_cpu(cpu, mask)
> + if (n-- == 0)
> + return cpu;

{ }

> + } else {
> + /* NUMA first. */
> + for_each_cpu_and(cpu, cpumask_of_node(node), mask)
> + if (n-- == 0)
> + return cpu;

{ }

>
> + for_each_cpu(cpu, mask) {
> + /* Skip NUMA nodes, done above. */
> + if (cpumask_test_cpu(cpu,
> + cpumask_of_node(node)))
> + continue;

No linebreak please.

>
> + if (n-- == 0)
> + return cpu;
> + }
> }
> }
> BUG();
> --
> 2.18.4
>

2020-06-23 13:21:47

by Nitesh Narayan Lal

[permalink] [raw]
Subject: Re: [Patch v2 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs


On 6/23/20 5:21 AM, Peter Zijlstra wrote:
> On Mon, Jun 22, 2020 at 07:45:08PM -0400, Nitesh Narayan Lal wrote:
>> From: Alex Belits <[email protected]>
>>
>> The current implementation of cpumask_local_spread() does not respect the
>> isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
>> it will return it to the caller for pinning of its IRQ threads. Having
>> these unwanted IRQ threads on an isolated CPU adds up to a latency
>> overhead.
>>
>> Restrict the CPUs that are returned for spreading IRQs only to the
>> available housekeeping CPUs.
>>
>> Signed-off-by: Alex Belits <[email protected]>
>> Signed-off-by: Nitesh Narayan Lal <[email protected]>
>> ---
>> lib/cpumask.c | 43 +++++++++++++++++++++++++------------------
>> 1 file changed, 25 insertions(+), 18 deletions(-)
>>
>> diff --git a/lib/cpumask.c b/lib/cpumask.c
>> index fb22fb266f93..cc4311a8c079 100644
>> --- a/lib/cpumask.c
>> +++ b/lib/cpumask.c
>> @@ -6,6 +6,7 @@
>> #include <linux/export.h>
>> #include <linux/memblock.h>
>> #include <linux/numa.h>
>> +#include <linux/sched/isolation.h>
>>
>> /**
>> * cpumask_next - get the next cpu in a cpumask
>> @@ -205,28 +206,34 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
>> */
>> unsigned int cpumask_local_spread(unsigned int i, int node)
>> {
>> - int cpu;
>> + int cpu, m, n, hk_flags;
>> + const struct cpumask *mask;
>>
>> + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> + mask = housekeeping_cpumask(hk_flags);
>> + m = cpumask_weight(mask);
>> /* Wrap: we always want a cpu. */
>> - i %= num_online_cpus();
>> + n = i % m;
>> + while (m-- > 0) {
> I are confuzled. What do we need this outer loop for?
>
> Why isn't something like:
>
> i %= cpumask_weight(mask);
>
> good enough? That voids having to touch the test.

Makes sense.
Thanks

> Still when you're there, at the very least you can fix the horrible
> style:

Sure.

>
>
>> + if (node == NUMA_NO_NODE) {
>> + for_each_cpu(cpu, mask)
>> + if (n-- == 0)
>> + return cpu;
> { }
>
>> + } else {
>> + /* NUMA first. */
>> + for_each_cpu_and(cpu, cpumask_of_node(node), mask)
>> + if (n-- == 0)
>> + return cpu;
> { }
>
>>
>> + for_each_cpu(cpu, mask) {
>> + /* Skip NUMA nodes, done above. */
>> + if (cpumask_test_cpu(cpu,
>> + cpumask_of_node(node)))
>> + continue;
> No linebreak please.
>
>>
>> + if (n-- == 0)
>> + return cpu;
>> + }
>> }
>> }
>> BUG();
>> --
>> 2.18.4
>>
--
Nitesh


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature