2015-02-25 16:38:21

by Rik van Riel

[permalink] [raw]
Subject: [PATCH -v2 0/2] cpusets,isolcpus: resolve conflict between cpusets and isolcpus

-v2 addresses the conflict David Rientjes spotted between my previous
patches and commit e8e6d97c9b ("cpuset: use %*pb[l] to print bitmaps
including cpumasks and nodemasks")

Ensure that cpus specified with the isolcpus= boot commandline
option stay outside of the load balancing in the kernel scheduler.

Operations like load balancing can introduce unwanted latencies,
which is exactly what the isolcpus= commandline is there to prevent.

Previously, simply creating a new cpuset, without even touching the
cpuset.cpus field inside the new cpuset, would undo the effects of
isolcpus=, by creating a scheduler domain spanning the whole system,
and setting up load balancing inside that domain. The cpuset root
cpuset.cpus file is read-only, so there was not even a way to undo
that effect.

This does not impact the majority of cpusets users, since isolcpus=
is a fairly specialized feature used for realtime purposes.


2015-02-25 16:38:42

by Rik van Riel

[permalink] [raw]
Subject: [PATCH 1/2] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

From: Rik van Riel <[email protected]>

Ensure that cpus specified with the isolcpus= boot commandline
option stay outside of the load balancing in the kernel scheduler.

Operations like load balancing can introduce unwanted latencies,
which is exactly what the isolcpus= commandline is there to prevent.

Previously, simply creating a new cpuset, without even touching the
cpuset.cpus field inside the new cpuset, would undo the effects of
isolcpus=, by creating a scheduler domain spanning the whole system,
and setting up load balancing inside that domain. The cpuset root
cpuset.cpus file is read-only, so there was not even a way to undo
that effect.

This does not impact the majority of cpusets users, since isolcpus=
is a fairly specialized feature used for realtime purposes.

Cc: Peter Zijlstra <[email protected]>
Cc: Clark Williams <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: [email protected]
Signed-off-by: Rik van Riel <[email protected]>
Tested-by: David Rientjes <[email protected]>
---
include/linux/sched.h | 2 ++
kernel/cpuset.c | 13 +++++++++++--
kernel/sched/core.c | 2 +-
3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6d77432e14ff..aeae02435717 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1038,6 +1038,8 @@ static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
struct sched_domain_attr *dattr_new);

+extern cpumask_var_t cpu_isolated_map;
+
/* Allocate an array of sched domains, for partition_sched_domains(). */
cpumask_var_t *alloc_sched_domains(unsigned int ndoms);
void free_sched_domains(cpumask_var_t doms[], unsigned int ndoms);
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 1d1fe9361d29..b544e5229d99 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -625,6 +625,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
int csn; /* how many cpuset ptrs in csa so far */
int i, j, k; /* indices for partition finding loops */
cpumask_var_t *doms; /* resulting partition; i.e. sched domains */
+ cpumask_var_t non_isolated_cpus; /* load balanced CPUs */
struct sched_domain_attr *dattr; /* attributes for custom domains */
int ndoms = 0; /* number of sched domains in result */
int nslot; /* next empty doms[] struct cpumask slot */
@@ -634,6 +635,10 @@ static int generate_sched_domains(cpumask_var_t **domains,
dattr = NULL;
csa = NULL;

+ if (!alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL))
+ goto done;
+ cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
+
/* Special case for the 99% of systems with one, full, sched domain */
if (is_sched_load_balance(&top_cpuset)) {
ndoms = 1;
@@ -646,7 +651,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
*dattr = SD_ATTR_INIT;
update_domain_attr_tree(dattr, &top_cpuset);
}
- cpumask_copy(doms[0], top_cpuset.effective_cpus);
+ cpumask_and(doms[0], top_cpuset.effective_cpus,
+ non_isolated_cpus);

goto done;
}
@@ -669,7 +675,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
* the corresponding sched domain.
*/
if (!cpumask_empty(cp->cpus_allowed) &&
- !is_sched_load_balance(cp))
+ !(is_sched_load_balance(cp) &&
+ cpumask_intersects(cp->cpus_allowed, non_isolated_cpus)))
continue;

if (is_sched_load_balance(cp))
@@ -751,6 +758,7 @@ static int generate_sched_domains(cpumask_var_t **domains,

if (apn == b->pn) {
cpumask_or(dp, dp, b->effective_cpus);
+ cpumask_and(dp, dp, non_isolated_cpus);
if (dattr)
update_domain_attr_tree(dattr + nslot, b);

@@ -763,6 +771,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
BUG_ON(nslot != ndoms);

done:
+ free_cpumask_var(non_isolated_cpus);
kfree(csa);

/*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..3db1beace19b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5812,7 +5812,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
}

/* cpus with isolated domains */
-static cpumask_var_t cpu_isolated_map;
+cpumask_var_t cpu_isolated_map;

/* Setup the mask of cpus configured for isolated domains */
static int __init isolated_cpu_setup(char *str)
--
2.1.0

2015-02-25 16:38:29

by Rik van Riel

[permalink] [raw]
Subject: [PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

From: Rik van Riel <[email protected]>

The previous patch makes it so the code skips over isolcpus when
building scheduler load balancing domains. This makes it hard to
see for a user which of the CPUs in a cpuset are participating in
load balancing, and which ones are isolated cpus.

Add a cpuset.isolcpus file with info on which cpus in a cpuset are
isolated CPUs.

This file is read-only for now. In the future we could extend things
so isolcpus can be changed at run time, for the root (system wide)
cpuset only.

Cc: Peter Zijlstra <[email protected]>
Cc: Clark Williams <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: [email protected]
Signed-off-by: Rik van Riel <[email protected]>
---
kernel/cpuset.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index b544e5229d99..94bf59588e23 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1563,6 +1563,7 @@ typedef enum {
FILE_MEMORY_PRESSURE,
FILE_SPREAD_PAGE,
FILE_SPREAD_SLAB,
+ FILE_ISOLCPUS,
} cpuset_filetype_t;

static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
@@ -1704,6 +1705,20 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
return retval ?: nbytes;
}

+static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
+{
+ cpumask_var_t my_isolated_cpus;
+
+ if (!alloc_cpumask_var(&my_isolated_cpus, GFP_KERNEL))
+ return;
+
+ cpumask_and(my_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
+
+ seq_printf(sf, "%*pbl\n", nodemask_pr_args(my_isolated_cpus));
+
+ free_cpumask_var(my_isolated_cpus);
+}
+
/*
* These ascii lists should be read in a single call, by using a user
* buffer large enough to hold the entire map. If read in smaller
@@ -1733,6 +1748,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
case FILE_EFFECTIVE_MEMLIST:
seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
break;
+ case FILE_ISOLCPUS:
+ cpuset_seq_print_isolcpus(sf, cs);
+ break;
default:
ret = -EINVAL;
}
@@ -1893,6 +1911,12 @@ static struct cftype files[] = {
.private = FILE_MEMORY_PRESSURE_ENABLED,
},

+ {
+ .name = "isolcpus",
+ .seq_show = cpuset_common_seq_show,
+ .private = FILE_ISOLCPUS,
+ },
+
{ } /* terminate */
};

--
2.1.0

2015-02-25 21:09:29

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

On Wed, 25 Feb 2015, [email protected] wrote:

> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index b544e5229d99..94bf59588e23 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1563,6 +1563,7 @@ typedef enum {
> FILE_MEMORY_PRESSURE,
> FILE_SPREAD_PAGE,
> FILE_SPREAD_SLAB,
> + FILE_ISOLCPUS,
> } cpuset_filetype_t;
>
> static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
> @@ -1704,6 +1705,20 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
> return retval ?: nbytes;
> }
>
> +static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
> +{
> + cpumask_var_t my_isolated_cpus;
> +
> + if (!alloc_cpumask_var(&my_isolated_cpus, GFP_KERNEL))
> + return;
> +
> + cpumask_and(my_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
> +
> + seq_printf(sf, "%*pbl\n", nodemask_pr_args(my_isolated_cpus));

That unfortunately won't output anything, it needs to be
cpumask_pr_args(). After that's fixed, feel free to add my

Acked-by: David Rientjes <[email protected]>

> +
> + free_cpumask_var(my_isolated_cpus);
> +}
> +
> /*
> * These ascii lists should be read in a single call, by using a user
> * buffer large enough to hold the entire map. If read in smaller
> @@ -1733,6 +1748,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
> case FILE_EFFECTIVE_MEMLIST:
> seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
> break;
> + case FILE_ISOLCPUS:
> + cpuset_seq_print_isolcpus(sf, cs);
> + break;
> default:
> ret = -EINVAL;
> }
> @@ -1893,6 +1911,12 @@ static struct cftype files[] = {
> .private = FILE_MEMORY_PRESSURE_ENABLED,
> },
>
> + {
> + .name = "isolcpus",
> + .seq_show = cpuset_common_seq_show,
> + .private = FILE_ISOLCPUS,
> + },
> +
> { } /* terminate */
> };
>

2015-02-25 21:22:21

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

On 02/25/2015 04:09 PM, David Rientjes wrote:
> On Wed, 25 Feb 2015, [email protected] wrote:
>
>> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
>> index b544e5229d99..94bf59588e23 100644
>> --- a/kernel/cpuset.c
>> +++ b/kernel/cpuset.c
>> @@ -1563,6 +1563,7 @@ typedef enum {
>> FILE_MEMORY_PRESSURE,
>> FILE_SPREAD_PAGE,
>> FILE_SPREAD_SLAB,
>> + FILE_ISOLCPUS,
>> } cpuset_filetype_t;
>>
>> static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
>> @@ -1704,6 +1705,20 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
>> return retval ?: nbytes;
>> }
>>
>> +static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
>> +{
>> + cpumask_var_t my_isolated_cpus;
>> +
>> + if (!alloc_cpumask_var(&my_isolated_cpus, GFP_KERNEL))
>> + return;
>> +
>> + cpumask_and(my_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
>> +
>> + seq_printf(sf, "%*pbl\n", nodemask_pr_args(my_isolated_cpus));
>
> That unfortunately won't output anything, it needs to be
> cpumask_pr_args(). After that's fixed, feel free to add my
>
> Acked-by: David Rientjes <[email protected]>

Gah. Too many things going on at once.

Let me resend a v3 of just patch 2/2 with your ack.

2015-02-25 21:33:05

by Rik van Riel

[permalink] [raw]
Subject: [PATCH v3 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

Subject: cpusets,isolcpus: add file to show isolated cpus in cpuset

The previous patch makes it so the code skips over isolcpus when
building scheduler load balancing domains. This makes it hard to
see for a user which of the CPUs in a cpuset are participating in
load balancing, and which ones are isolated cpus.

Add a cpuset.isolcpus file with info on which cpus in a cpuset are
isolated CPUs.

This file is read-only for now. In the future we could extend things
so isolcpus can be changed at run time, for the root (system wide)
cpuset only.

Acked-by: David Rientjes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Clark Williams <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: [email protected]
Signed-off-by: Rik van Riel <[email protected]>
---
OK, I suck. Thanks to David Rientjes for spotting the silly mistake.

kernel/cpuset.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index b544e5229d99..455df101ceec 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1563,6 +1563,7 @@ typedef enum {
FILE_MEMORY_PRESSURE,
FILE_SPREAD_PAGE,
FILE_SPREAD_SLAB,
+ FILE_ISOLCPUS,
} cpuset_filetype_t;

static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
@@ -1704,6 +1705,20 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
return retval ?: nbytes;
}

+static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
+{
+ cpumask_var_t my_isolated_cpus;
+
+ if (!alloc_cpumask_var(&my_isolated_cpus, GFP_KERNEL))
+ return;
+
+ cpumask_and(my_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
+
+ seq_printf(sf, "%*pbl\n", cpumask_pr_args(my_isolated_cpus));
+
+ free_cpumask_var(my_isolated_cpus);
+}
+
/*
* These ascii lists should be read in a single call, by using a user
* buffer large enough to hold the entire map. If read in smaller
@@ -1733,6 +1748,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
case FILE_EFFECTIVE_MEMLIST:
seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
break;
+ case FILE_ISOLCPUS:
+ cpuset_seq_print_isolcpus(sf, cs);
+ break;
default:
ret = -EINVAL;
}
@@ -1893,6 +1911,12 @@ static struct cftype files[] = {
.private = FILE_MEMORY_PRESSURE_ENABLED,
},

+ {
+ .name = "isolcpus",
+ .seq_show = cpuset_common_seq_show,
+ .private = FILE_ISOLCPUS,
+ },
+
{ } /* terminate */
};

2015-02-26 11:06:25

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

> +static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
> +{
> + cpumask_var_t my_isolated_cpus;
> +
> + if (!alloc_cpumask_var(&my_isolated_cpus, GFP_KERNEL))
> + return;
> +

Make it return -ENOMEM ? Or make it a global variable and allocate memory for it
in cpuset_init().

> + cpumask_and(my_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
> +
> + seq_printf(sf, "%*pbl\n", nodemask_pr_args(my_isolated_cpus));
> +
> + free_cpumask_var(my_isolated_cpus);
> +}
> +
> /*
> * These ascii lists should be read in a single call, by using a user
> * buffer large enough to hold the entire map. If read in smaller
> @@ -1733,6 +1748,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
> case FILE_EFFECTIVE_MEMLIST:
> seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
> break;
> + case FILE_ISOLCPUS:
> + cpuset_seq_print_isolcpus(sf, cs);
> + break;
> default:
> ret = -EINVAL;
> }
> @@ -1893,6 +1911,12 @@ static struct cftype files[] = {
> .private = FILE_MEMORY_PRESSURE_ENABLED,
> },
>
> + {
> + .name = "isolcpus",
> + .seq_show = cpuset_common_seq_show,
> + .private = FILE_ISOLCPUS,
> + },
> +
> { } /* terminate */
> };
>
>

2015-02-26 15:25:48

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

On 02/26/2015 06:05 AM, Zefan Li wrote:
>> +static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
>> +{
>> + cpumask_var_t my_isolated_cpus;
>> +
>> + if (!alloc_cpumask_var(&my_isolated_cpus, GFP_KERNEL))
>> + return;
>> +
>
> Make it return -ENOMEM ? Or make it a global variable and allocate memory for it
> in cpuset_init().

OK, can do.

I see that cpuset_common_seq_show already takes a lock, so having
one global variable for this should not introduce any additional
contention.

I will send a v4.

>> @@ -1733,6 +1748,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
>> case FILE_EFFECTIVE_MEMLIST:
>> seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
>> break;
>> + case FILE_ISOLCPUS:
>> + cpuset_seq_print_isolcpus(sf, cs);
>> + break;
>> default:
>> ret = -EINVAL;
>> }


--
All rights reversed

2015-02-26 17:13:12

by Rik van Riel

[permalink] [raw]
Subject: [PATCH v4 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

On Thu, 26 Feb 2015 19:05:57 +0800
Zefan Li <[email protected]> wrote:

> Make it return -ENOMEM ? Or make it a global variable and allocate memory for it
> in cpuset_init().

Here you are. This addresses your concern, as well as the
issue David Rientjes found earlier.

---8<---

Subject: cpusets,isolcpus: add file to show isolated cpus in cpuset

The previous patch makes it so the code skips over isolcpus when
building scheduler load balancing domains. This makes it hard to
see for a user which of the CPUs in a cpuset are participating in
load balancing, and which ones are isolated cpus.

Add a cpuset.isolcpus file with info on which cpus in a cpuset are
isolated CPUs.

This file is read-only for now. In the future we could extend things
so isolcpus can be changed at run time, for the root (system wide)
cpuset only.

Acked-by: David Rientjes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Clark Williams <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: [email protected]
Signed-off-by: Rik van Riel <[email protected]>
---
kernel/cpuset.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index b544e5229d99..5462e1ca90bd 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1563,6 +1563,7 @@ typedef enum {
FILE_MEMORY_PRESSURE,
FILE_SPREAD_PAGE,
FILE_SPREAD_SLAB,
+ FILE_ISOLCPUS,
} cpuset_filetype_t;

static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
@@ -1704,6 +1705,16 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
return retval ?: nbytes;
}

+/* protected by the lock in cpuset_common_seq_show */
+static cpumask_var_t print_isolated_cpus;
+
+static void cpuset_seq_print_isolcpus(struct seq_file *sf, struct cpuset *cs)
+{
+ cpumask_and(print_isolated_cpus, cs->cpus_allowed, cpu_isolated_map);
+
+ seq_printf(sf, "%*pbl\n", cpumask_pr_args(print_isolated_cpus));
+}
+
/*
* These ascii lists should be read in a single call, by using a user
* buffer large enough to hold the entire map. If read in smaller
@@ -1733,6 +1744,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
case FILE_EFFECTIVE_MEMLIST:
seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
break;
+ case FILE_ISOLCPUS:
+ cpuset_seq_print_isolcpus(sf, cs);
+ break;
default:
ret = -EINVAL;
}
@@ -1893,6 +1907,12 @@ static struct cftype files[] = {
.private = FILE_MEMORY_PRESSURE_ENABLED,
},

+ {
+ .name = "isolcpus",
+ .seq_show = cpuset_common_seq_show,
+ .private = FILE_ISOLCPUS,
+ },
+
{ } /* terminate */
};

@@ -2070,6 +2090,8 @@ int __init cpuset_init(void)
BUG();
if (!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL))
BUG();
+ if (!alloc_cpumask_var(&print_isolated_cpus, GFP_KERNEL))
+ BUG();

cpumask_setall(top_cpuset.cpus_allowed);
nodes_setall(top_cpuset.mems_allowed);

2015-02-27 09:32:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/2] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

On Wed, Feb 25, 2015 at 11:38:07AM -0500, [email protected] wrote:
> From: Rik van Riel <[email protected]>
>
> Ensure that cpus specified with the isolcpus= boot commandline
> option stay outside of the load balancing in the kernel scheduler.
>
> Operations like load balancing can introduce unwanted latencies,
> which is exactly what the isolcpus= commandline is there to prevent.
>
> Previously, simply creating a new cpuset, without even touching the
> cpuset.cpus field inside the new cpuset, would undo the effects of
> isolcpus=, by creating a scheduler domain spanning the whole system,
> and setting up load balancing inside that domain. The cpuset root
> cpuset.cpus file is read-only, so there was not even a way to undo
> that effect.
>
> This does not impact the majority of cpusets users, since isolcpus=
> is a fairly specialized feature used for realtime purposes.
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Clark Williams <[email protected]>
> Cc: Li Zefan <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Luiz Capitulino <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: [email protected]
> Signed-off-by: Rik van Riel <[email protected]>
> Tested-by: David Rientjes <[email protected]>

Might I asked you to update Documentation/cgroups/cpusets.txt with this
knowledge? While it does mentions isolcpus it does not clarify the
interaction between it and cpusets.

Other than that,

Acked-by: Peter Zijlstra (Intel) <[email protected]>

2015-02-27 17:08:39

by Rik van Riel

[permalink] [raw]
Subject: [PATCH 3/2] cpusets,isolcpus: document relationship between cpusets & isolcpus

Document the subtly changed relationship between cpusets and isolcpus.
Turns out the old documentation did not quite match the code...

Signed-off-by: Rik van Riel <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
---
Documentation/cgroups/cpusets.txt | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
index f2235a162529..fdf7dff3f607 100644
--- a/Documentation/cgroups/cpusets.txt
+++ b/Documentation/cgroups/cpusets.txt
@@ -392,8 +392,10 @@ Put simply, it costs less to balance between two smaller sched domains
than one big one, but doing so means that overloads in one of the
two domains won't be load balanced to the other one.

-By default, there is one sched domain covering all CPUs, except those
-marked isolated using the kernel boot time "isolcpus=" argument.
+By default, there is one sched domain covering all CPUs, including those
+marked isolated using the kernel boot time "isolcpus=" argument. However,
+the isolated CPUs will not participate in load balancing, and will not
+have tasks running on them unless explicitly assigned.

This default load balancing across all CPUs is not well suited for
the following two situations:
@@ -465,6 +467,10 @@ such partially load balanced cpusets, as they may be artificially
constrained to some subset of the CPUs allowed to them, for lack of
load balancing to the other CPUs.

+CPUs in "cpuset.isolcpus" were excluded from load balancing by the
+isolcpus= kernel boot option, and will never be load balanced regardless
+of the value of "cpuset.sched_load_balance" in any cpuset.
+
1.7.1 sched_load_balance implementation details.
------------------------------------------------

2015-02-27 21:15:24

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 3/2] cpusets,isolcpus: document relationship between cpusets & isolcpus

On Fri, 27 Feb 2015, Rik van Riel wrote:

> Document the subtly changed relationship between cpusets and isolcpus.
> Turns out the old documentation did not quite match the code...
>
> Signed-off-by: Rik van Riel <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>

Acked-by: David Rientjes <[email protected]>

2015-02-28 03:21:57

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 1/2] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

On 2015/2/26 0:38, [email protected] wrote:
> From: Rik van Riel <[email protected]>
>
> Ensure that cpus specified with the isolcpus= boot commandline
> option stay outside of the load balancing in the kernel scheduler.
>
> Operations like load balancing can introduce unwanted latencies,
> which is exactly what the isolcpus= commandline is there to prevent.
>
> Previously, simply creating a new cpuset, without even touching the
> cpuset.cpus field inside the new cpuset, would undo the effects of
> isolcpus=, by creating a scheduler domain spanning the whole system,
> and setting up load balancing inside that domain. The cpuset root
> cpuset.cpus file is read-only, so there was not even a way to undo
> that effect.
>
> This does not impact the majority of cpusets users, since isolcpus=
> is a fairly specialized feature used for realtime purposes.
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Clark Williams <[email protected]>
> Cc: Li Zefan <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Luiz Capitulino <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: [email protected]
> Signed-off-by: Rik van Riel <[email protected]>
> Tested-by: David Rientjes <[email protected]>

Acked-by: Zefan Li <[email protected]>

2015-02-28 03:22:51

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

> Subject: cpusets,isolcpus: add file to show isolated cpus in cpuset
>
> The previous patch makes it so the code skips over isolcpus when
> building scheduler load balancing domains. This makes it hard to
> see for a user which of the CPUs in a cpuset are participating in
> load balancing, and which ones are isolated cpus.
>
> Add a cpuset.isolcpus file with info on which cpus in a cpuset are
> isolated CPUs.
>
> This file is read-only for now. In the future we could extend things
> so isolcpus can be changed at run time, for the root (system wide)
> cpuset only.
>
> Acked-by: David Rientjes <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Clark Williams <[email protected]>
> Cc: Li Zefan <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Luiz Capitulino <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> Cc: [email protected]
> Signed-off-by: Rik van Riel <[email protected]>

Acked-by: Zefan Li <[email protected]>

2015-02-28 03:23:34

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 3/2] cpusets,isolcpus: document relationship between cpusets & isolcpus

On 2015/2/28 1:08, Rik van Riel wrote:
> Document the subtly changed relationship between cpusets and isolcpus.
> Turns out the old documentation did not quite match the code...
>
> Signed-off-by: Rik van Riel <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>

Acked-by: Zefan Li <[email protected]>