Ingo,
Please pull the core/isolation-v4 branch that can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
core/isolation-v4
HEAD: cf4c55aad44251369c8507c3823f9f9c51d4dc77
Summary of changes:
* Move the housekeeping code that was tied to NO_HZ to its own subsystem.
Currently NO_HZ governs the other isolation features which is not right
as dynticks is just an isolation feature like the others. We want to
centralize the CPU isolation decisions to a subsystem of its own instead.
* Integrate isolcpus code to housekeeping and treat it as a CPU isolation
feature.
* Reuse the "isolcpus=" kernel parameter to control the CPU isolation.
For now only tick and domains can be isolated after this patchset:
isolcpus=1-7 # isolate domains on CPU range 1 to 7
# "domain" flag is implicit by default to
# keep the current behaviour
isolcpus=domain,1-7 # do the same
isolcpus=nohz,1-7 # apply nohz_full to CPU range 1 to 7
isolcpus=nohz,domain,1-7 # apply nohz_full and isolate domains of
# CPU range 1 to 7
Thanks,
Frederic
---
Frederic Weisbecker (12):
housekeeping: Move housekeeping related code to its own file
watchdog: Use housekeeping_cpumask() instead of ad-hoc version
housekeeping: Provide a dynamic off-case to housekeeping_any_cpu()
housekeeping: Make housekeeping cpumask private
housekeeping: Use its own static key
housekeeping: Rename is_housekeeping_cpu to housekeeping_cpu
housekeeping: Move it under its own config, independant from NO_HZ
housekeeping: Introduce housekeeping flags
housekeeping: Handle nohz_full= parameter
housekeeping: Move isolcpus to housekeeping
housekeeping: Add basic isolcpus flags
housekeeping: Document isolcpus flags
Documentation/admin-guide/kernel-parameters.txt | 33 +++---
drivers/base/cpu.c | 11 +-
drivers/net/ethernet/tile/tilegx.c | 6 +-
include/linux/housekeeping.h | 51 ++++++++
include/linux/sched.h | 2 -
include/linux/tick.h | 39 +------
init/Kconfig | 7 ++
init/main.c | 2 +
kernel/Makefile | 1 +
kernel/cgroup/cpuset.c | 15 +--
kernel/housekeeping.c | 149 ++++++++++++++++++++++++
kernel/rcu/tree_plugin.h | 3 +-
kernel/rcu/update.c | 3 +-
kernel/sched/core.c | 25 +---
kernel/sched/fair.c | 3 +-
kernel/sched/topology.c | 24 +---
kernel/time/tick-sched.c | 31 +----
kernel/watchdog.c | 13 +--
18 files changed, 276 insertions(+), 142 deletions(-)
While trying to disable the watchog on nohz_full CPUs, the watchdog
implements an ad-hoc version of housekeeping_cpumask(). Lets replace
those re-invented lines.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
kernel/watchdog.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 3cc5596..e672752 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -942,15 +942,10 @@ void __init lockup_detector_init(void)
{
set_sample_period();
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled()) {
+ if (tick_nohz_full_enabled())
pr_info("Disabling watchdog on nohz_full cores by default\n");
- cpumask_copy(&watchdog_cpumask, housekeeping_mask);
- } else
- cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
-#else
- cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
-#endif
+
+ cpumask_copy(&watchdog_cpumask, housekeeping_cpumask());
if (watchdog_enabled)
watchdog_enable_all_cpus();
--
2.7.4
Nobody needs to access this detail. housekeeping_cpumask() already
takes care about it.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 31 ++++++++++---------------------
kernel/housekeeping.c | 33 ++++++++++++++++++++++++++++++++-
2 files changed, 42 insertions(+), 22 deletions(-)
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 64d0ee5..31a1401 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -6,46 +6,35 @@
#include <linux/tick.h>
#ifdef CONFIG_NO_HZ_FULL
-extern cpumask_var_t housekeeping_mask;
+extern int housekeeping_any_cpu(void);
+extern const struct cpumask *housekeeping_cpumask(void);
+extern void housekeeping_affine(struct task_struct *t);
+extern bool housekeeping_test_cpu(int cpu);
extern void __init housekeeping_init(void);
+
#else
-static inline void housekeeping_init(void) { }
-#endif /* CONFIG_NO_HZ_FULL */
static inline int housekeeping_any_cpu(void)
{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-#endif
return smp_processor_id();
}
static inline const struct cpumask *housekeeping_cpumask(void)
{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return housekeeping_mask;
-#endif
return cpu_possible_mask;
}
+static inline void housekeeping_affine(struct task_struct *t) { }
+static inline void housekeeping_init(void) { }
+#endif /* CONFIG_NO_HZ_FULL */
+
static inline bool is_housekeeping_cpu(int cpu)
{
#ifdef CONFIG_NO_HZ_FULL
if (tick_nohz_full_enabled())
- return cpumask_test_cpu(cpu, housekeeping_mask);
+ return housekeeping_test_cpu(cpu);
#endif
return true;
}
-static inline void housekeeping_affine(struct task_struct *t)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- set_cpus_allowed_ptr(t, housekeeping_mask);
-
-#endif
-}
-
#endif /* _LINUX_HOUSEKEEPING_H */
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index b41c52e..0e70dc8 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -11,7 +11,38 @@
#include <linux/init.h>
#include <linux/kernel.h>
-cpumask_var_t housekeeping_mask;
+static cpumask_var_t housekeeping_mask;
+
+int housekeeping_any_cpu(void)
+{
+ if (tick_nohz_full_enabled())
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+
+ return smp_processor_id();
+}
+
+const struct cpumask *housekeeping_cpumask(void)
+{
+ if (tick_nohz_full_enabled())
+ return housekeeping_mask;
+
+ return cpu_possible_mask;
+}
+
+void housekeeping_affine(struct task_struct *t)
+{
+ if (tick_nohz_full_enabled())
+ set_cpus_allowed_ptr(t, housekeeping_mask);
+}
+
+bool housekeeping_test_cpu(int cpu)
+{
+ if (tick_nohz_full_enabled())
+ return cpumask_test_cpu(cpu, housekeeping_mask);
+
+ return true;
+}
+
void __init housekeeping_init(void)
{
--
2.7.4
Before we implement isolcpus under housekeeping, we need the isolation
features to be more finegrained. For example some people want nohz_full
without the full scheduler isolation, others want full scheduler
isolation without nohz_full.
So let's cut all these isolation features piecewise, at the risk of
overcutting it right now. We can still merge some flags later if they
always make sense together.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
drivers/net/ethernet/tile/tilegx.c | 4 ++--
include/linux/housekeeping.h | 26 +++++++++++++++++---------
kernel/housekeeping.c | 26 +++++++++++++++-----------
kernel/rcu/tree_plugin.h | 2 +-
kernel/rcu/update.c | 2 +-
kernel/sched/core.c | 8 ++++----
kernel/sched/fair.c | 2 +-
kernel/watchdog.c | 3 ++-
8 files changed, 43 insertions(+), 30 deletions(-)
diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index 8c7ef12..831f2db 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -2270,8 +2270,8 @@ static int __init tile_net_init_module(void)
tile_net_dev_init(name, mac);
if (!network_cpus_init())
- cpumask_and(&network_cpus_map, housekeeping_cpumask(),
- cpu_online_mask);
+ cpumask_and(&network_cpus_map,
+ housekeeping_cpumask(HK_FLAG_MISC), cpu_online_mask);
return 0;
}
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index dcbec47..b1a62544 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -5,35 +5,43 @@
#include <linux/init.h>
#include <linux/tick.h>
+enum hk_flags {
+ HK_FLAG_TIMER = 1,
+ HK_FLAG_RCU = (1 << 1),
+ HK_FLAG_MISC = (1 << 2),
+ HK_FLAG_SCHED = (1 << 3),
+};
+
#ifdef CONFIG_CPU_ISOLATION
DECLARE_STATIC_KEY_FALSE(housekeeping_overriden);
-extern int housekeeping_any_cpu(void);
-extern const struct cpumask *housekeeping_cpumask(void);
-extern void housekeeping_affine(struct task_struct *t);
-extern bool housekeeping_test_cpu(int cpu);
+extern int housekeeping_any_cpu(enum hk_flags flags);
+extern const struct cpumask *housekeeping_cpumask(enum hk_flags flags);
+extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
+extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
extern void __init housekeeping_init(void);
#else
-static inline int housekeeping_any_cpu(void)
+static inline int housekeeping_any_cpu(enum hk_flags flags)
{
return smp_processor_id();
}
-static inline const struct cpumask *housekeeping_cpumask(void)
+static inline const struct cpumask *housekeeping_cpumask(enum hk_flags flags)
{
return cpu_possible_mask;
}
-static inline void housekeeping_affine(struct task_struct *t) { }
+static inline void housekeeping_affine(struct task_struct *t,
+ enum hk_flags flags) { }
static inline void housekeeping_init(void) { }
#endif /* CONFIG_CPU_ISOLATION */
-static inline bool housekeeping_cpu(int cpu)
+static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
{
#ifdef CONFIG_CPU_ISOLATION
if (static_branch_unlikely(&housekeeping_overriden))
- return housekeeping_test_cpu(cpu);
+ return housekeeping_test_cpu(cpu, flags);
#endif
return true;
}
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 272c344..2cf52ee 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -15,34 +15,36 @@
DEFINE_STATIC_KEY_FALSE(housekeeping_overriden);
EXPORT_SYMBOL_GPL(housekeeping_overriden);
static cpumask_var_t housekeeping_mask;
+static unsigned int housekeeping_flags;
-int housekeeping_any_cpu(void)
+int housekeeping_any_cpu(enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-
+ if (housekeeping_flags & flags)
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
return smp_processor_id();
}
-const struct cpumask *housekeeping_cpumask(void)
+const struct cpumask *housekeeping_cpumask(enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- return housekeeping_mask;
-
+ if (housekeeping_flags & flags)
+ return housekeeping_mask;
return cpu_possible_mask;
}
-void housekeeping_affine(struct task_struct *t)
+void housekeeping_affine(struct task_struct *t, enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- set_cpus_allowed_ptr(t, housekeeping_mask);
+ if (housekeeping_flags & flags)
+ set_cpus_allowed_ptr(t, housekeeping_mask);
}
-bool housekeeping_test_cpu(int cpu)
+bool housekeeping_test_cpu(int cpu, enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- return cpumask_test_cpu(cpu, housekeeping_mask);
-
+ if (housekeeping_flags & flags)
+ return cpumask_test_cpu(cpu, housekeeping_mask);
return true;
}
@@ -61,6 +63,8 @@ void __init housekeeping_init(void)
cpumask_andnot(housekeeping_mask,
cpu_possible_mask, tick_nohz_full_mask);
+ housekeeping_flags = HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
+
static_branch_enable(&housekeeping_overriden);
/* We need at least one CPU to handle housekeeping work */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 387e0a2..e55ff70 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2584,7 +2584,7 @@ static void rcu_bind_gp_kthread(void)
if (!tick_nohz_full_enabled())
return;
- housekeeping_affine(current);
+ housekeeping_affine(current, HK_FLAG_RCU);
}
/* Record the current task on dyntick-idle entry. */
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 1c003e2..f9df98e 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -719,7 +719,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
LIST_HEAD(rcu_tasks_holdouts);
/* Run on housekeeping CPUs by default. Sysadm can move if desired. */
- housekeeping_affine(current);
+ housekeeping_affine(current, HK_FLAG_RCU);
/*
* Each pass through the following loop makes one check for
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec30a98..4b2c6ce 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -527,7 +527,7 @@ int get_nohz_timer_target(void)
int i, cpu = smp_processor_id();
struct sched_domain *sd;
- if (!idle_cpu(cpu) && housekeeping_cpu(cpu))
+ if (!idle_cpu(cpu) && housekeeping_cpu(cpu, HK_FLAG_TIMER))
return cpu;
rcu_read_lock();
@@ -536,15 +536,15 @@ int get_nohz_timer_target(void)
if (cpu == i)
continue;
- if (!idle_cpu(i) && housekeeping_cpu(i)) {
+ if (!idle_cpu(i) && housekeeping_cpu(i, HK_FLAG_TIMER)) {
cpu = i;
goto unlock;
}
}
}
- if (!housekeeping_cpu(cpu))
- cpu = housekeeping_any_cpu();
+ if (!housekeeping_cpu(cpu, HK_FLAG_TIMER))
+ cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
unlock:
rcu_read_unlock();
return cpu;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 43c4092..0c1564c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8741,7 +8741,7 @@ void nohz_balance_enter_idle(int cpu)
return;
/* Spare idle load balancing on CPUs that don't want to be disturbed: */
- if (!housekeeping_cpu(cpu))
+ if (!housekeeping_cpu(cpu, HK_FLAG_SCHED))
return;
if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index e672752..0bb7c74 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -945,7 +945,8 @@ void __init lockup_detector_init(void)
if (tick_nohz_full_enabled())
pr_info("Disabling watchdog on nohz_full cores by default\n");
- cpumask_copy(&watchdog_cpumask, housekeeping_cpumask());
+ cpumask_copy(&watchdog_cpumask,
+ housekeeping_cpumask(HK_FLAG_TIMER));
if (watchdog_enabled)
watchdog_enable_all_cpus();
--
2.7.4
Add flags to control nohz and domain isolations from "isolcpus=", in
order to centralize the isolation features to a common interface. Domain
isolation remains the default so not to break the existing isolcpus
boot paramater behaviour.
Further flags in the future may include 0hz (1hz tick offload) and timers,
workqueue, RCU, kthread, watchdog, likely all merged together in a
common flag ("async"?). In any case, this will have to be modifiable by
cpusets.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
kernel/housekeeping.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 30ee98f..089472f 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -11,6 +11,7 @@
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/static_key.h>
+#include <linux/ctype.h>
DEFINE_STATIC_KEY_FALSE(housekeeping_overriden);
EXPORT_SYMBOL_GPL(housekeeping_overriden);
@@ -120,6 +121,29 @@ __setup("nohz_full=", housekeeping_nohz_full_setup);
static int __init housekeeping_isolcpus_setup(char *str)
{
- return housekeeping_setup(str, HK_FLAG_DOMAIN);
+ unsigned int flags = 0;
+
+ while (isalpha(*str)) {
+ if (!strncmp(str, "nohz,", 5)) {
+ str += 5;
+ flags |= HK_FLAG_TICK;
+ continue;
+ }
+
+ if (!strncmp(str, "domain,", 7)) {
+ str += 7;
+ flags |= HK_FLAG_DOMAIN;
+ continue;
+ }
+
+ pr_warn("isolcpus: Error, unknown flag\n");
+ return 0;
+ }
+
+ /* Default behaviour for isolcpus without flags */
+ if (!flags)
+ flags |= HK_FLAG_DOMAIN;
+
+ return housekeeping_setup(str, flags);
}
__setup("isolcpus=", housekeeping_isolcpus_setup);
--
2.7.4
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 35 +++++++++++++++----------
1 file changed, 21 insertions(+), 14 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 0549662..34ea914 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1727,20 +1727,27 @@
isapnp= [ISAPNP]
Format: <RDP>,<reset>,<pci_scan>,<verbosity>
- isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler.
- The argument is a cpu list, as described above.
-
- This option can be used to specify one or more CPUs
- to isolate from the general SMP balancing and scheduling
- algorithms. You can move a process onto or off an
- "isolated" CPU via the CPU affinity syscalls or cpuset.
- <cpu number> begins at 0 and the maximum value is
- "number of CPUs in system - 1".
-
- This option is the preferred way to isolate CPUs. The
- alternative -- manually setting the CPU mask of all
- tasks in the system -- can cause problems and
- suboptimal load balancer performance.
+ isolcpus= [KNL,SMP] Isolate a given set of CPUs from disturbance.
+ Format: [flag-list,]<cpu-list>
+
+ Specify one or more CPUs to isolate from disturbances
+ specified in the flag list (default: domain):
+
+ nohz
+ Disable the tick when a single task runs.
+ domain
+ Isolate from the general SMP balancing and scheduling
+ algorithms. This option is the preferred way to isolate
+ CPUs from tasks. The alternative -- manually setting the
+ CPU mask of all tasks in the system, can cause problems
+ and suboptimal load balancer performance. You can move a
+ process onto or off an "isolated" CPU via the CPU
+ affinity syscalls or cpuset. <cpu number> begins at 0
+ and the maximum value is "number of CPUs in system - 1".
+
+ The format of <cpu-list> is described above.
+
+
iucv= [HW,NET]
--
2.7.4
We want to centralize the isolation features on the housekeeping
subsystem and scheduler domain isolation is a significant part of it.
No intended behaviour change, we just reuse the housekeeping cpumask
and core code.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
drivers/base/cpu.c | 11 ++++++++-
include/linux/housekeeping.h | 1 +
include/linux/sched.h | 2 --
kernel/cgroup/cpuset.c | 15 ++++--------
kernel/housekeeping.c | 57 +++++++++++++++++++++++++++++++++++---------
kernel/sched/core.c | 16 +------------
kernel/sched/topology.c | 24 +++++--------------
7 files changed, 69 insertions(+), 57 deletions(-)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 321cd7b..7aa0d12 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -18,6 +18,7 @@
#include <linux/cpufeature.h>
#include <linux/tick.h>
#include <linux/pm_qos.h>
+#include <linux/housekeeping.h>
#include "base.h"
@@ -271,8 +272,16 @@ static ssize_t print_cpus_isolated(struct device *dev,
struct device_attribute *attr, char *buf)
{
int n = 0, len = PAGE_SIZE-2;
+ cpumask_var_t isolated;
- n = scnprintf(buf, len, "%*pbl\n", cpumask_pr_args(cpu_isolated_map));
+ if (!alloc_cpumask_var(&isolated, GFP_KERNEL))
+ return -ENOMEM;
+
+ cpumask_andnot(isolated, cpu_possible_mask,
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
+ n = scnprintf(buf, len, "%*pbl\n", cpumask_pr_args(isolated));
+
+ free_cpumask_var(isolated);
return n;
}
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 35fb197..a99466e 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -11,6 +11,7 @@ enum hk_flags {
HK_FLAG_MISC = (1 << 2),
HK_FLAG_SCHED = (1 << 3),
HK_FLAG_TICK = (1 << 4),
+ HK_FLAG_DOMAIN = (1 << 5),
};
#ifdef CONFIG_CPU_ISOLATION
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 92fb8dd..962de00 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -166,8 +166,6 @@ struct task_group;
/* Task command name length: */
#define TASK_COMM_LEN 16
-extern cpumask_var_t cpu_isolated_map;
-
extern void scheduler_tick(void);
#define MAX_SCHEDULE_TIMEOUT LONG_MAX
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 4657e29..4156ad9 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -57,7 +57,7 @@
#include <linux/backing-dev.h>
#include <linux/sort.h>
#include <linux/oom.h>
-
+#include <linux/housekeeping.h>
#include <linux/uaccess.h>
#include <linux/atomic.h>
#include <linux/mutex.h>
@@ -656,7 +656,6 @@ static int generate_sched_domains(cpumask_var_t **domains,
int csn; /* how many cpuset ptrs in csa so far */
int i, j, k; /* indices for partition finding loops */
cpumask_var_t *doms; /* resulting partition; i.e. sched domains */
- cpumask_var_t non_isolated_cpus; /* load balanced CPUs */
struct sched_domain_attr *dattr; /* attributes for custom domains */
int ndoms = 0; /* number of sched domains in result */
int nslot; /* next empty doms[] struct cpumask slot */
@@ -666,10 +665,6 @@ static int generate_sched_domains(cpumask_var_t **domains,
dattr = NULL;
csa = NULL;
- if (!alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL))
- goto done;
- cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
-
/* Special case for the 99% of systems with one, full, sched domain */
if (is_sched_load_balance(&top_cpuset)) {
ndoms = 1;
@@ -683,7 +678,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
update_domain_attr_tree(dattr, &top_cpuset);
}
cpumask_and(doms[0], top_cpuset.effective_cpus,
- non_isolated_cpus);
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
goto done;
}
@@ -707,7 +702,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
*/
if (!cpumask_empty(cp->cpus_allowed) &&
!(is_sched_load_balance(cp) &&
- cpumask_intersects(cp->cpus_allowed, non_isolated_cpus)))
+ cpumask_intersects(cp->cpus_allowed,
+ housekeeping_cpumask(HK_FLAG_DOMAIN))))
continue;
if (is_sched_load_balance(cp))
@@ -789,7 +785,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
if (apn == b->pn) {
cpumask_or(dp, dp, b->effective_cpus);
- cpumask_and(dp, dp, non_isolated_cpus);
+ cpumask_and(dp, dp, housekeeping_cpumask(HK_FLAG_DOMAIN));
if (dattr)
update_domain_attr_tree(dattr + nslot, b);
@@ -802,7 +798,6 @@ static int generate_sched_domains(cpumask_var_t **domains,
BUG_ON(nslot != ndoms);
done:
- free_cpumask_var(non_isolated_cpus);
kfree(csa);
/*
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 8fb8d6b..30ee98f 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -59,32 +59,67 @@ void __init housekeeping_init(void)
WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
-#ifdef CONFIG_NO_HZ_FULL
-static int __init housekeeping_nohz_full_setup(char *str)
+static int __init housekeeping_setup(char *str, enum hk_flags flags)
{
cpumask_var_t non_housekeeping_mask;
alloc_bootmem_cpumask_var(&non_housekeeping_mask);
if (cpulist_parse(str, non_housekeeping_mask) < 0) {
- pr_warn("Housekeeping: Incorrect nohz_full cpumask\n");
+ pr_warn("Housekeeping: nohz_full= or isolcpus= incorrect CPU range\n");
free_bootmem_cpumask_var(non_housekeeping_mask);
return 0;
}
- alloc_bootmem_cpumask_var(&housekeeping_mask);
- cpumask_andnot(housekeeping_mask, cpu_possible_mask, non_housekeeping_mask);
+ if (!housekeeping_flags) {
+ alloc_bootmem_cpumask_var(&housekeeping_mask);
+ cpumask_andnot(housekeeping_mask,
+ cpu_possible_mask, non_housekeeping_mask);
+ if (cpumask_empty(housekeeping_mask))
+ cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+ } else {
+ cpumask_var_t tmp;
- if (cpumask_empty(housekeeping_mask))
- cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+ alloc_bootmem_cpumask_var(&tmp);
+ cpumask_andnot(tmp, cpu_possible_mask, non_housekeeping_mask);
+ if (!cpumask_equal(tmp, housekeeping_mask)) {
+ pr_warn("Housekeeping: nohz_full= must match isolcpus=\n");
+ free_bootmem_cpumask_var(tmp);
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+ return 0;
+ }
+ free_bootmem_cpumask_var(tmp);
+ }
- housekeeping_flags = HK_FLAG_TICK | HK_FLAG_TIMER |
- HK_FLAG_RCU | HK_FLAG_MISC;
+ if ((flags & HK_FLAG_TICK) && !(housekeeping_flags & HK_FLAG_TICK)) {
+ if (IS_ENABLED(CONFIG_NO_HZ_FULL)) {
+ tick_nohz_full_setup(non_housekeeping_mask);
+ } else {
+ pr_warn("Housekeeping: nohz unsupported."
+ " Build with CONFIG_NO_HZ_FULL\n");
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+ return 0;
+ }
+ }
- tick_nohz_full_setup(non_housekeeping_mask);
+ housekeeping_flags |= flags;
free_bootmem_cpumask_var(non_housekeeping_mask);
return 1;
}
+
+static int __init housekeeping_nohz_full_setup(char *str)
+{
+ unsigned int flags;
+
+ flags = HK_FLAG_TICK | HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
+
+ return housekeeping_setup(str, flags);
+}
__setup("nohz_full=", housekeeping_nohz_full_setup);
-#endif
+
+static int __init housekeeping_isolcpus_setup(char *str)
+{
+ return housekeeping_setup(str, HK_FLAG_DOMAIN);
+}
+__setup("isolcpus=", housekeeping_isolcpus_setup);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4b2c6ce..be124f9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -84,9 +84,6 @@ __read_mostly int scheduler_running;
*/
int sysctl_sched_rt_runtime = 950000;
-/* CPUs with isolated domains */
-cpumask_var_t cpu_isolated_map;
-
/*
* __task_rq_lock - lock the rq @p resides on.
*/
@@ -5705,10 +5702,6 @@ static inline void sched_init_smt(void) { }
void __init sched_init_smp(void)
{
- cpumask_var_t non_isolated_cpus;
-
- alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
-
sched_init_numa();
/*
@@ -5718,16 +5711,12 @@ void __init sched_init_smp(void)
*/
mutex_lock(&sched_domains_mutex);
sched_init_domains(cpu_active_mask);
- cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
- if (cpumask_empty(non_isolated_cpus))
- cpumask_set_cpu(smp_processor_id(), non_isolated_cpus);
mutex_unlock(&sched_domains_mutex);
/* Move init over to a non-isolated CPU */
- if (set_cpus_allowed_ptr(current, non_isolated_cpus) < 0)
+ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_DOMAIN)) < 0)
BUG();
sched_init_granularity();
- free_cpumask_var(non_isolated_cpus);
init_sched_rt_class();
init_sched_dl_class();
@@ -5931,9 +5920,6 @@ void __init sched_init(void)
calc_load_update = jiffies + LOAD_FREQ;
#ifdef CONFIG_SMP
- /* May be allocated at isolcpus cmdline parse time */
- if (cpu_isolated_map == NULL)
- zalloc_cpumask_var(&cpu_isolated_map, GFP_NOWAIT);
idle_thread_set_boot_cpu();
set_cpu_rq_start_time(smp_processor_id());
#endif
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index f1cf4f3..c6ec01b 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -3,6 +3,7 @@
*/
#include <linux/sched.h>
#include <linux/mutex.h>
+#include <linux/housekeeping.h>
#include "sched.h"
@@ -463,21 +464,6 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
update_top_cache_domain(cpu);
}
-/* Setup the mask of CPUs configured for isolated domains */
-static int __init isolated_cpu_setup(char *str)
-{
- int ret;
-
- alloc_bootmem_cpumask_var(&cpu_isolated_map);
- ret = cpulist_parse(str, cpu_isolated_map);
- if (ret) {
- pr_err("sched: Error, all isolcpus= values must be between 0 and %u\n", nr_cpu_ids);
- return 0;
- }
- return 1;
-}
-__setup("isolcpus=", isolated_cpu_setup);
-
struct s_data {
struct sched_domain ** __percpu sd;
struct root_domain *rd;
@@ -1773,7 +1759,7 @@ int sched_init_domains(const struct cpumask *cpu_map)
doms_cur = alloc_sched_domains(ndoms_cur);
if (!doms_cur)
doms_cur = &fallback_doms;
- cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
+ cpumask_and(doms_cur[0], cpu_map, housekeeping_cpumask(HK_FLAG_DOMAIN));
err = build_sched_domains(doms_cur[0], NULL);
register_sched_domain_sysctl();
@@ -1856,7 +1842,8 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
doms_new = alloc_sched_domains(1);
if (doms_new) {
n = 1;
- cpumask_andnot(doms_new[0], cpu_active_mask, cpu_isolated_map);
+ cpumask_and(doms_new[0], cpu_active_mask,
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
}
} else {
n = ndoms_new;
@@ -1879,7 +1866,8 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
if (!doms_new) {
n = 0;
doms_new = &fallback_doms;
- cpumask_andnot(doms_new[0], cpu_active_mask, cpu_isolated_map);
+ cpumask_and(doms_new[0], cpu_active_mask,
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
}
/* Build new domains: */
--
2.7.4
We want to centralize the isolation management from the housekeeping
subsystem. Therefore we need to handle the nohz_full= parameter from
there.
Since nohz_full= so far has involved unbound timers, watchdog, RCU
and tilegx NAPI isolation, we keep that default behaviour.
nohz_full= is deemed to be deprecated in the future. We want to control
the isolation features from the isolcpus= parameter.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 1 +
include/linux/tick.h | 2 ++
init/Kconfig | 1 -
kernel/housekeeping.c | 44 +++++++++++++++++++++++++++++++-------------
kernel/time/tick-sched.c | 13 +++----------
5 files changed, 37 insertions(+), 24 deletions(-)
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index b1a62544..35fb197 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -10,6 +10,7 @@ enum hk_flags {
HK_FLAG_RCU = (1 << 1),
HK_FLAG_MISC = (1 << 2),
HK_FLAG_SCHED = (1 << 3),
+ HK_FLAG_TICK = (1 << 4),
};
#ifdef CONFIG_CPU_ISOLATION
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 68afc09..e2a163a 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -228,6 +228,7 @@ static inline void tick_dep_clear_signal(struct signal_struct *signal,
extern void tick_nohz_full_kick_cpu(int cpu);
extern void __tick_nohz_task_switch(void);
+extern void __init tick_nohz_full_setup(cpumask_var_t cpumask);
#else
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
@@ -248,6 +249,7 @@ static inline void tick_dep_clear_signal(struct signal_struct *signal,
static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void __tick_nohz_task_switch(void) { }
+static inline void tick_nohz_full_setup(cpumask_var_t cpumask) { }
#endif
static inline void tick_nohz_task_switch(void)
diff --git a/init/Kconfig b/init/Kconfig
index 6f52e6f..f8564df5 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -474,7 +474,6 @@ endmenu # "CPU/Task time and stats accounting"
config CPU_ISOLATION
bool "CPU isolation"
- depends on NO_HZ_FULL
help
Make sure that CPUs running critical tasks are not disturbed by
any source of "noise" such as unbound workqueues, timers, kthreads...
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 2cf52ee..8fb8d6b 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -50,23 +50,41 @@ bool housekeeping_test_cpu(int cpu, enum hk_flags flags)
void __init housekeeping_init(void)
{
- if (!tick_nohz_full_enabled())
+ if (!housekeeping_flags)
return;
- if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) {
- WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n");
- cpumask_clear(tick_nohz_full_mask);
- tick_nohz_full_running = false;
- return;
- }
-
- cpumask_andnot(housekeeping_mask,
- cpu_possible_mask, tick_nohz_full_mask);
-
- housekeeping_flags = HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
-
static_branch_enable(&housekeeping_overriden);
/* We need at least one CPU to handle housekeeping work */
WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
+
+#ifdef CONFIG_NO_HZ_FULL
+static int __init housekeeping_nohz_full_setup(char *str)
+{
+ cpumask_var_t non_housekeeping_mask;
+
+ alloc_bootmem_cpumask_var(&non_housekeeping_mask);
+ if (cpulist_parse(str, non_housekeeping_mask) < 0) {
+ pr_warn("Housekeeping: Incorrect nohz_full cpumask\n");
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+ return 0;
+ }
+
+ alloc_bootmem_cpumask_var(&housekeeping_mask);
+ cpumask_andnot(housekeeping_mask, cpu_possible_mask, non_housekeeping_mask);
+
+ if (cpumask_empty(housekeeping_mask))
+ cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+
+ housekeeping_flags = HK_FLAG_TICK | HK_FLAG_TIMER |
+ HK_FLAG_RCU | HK_FLAG_MISC;
+
+ tick_nohz_full_setup(non_housekeeping_mask);
+
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+
+ return 1;
+}
+__setup("nohz_full=", housekeeping_nohz_full_setup);
+#endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9d29dee..f09dd43 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -384,20 +384,13 @@ void __tick_nohz_task_switch(void)
local_irq_restore(flags);
}
-/* Parse the boot-time nohz CPU list from the kernel parameters. */
-static int __init tick_nohz_full_setup(char *str)
+/* Get the boot-time nohz CPU list from the kernel parameters. */
+void __init tick_nohz_full_setup(cpumask_var_t cpumask)
{
alloc_bootmem_cpumask_var(&tick_nohz_full_mask);
- if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
- pr_warn("NO_HZ: Incorrect nohz_full cpumask\n");
- free_bootmem_cpumask_var(tick_nohz_full_mask);
- return 1;
- }
+ cpumask_copy(tick_nohz_full_mask, cpumask);
tick_nohz_full_running = true;
-
- return 1;
}
-__setup("nohz_full=", tick_nohz_full_setup);
static int tick_nohz_cpu_down(unsigned int cpu)
{
--
2.7.4
Split housekeeping config from CONFIG_NO_HZ_FULL. This way we finally
separate the isolation code from nohz.
Although a dependency to CONFIG_NO_HZ_FULL remains for now until the
housekeeping code doesn't deal anymore with nohz internals directly.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 6 +++---
init/Kconfig | 8 ++++++++
kernel/Makefile | 2 +-
3 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 320cc2b..dcbec47 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -5,7 +5,7 @@
#include <linux/init.h>
#include <linux/tick.h>
-#ifdef CONFIG_NO_HZ_FULL
+#ifdef CONFIG_CPU_ISOLATION
DECLARE_STATIC_KEY_FALSE(housekeeping_overriden);
extern int housekeeping_any_cpu(void);
extern const struct cpumask *housekeeping_cpumask(void);
@@ -27,11 +27,11 @@ static inline const struct cpumask *housekeeping_cpumask(void)
static inline void housekeeping_affine(struct task_struct *t) { }
static inline void housekeeping_init(void) { }
-#endif /* CONFIG_NO_HZ_FULL */
+#endif /* CONFIG_CPU_ISOLATION */
static inline bool housekeeping_cpu(int cpu)
{
-#ifdef CONFIG_NO_HZ_FULL
+#ifdef CONFIG_CPU_ISOLATION
if (static_branch_unlikely(&housekeeping_overriden))
return housekeeping_test_cpu(cpu);
#endif
diff --git a/init/Kconfig b/init/Kconfig
index 78cb246..6f52e6f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -472,6 +472,14 @@ config TASK_IO_ACCOUNTING
endmenu # "CPU/Task time and stats accounting"
+config CPU_ISOLATION
+ bool "CPU isolation"
+ depends on NO_HZ_FULL
+ help
+ Make sure that CPUs running critical tasks are not disturbed by
+ any source of "noise" such as unbound workqueues, timers, kthreads...
+ Unbound jobs get offloaded to housekeeping CPUs.
+
source "kernel/rcu/Kconfig"
config BUILD_BIN2C
diff --git a/kernel/Makefile b/kernel/Makefile
index c63f893..5b01feb 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -109,7 +109,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_JUMP_LABEL) += jump_label.o
obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
obj-$(CONFIG_TORTURE_TEST) += torture.o
-obj-$(CONFIG_NO_HZ_FULL) += housekeeping.o
+obj-$(CONFIG_CPU_ISOLATION) += housekeeping.o
obj-$(CONFIG_HAS_IOMEM) += memremap.o
--
2.7.4
To keep a proper housekeeping namespace.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 2 +-
kernel/sched/core.c | 6 +++---
kernel/sched/fair.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index cbe8d63..320cc2b 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -29,7 +29,7 @@ static inline void housekeeping_affine(struct task_struct *t) { }
static inline void housekeeping_init(void) { }
#endif /* CONFIG_NO_HZ_FULL */
-static inline bool is_housekeeping_cpu(int cpu)
+static inline bool housekeeping_cpu(int cpu)
{
#ifdef CONFIG_NO_HZ_FULL
if (static_branch_unlikely(&housekeeping_overriden))
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b9769d1..ec30a98 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -527,7 +527,7 @@ int get_nohz_timer_target(void)
int i, cpu = smp_processor_id();
struct sched_domain *sd;
- if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
+ if (!idle_cpu(cpu) && housekeeping_cpu(cpu))
return cpu;
rcu_read_lock();
@@ -536,14 +536,14 @@ int get_nohz_timer_target(void)
if (cpu == i)
continue;
- if (!idle_cpu(i) && is_housekeeping_cpu(i)) {
+ if (!idle_cpu(i) && housekeeping_cpu(i)) {
cpu = i;
goto unlock;
}
}
}
- if (!is_housekeeping_cpu(cpu))
+ if (!housekeeping_cpu(cpu))
cpu = housekeeping_any_cpu();
unlock:
rcu_read_unlock();
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3af0630..43c4092 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8741,7 +8741,7 @@ void nohz_balance_enter_idle(int cpu)
return;
/* Spare idle load balancing on CPUs that don't want to be disturbed: */
- if (!is_housekeeping_cpu(cpu))
+ if (!housekeeping_cpu(cpu))
return;
if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
--
2.7.4
Housekeeping code still depends on nohz_full static key. Since we want
to decouple housekeeping from nohz, let's create a housekeeping own static
key. It's mostly relevant for calls to is_housekeeping_cpu() from the
scheduler.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 3 ++-
kernel/housekeeping.c | 14 +++++++++-----
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 31a1401..cbe8d63 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -6,6 +6,7 @@
#include <linux/tick.h>
#ifdef CONFIG_NO_HZ_FULL
+DECLARE_STATIC_KEY_FALSE(housekeeping_overriden);
extern int housekeeping_any_cpu(void);
extern const struct cpumask *housekeeping_cpumask(void);
extern void housekeeping_affine(struct task_struct *t);
@@ -31,7 +32,7 @@ static inline void housekeeping_init(void) { }
static inline bool is_housekeeping_cpu(int cpu)
{
#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return housekeeping_test_cpu(cpu);
#endif
return true;
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 0e70dc8..272c344 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -10,12 +10,15 @@
#include <linux/tick.h>
#include <linux/init.h>
#include <linux/kernel.h>
+#include <linux/static_key.h>
+DEFINE_STATIC_KEY_FALSE(housekeeping_overriden);
+EXPORT_SYMBOL_GPL(housekeeping_overriden);
static cpumask_var_t housekeeping_mask;
int housekeeping_any_cpu(void)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return cpumask_any_and(housekeeping_mask, cpu_online_mask);
return smp_processor_id();
@@ -23,7 +26,7 @@ int housekeeping_any_cpu(void)
const struct cpumask *housekeeping_cpumask(void)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return housekeeping_mask;
return cpu_possible_mask;
@@ -31,19 +34,18 @@ const struct cpumask *housekeeping_cpumask(void)
void housekeeping_affine(struct task_struct *t)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
set_cpus_allowed_ptr(t, housekeeping_mask);
}
bool housekeeping_test_cpu(int cpu)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return cpumask_test_cpu(cpu, housekeeping_mask);
return true;
}
-
void __init housekeeping_init(void)
{
if (!tick_nohz_full_enabled())
@@ -59,6 +61,8 @@ void __init housekeeping_init(void)
cpumask_andnot(housekeeping_mask,
cpu_possible_mask, tick_nohz_full_mask);
+ static_branch_enable(&housekeeping_overriden);
+
/* We need at least one CPU to handle housekeeping work */
WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
--
2.7.4
housekeeping_any_cpu() doesn't handle correctly the case where
CONFIG_NO_HZ_FULL=y and no CPU is in nohz_full mode. So far no caller
needs this but let's prepare to avoid any future surprise.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 3d6a8e6..64d0ee5 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -7,24 +7,19 @@
#ifdef CONFIG_NO_HZ_FULL
extern cpumask_var_t housekeeping_mask;
-
-static inline int housekeeping_any_cpu(void)
-{
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-}
-
extern void __init housekeeping_init(void);
-
#else
-
-static inline int housekeeping_any_cpu(void)
-{
- return smp_processor_id();
-}
-
static inline void housekeeping_init(void) { }
#endif /* CONFIG_NO_HZ_FULL */
+static inline int housekeeping_any_cpu(void)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+#endif
+ return smp_processor_id();
+}
static inline const struct cpumask *housekeeping_cpumask(void)
{
--
2.7.4
The housekeeping code is currently tied to the nohz code. As we are
planning to make housekeeping independant from it, start with moving
the relevant code to its own file.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
drivers/net/ethernet/tile/tilegx.c | 2 +-
include/linux/housekeeping.h | 56 ++++++++++++++++++++++++++++++++++++++
include/linux/tick.h | 37 -------------------------
init/main.c | 2 ++
kernel/Makefile | 1 +
kernel/housekeeping.c | 33 ++++++++++++++++++++++
kernel/rcu/tree_plugin.h | 1 +
kernel/rcu/update.c | 1 +
kernel/sched/core.c | 1 +
kernel/sched/fair.c | 1 +
kernel/time/tick-sched.c | 18 ------------
kernel/watchdog.c | 1 +
12 files changed, 98 insertions(+), 56 deletions(-)
create mode 100644 include/linux/housekeeping.h
create mode 100644 kernel/housekeeping.c
diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index c00102b..8c7ef12 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -40,7 +40,7 @@
#include <linux/tcp.h>
#include <linux/net_tstamp.h>
#include <linux/ptp_clock_kernel.h>
-#include <linux/tick.h>
+#include <linux/housekeeping.h>
#include <asm/checksum.h>
#include <asm/homecache.h>
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
new file mode 100644
index 0000000..3d6a8e6
--- /dev/null
+++ b/include/linux/housekeeping.h
@@ -0,0 +1,56 @@
+#ifndef _LINUX_HOUSEKEEPING_H
+#define _LINUX_HOUSEKEEPING_H
+
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/tick.h>
+
+#ifdef CONFIG_NO_HZ_FULL
+extern cpumask_var_t housekeeping_mask;
+
+static inline int housekeeping_any_cpu(void)
+{
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
+
+extern void __init housekeeping_init(void);
+
+#else
+
+static inline int housekeeping_any_cpu(void)
+{
+ return smp_processor_id();
+}
+
+static inline void housekeeping_init(void) { }
+#endif /* CONFIG_NO_HZ_FULL */
+
+
+static inline const struct cpumask *housekeeping_cpumask(void)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ return housekeeping_mask;
+#endif
+ return cpu_possible_mask;
+}
+
+static inline bool is_housekeeping_cpu(int cpu)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ return cpumask_test_cpu(cpu, housekeeping_mask);
+#endif
+ return true;
+}
+
+static inline void housekeeping_affine(struct task_struct *t)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ set_cpus_allowed_ptr(t, housekeeping_mask);
+
+#endif
+}
+
+#endif /* _LINUX_HOUSEKEEPING_H */
diff --git a/include/linux/tick.h b/include/linux/tick.h
index fe01e68..68afc09 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -137,7 +137,6 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
#ifdef CONFIG_NO_HZ_FULL
extern bool tick_nohz_full_running;
extern cpumask_var_t tick_nohz_full_mask;
-extern cpumask_var_t housekeeping_mask;
static inline bool tick_nohz_full_enabled(void)
{
@@ -161,11 +160,6 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
cpumask_or(mask, mask, tick_nohz_full_mask);
}
-static inline int housekeeping_any_cpu(void)
-{
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-}
-
extern void tick_nohz_dep_set(enum tick_dep_bits bit);
extern void tick_nohz_dep_clear(enum tick_dep_bits bit);
extern void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit);
@@ -235,10 +229,6 @@ static inline void tick_dep_clear_signal(struct signal_struct *signal,
extern void tick_nohz_full_kick_cpu(int cpu);
extern void __tick_nohz_task_switch(void);
#else
-static inline int housekeeping_any_cpu(void)
-{
- return smp_processor_id();
-}
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
@@ -260,33 +250,6 @@ static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void __tick_nohz_task_switch(void) { }
#endif
-static inline const struct cpumask *housekeeping_cpumask(void)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return housekeeping_mask;
-#endif
- return cpu_possible_mask;
-}
-
-static inline bool is_housekeeping_cpu(int cpu)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return cpumask_test_cpu(cpu, housekeeping_mask);
-#endif
- return true;
-}
-
-static inline void housekeeping_affine(struct task_struct *t)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- set_cpus_allowed_ptr(t, housekeeping_mask);
-
-#endif
-}
-
static inline void tick_nohz_task_switch(void)
{
if (tick_nohz_full_enabled())
diff --git a/init/main.c b/init/main.c
index 0ee9c686..bf138b9 100644
--- a/init/main.c
+++ b/init/main.c
@@ -46,6 +46,7 @@
#include <linux/cgroup.h>
#include <linux/efi.h>
#include <linux/tick.h>
+#include <linux/housekeeping.h>
#include <linux/interrupt.h>
#include <linux/taskstats_kern.h>
#include <linux/delayacct.h>
@@ -606,6 +607,7 @@ asmlinkage __visible void __init start_kernel(void)
early_irq_init();
init_IRQ();
tick_init();
+ housekeeping_init();
rcu_init_nohz();
init_timers();
hrtimers_init();
diff --git a/kernel/Makefile b/kernel/Makefile
index ed470aa..c63f893 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -109,6 +109,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_JUMP_LABEL) += jump_label.o
obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
obj-$(CONFIG_TORTURE_TEST) += torture.o
+obj-$(CONFIG_NO_HZ_FULL) += housekeeping.o
obj-$(CONFIG_HAS_IOMEM) += memremap.o
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
new file mode 100644
index 0000000..b41c52e
--- /dev/null
+++ b/kernel/housekeeping.c
@@ -0,0 +1,33 @@
+/*
+ * Housekeeping management. Manage the targets for routine code that can run on
+ * any CPU: unbound workqueues, timers, kthreads and any offloadable work.
+ *
+ * Copyright (C) 2017 Red Hat, Inc., Frederic Weisbecker
+ *
+ */
+
+#include <linux/housekeeping.h>
+#include <linux/tick.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+
+cpumask_var_t housekeeping_mask;
+
+void __init housekeeping_init(void)
+{
+ if (!tick_nohz_full_enabled())
+ return;
+
+ if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) {
+ WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n");
+ cpumask_clear(tick_nohz_full_mask);
+ tick_nohz_full_running = false;
+ return;
+ }
+
+ cpumask_andnot(housekeeping_mask,
+ cpu_possible_mask, tick_nohz_full_mask);
+
+ /* We need at least one CPU to handle housekeeping work */
+ WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
+}
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index e012b9b..387e0a2 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -29,6 +29,7 @@
#include <linux/oom.h>
#include <linux/sched/debug.h>
#include <linux/smpboot.h>
+#include <linux/housekeeping.h>
#include <uapi/linux/sched/types.h>
#include "../time/tick-internal.h"
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 5033b66..1c003e2 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -51,6 +51,7 @@
#include <linux/kthread.h>
#include <linux/tick.h>
#include <linux/rcupdate_wait.h>
+#include <linux/housekeeping.h>
#define CREATE_TRACE_POINTS
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 18a6966..b9769d1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -26,6 +26,7 @@
#include <linux/profile.h>
#include <linux/security.h>
#include <linux/syscalls.h>
+#include <linux/housekeeping.h>
#include <asm/switch_to.h>
#include <asm/tlb.h>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 70ba32e..3af0630 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -32,6 +32,7 @@
#include <linux/mempolicy.h>
#include <linux/migrate.h>
#include <linux/task_work.h>
+#include <linux/housekeeping.h>
#include <trace/events/sched.h>
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c7a899c..9d29dee 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -165,7 +165,6 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
#ifdef CONFIG_NO_HZ_FULL
cpumask_var_t tick_nohz_full_mask;
-cpumask_var_t housekeeping_mask;
bool tick_nohz_full_running;
static atomic_t tick_dep_mask;
@@ -437,13 +436,6 @@ void __init tick_nohz_init(void)
return;
}
- if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) {
- WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n");
- cpumask_clear(tick_nohz_full_mask);
- tick_nohz_full_running = false;
- return;
- }
-
/*
* Full dynticks uses irq work to drive the tick rescheduling on safe
* locking contexts. But then we need irq work to raise its own
@@ -452,7 +444,6 @@ void __init tick_nohz_init(void)
if (!arch_irq_work_has_interrupt()) {
pr_warn("NO_HZ: Can't run full dynticks because arch doesn't support irq work self-IPIs\n");
cpumask_clear(tick_nohz_full_mask);
- cpumask_copy(housekeeping_mask, cpu_possible_mask);
tick_nohz_full_running = false;
return;
}
@@ -465,9 +456,6 @@ void __init tick_nohz_init(void)
cpumask_clear_cpu(cpu, tick_nohz_full_mask);
}
- cpumask_andnot(housekeeping_mask,
- cpu_possible_mask, tick_nohz_full_mask);
-
for_each_cpu(cpu, tick_nohz_full_mask)
context_tracking_cpu_set(cpu);
@@ -477,12 +465,6 @@ void __init tick_nohz_init(void)
WARN_ON(ret < 0);
pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
cpumask_pr_args(tick_nohz_full_mask));
-
- /*
- * We need at least one CPU to handle housekeeping work such
- * as timekeeping, unbound timers, workqueues, ...
- */
- WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
#endif
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index f5d5202..3cc5596 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -24,6 +24,7 @@
#include <linux/workqueue.h>
#include <linux/sched/clock.h>
#include <linux/sched/debug.h>
+#include <linux/housekeeping.h>
#include <asm/irq_regs.h>
#include <linux/kvm_para.h>
--
2.7.4
* Frederic Weisbecker <[email protected]> wrote:
> Ingo,
>
> Please pull the core/isolation-v4 branch that can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> core/isolation-v4
>
> HEAD: cf4c55aad44251369c8507c3823f9f9c51d4dc77
>
> Summary of changes:
>
> * Move the housekeeping code that was tied to NO_HZ to its own subsystem.
> Currently NO_HZ governs the other isolation features which is not right
> as dynticks is just an isolation feature like the others. We want to
> centralize the CPU isolation decisions to a subsystem of its own instead.
>
> * Integrate isolcpus code to housekeeping and treat it as a CPU isolation
> feature.
>
> * Reuse the "isolcpus=" kernel parameter to control the CPU isolation.
> For now only tick and domains can be isolated after this patchset:
>
> isolcpus=1-7 # isolate domains on CPU range 1 to 7
> # "domain" flag is implicit by default to
> # keep the current behaviour
>
> isolcpus=domain,1-7 # do the same
>
> isolcpus=nohz,1-7 # apply nohz_full to CPU range 1 to 7
>
> isolcpus=nohz,domain,1-7 # apply nohz_full and isolate domains of
> # CPU range 1 to 7
>
> Thanks,
> Frederic
> ---
>
> Frederic Weisbecker (12):
> housekeeping: Move housekeeping related code to its own file
> watchdog: Use housekeeping_cpumask() instead of ad-hoc version
> housekeeping: Provide a dynamic off-case to housekeeping_any_cpu()
> housekeeping: Make housekeeping cpumask private
> housekeeping: Use its own static key
> housekeeping: Rename is_housekeeping_cpu to housekeeping_cpu
> housekeeping: Move it under its own config, independant from NO_HZ
> housekeeping: Introduce housekeeping flags
> housekeeping: Handle nohz_full= parameter
> housekeeping: Move isolcpus to housekeeping
> housekeeping: Add basic isolcpus flags
> housekeeping: Document isolcpus flags
>
>
> Documentation/admin-guide/kernel-parameters.txt | 33 +++---
> drivers/base/cpu.c | 11 +-
> drivers/net/ethernet/tile/tilegx.c | 6 +-
> include/linux/housekeeping.h | 51 ++++++++
> include/linux/sched.h | 2 -
> include/linux/tick.h | 39 +------
> init/Kconfig | 7 ++
> init/main.c | 2 +
> kernel/Makefile | 1 +
> kernel/cgroup/cpuset.c | 15 +--
> kernel/housekeeping.c | 149 ++++++++++++++++++++++++
> kernel/rcu/tree_plugin.h | 3 +-
> kernel/rcu/update.c | 3 +-
> kernel/sched/core.c | 25 +---
> kernel/sched/fair.c | 3 +-
> kernel/sched/topology.c | 24 +---
> kernel/time/tick-sched.c | 31 +----
> kernel/watchdog.c | 13 +--
> 18 files changed, 276 insertions(+), 142 deletions(-)
Yeah, so while I agree that all this functionality needs to be factored
out and organized, I have a problem with this specific high level organization.
The main problem I think is that it's all called "housekeeping", which is pretty
fuzzy and opaque. What does 'housekeeping' exactly mean? A dozen of details if you
look at the code - and this name does not make it much easier to think about this
whole problem category.
So how about introducing _two_ new high level concepts:
1) 'global time handling'
2) 'double async CPU callbacks'
The notion of 'global time' handling is obvious to everyone I think: it involves
the system-global guarantee that certain kernel jobs will be executed
periodically. At least one CPU in the system needs to handle 'global time'.
The notion of 'double async CPU callbacks' is less obvious: it involves the action
of invoking a callback on a CPU, that might be executed on _another_ CPU.
I.e. there are 3 CPUs involved:
- the invoking CPU
- the target CPU
- the CPU(s!) that will handle the callback (the housekeeping CPU mask)
For example the kmem-cache on_each_cpu() calls in mm/slab.c would fall into this
category.
I don't know to what extent it makes sense to formalize and unify these
facilities: it's certain that the (former) housekeeping CPU mask should be shared
by these two facilities: the CPU executing global time callbacks periodically
should be one of the CPUs that execute double-async CPU callbacks.
But by separating all this functionality into these two categories, it's already
much easier to me to argue about which bit does what and why.
What do you think?
Thanks,
Ingo
On 28/09/2017 11:54, Ingo Molnar wrote:
>
> * Frederic Weisbecker <[email protected]> wrote:
>
>> Ingo,
>>
>> Please pull the core/isolation-v4 branch that can be found at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
>> core/isolation-v4
>>
>> HEAD: cf4c55aad44251369c8507c3823f9f9c51d4dc77
>>
>> Summary of changes:
>>
>> * Move the housekeeping code that was tied to NO_HZ to its own subsystem.
>> Currently NO_HZ governs the other isolation features which is not right
>> as dynticks is just an isolation feature like the others. We want to
>> centralize the CPU isolation decisions to a subsystem of its own instead.
>>
>> * Integrate isolcpus code to housekeeping and treat it as a CPU isolation
>> feature.
>>
>> * Reuse the "isolcpus=" kernel parameter to control the CPU isolation.
>> For now only tick and domains can be isolated after this patchset:
>>
>> isolcpus=1-7 # isolate domains on CPU range 1 to 7
>> # "domain" flag is implicit by default to
>> # keep the current behaviour
>>
>> isolcpus=domain,1-7 # do the same
>>
>> isolcpus=nohz,1-7 # apply nohz_full to CPU range 1 to 7
>>
>> isolcpus=nohz,domain,1-7 # apply nohz_full and isolate domains of
>> # CPU range 1 to 7
>>
>> Thanks,
>> Frederic
>> ---
>>
>> Frederic Weisbecker (12):
>> housekeeping: Move housekeeping related code to its own file
>> watchdog: Use housekeeping_cpumask() instead of ad-hoc version
>> housekeeping: Provide a dynamic off-case to housekeeping_any_cpu()
>> housekeeping: Make housekeeping cpumask private
>> housekeeping: Use its own static key
>> housekeeping: Rename is_housekeeping_cpu to housekeeping_cpu
>> housekeeping: Move it under its own config, independant from NO_HZ
>> housekeeping: Introduce housekeeping flags
>> housekeeping: Handle nohz_full= parameter
>> housekeeping: Move isolcpus to housekeeping
>> housekeeping: Add basic isolcpus flags
>> housekeeping: Document isolcpus flags
>>
>>
>> Documentation/admin-guide/kernel-parameters.txt | 33 +++---
>> drivers/base/cpu.c | 11 +-
>> drivers/net/ethernet/tile/tilegx.c | 6 +-
>> include/linux/housekeeping.h | 51 ++++++++
>> include/linux/sched.h | 2 -
>> include/linux/tick.h | 39 +------
>> init/Kconfig | 7 ++
>> init/main.c | 2 +
>> kernel/Makefile | 1 +
>> kernel/cgroup/cpuset.c | 15 +--
>> kernel/housekeeping.c | 149 ++++++++++++++++++++++++
>> kernel/rcu/tree_plugin.h | 3 +-
>> kernel/rcu/update.c | 3 +-
>> kernel/sched/core.c | 25 +---
>> kernel/sched/fair.c | 3 +-
>> kernel/sched/topology.c | 24 +---
>> kernel/time/tick-sched.c | 31 +----
>> kernel/watchdog.c | 13 +--
>> 18 files changed, 276 insertions(+), 142 deletions(-)
>
> Yeah, so while I agree that all this functionality needs to be factored
> out and organized, I have a problem with this specific high level organization.
>
> The main problem I think is that it's all called "housekeeping", which is pretty
> fuzzy and opaque. What does 'housekeeping' exactly mean? A dozen of details if you
> look at the code - and this name does not make it much easier to think about this
> whole problem category.
Indeed I feel that housekeeping is probably not the best concept to
express all these things. I'm all for something clearer.
>
> So how about introducing _two_ new high level concepts:
>
> 1) 'global time handling'
> 2) 'double async CPU callbacks'
>
> The notion of 'global time' handling is obvious to everyone I think: it involves
> the system-global guarantee that certain kernel jobs will be executed
> periodically. At least one CPU in the system needs to handle 'global time'.
>
> The notion of 'double async CPU callbacks' is less obvious: it involves the action
> of invoking a callback on a CPU, that might be executed on _another_ CPU.
>
> I.e. there are 3 CPUs involved:
>
> - the invoking CPU
> - the target CPU
> - the CPU(s!) that will handle the callback (the housekeeping CPU mask)
>
> For example the kmem-cache on_each_cpu() calls in mm/slab.c would fall into this
> category.
Hmm, I'm not clear on this one. Do you mean works that can be executed
concurrently?
>
> I don't know to what extent it makes sense to formalize and unify these
> facilities: it's certain that the (former) housekeeping CPU mask should be shared
> by these two facilities: the CPU executing global time callbacks periodically
> should be one of the CPUs that execute double-async CPU callbacks.
>
> But by separating all this functionality into these two categories, it's already
> much easier to me to argue about which bit does what and why.
Note that some housekeeping concepts may not fall into any of these
categories. For example domain isolation.
Thanks.
>
> What do you think?
>
> Thanks,
>
> Ingo
>