2017-09-01 16:41:59

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 00/12] Introduce housekeeping subsystem v3

So this time I didn't change the implementation of isolcpus which relies
on NULL domains, I only pulled it to the housekeeping subsystem.

Summary of changes from v2:

* "isolcpus=" takes flags, which allows us to control nohz through it.
For example:
isolcpus=nohz,1-7 -- enable nohz_full to CPUs 1 to 7
isolcpus=nohz,domain,1-7 -- enable nohz_full and isolate CPUs 1 to 7

If no flags are passed, the default flag is "domain", so the kernel
parameter is backward compatible.

More flags should be added in the future to isolate a CPU from more
details. We just need to debate how finegrained we want that.

We also want to make sure that what is passed to isolcpus is later
modifiable through cpusets.

* Remove workqueue and kthread isolation, we'll think about those later
once we have an interface for them in the isolcpus flags.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
core/isolation-v3

HEAD: f1292ff748c3f5cdff0c29a3a43b2231ac5005cf

Thanks,
Frederic
---

Frederic Weisbecker (12):
housekeeping: Move housekeeping related code to its own file
watchdog: Use housekeeping_cpumask() instead of ad-hoc version
housekeeping: Provide a dynamic off-case to housekeeping_any_cpu()
housekeeping: Make housekeeping cpumask private
housekeeping: Use its own static key
housekeeping: Rename is_housekeeping_cpu to housekeeping_cpu
housekeeping: Move it under own config, independant from NO_HZ
housekeeping: Introduce housekeeping flags
housekeeping: Handle nohz_full= parameter
housekeeping: Move isolcpus to housekeeping
housekeeping: Add basic isolcpus flags
housekeeping: Document isolcpus flags


Documentation/admin-guide/kernel-parameters.txt | 33 +++---
drivers/base/cpu.c | 11 +-
drivers/net/ethernet/tile/tilegx.c | 6 +-
include/linux/housekeeping.h | 51 ++++++++
include/linux/sched.h | 2 -
include/linux/tick.h | 38 +-----
init/Kconfig | 7 ++
init/main.c | 2 +
kernel/Makefile | 1 +
kernel/cgroup/cpuset.c | 14 +--
kernel/housekeeping.c | 148 ++++++++++++++++++++++++
kernel/rcu/tree_plugin.h | 3 +-
kernel/rcu/update.c | 3 +-
kernel/sched/core.c | 25 +---
kernel/sched/fair.c | 3 +-
kernel/sched/topology.c | 21 +---
kernel/time/tick-sched.c | 31 +----
kernel/watchdog.c | 13 +--
18 files changed, 272 insertions(+), 140 deletions(-)


2017-09-01 16:42:06

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 03/12] housekeeping: Provide a dynamic off-case to housekeeping_any_cpu()

housekeeping_any_cpu() doesn't handle correctly the case where
CONFIG_NO_HZ_FULL=y and no CPU is in nohz_full mode. So far no caller
needs this but let's prepare to avoid any future surprise.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 3d6a8e6..64d0ee5 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -7,24 +7,19 @@

#ifdef CONFIG_NO_HZ_FULL
extern cpumask_var_t housekeeping_mask;
-
-static inline int housekeeping_any_cpu(void)
-{
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-}
-
extern void __init housekeeping_init(void);
-
#else
-
-static inline int housekeeping_any_cpu(void)
-{
- return smp_processor_id();
-}
-
static inline void housekeeping_init(void) { }
#endif /* CONFIG_NO_HZ_FULL */

+static inline int housekeeping_any_cpu(void)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+#endif
+ return smp_processor_id();
+}

static inline const struct cpumask *housekeeping_cpumask(void)
{
--
2.7.4

2017-09-01 16:42:10

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 05/12] housekeeping: Use its own static key

Housekeeping code still depends on nohz_full static key. Since we want
to decouple housekeeping from nohz, let's create a housekeeping own static
key. It's mostly relevant for calls to is_housekeeping_cpu() from the
scheduler.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 3 ++-
kernel/housekeeping.c | 14 +++++++++-----
2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 31a1401..cbe8d63 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -6,6 +6,7 @@
#include <linux/tick.h>

#ifdef CONFIG_NO_HZ_FULL
+DECLARE_STATIC_KEY_FALSE(housekeeping_overriden);
extern int housekeeping_any_cpu(void);
extern const struct cpumask *housekeeping_cpumask(void);
extern void housekeeping_affine(struct task_struct *t);
@@ -31,7 +32,7 @@ static inline void housekeeping_init(void) { }
static inline bool is_housekeeping_cpu(int cpu)
{
#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return housekeeping_test_cpu(cpu);
#endif
return true;
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 0183e75..f8be7e6 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -9,12 +9,15 @@
#include <linux/housekeeping.h>
#include <linux/tick.h>
#include <linux/init.h>
+#include <linux/static_key.h>

+DEFINE_STATIC_KEY_FALSE(housekeeping_overriden);
+EXPORT_SYMBOL_GPL(housekeeping_overriden);
static cpumask_var_t housekeeping_mask;

int housekeeping_any_cpu(void)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return cpumask_any_and(housekeeping_mask, cpu_online_mask);

return smp_processor_id();
@@ -22,7 +25,7 @@ int housekeeping_any_cpu(void)

const struct cpumask *housekeeping_cpumask(void)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return housekeeping_mask;

return cpu_possible_mask;
@@ -30,19 +33,18 @@ const struct cpumask *housekeeping_cpumask(void)

void housekeeping_affine(struct task_struct *t)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
set_cpus_allowed_ptr(t, housekeeping_mask);
}

bool housekeeping_test_cpu(int cpu)
{
- if (tick_nohz_full_enabled())
+ if (static_branch_unlikely(&housekeeping_overriden))
return cpumask_test_cpu(cpu, housekeeping_mask);

return true;
}

-
void __init housekeeping_init(void)
{
if (!tick_nohz_full_enabled())
@@ -58,6 +60,8 @@ void __init housekeeping_init(void)
cpumask_andnot(housekeeping_mask,
cpu_possible_mask, tick_nohz_full_mask);

+ static_branch_enable(&housekeeping_overriden);
+
/* We need at least one CPU to handle housekeeping work */
WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
--
2.7.4

2017-09-01 16:42:18

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 11/12] housekeeping: Add basic isolcpus flags

Add flags to control nohz and domain isolations from "isolcpus=", in
order to centralize the isolation features to a common interface. Domain
isolation remains the default so not to break the existing isolcpus
boot paramater behaviour.

Further flags in the future may include 0hz (1hz tick offload) and timers,
workqueue, RCU, kthread, watchdog, likely all merged together in a
common flag ("async"?). In any case, this will have to be modifiable by
cpusets.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
kernel/housekeeping.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 5f0e7ec..6476e7c 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -10,6 +10,7 @@
#include <linux/tick.h>
#include <linux/init.h>
#include <linux/static_key.h>
+#include <linux/ctype.h>

DEFINE_STATIC_KEY_FALSE(housekeeping_overriden);
EXPORT_SYMBOL_GPL(housekeeping_overriden);
@@ -119,6 +120,29 @@ __setup("nohz_full=", housekeeping_nohz_full_setup);

static int __init housekeeping_isolcpus_setup(char *str)
{
- return housekeeping_setup(str, HK_FLAG_DOMAIN);
+ unsigned int flags = 0;
+
+ while (isalpha(*str)) {
+ if (!strncmp(str, "nohz,", 5)) {
+ str += 5;
+ flags |= HK_FLAG_TICK;
+ continue;
+ }
+
+ if (!strncmp(str, "domain,", 7)) {
+ str += 7;
+ flags |= HK_FLAG_DOMAIN;
+ continue;
+ }
+
+ pr_warn("isolcpus: Error, unknown flag\n");
+ return 0;
+ }
+
+ /* Default behaviour for isolcpus without flags */
+ if (!flags)
+ flags |= HK_FLAG_DOMAIN;
+
+ return housekeeping_setup(str, flags);
}
__setup("isolcpus=", housekeeping_isolcpus_setup);
--
2.7.4

2017-09-01 16:42:16

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 08/12] housekeeping: Introduce housekeeping flags

Before we implement isolcpus under housekeeping, we need the isolation
features to be more finegrained. For example some people want nohz_full
without the full scheduler isolation, others want full scheduler
isolation without nohz_full.

So let's cut all these isolation features piecewise, at the risk of
overcutting it right now. We can still merge some flags later if they
always make sense together.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
drivers/net/ethernet/tile/tilegx.c | 4 ++--
include/linux/housekeeping.h | 26 +++++++++++++++++---------
kernel/housekeeping.c | 26 +++++++++++++++-----------
kernel/rcu/tree_plugin.h | 2 +-
kernel/rcu/update.c | 2 +-
kernel/sched/core.c | 8 ++++----
kernel/sched/fair.c | 2 +-
kernel/watchdog.c | 3 ++-
8 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index eb74e09..0bd765b 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -2270,8 +2270,8 @@ static int __init tile_net_init_module(void)
tile_net_dev_init(name, mac);

if (!network_cpus_init())
- cpumask_and(&network_cpus_map, housekeeping_cpumask(),
- cpu_online_mask);
+ cpumask_and(&network_cpus_map,
+ housekeeping_cpumask(HK_FLAG_MISC), cpu_online_mask);

return 0;
}
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index dcbec47..b1a62544 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -5,35 +5,43 @@
#include <linux/init.h>
#include <linux/tick.h>

+enum hk_flags {
+ HK_FLAG_TIMER = 1,
+ HK_FLAG_RCU = (1 << 1),
+ HK_FLAG_MISC = (1 << 2),
+ HK_FLAG_SCHED = (1 << 3),
+};
+
#ifdef CONFIG_CPU_ISOLATION
DECLARE_STATIC_KEY_FALSE(housekeeping_overriden);
-extern int housekeeping_any_cpu(void);
-extern const struct cpumask *housekeeping_cpumask(void);
-extern void housekeeping_affine(struct task_struct *t);
-extern bool housekeeping_test_cpu(int cpu);
+extern int housekeeping_any_cpu(enum hk_flags flags);
+extern const struct cpumask *housekeeping_cpumask(enum hk_flags flags);
+extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
+extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
extern void __init housekeeping_init(void);

#else

-static inline int housekeeping_any_cpu(void)
+static inline int housekeeping_any_cpu(enum hk_flags flags)
{
return smp_processor_id();
}

-static inline const struct cpumask *housekeeping_cpumask(void)
+static inline const struct cpumask *housekeeping_cpumask(enum hk_flags flags)
{
return cpu_possible_mask;
}

-static inline void housekeeping_affine(struct task_struct *t) { }
+static inline void housekeeping_affine(struct task_struct *t,
+ enum hk_flags flags) { }
static inline void housekeeping_init(void) { }
#endif /* CONFIG_CPU_ISOLATION */

-static inline bool housekeeping_cpu(int cpu)
+static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
{
#ifdef CONFIG_CPU_ISOLATION
if (static_branch_unlikely(&housekeeping_overriden))
- return housekeeping_test_cpu(cpu);
+ return housekeeping_test_cpu(cpu, flags);
#endif
return true;
}
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index f8be7e6..e2196d1 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -14,34 +14,36 @@
DEFINE_STATIC_KEY_FALSE(housekeeping_overriden);
EXPORT_SYMBOL_GPL(housekeeping_overriden);
static cpumask_var_t housekeeping_mask;
+static unsigned int housekeeping_flags;

-int housekeeping_any_cpu(void)
+int housekeeping_any_cpu(enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-
+ if (housekeeping_flags & flags)
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
return smp_processor_id();
}

-const struct cpumask *housekeeping_cpumask(void)
+const struct cpumask *housekeeping_cpumask(enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- return housekeeping_mask;
-
+ if (housekeeping_flags & flags)
+ return housekeeping_mask;
return cpu_possible_mask;
}

-void housekeeping_affine(struct task_struct *t)
+void housekeeping_affine(struct task_struct *t, enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- set_cpus_allowed_ptr(t, housekeeping_mask);
+ if (housekeeping_flags & flags)
+ set_cpus_allowed_ptr(t, housekeeping_mask);
}

-bool housekeeping_test_cpu(int cpu)
+bool housekeeping_test_cpu(int cpu, enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overriden))
- return cpumask_test_cpu(cpu, housekeeping_mask);
-
+ if (housekeeping_flags & flags)
+ return cpumask_test_cpu(cpu, housekeeping_mask);
return true;
}

@@ -60,6 +62,8 @@ void __init housekeeping_init(void)
cpumask_andnot(housekeeping_mask,
cpu_possible_mask, tick_nohz_full_mask);

+ housekeeping_flags = HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
+
static_branch_enable(&housekeeping_overriden);

/* We need at least one CPU to handle housekeeping work */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c66d162..47f2865 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2542,7 +2542,7 @@ static void rcu_bind_gp_kthread(void)

if (!tick_nohz_full_enabled())
return;
- housekeeping_affine(current);
+ housekeeping_affine(current, HK_FLAG_RCU);
}

/* Record the current task on dyntick-idle entry. */
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index bfe973d..7acfd74 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -719,7 +719,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
LIST_HEAD(rcu_tasks_holdouts);

/* Run on housekeeping CPUs by default. Sysadm can move if desired. */
- housekeeping_affine(current);
+ housekeeping_affine(current, HK_FLAG_RCU);

/*
* Each pass through the following loop makes one check for
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fd00ae3..877c85d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -527,7 +527,7 @@ int get_nohz_timer_target(void)
int i, cpu = smp_processor_id();
struct sched_domain *sd;

- if (!idle_cpu(cpu) && housekeeping_cpu(cpu))
+ if (!idle_cpu(cpu) && housekeeping_cpu(cpu, HK_FLAG_TIMER))
return cpu;

rcu_read_lock();
@@ -536,15 +536,15 @@ int get_nohz_timer_target(void)
if (cpu == i)
continue;

- if (!idle_cpu(i) && housekeeping_cpu(i)) {
+ if (!idle_cpu(i) && housekeeping_cpu(i, HK_FLAG_TIMER)) {
cpu = i;
goto unlock;
}
}
}

- if (!housekeeping_cpu(cpu))
- cpu = housekeeping_any_cpu();
+ if (!housekeeping_cpu(cpu, HK_FLAG_TIMER))
+ cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
unlock:
rcu_read_unlock();
return cpu;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0d8f7b1..74af955 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8739,7 +8739,7 @@ void nohz_balance_enter_idle(int cpu)
return;

/* Spare idle load balancing on CPUs that don't want to be disturbed: */
- if (!housekeeping_cpu(cpu))
+ if (!housekeeping_cpu(cpu, HK_FLAG_SCHED))
return;

if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index cdd0d11..631131c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -944,7 +944,8 @@ void __init lockup_detector_init(void)
if (tick_nohz_full_enabled())
pr_info("Disabling watchdog on nohz_full cores by default\n");

- cpumask_copy(&watchdog_cpumask, housekeeping_cpumask());
+ cpumask_copy(&watchdog_cpumask,
+ housekeeping_cpumask(HK_FLAG_TIMER));

if (watchdog_enabled)
watchdog_enable_all_cpus();
--
2.7.4

2017-09-01 16:42:34

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 12/12] housekeeping: Document isolcpus flags

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 35 +++++++++++++++----------
1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d9c171c..a34de40 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1727,20 +1727,27 @@
isapnp= [ISAPNP]
Format: <RDP>,<reset>,<pci_scan>,<verbosity>

- isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler.
- The argument is a cpu list, as described above.
-
- This option can be used to specify one or more CPUs
- to isolate from the general SMP balancing and scheduling
- algorithms. You can move a process onto or off an
- "isolated" CPU via the CPU affinity syscalls or cpuset.
- <cpu number> begins at 0 and the maximum value is
- "number of CPUs in system - 1".
-
- This option is the preferred way to isolate CPUs. The
- alternative -- manually setting the CPU mask of all
- tasks in the system -- can cause problems and
- suboptimal load balancer performance.
+ isolcpus= [KNL,SMP] Isolate a given set of CPUs from disturbance.
+ Format: [flag-list,]<cpu-list>
+
+ Specify one or more CPUs to isolate from disturbances
+ specified in the flag list (default: domain):
+
+ nohz
+ Disable the tick when a single task runs.
+ domain
+ Isolate from the general SMP balancing and scheduling
+ algorithms. This option is the preferred way to isolate
+ CPUs from tasks. The alternative -- manually setting the
+ CPU mask of all tasks in the system, can cause problems
+ and suboptimal load balancer performance. You can move a
+ process onto or off an "isolated" CPU via the CPU
+ affinity syscalls or cpuset. <cpu number> begins at 0
+ and the maximum value is "number of CPUs in system - 1".
+
+ The format of <cpu-list> is described above.
+
+

iucv= [HW,NET]

--
2.7.4

2017-09-01 16:43:10

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 10/12] housekeeping: Move isolcpus to housekeeping

We want to centralize the isolation features on the housekeeping
subsystem and scheduler domain isolation is a significant part of it.

No intended behaviour change, we just reuse the housekeeping cpumask
and core code.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
drivers/base/cpu.c | 11 ++++++++-
include/linux/housekeeping.h | 1 +
include/linux/sched.h | 2 --
kernel/cgroup/cpuset.c | 14 ++++-------
kernel/housekeeping.c | 57 +++++++++++++++++++++++++++++++++++---------
kernel/sched/core.c | 16 +------------
kernel/sched/topology.c | 21 ++++------------
7 files changed, 67 insertions(+), 55 deletions(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 2c3b359..8e33b9e 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -18,6 +18,7 @@
#include <linux/cpufeature.h>
#include <linux/tick.h>
#include <linux/pm_qos.h>
+#include <linux/housekeeping.h>

#include "base.h"

@@ -271,8 +272,16 @@ static ssize_t print_cpus_isolated(struct device *dev,
struct device_attribute *attr, char *buf)
{
int n = 0, len = PAGE_SIZE-2;
+ cpumask_var_t isolated;

- n = scnprintf(buf, len, "%*pbl\n", cpumask_pr_args(cpu_isolated_map));
+ if (!alloc_cpumask_var(&isolated, GFP_KERNEL))
+ return -ENOMEM;
+
+ cpumask_andnot(isolated, cpu_possible_mask,
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
+ n = scnprintf(buf, len, "%*pbl\n", cpumask_pr_args(isolated));
+
+ free_cpumask_var(isolated);

return n;
}
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 35fb197..a99466e 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -11,6 +11,7 @@ enum hk_flags {
HK_FLAG_MISC = (1 << 2),
HK_FLAG_SCHED = (1 << 3),
HK_FLAG_TICK = (1 << 4),
+ HK_FLAG_DOMAIN = (1 << 5),
};

#ifdef CONFIG_CPU_ISOLATION
diff --git a/include/linux/sched.h b/include/linux/sched.h
index c28b182..816ff52 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -166,8 +166,6 @@ struct task_group;
/* Task command name length: */
#define TASK_COMM_LEN 16

-extern cpumask_var_t cpu_isolated_map;
-
extern void scheduler_tick(void);

#define MAX_SCHEDULE_TIMEOUT LONG_MAX
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 8d51516..c65e73a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -56,6 +56,7 @@
#include <linux/time64.h>
#include <linux/backing-dev.h>
#include <linux/sort.h>
+#include <linux/housekeeping.h>

#include <linux/uaccess.h>
#include <linux/atomic.h>
@@ -639,7 +640,6 @@ static int generate_sched_domains(cpumask_var_t **domains,
int csn; /* how many cpuset ptrs in csa so far */
int i, j, k; /* indices for partition finding loops */
cpumask_var_t *doms; /* resulting partition; i.e. sched domains */
- cpumask_var_t non_isolated_cpus; /* load balanced CPUs */
struct sched_domain_attr *dattr; /* attributes for custom domains */
int ndoms = 0; /* number of sched domains in result */
int nslot; /* next empty doms[] struct cpumask slot */
@@ -649,10 +649,6 @@ static int generate_sched_domains(cpumask_var_t **domains,
dattr = NULL;
csa = NULL;

- if (!alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL))
- goto done;
- cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
-
/* Special case for the 99% of systems with one, full, sched domain */
if (is_sched_load_balance(&top_cpuset)) {
ndoms = 1;
@@ -666,7 +662,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
update_domain_attr_tree(dattr, &top_cpuset);
}
cpumask_and(doms[0], top_cpuset.effective_cpus,
- non_isolated_cpus);
+ housekeeping_cpumask(HK_FLAG_DOMAIN));

goto done;
}
@@ -690,7 +686,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
*/
if (!cpumask_empty(cp->cpus_allowed) &&
!(is_sched_load_balance(cp) &&
- cpumask_intersects(cp->cpus_allowed, non_isolated_cpus)))
+ cpumask_intersects(cp->cpus_allowed,
+ housekeeping_cpumask(HK_FLAG_DOMAIN))))
continue;

if (is_sched_load_balance(cp))
@@ -772,7 +769,7 @@ static int generate_sched_domains(cpumask_var_t **domains,

if (apn == b->pn) {
cpumask_or(dp, dp, b->effective_cpus);
- cpumask_and(dp, dp, non_isolated_cpus);
+ cpumask_and(dp, dp, housekeeping_cpumask(HK_FLAG_DOMAIN));
if (dattr)
update_domain_attr_tree(dattr + nslot, b);

@@ -785,7 +782,6 @@ static int generate_sched_domains(cpumask_var_t **domains,
BUG_ON(nslot != ndoms);

done:
- free_cpumask_var(non_isolated_cpus);
kfree(csa);

/*
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index b2006bd..5f0e7ec 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -58,32 +58,67 @@ void __init housekeeping_init(void)
WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}

-#ifdef CONFIG_NO_HZ_FULL
-static int __init housekeeping_nohz_full_setup(char *str)
+static int __init housekeeping_setup(char *str, enum hk_flags flags)
{
cpumask_var_t non_housekeeping_mask;

alloc_bootmem_cpumask_var(&non_housekeeping_mask);
if (cpulist_parse(str, non_housekeeping_mask) < 0) {
- pr_warn("Housekeeping: Incorrect nohz_full cpumask\n");
+ pr_warn("Housekeeping: nohz_full= or isolcpus= incorrect CPU range\n");
free_bootmem_cpumask_var(non_housekeeping_mask);
return 0;
}

- alloc_bootmem_cpumask_var(&housekeeping_mask);
- cpumask_andnot(housekeeping_mask, cpu_possible_mask, non_housekeeping_mask);
+ if (!housekeeping_flags) {
+ alloc_bootmem_cpumask_var(&housekeeping_mask);
+ cpumask_andnot(housekeeping_mask,
+ cpu_possible_mask, non_housekeeping_mask);
+ if (cpumask_empty(housekeeping_mask))
+ cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+ } else {
+ cpumask_var_t tmp;

- if (cpumask_empty(housekeeping_mask))
- cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+ alloc_bootmem_cpumask_var(&tmp);
+ cpumask_andnot(tmp, cpu_possible_mask, non_housekeeping_mask);
+ if (!cpumask_equal(tmp, housekeeping_mask)) {
+ pr_warn("Housekeeping: nohz_full= must match isolcpus=\n");
+ free_bootmem_cpumask_var(tmp);
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+ return 0;
+ }
+ free_bootmem_cpumask_var(tmp);
+ }

- housekeeping_flags = HK_FLAG_TICK | HK_FLAG_TIMER |
- HK_FLAG_RCU | HK_FLAG_MISC;
+ if ((flags & HK_FLAG_TICK) && !(housekeeping_flags & HK_FLAG_TICK)) {
+ if (IS_ENABLED(CONFIG_NO_HZ_FULL)) {
+ tick_nohz_full_setup(non_housekeeping_mask);
+ } else {
+ pr_warn("Housekeeping: nohz unsupported."
+ " Build with CONFIG_NO_HZ_FULL\n");
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+ return 0;
+ }
+ }

- tick_nohz_full_setup(non_housekeeping_mask);
+ housekeeping_flags |= flags;

free_bootmem_cpumask_var(non_housekeeping_mask);

return 1;
}
+
+static int __init housekeeping_nohz_full_setup(char *str)
+{
+ unsigned int flags;
+
+ flags = HK_FLAG_TICK | HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
+
+ return housekeeping_setup(str, flags);
+}
__setup("nohz_full=", housekeeping_nohz_full_setup);
-#endif
+
+static int __init housekeeping_isolcpus_setup(char *str)
+{
+ return housekeeping_setup(str, HK_FLAG_DOMAIN);
+}
+__setup("isolcpus=", housekeeping_isolcpus_setup);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 877c85d..de20aa1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -84,9 +84,6 @@ __read_mostly int scheduler_running;
*/
int sysctl_sched_rt_runtime = 950000;

-/* CPUs with isolated domains */
-cpumask_var_t cpu_isolated_map;
-
/*
* __task_rq_lock - lock the rq @p resides on.
*/
@@ -5672,10 +5669,6 @@ static inline void sched_init_smt(void) { }

void __init sched_init_smp(void)
{
- cpumask_var_t non_isolated_cpus;
-
- alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
-
sched_init_numa();

/*
@@ -5685,16 +5678,12 @@ void __init sched_init_smp(void)
*/
mutex_lock(&sched_domains_mutex);
sched_init_domains(cpu_active_mask);
- cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
- if (cpumask_empty(non_isolated_cpus))
- cpumask_set_cpu(smp_processor_id(), non_isolated_cpus);
mutex_unlock(&sched_domains_mutex);

/* Move init over to a non-isolated CPU */
- if (set_cpus_allowed_ptr(current, non_isolated_cpus) < 0)
+ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_DOMAIN)) < 0)
BUG();
sched_init_granularity();
- free_cpumask_var(non_isolated_cpus);

init_sched_rt_class();
init_sched_dl_class();
@@ -5898,9 +5887,6 @@ void __init sched_init(void)
calc_load_update = jiffies + LOAD_FREQ;

#ifdef CONFIG_SMP
- /* May be allocated at isolcpus cmdline parse time */
- if (cpu_isolated_map == NULL)
- zalloc_cpumask_var(&cpu_isolated_map, GFP_NOWAIT);
idle_thread_set_boot_cpu();
set_cpu_rq_start_time(smp_processor_id());
#endif
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index bd8b6d6..1e03138 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -3,6 +3,7 @@
*/
#include <linux/sched.h>
#include <linux/mutex.h>
+#include <linux/housekeeping.h>

#include "sched.h"

@@ -466,21 +467,6 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
update_top_cache_domain(cpu);
}

-/* Setup the mask of CPUs configured for isolated domains */
-static int __init isolated_cpu_setup(char *str)
-{
- int ret;
-
- alloc_bootmem_cpumask_var(&cpu_isolated_map);
- ret = cpulist_parse(str, cpu_isolated_map);
- if (ret) {
- pr_err("sched: Error, all isolcpus= values must be between 0 and %d\n", nr_cpu_ids);
- return 0;
- }
- return 1;
-}
-__setup("isolcpus=", isolated_cpu_setup);
-
struct s_data {
struct sched_domain ** __percpu sd;
struct root_domain *rd;
@@ -1775,7 +1761,7 @@ int sched_init_domains(const struct cpumask *cpu_map)
doms_cur = alloc_sched_domains(ndoms_cur);
if (!doms_cur)
doms_cur = &fallback_doms;
- cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
+ cpumask_and(doms_cur[0], cpu_map, housekeeping_cpumask(HK_FLAG_DOMAIN));
err = build_sched_domains(doms_cur[0], NULL);
register_sched_domain_sysctl();

@@ -1871,7 +1857,8 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
if (doms_new == NULL) {
n = 0;
doms_new = &fallback_doms;
- cpumask_andnot(doms_new[0], cpu_active_mask, cpu_isolated_map);
+ cpumask_and(doms_new[0], cpu_active_mask,
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
WARN_ON_ONCE(dattr_new);
}

--
2.7.4

2017-09-01 16:43:12

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 09/12] housekeeping: Handle nohz_full= parameter

We want to centralize the isolation management from the housekeeping
subsystem. Therefore we need to handle the nohz_full= parameter from
there.

Since nohz_full= so far has involved unbound timers, watchdog, RCU
and tilegx NAPI isolation, we keep that default behaviour.

nohz_full= is deemed to be deprecated in the future. We want to control
the isolation features from the isolcpus= parameter.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 1 +
include/linux/tick.h | 1 +
kernel/housekeeping.c | 44 +++++++++++++++++++++++++++++++-------------
kernel/time/tick-sched.c | 13 +++----------
4 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index b1a62544..35fb197 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -10,6 +10,7 @@ enum hk_flags {
HK_FLAG_RCU = (1 << 1),
HK_FLAG_MISC = (1 << 2),
HK_FLAG_SCHED = (1 << 3),
+ HK_FLAG_TICK = (1 << 4),
};

#ifdef CONFIG_CPU_ISOLATION
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 68afc09..3c82cf5 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -228,6 +228,7 @@ static inline void tick_dep_clear_signal(struct signal_struct *signal,

extern void tick_nohz_full_kick_cpu(int cpu);
extern void __tick_nohz_task_switch(void);
+extern void __init tick_nohz_full_setup(cpumask_var_t cpumask);
#else
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index e2196d1..b2006bd 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -49,23 +49,41 @@ bool housekeeping_test_cpu(int cpu, enum hk_flags flags)

void __init housekeeping_init(void)
{
- if (!tick_nohz_full_enabled())
+ if (!housekeeping_flags)
return;

- if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) {
- WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n");
- cpumask_clear(tick_nohz_full_mask);
- tick_nohz_full_running = false;
- return;
- }
-
- cpumask_andnot(housekeeping_mask,
- cpu_possible_mask, tick_nohz_full_mask);
-
- housekeeping_flags = HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC;
-
static_branch_enable(&housekeeping_overriden);

/* We need at least one CPU to handle housekeeping work */
WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
+
+#ifdef CONFIG_NO_HZ_FULL
+static int __init housekeeping_nohz_full_setup(char *str)
+{
+ cpumask_var_t non_housekeeping_mask;
+
+ alloc_bootmem_cpumask_var(&non_housekeeping_mask);
+ if (cpulist_parse(str, non_housekeeping_mask) < 0) {
+ pr_warn("Housekeeping: Incorrect nohz_full cpumask\n");
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+ return 0;
+ }
+
+ alloc_bootmem_cpumask_var(&housekeeping_mask);
+ cpumask_andnot(housekeeping_mask, cpu_possible_mask, non_housekeeping_mask);
+
+ if (cpumask_empty(housekeeping_mask))
+ cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+
+ housekeeping_flags = HK_FLAG_TICK | HK_FLAG_TIMER |
+ HK_FLAG_RCU | HK_FLAG_MISC;
+
+ tick_nohz_full_setup(non_housekeeping_mask);
+
+ free_bootmem_cpumask_var(non_housekeeping_mask);
+
+ return 1;
+}
+__setup("nohz_full=", housekeeping_nohz_full_setup);
+#endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9d29dee..f09dd43 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -384,20 +384,13 @@ void __tick_nohz_task_switch(void)
local_irq_restore(flags);
}

-/* Parse the boot-time nohz CPU list from the kernel parameters. */
-static int __init tick_nohz_full_setup(char *str)
+/* Get the boot-time nohz CPU list from the kernel parameters. */
+void __init tick_nohz_full_setup(cpumask_var_t cpumask)
{
alloc_bootmem_cpumask_var(&tick_nohz_full_mask);
- if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
- pr_warn("NO_HZ: Incorrect nohz_full cpumask\n");
- free_bootmem_cpumask_var(tick_nohz_full_mask);
- return 1;
- }
+ cpumask_copy(tick_nohz_full_mask, cpumask);
tick_nohz_full_running = true;
-
- return 1;
}
-__setup("nohz_full=", tick_nohz_full_setup);

static int tick_nohz_cpu_down(unsigned int cpu)
{
--
2.7.4

2017-09-01 16:43:42

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 07/12] housekeeping: Move it under own config, independant from NO_HZ

Complete the housekeeping split from CONFIG_NO_HZ_FULL by moving it
under its own config. This way we finally separate the isolation code
from nohz.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 6 +++---
init/Kconfig | 7 +++++++
kernel/Makefile | 2 +-
3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 320cc2b..dcbec47 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -5,7 +5,7 @@
#include <linux/init.h>
#include <linux/tick.h>

-#ifdef CONFIG_NO_HZ_FULL
+#ifdef CONFIG_CPU_ISOLATION
DECLARE_STATIC_KEY_FALSE(housekeeping_overriden);
extern int housekeeping_any_cpu(void);
extern const struct cpumask *housekeeping_cpumask(void);
@@ -27,11 +27,11 @@ static inline const struct cpumask *housekeeping_cpumask(void)

static inline void housekeeping_affine(struct task_struct *t) { }
static inline void housekeeping_init(void) { }
-#endif /* CONFIG_NO_HZ_FULL */
+#endif /* CONFIG_CPU_ISOLATION */

static inline bool housekeeping_cpu(int cpu)
{
-#ifdef CONFIG_NO_HZ_FULL
+#ifdef CONFIG_CPU_ISOLATION
if (static_branch_unlikely(&housekeeping_overriden))
return housekeeping_test_cpu(cpu);
#endif
diff --git a/init/Kconfig b/init/Kconfig
index 8514b25..f35bdae 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -472,6 +472,13 @@ config TASK_IO_ACCOUNTING

endmenu # "CPU/Task time and stats accounting"

+config CPU_ISOLATION
+ bool "CPU isolation"
+ help
+ Make sure that CPUs running critical tasks are not disturbed by
+ any source of "noise" such as unbound workqueues, timers, kthreads...
+ Unbound jobs get offloaded to housekeeping CPUs.
+
source "kernel/rcu/Kconfig"

config BUILD_BIN2C
diff --git a/kernel/Makefile b/kernel/Makefile
index 8a85c4b..445f876 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -109,7 +109,7 @@ obj-$(CONFIG_JUMP_LABEL) += jump_label.o
obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
obj-$(CONFIG_TORTURE_TEST) += torture.o
obj-$(CONFIG_MEMBARRIER) += membarrier.o
-obj-$(CONFIG_NO_HZ_FULL) += housekeeping.o
+obj-$(CONFIG_CPU_ISOLATION) += housekeeping.o

obj-$(CONFIG_HAS_IOMEM) += memremap.o

--
2.7.4

2017-09-01 16:43:58

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 06/12] housekeeping: Rename is_housekeeping_cpu to housekeeping_cpu

To keep a proper housekeeping namespace.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 2 +-
kernel/sched/core.c | 6 +++---
kernel/sched/fair.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index cbe8d63..320cc2b 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -29,7 +29,7 @@ static inline void housekeeping_affine(struct task_struct *t) { }
static inline void housekeeping_init(void) { }
#endif /* CONFIG_NO_HZ_FULL */

-static inline bool is_housekeeping_cpu(int cpu)
+static inline bool housekeeping_cpu(int cpu)
{
#ifdef CONFIG_NO_HZ_FULL
if (static_branch_unlikely(&housekeeping_overriden))
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 536d6a5..fd00ae3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -527,7 +527,7 @@ int get_nohz_timer_target(void)
int i, cpu = smp_processor_id();
struct sched_domain *sd;

- if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
+ if (!idle_cpu(cpu) && housekeeping_cpu(cpu))
return cpu;

rcu_read_lock();
@@ -536,14 +536,14 @@ int get_nohz_timer_target(void)
if (cpu == i)
continue;

- if (!idle_cpu(i) && is_housekeeping_cpu(i)) {
+ if (!idle_cpu(i) && housekeeping_cpu(i)) {
cpu = i;
goto unlock;
}
}
}

- if (!is_housekeeping_cpu(cpu))
+ if (!housekeeping_cpu(cpu))
cpu = housekeeping_any_cpu();
unlock:
rcu_read_unlock();
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5455e98..0d8f7b1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8739,7 +8739,7 @@ void nohz_balance_enter_idle(int cpu)
return;

/* Spare idle load balancing on CPUs that don't want to be disturbed: */
- if (!is_housekeeping_cpu(cpu))
+ if (!housekeeping_cpu(cpu))
return;

if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
--
2.7.4

2017-09-01 16:44:18

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 04/12] housekeeping: Make housekeeping cpumask private

Nobody needs to access this detail. housekeeping_cpumask() already
takes care about it.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
include/linux/housekeeping.h | 31 ++++++++++---------------------
kernel/housekeeping.c | 33 ++++++++++++++++++++++++++++++++-
2 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
index 64d0ee5..31a1401 100644
--- a/include/linux/housekeeping.h
+++ b/include/linux/housekeeping.h
@@ -6,46 +6,35 @@
#include <linux/tick.h>

#ifdef CONFIG_NO_HZ_FULL
-extern cpumask_var_t housekeeping_mask;
+extern int housekeeping_any_cpu(void);
+extern const struct cpumask *housekeeping_cpumask(void);
+extern void housekeeping_affine(struct task_struct *t);
+extern bool housekeeping_test_cpu(int cpu);
extern void __init housekeeping_init(void);
+
#else
-static inline void housekeeping_init(void) { }
-#endif /* CONFIG_NO_HZ_FULL */

static inline int housekeeping_any_cpu(void)
{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-#endif
return smp_processor_id();
}

static inline const struct cpumask *housekeeping_cpumask(void)
{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return housekeeping_mask;
-#endif
return cpu_possible_mask;
}

+static inline void housekeeping_affine(struct task_struct *t) { }
+static inline void housekeeping_init(void) { }
+#endif /* CONFIG_NO_HZ_FULL */
+
static inline bool is_housekeeping_cpu(int cpu)
{
#ifdef CONFIG_NO_HZ_FULL
if (tick_nohz_full_enabled())
- return cpumask_test_cpu(cpu, housekeeping_mask);
+ return housekeeping_test_cpu(cpu);
#endif
return true;
}

-static inline void housekeeping_affine(struct task_struct *t)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- set_cpus_allowed_ptr(t, housekeeping_mask);
-
-#endif
-}
-
#endif /* _LINUX_HOUSEKEEPING_H */
diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
index 6d8afd5..0183e75 100644
--- a/kernel/housekeeping.c
+++ b/kernel/housekeeping.c
@@ -10,7 +10,38 @@
#include <linux/tick.h>
#include <linux/init.h>

-cpumask_var_t housekeeping_mask;
+static cpumask_var_t housekeeping_mask;
+
+int housekeeping_any_cpu(void)
+{
+ if (tick_nohz_full_enabled())
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+
+ return smp_processor_id();
+}
+
+const struct cpumask *housekeeping_cpumask(void)
+{
+ if (tick_nohz_full_enabled())
+ return housekeeping_mask;
+
+ return cpu_possible_mask;
+}
+
+void housekeeping_affine(struct task_struct *t)
+{
+ if (tick_nohz_full_enabled())
+ set_cpus_allowed_ptr(t, housekeeping_mask);
+}
+
+bool housekeeping_test_cpu(int cpu)
+{
+ if (tick_nohz_full_enabled())
+ return cpumask_test_cpu(cpu, housekeeping_mask);
+
+ return true;
+}
+

void __init housekeeping_init(void)
{
--
2.7.4

2017-09-01 16:42:04

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 01/12] housekeeping: Move housekeeping related code to its own file

The housekeeping code is currently tied to the nohz code. As we are
planning to make housekeeping independant from it, start with moving
the relevant code to its own file.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
drivers/net/ethernet/tile/tilegx.c | 2 +-
include/linux/housekeeping.h | 56 ++++++++++++++++++++++++++++++++++++++
include/linux/tick.h | 37 -------------------------
init/main.c | 2 ++
kernel/Makefile | 1 +
kernel/housekeeping.c | 32 ++++++++++++++++++++++
kernel/rcu/tree_plugin.h | 1 +
kernel/rcu/update.c | 1 +
kernel/sched/core.c | 1 +
kernel/sched/fair.c | 1 +
kernel/time/tick-sched.c | 18 ------------
kernel/watchdog.c | 1 +
12 files changed, 97 insertions(+), 56 deletions(-)
create mode 100644 include/linux/housekeeping.h
create mode 100644 kernel/housekeeping.c

diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index aec9538..eb74e09 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -40,7 +40,7 @@
#include <linux/tcp.h>
#include <linux/net_tstamp.h>
#include <linux/ptp_clock_kernel.h>
-#include <linux/tick.h>
+#include <linux/housekeeping.h>

#include <asm/checksum.h>
#include <asm/homecache.h>
diff --git a/include/linux/housekeeping.h b/include/linux/housekeeping.h
new file mode 100644
index 0000000..3d6a8e6
--- /dev/null
+++ b/include/linux/housekeeping.h
@@ -0,0 +1,56 @@
+#ifndef _LINUX_HOUSEKEEPING_H
+#define _LINUX_HOUSEKEEPING_H
+
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/tick.h>
+
+#ifdef CONFIG_NO_HZ_FULL
+extern cpumask_var_t housekeeping_mask;
+
+static inline int housekeeping_any_cpu(void)
+{
+ return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
+
+extern void __init housekeeping_init(void);
+
+#else
+
+static inline int housekeeping_any_cpu(void)
+{
+ return smp_processor_id();
+}
+
+static inline void housekeeping_init(void) { }
+#endif /* CONFIG_NO_HZ_FULL */
+
+
+static inline const struct cpumask *housekeeping_cpumask(void)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ return housekeeping_mask;
+#endif
+ return cpu_possible_mask;
+}
+
+static inline bool is_housekeeping_cpu(int cpu)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ return cpumask_test_cpu(cpu, housekeeping_mask);
+#endif
+ return true;
+}
+
+static inline void housekeeping_affine(struct task_struct *t)
+{
+#ifdef CONFIG_NO_HZ_FULL
+ if (tick_nohz_full_enabled())
+ set_cpus_allowed_ptr(t, housekeeping_mask);
+
+#endif
+}
+
+#endif /* _LINUX_HOUSEKEEPING_H */
diff --git a/include/linux/tick.h b/include/linux/tick.h
index fe01e68..68afc09 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -137,7 +137,6 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
#ifdef CONFIG_NO_HZ_FULL
extern bool tick_nohz_full_running;
extern cpumask_var_t tick_nohz_full_mask;
-extern cpumask_var_t housekeeping_mask;

static inline bool tick_nohz_full_enabled(void)
{
@@ -161,11 +160,6 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
cpumask_or(mask, mask, tick_nohz_full_mask);
}

-static inline int housekeeping_any_cpu(void)
-{
- return cpumask_any_and(housekeeping_mask, cpu_online_mask);
-}
-
extern void tick_nohz_dep_set(enum tick_dep_bits bit);
extern void tick_nohz_dep_clear(enum tick_dep_bits bit);
extern void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit);
@@ -235,10 +229,6 @@ static inline void tick_dep_clear_signal(struct signal_struct *signal,
extern void tick_nohz_full_kick_cpu(int cpu);
extern void __tick_nohz_task_switch(void);
#else
-static inline int housekeeping_any_cpu(void)
-{
- return smp_processor_id();
-}
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
@@ -260,33 +250,6 @@ static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void __tick_nohz_task_switch(void) { }
#endif

-static inline const struct cpumask *housekeeping_cpumask(void)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return housekeeping_mask;
-#endif
- return cpu_possible_mask;
-}
-
-static inline bool is_housekeeping_cpu(int cpu)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- return cpumask_test_cpu(cpu, housekeeping_mask);
-#endif
- return true;
-}
-
-static inline void housekeeping_affine(struct task_struct *t)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled())
- set_cpus_allowed_ptr(t, housekeeping_mask);
-
-#endif
-}
-
static inline void tick_nohz_task_switch(void)
{
if (tick_nohz_full_enabled())
diff --git a/init/main.c b/init/main.c
index 881d624..4047c85 100644
--- a/init/main.c
+++ b/init/main.c
@@ -46,6 +46,7 @@
#include <linux/cgroup.h>
#include <linux/efi.h>
#include <linux/tick.h>
+#include <linux/housekeeping.h>
#include <linux/interrupt.h>
#include <linux/taskstats_kern.h>
#include <linux/delayacct.h>
@@ -604,6 +605,7 @@ asmlinkage __visible void __init start_kernel(void)
early_irq_init();
init_IRQ();
tick_init();
+ housekeeping_init();
rcu_init_nohz();
init_timers();
hrtimers_init();
diff --git a/kernel/Makefile b/kernel/Makefile
index 4cb8e8b..8a85c4b 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -109,6 +109,7 @@ obj-$(CONFIG_JUMP_LABEL) += jump_label.o
obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
obj-$(CONFIG_TORTURE_TEST) += torture.o
obj-$(CONFIG_MEMBARRIER) += membarrier.o
+obj-$(CONFIG_NO_HZ_FULL) += housekeeping.o

obj-$(CONFIG_HAS_IOMEM) += memremap.o

diff --git a/kernel/housekeeping.c b/kernel/housekeeping.c
new file mode 100644
index 0000000..6d8afd5
--- /dev/null
+++ b/kernel/housekeeping.c
@@ -0,0 +1,32 @@
+/*
+ * Housekeeping management. Manage the targets for routine code that can run on
+ * any CPU: unbound workqueues, timers, kthreads and any offloadable work.
+ *
+ * Copyright (C) 2017 Red Hat, Inc., Frederic Weisbecker
+ *
+ */
+
+#include <linux/housekeeping.h>
+#include <linux/tick.h>
+#include <linux/init.h>
+
+cpumask_var_t housekeeping_mask;
+
+void __init housekeeping_init(void)
+{
+ if (!tick_nohz_full_enabled())
+ return;
+
+ if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) {
+ WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n");
+ cpumask_clear(tick_nohz_full_mask);
+ tick_nohz_full_running = false;
+ return;
+ }
+
+ cpumask_andnot(housekeeping_mask,
+ cpu_possible_mask, tick_nohz_full_mask);
+
+ /* We need at least one CPU to handle housekeeping work */
+ WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
+}
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 908b309..c66d162 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -29,6 +29,7 @@
#include <linux/oom.h>
#include <linux/sched/debug.h>
#include <linux/smpboot.h>
+#include <linux/housekeeping.h>
#include <uapi/linux/sched/types.h>
#include "../time/tick-internal.h"

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 00e77c4..bfe973d 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -51,6 +51,7 @@
#include <linux/kthread.h>
#include <linux/tick.h>
#include <linux/rcupdate_wait.h>
+#include <linux/housekeeping.h>

#define CREATE_TRACE_POINTS

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f9f9948..536d6a5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -26,6 +26,7 @@
#include <linux/profile.h>
#include <linux/security.h>
#include <linux/syscalls.h>
+#include <linux/housekeeping.h>

#include <asm/switch_to.h>
#include <asm/tlb.h>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8d58687..5455e98 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -32,6 +32,7 @@
#include <linux/mempolicy.h>
#include <linux/migrate.h>
#include <linux/task_work.h>
+#include <linux/housekeeping.h>

#include <trace/events/sched.h>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c7a899c..9d29dee 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -165,7 +165,6 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)

#ifdef CONFIG_NO_HZ_FULL
cpumask_var_t tick_nohz_full_mask;
-cpumask_var_t housekeeping_mask;
bool tick_nohz_full_running;
static atomic_t tick_dep_mask;

@@ -437,13 +436,6 @@ void __init tick_nohz_init(void)
return;
}

- if (!alloc_cpumask_var(&housekeeping_mask, GFP_KERNEL)) {
- WARN(1, "NO_HZ: Can't allocate not-full dynticks cpumask\n");
- cpumask_clear(tick_nohz_full_mask);
- tick_nohz_full_running = false;
- return;
- }
-
/*
* Full dynticks uses irq work to drive the tick rescheduling on safe
* locking contexts. But then we need irq work to raise its own
@@ -452,7 +444,6 @@ void __init tick_nohz_init(void)
if (!arch_irq_work_has_interrupt()) {
pr_warn("NO_HZ: Can't run full dynticks because arch doesn't support irq work self-IPIs\n");
cpumask_clear(tick_nohz_full_mask);
- cpumask_copy(housekeeping_mask, cpu_possible_mask);
tick_nohz_full_running = false;
return;
}
@@ -465,9 +456,6 @@ void __init tick_nohz_init(void)
cpumask_clear_cpu(cpu, tick_nohz_full_mask);
}

- cpumask_andnot(housekeeping_mask,
- cpu_possible_mask, tick_nohz_full_mask);
-
for_each_cpu(cpu, tick_nohz_full_mask)
context_tracking_cpu_set(cpu);

@@ -477,12 +465,6 @@ void __init tick_nohz_init(void)
WARN_ON(ret < 0);
pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
cpumask_pr_args(tick_nohz_full_mask));
-
- /*
- * We need at least one CPU to handle housekeeping work such
- * as timekeeping, unbound timers, workqueues, ...
- */
- WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
}
#endif

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 06d3389..7a9df162 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -24,6 +24,7 @@
#include <linux/workqueue.h>
#include <linux/sched/clock.h>
#include <linux/sched/debug.h>
+#include <linux/housekeeping.h>

#include <asm/irq_regs.h>
#include <linux/kvm_para.h>
--
2.7.4

2017-09-01 16:44:48

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 02/12] watchdog: Use housekeeping_cpumask() instead of ad-hoc version

While trying to disable the watchog on nohz_full CPUs, the watchdog
implements an ad-hoc version of housekeeping_cpumask(). Lets replace
those re-invented lines.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Luiz Capitulino <[email protected]>
---
kernel/watchdog.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7a9df162..cdd0d11 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -941,15 +941,10 @@ void __init lockup_detector_init(void)
{
set_sample_period();

-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_enabled()) {
+ if (tick_nohz_full_enabled())
pr_info("Disabling watchdog on nohz_full cores by default\n");
- cpumask_copy(&watchdog_cpumask, housekeeping_mask);
- } else
- cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
-#else
- cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
-#endif
+
+ cpumask_copy(&watchdog_cpumask, housekeeping_cpumask());

if (watchdog_enabled)
watchdog_enable_all_cpus();
--
2.7.4