2015-05-12 22:30:43

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 0/24] Initialization/Kconfig updates for 4.2

Hello!

This series updates initialization and Kconfig:

1. Control grace-period delays directly from value, avoiding
pointless Kconfig questions.

2. Modulate grace-period slow init to normalize delay.

3. Shut up spurious gcc uninitialized-variable warning.

4. Panic if RCU tree can not accommodate all CPUs, courtesy of
Alexander Gordeev.

5. Remove superfluous local variable in rcu_init_geometry(), courtesy
of Alexander Gordeev.

6. Clean up rcu_init_geometry() code and arithmetic, courtesy of
Alexander Gordeev.

7. Simplify rcu_init_geometry() capacity arithmetic, courtesy of
Alexander Gordeev.

8. Limit rcu_state::levelcnt[] to RCU_NUM_LVLS items, courtesy of
Alexander Gordeev.

9. Limit rcu_capacity[] size to RCU_NUM_LVLS items, courtesy of
Alexander Gordeev.

10. Remove unnecessary fields from rcu_state structure, courtesy
of Alexander Gordeev.

11. Limit count of static data to the number of RCU levels, courtesy
of Alexander Gordeev.

12. Simplify calculation of the number of RCU nodes, courtesy of
Alexander Gordeev.

13. Provide diagnostic option to slow down grace-period scans.

14. Directly drive CONFIG_TASKS_RCU from Kconfig.

15. Directly drive CONFIG_RCU_USER_QS from Kconfig.

16. Convert CONFIG_RCU_FANOUT_EXACT to boot parameter.

17. Enable diagnostic dump of rcu_node combining tree.

18. Create CONFIG_RCU_EXPERT Kconfig and hide boolean Kconfig parameters
behind it.

19. Break dependency of CONFIG_RCU_FANOUT_LEAF on CONFIG_RCU_FANOUT.

20. Make RCU able to tolerate undefined CONFIG_RCU_FANOUT.

21. Make RCU able to tolerate undefined CONFIG_RCU_FANOUT_LEAF.

22. Make RCU able to tolerate undefined CONFIG_RCU_KTHREAD_PRIO.

23. Remove prompt for RCU implementation.

24. Conditionally compile RCU's eqs warnings in order to shave a bit
of time off of kernel/user and kernel/idle transitions.

Thanx, Paul

------------------------------------------------------------------------

b/Documentation/kernel-parameters.txt | 27 +
b/init/Kconfig | 74 +--
b/kernel/rcu/tree.c | 268 ++++++++------
b/kernel/rcu/tree.h | 63 ++-
b/kernel/rcu/tree_plugin.h | 18
b/lib/Kconfig.debug | 67 +++
b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon | 2
7 files changed, 338 insertions(+), 181 deletions(-)


2015-05-12 22:31:09

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 01/24] rcu: Control grace-period delays directly from value

From: "Paul E. McKenney" <[email protected]>

In a misguided attempt to avoid an #ifdef, the use of the
gp_init_delay module parameter was conditioned on the corresponding
RCU_TORTURE_TEST_SLOW_INIT Kconfig variable, using IS_ENABLED() at
the point of use in the code. This meant that the compiler always saw
the delay, which meant that RCU_TORTURE_TEST_SLOW_INIT_DELAY had to be
unconditionally defined. This in turn caused "make oldconfig" to ask
pointless questions about the value of RCU_TORTURE_TEST_SLOW_INIT_DELAY
in cases where it was not even used.

This commit avoids these pointless questions by defining gp_init_delay
under #ifdef. In one branch, gp_init_delay is initialized to
RCU_TORTURE_TEST_SLOW_INIT_DELAY and is also a module parameter (thus
allowing boot-time modification), and in the other branch gp_init_delay
is a const variable initialized by default to zero.

This approach also simplifies the code at the delay point by eliminating
the IS_DEFINED(). Because gp_init_delay is constant zero in the no-delay
case intended for production use, the "gp_init_delay > 0" check causes
the delay to become dead code, as desired in this case. In addition,
this commit replaces magic constant "10" with the preprocessor variable
PER_RCU_NODE_PERIOD, which controls the number of grace periods that
are allowed to elapse at full speed before a delay is inserted.

Reported-by: Linus Torvalds <[email protected]> Signed-off-by:
Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 16 +++++++++-------
lib/Kconfig.debug | 1 +
2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 4d3299577d7b..0628df155970 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -162,11 +162,14 @@ static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
module_param(kthread_prio, int, 0644);

-/* Delay in jiffies for grace-period initialization delays. */
-static int gp_init_delay = IS_ENABLED(CONFIG_RCU_TORTURE_TEST_SLOW_INIT)
- ? CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY
- : 0;
+/* Delay in jiffies for grace-period initialization delays, debug only. */
+#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT
+static int gp_init_delay = CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY;
module_param(gp_init_delay, int, 0644);
+#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */
+static const int gp_init_delay;
+#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */
+#define PER_RCU_NODE_PERIOD 10 /* Number of grace periods between delays. */

/*
* Track the rcutorture test sequence number and the update version
@@ -1844,9 +1847,8 @@ static int rcu_gp_init(struct rcu_state *rsp)
raw_spin_unlock_irq(&rnp->lock);
cond_resched_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
- if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST_SLOW_INIT) &&
- gp_init_delay > 0 &&
- !(rsp->gpnum % (rcu_num_nodes * 10)))
+ if (gp_init_delay > 0 &&
+ !(rsp->gpnum % (rcu_num_nodes * PER_RCU_NODE_PERIOD)))
schedule_timeout_uninterruptible(gp_init_delay);
}

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 17670573dda8..ba2b0c87e65b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1281,6 +1281,7 @@ config RCU_TORTURE_TEST_SLOW_INIT_DELAY
int "How much to slow down RCU grace-period initialization"
range 0 5
default 3
+ depends on RCU_TORTURE_TEST_SLOW_INIT
help
This option specifies the number of jiffies to wait between
each rcu_node structure initialization.
--
1.8.1.5

2015-05-12 22:38:07

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 02/24] rcu: Modulate grace-period slow init to normalize delay

From: "Paul E. McKenney" <[email protected]>

Currently, the larger the gp_init_delay boot parameter, the slower
rcutorture will sequence through grace periods. This commit avoids this
issue by decreasing the probability of slowing initialization of a given
grace period as the degree of slowness increases.

Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0628df155970..c34422d92aa9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -169,7 +169,17 @@ module_param(gp_init_delay, int, 0644);
#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */
static const int gp_init_delay;
#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */
-#define PER_RCU_NODE_PERIOD 10 /* Number of grace periods between delays. */
+
+/*
+ * Number of grace periods between delays, normalized by the duration of
+ * the delay. The longer the the delay, the more the grace periods between
+ * each delay. The reason for this normalization is that it means that,
+ * for non-zero delays, the overall slowdown of grace periods is constant
+ * regardless of the duration of the delay. This arrangement balances
+ * the need for long delays to increase some race probabilities with the
+ * need for fast grace periods to increase other race probabilities.
+ */
+#define PER_RCU_NODE_PERIOD 3 /* Number of grace periods between delays. */

/*
* Track the rcutorture test sequence number and the update version
@@ -1848,7 +1858,8 @@ static int rcu_gp_init(struct rcu_state *rsp)
cond_resched_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
if (gp_init_delay > 0 &&
- !(rsp->gpnum % (rcu_num_nodes * PER_RCU_NODE_PERIOD)))
+ !(rsp->gpnum %
+ (rcu_num_nodes * PER_RCU_NODE_PERIOD * gp_init_delay)))
schedule_timeout_uninterruptible(gp_init_delay);
}

--
1.8.1.5

2015-05-12 22:31:16

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 03/24] rcu: Shut up spurious gcc uninitialized-variable warning

From: "Paul E. McKenney" <[email protected]>

Because gcc doesn't realize that rcu_num_lvls must be strictly greater
than zero, some versions give a spurious warning about levelcnt[0] being
uninitialized in rcu_init_one(). This commit updates the condition on
the pre-existing panic() in order to educate gcc on this point.

Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c34422d92aa9..9b076b284695 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3982,9 +3982,9 @@ static void __init rcu_init_one(struct rcu_state *rsp,

BUILD_BUG_ON(MAX_RCU_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */

- /* Silence gcc 4.8 warning about array index out of range. */
- if (rcu_num_lvls > RCU_NUM_LVLS)
- panic("rcu_init_one: rcu_num_lvls overflow");
+ /* Silence gcc 4.8 false positive about array index out of range. */
+ if (rcu_num_lvls <= 0 || rcu_num_lvls > RCU_NUM_LVLS)
+ panic("rcu_init_one: rcu_num_lvls out of range");

/* Initialize the level-tracking arrays. */

--
1.8.1.5

2015-05-12 22:31:06

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 04/24] rcu: Panic if RCU tree can not accommodate all CPUs

From: Alexander Gordeev <[email protected]>

Currently a condition when RCU tree is unable to accommodate
the configured number of CPUs is not permitted and causes
a fall back to compile-time values. However, the code has no
means to exceed the RCU tree capacity neither at compile-time
nor in run-time. Therefore, if the condition is met in run-
time then it indicates a serios problem elsewhere and should
be handled with a panic.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9b076b284695..f374f2caaf25 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4087,16 +4087,19 @@ static void __init rcu_init_geometry(void)
rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;

/*
+ * The tree must be able to accommodate the configured number of CPUs.
+ * If this limit is exceeded than we have a serious problem elsewhere.
+ *
* The boot-time rcu_fanout_leaf parameter is only permitted
* to increase the leaf-level fanout, not decrease it. Of course,
* the leaf-level fanout cannot exceed the number of bits in
- * the rcu_node masks. Finally, the tree must be able to accommodate
- * the configured number of CPUs. Complain and fall back to the
- * compile-time values if these limits are exceeded.
+ * the rcu_node masks. Complain and fall back to the compile-
+ * time values if these limits are exceeded.
*/
- if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
- rcu_fanout_leaf > sizeof(unsigned long) * 8 ||
- n > rcu_capacity[MAX_RCU_LVLS]) {
+ if (n > rcu_capacity[MAX_RCU_LVLS])
+ panic("rcu_init_geometry: rcu_capacity[] is too small");
+ else if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
+ rcu_fanout_leaf > sizeof(unsigned long) * 8) {
WARN_ON(1);
return;
}
--
1.8.1.5

2015-05-12 22:31:12

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 05/24] rcu: Remove superfluous local variable in rcu_init_geometry()

From: Alexander Gordeev <[email protected]>

Local variable 'n' mimics 'nr_cpu_ids' while the both are
used within one function. There is no reason for 'n' to
exist whatsoever.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f374f2caaf25..99431ca4d395 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4053,7 +4053,6 @@ static void __init rcu_init_geometry(void)
ulong d;
int i;
int j;
- int n = nr_cpu_ids;
int rcu_capacity[MAX_RCU_LVLS + 1];

/*
@@ -4096,7 +4095,7 @@ static void __init rcu_init_geometry(void)
* the rcu_node masks. Complain and fall back to the compile-
* time values if these limits are exceeded.
*/
- if (n > rcu_capacity[MAX_RCU_LVLS])
+ if (nr_cpu_ids > rcu_capacity[MAX_RCU_LVLS])
panic("rcu_init_geometry: rcu_capacity[] is too small");
else if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
rcu_fanout_leaf > sizeof(unsigned long) * 8) {
@@ -4106,10 +4105,11 @@ static void __init rcu_init_geometry(void)

/* Calculate the number of rcu_nodes at each level of the tree. */
for (i = 1; i <= MAX_RCU_LVLS; i++)
- if (n <= rcu_capacity[i]) {
- for (j = 0; j <= i; j++)
- num_rcu_lvl[j] =
- DIV_ROUND_UP(n, rcu_capacity[i - j]);
+ if (nr_cpu_ids <= rcu_capacity[i]) {
+ for (j = 0; j <= i; j++) {
+ int cap = rcu_capacity[i - j];
+ num_rcu_lvl[j] = DIV_ROUND_UP(nr_cpu_ids, cap);
+ }
rcu_num_lvls = i;
for (j = i + 1; j <= MAX_RCU_LVLS; j++)
num_rcu_lvl[j] = 0;
@@ -4120,7 +4120,7 @@ static void __init rcu_init_geometry(void)
rcu_num_nodes = 0;
for (i = 0; i <= MAX_RCU_LVLS; i++)
rcu_num_nodes += num_rcu_lvl[i];
- rcu_num_nodes -= n;
+ rcu_num_nodes -= nr_cpu_ids;
}

void __init rcu_init(void)
--
1.8.1.5

2015-05-12 22:32:13

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 06/24] rcu: Cleanup rcu_init_geometry() code and arithmetics

From: Alexander Gordeev <[email protected]>

This update simplifies rcu_init_geometry() code flow
and makes calculation of the total number of rcu_node
structures more easy to read.

The update relies on the fact num_rcu_lvl[] is never
accessed beyond rcu_num_lvls index by the rest of the
code. Therefore, there is no need initialize the whole
num_rcu_lvl[].

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 99431ca4d395..d37a7f80ad9c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4052,7 +4052,6 @@ static void __init rcu_init_geometry(void)
{
ulong d;
int i;
- int j;
int rcu_capacity[MAX_RCU_LVLS + 1];

/*
@@ -4103,24 +4102,21 @@ static void __init rcu_init_geometry(void)
return;
}

+ /* Calculate the number of levels in the tree. */
+ for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {
+ }
+ rcu_num_lvls = i;
+
/* Calculate the number of rcu_nodes at each level of the tree. */
- for (i = 1; i <= MAX_RCU_LVLS; i++)
- if (nr_cpu_ids <= rcu_capacity[i]) {
- for (j = 0; j <= i; j++) {
- int cap = rcu_capacity[i - j];
- num_rcu_lvl[j] = DIV_ROUND_UP(nr_cpu_ids, cap);
- }
- rcu_num_lvls = i;
- for (j = i + 1; j <= MAX_RCU_LVLS; j++)
- num_rcu_lvl[j] = 0;
- break;
- }
+ for (i = 0; i < rcu_num_lvls; i++) {
+ int cap = rcu_capacity[rcu_num_lvls - i];
+ num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap);
+ }

/* Calculate the total number of rcu_node structures. */
rcu_num_nodes = 0;
- for (i = 0; i <= MAX_RCU_LVLS; i++)
+ for (i = 0; i < rcu_num_lvls; i++)
rcu_num_nodes += num_rcu_lvl[i];
- rcu_num_nodes -= nr_cpu_ids;
}

void __init rcu_init(void)
--
1.8.1.5

2015-05-12 22:33:53

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 07/24] rcu: Simplify rcu_init_geometry() capacity arithmetics

From: Alexander Gordeev <[email protected]>

Current code suggests that introducing the extra level to
rcu_capacity[] array makes some of the arithmetic easier.
Well, in fact it appears rather confusing and unnecessary.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d37a7f80ad9c..2473a60161a3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4052,7 +4052,7 @@ static void __init rcu_init_geometry(void)
{
ulong d;
int i;
- int rcu_capacity[MAX_RCU_LVLS + 1];
+ int rcu_capacity[MAX_RCU_LVLS];

/*
* Initialize any unspecified boot parameters.
@@ -4076,12 +4076,10 @@ static void __init rcu_init_geometry(void)

/*
* Compute number of nodes that can be handled an rcu_node tree
- * with the given number of levels. Setting rcu_capacity[0] makes
- * some of the arithmetic easier.
+ * with the given number of levels.
*/
- rcu_capacity[0] = 1;
- rcu_capacity[1] = rcu_fanout_leaf;
- for (i = 2; i <= MAX_RCU_LVLS; i++)
+ rcu_capacity[0] = rcu_fanout_leaf;
+ for (i = 1; i < MAX_RCU_LVLS; i++)
rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;

/*
@@ -4094,7 +4092,7 @@ static void __init rcu_init_geometry(void)
* the rcu_node masks. Complain and fall back to the compile-
* time values if these limits are exceeded.
*/
- if (nr_cpu_ids > rcu_capacity[MAX_RCU_LVLS])
+ if (nr_cpu_ids > rcu_capacity[MAX_RCU_LVLS - 1])
panic("rcu_init_geometry: rcu_capacity[] is too small");
else if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
rcu_fanout_leaf > sizeof(unsigned long) * 8) {
@@ -4105,11 +4103,11 @@ static void __init rcu_init_geometry(void)
/* Calculate the number of levels in the tree. */
for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {
}
- rcu_num_lvls = i;
+ rcu_num_lvls = i + 1;

/* Calculate the number of rcu_nodes at each level of the tree. */
for (i = 0; i < rcu_num_lvls; i++) {
- int cap = rcu_capacity[rcu_num_lvls - i];
+ int cap = rcu_capacity[(rcu_num_lvls - 1) - i];
num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap);
}

--
1.8.1.5

2015-05-12 22:37:42

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 08/24] rcu: Limit rcu_state::levelcnt[] to RCU_NUM_LVLS items

From: Alexander Gordeev <[email protected]>

Variable rcu_num_lvls is limited by RCU_NUM_LVLS macro.
In turn, rcu_state::levelcnt[] array is never accessed
beyond rcu_num_lvls. Thus, rcu_state::levelcnt[] is safe
to limit to RCU_NUM_LVLS items.

Since rcu_num_lvls could be changed during boot (as result
of rcutree.rcu_fanout_leaf kernel parameter update) one might
assume a new value could overflow the value of RCU_NUM_LVLS.
However, that is not the case, since leaf-level fanout is only
permitted to increase, resulting in rcu_num_lvls possibly to
decrease.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index a69d3dab2ec4..7194efd3e1d3 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -423,7 +423,7 @@ do { \
struct rcu_state {
struct rcu_node node[NUM_RCU_NODES]; /* Hierarchy. */
struct rcu_node *level[RCU_NUM_LVLS]; /* Hierarchy levels. */
- u32 levelcnt[MAX_RCU_LVLS + 1]; /* # nodes in each level. */
+ u32 levelcnt[RCU_NUM_LVLS]; /* # nodes in each level. */
u8 levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */
u8 flavor_mask; /* bit in flavor mask. */
struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */
--
1.8.1.5

2015-05-12 22:32:26

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 09/24] rcu: Limit rcu_capacity[] size to RCU_NUM_LVLS items

From: Alexander Gordeev <[email protected]>

Number of items in rcu_capacity[] array is defined by macro
MAX_RCU_LVLS. However, that array is never accessed beyond
RCU_NUM_LVLS index. Therefore, we can limit the array to
RCU_NUM_LVLS items and eliminate MAX_RCU_LVLS. As result,
in most cases the memory is conserved.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 12 ++++++------
kernel/rcu/tree.h | 1 -
2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2473a60161a3..69c957ea3723 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3968,19 +3968,19 @@ static void __init rcu_init_one(struct rcu_state *rsp,
"rcu_node_0",
"rcu_node_1",
"rcu_node_2",
- "rcu_node_3" }; /* Match MAX_RCU_LVLS */
+ "rcu_node_3" };
static const char * const fqs[] = {
"rcu_node_fqs_0",
"rcu_node_fqs_1",
"rcu_node_fqs_2",
- "rcu_node_fqs_3" }; /* Match MAX_RCU_LVLS */
+ "rcu_node_fqs_3" };
static u8 fl_mask = 0x1;
int cpustride = 1;
int i;
int j;
struct rcu_node *rnp;

- BUILD_BUG_ON(MAX_RCU_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */
+ BUILD_BUG_ON(RCU_NUM_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */

/* Silence gcc 4.8 false positive about array index out of range. */
if (rcu_num_lvls <= 0 || rcu_num_lvls > RCU_NUM_LVLS)
@@ -4052,7 +4052,7 @@ static void __init rcu_init_geometry(void)
{
ulong d;
int i;
- int rcu_capacity[MAX_RCU_LVLS];
+ int rcu_capacity[RCU_NUM_LVLS];

/*
* Initialize any unspecified boot parameters.
@@ -4079,7 +4079,7 @@ static void __init rcu_init_geometry(void)
* with the given number of levels.
*/
rcu_capacity[0] = rcu_fanout_leaf;
- for (i = 1; i < MAX_RCU_LVLS; i++)
+ for (i = 1; i < RCU_NUM_LVLS; i++)
rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;

/*
@@ -4092,7 +4092,7 @@ static void __init rcu_init_geometry(void)
* the rcu_node masks. Complain and fall back to the compile-
* time values if these limits are exceeded.
*/
- if (nr_cpu_ids > rcu_capacity[MAX_RCU_LVLS - 1])
+ if (nr_cpu_ids > rcu_capacity[RCU_NUM_LVLS - 1])
panic("rcu_init_geometry: rcu_capacity[] is too small");
else if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
rcu_fanout_leaf > sizeof(unsigned long) * 8) {
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 7194efd3e1d3..8c0a27861d2f 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -35,7 +35,6 @@
* In practice, this did work well going from three levels to four.
* Of course, your mileage may vary.
*/
-#define MAX_RCU_LVLS 4
#define RCU_FANOUT_1 (CONFIG_RCU_FANOUT_LEAF)
#define RCU_FANOUT_2 (RCU_FANOUT_1 * CONFIG_RCU_FANOUT)
#define RCU_FANOUT_3 (RCU_FANOUT_2 * CONFIG_RCU_FANOUT)
--
1.8.1.5

2015-05-12 22:34:01

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 10/24] rcu: Remove unnecessary fields from rcu_state structure

From: Alexander Gordeev <[email protected]>

Members rcu_state::levelcnt[] and rcu_state::levelspread[]
are only used at init. There is no reason to keep them
afterwards.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 27 +++++++++++++++------------
kernel/rcu/tree.h | 2 --
2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 69c957ea3723..9ae53b08a2fd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3937,22 +3937,22 @@ void rcu_scheduler_starting(void)
* Compute the per-level fanout, either using the exact fanout specified
* or balancing the tree, depending on CONFIG_RCU_FANOUT_EXACT.
*/
-static void __init rcu_init_levelspread(struct rcu_state *rsp)
+static void __init rcu_init_levelspread(int *levelspread, const int *levelcnt)
{
int i;

if (IS_ENABLED(CONFIG_RCU_FANOUT_EXACT)) {
- rsp->levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf;
+ levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf;
for (i = rcu_num_lvls - 2; i >= 0; i--)
- rsp->levelspread[i] = CONFIG_RCU_FANOUT;
+ levelspread[i] = CONFIG_RCU_FANOUT;
} else {
int ccur;
int cprv;

cprv = nr_cpu_ids;
for (i = rcu_num_lvls - 1; i >= 0; i--) {
- ccur = rsp->levelcnt[i];
- rsp->levelspread[i] = (cprv + ccur - 1) / ccur;
+ ccur = levelcnt[i];
+ levelspread[i] = (cprv + ccur - 1) / ccur;
cprv = ccur;
}
}
@@ -3975,6 +3975,9 @@ static void __init rcu_init_one(struct rcu_state *rsp,
"rcu_node_fqs_2",
"rcu_node_fqs_3" };
static u8 fl_mask = 0x1;
+
+ int levelcnt[RCU_NUM_LVLS]; /* # nodes in each level. */
+ int levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */
int cpustride = 1;
int i;
int j;
@@ -3989,19 +3992,19 @@ static void __init rcu_init_one(struct rcu_state *rsp,
/* Initialize the level-tracking arrays. */

for (i = 0; i < rcu_num_lvls; i++)
- rsp->levelcnt[i] = num_rcu_lvl[i];
+ levelcnt[i] = num_rcu_lvl[i];
for (i = 1; i < rcu_num_lvls; i++)
- rsp->level[i] = rsp->level[i - 1] + rsp->levelcnt[i - 1];
- rcu_init_levelspread(rsp);
+ rsp->level[i] = rsp->level[i - 1] + levelcnt[i - 1];
+ rcu_init_levelspread(levelspread, levelcnt);
rsp->flavor_mask = fl_mask;
fl_mask <<= 1;

/* Initialize the elements themselves, starting from the leaves. */

for (i = rcu_num_lvls - 1; i >= 0; i--) {
- cpustride *= rsp->levelspread[i];
+ cpustride *= levelspread[i];
rnp = rsp->level[i];
- for (j = 0; j < rsp->levelcnt[i]; j++, rnp++) {
+ for (j = 0; j < levelcnt[i]; j++, rnp++) {
raw_spin_lock_init(&rnp->lock);
lockdep_set_class_and_name(&rnp->lock,
&rcu_node_class[i], buf[i]);
@@ -4021,10 +4024,10 @@ static void __init rcu_init_one(struct rcu_state *rsp,
rnp->grpmask = 0;
rnp->parent = NULL;
} else {
- rnp->grpnum = j % rsp->levelspread[i - 1];
+ rnp->grpnum = j % levelspread[i - 1];
rnp->grpmask = 1UL << rnp->grpnum;
rnp->parent = rsp->level[i - 1] +
- j / rsp->levelspread[i - 1];
+ j / levelspread[i - 1];
}
rnp->level = i;
INIT_LIST_HEAD(&rnp->blkd_tasks);
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 8c0a27861d2f..740100a17f72 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -422,8 +422,6 @@ do { \
struct rcu_state {
struct rcu_node node[NUM_RCU_NODES]; /* Hierarchy. */
struct rcu_node *level[RCU_NUM_LVLS]; /* Hierarchy levels. */
- u32 levelcnt[RCU_NUM_LVLS]; /* # nodes in each level. */
- u8 levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */
u8 flavor_mask; /* bit in flavor mask. */
struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */
void (*call)(struct rcu_head *head, /* call_rcu() flavor. */
--
1.8.1.5

2015-05-12 22:34:06

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 11/24] rcu: Limit count of static data to the number of RCU levels

From: Alexander Gordeev <[email protected]>

Although a number of RCU levels may be less than the current
maximum of four, some static data associated with each level
are allocated for all four levels. As result, the extra data
never get accessed and just wast memory. This update limits
count of allocated items to the number of used RCU levels.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 21 ++++-----------------
kernel/rcu/tree.h | 12 ++++++++++++
2 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9ae53b08a2fd..56eeb01d581e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -117,13 +117,8 @@ LIST_HEAD(rcu_struct_flavors);
static int rcu_fanout_leaf = CONFIG_RCU_FANOUT_LEAF;
module_param(rcu_fanout_leaf, int, 0444);
int rcu_num_lvls __read_mostly = RCU_NUM_LVLS;
-static int num_rcu_lvl[] = { /* Number of rcu_nodes at specified level. */
- NUM_RCU_LVL_0,
- NUM_RCU_LVL_1,
- NUM_RCU_LVL_2,
- NUM_RCU_LVL_3,
- NUM_RCU_LVL_4,
-};
+/* Number of rcu_nodes at specified level. */
+static int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */

/*
@@ -3964,16 +3959,8 @@ static void __init rcu_init_levelspread(int *levelspread, const int *levelcnt)
static void __init rcu_init_one(struct rcu_state *rsp,
struct rcu_data __percpu *rda)
{
- static const char * const buf[] = {
- "rcu_node_0",
- "rcu_node_1",
- "rcu_node_2",
- "rcu_node_3" };
- static const char * const fqs[] = {
- "rcu_node_fqs_0",
- "rcu_node_fqs_1",
- "rcu_node_fqs_2",
- "rcu_node_fqs_3" };
+ static const char * const buf[] = RCU_NODE_NAME_INIT;
+ static const char * const fqs[] = RCU_FQS_NAME_INIT;
static u8 fl_mask = 0x1;

int levelcnt[RCU_NUM_LVLS]; /* # nodes in each level. */
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 740100a17f72..c1ac06ad6239 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -47,6 +47,9 @@
# define NUM_RCU_LVL_2 0
# define NUM_RCU_LVL_3 0
# define NUM_RCU_LVL_4 0
+# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0 }
+# define RCU_NODE_NAME_INIT { "rcu_node_0" }
+# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0" }
#elif NR_CPUS <= RCU_FANOUT_2
# define RCU_NUM_LVLS 2
# define NUM_RCU_LVL_0 1
@@ -54,6 +57,9 @@
# define NUM_RCU_LVL_2 (NR_CPUS)
# define NUM_RCU_LVL_3 0
# define NUM_RCU_LVL_4 0
+# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1 }
+# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1" }
+# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1" }
#elif NR_CPUS <= RCU_FANOUT_3
# define RCU_NUM_LVLS 3
# define NUM_RCU_LVL_0 1
@@ -61,6 +67,9 @@
# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
# define NUM_RCU_LVL_3 (NR_CPUS)
# define NUM_RCU_LVL_4 0
+# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2 }
+# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2" }
+# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2" }
#elif NR_CPUS <= RCU_FANOUT_4
# define RCU_NUM_LVLS 4
# define NUM_RCU_LVL_0 1
@@ -68,6 +77,9 @@
# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
# define NUM_RCU_LVL_3 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
# define NUM_RCU_LVL_4 (NR_CPUS)
+# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2, NUM_RCU_LVL_3 }
+# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2", "rcu_node_3" }
+# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2", "rcu_node_fqs_3" }
#else
# error "CONFIG_RCU_FANOUT insufficient for NR_CPUS"
#endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */
--
1.8.1.5

2015-05-12 22:32:11

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 12/24] rcu: Simplify arithmetic to calculate number of RCU nodes

From: Alexander Gordeev <[email protected]>

This update makes arithmetic to calculate number of RCU nodes
more straight and easy to read.

Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.h | 17 ++++-------------
kernel/rcu/tree_plugin.h | 4 ++--
2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index c1ac06ad6239..5f294267ed20 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -43,10 +43,7 @@
#if NR_CPUS <= RCU_FANOUT_1
# define RCU_NUM_LVLS 1
# define NUM_RCU_LVL_0 1
-# define NUM_RCU_LVL_1 (NR_CPUS)
-# define NUM_RCU_LVL_2 0
-# define NUM_RCU_LVL_3 0
-# define NUM_RCU_LVL_4 0
+# define NUM_RCU_NODES NUM_RCU_LVL_0
# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0 }
# define RCU_NODE_NAME_INIT { "rcu_node_0" }
# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0" }
@@ -54,9 +51,7 @@
# define RCU_NUM_LVLS 2
# define NUM_RCU_LVL_0 1
# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
-# define NUM_RCU_LVL_2 (NR_CPUS)
-# define NUM_RCU_LVL_3 0
-# define NUM_RCU_LVL_4 0
+# define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1)
# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1 }
# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1" }
# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1" }
@@ -65,8 +60,7 @@
# define NUM_RCU_LVL_0 1
# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
-# define NUM_RCU_LVL_3 (NR_CPUS)
-# define NUM_RCU_LVL_4 0
+# define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2)
# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2 }
# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2" }
# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2" }
@@ -76,7 +70,7 @@
# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3)
# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
# define NUM_RCU_LVL_3 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
-# define NUM_RCU_LVL_4 (NR_CPUS)
+# define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3)
# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2, NUM_RCU_LVL_3 }
# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2", "rcu_node_3" }
# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2", "rcu_node_fqs_3" }
@@ -84,9 +78,6 @@
# error "CONFIG_RCU_FANOUT insufficient for NR_CPUS"
#endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */

-#define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3 + NUM_RCU_LVL_4)
-#define NUM_RCU_NODES (RCU_SUM - NR_CPUS)
-
extern int rcu_num_lvls;
extern int rcu_num_nodes;

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 58b1ebdc4387..9562dd76542e 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -74,8 +74,8 @@ static void __init rcu_bootup_announce_oddness(void)
pr_info("\tRCU torture testing starts during boot.\n");
if (IS_ENABLED(CONFIG_RCU_CPU_STALL_INFO))
pr_info("\tAdditional per-CPU info printed with stalls.\n");
- if (NUM_RCU_LVL_4 != 0)
- pr_info("\tFour-level hierarchy is enabled.\n");
+ if (RCU_NUM_LVLS >= 4)
+ pr_info("\tFour(or more)-level hierarchy is enabled.\n");
if (CONFIG_RCU_FANOUT_LEAF != 16)
pr_info("\tBuild-time adjustment of leaf fanout to %d.\n",
CONFIG_RCU_FANOUT_LEAF);
--
1.8.1.5

2015-05-12 22:36:02

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 13/24] rcu: Provide diagnostic option to slow down grace-period scans

From: "Paul E. McKenney" <[email protected]>

Grace-period scans of the rcu_node combining tree normally
proceed quite quickly, so that it is very difficult to reproduce
races against them. This commit therefore allows grace-period
pre-initialization and cleanup to be artificially slowed down,
increasing race-reproduction probability. A pair of pairs of new
Kconfig parameters are provided, RCU_TORTURE_TEST_SLOW_PREINIT to
enable the slowing down of propagating CPU-hotplug changes up the
combining tree along with RCU_TORTURE_TEST_SLOW_PREINIT_DELAY to
specify the delay in jiffies, and RCU_TORTURE_TEST_SLOW_CLEANUP
to enable the slowing down of the end-of-grace-period cleanup scan
along with RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY to specify the delay
in jiffies. Boot-time parameters named rcutree.gp_preinit_delay and
rcutree.gp_cleanup_delay allow these delays to be specified at boot time.

Signed-off-by: Paul E. McKenney <[email protected]>
---
Documentation/kernel-parameters.txt | 16 ++++++-
kernel/rcu/tree.c | 29 ++++++++++--
lib/Kconfig.debug | 54 +++++++++++++++++++++-
.../selftests/rcutorture/configs/rcu/CFcommon | 2 +
4 files changed, 93 insertions(+), 8 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index f6befa9855c1..d4afdbfbc5da 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2992,11 +2992,23 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Set maximum number of finished RCU callbacks to
process in one batch.

+ rcutree.gp_cleanup_delay= [KNL]
+ Set the number of jiffies to delay each step of
+ RCU grace-period cleanup. This only has effect
+ when CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is set.
+
rcutree.gp_init_delay= [KNL]
Set the number of jiffies to delay each step of
RCU grace-period initialization. This only has
- effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT is
- set.
+ effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT
+ is set.
+
+ rcutree.gp_preinit_delay= [KNL]
+ Set the number of jiffies to delay each step of
+ RCU grace-period pre-initialization, that is,
+ the propagation of recent CPU-hotplug changes up
+ the rcu_node combining tree. This only has effect
+ when CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is set.

rcutree.rcu_fanout_leaf= [KNL]
Increase the number of CPUs assigned to each
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 56eeb01d581e..69dd18af71da 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -158,6 +158,14 @@ static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
module_param(kthread_prio, int, 0644);

/* Delay in jiffies for grace-period initialization delays, debug only. */
+
+#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT
+static int gp_preinit_delay = CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT_DELAY;
+module_param(gp_preinit_delay, int, 0644);
+#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT */
+static const int gp_preinit_delay;
+#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT */
+
#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT
static int gp_init_delay = CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY;
module_param(gp_init_delay, int, 0644);
@@ -165,6 +173,13 @@ module_param(gp_init_delay, int, 0644);
static const int gp_init_delay;
#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */

+#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP
+static int gp_cleanup_delay = CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY;
+module_param(gp_cleanup_delay, int, 0644);
+#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP */
+static const int gp_cleanup_delay;
+#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP */
+
/*
* Number of grace periods between delays, normalized by the duration of
* the delay. The longer the the delay, the more the grace periods between
@@ -1737,6 +1752,13 @@ static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp)
rcu_gp_kthread_wake(rsp);
}

+static void rcu_gp_slow(struct rcu_state *rsp, int delay)
+{
+ if (delay > 0 &&
+ !(rsp->gpnum % (rcu_num_nodes * PER_RCU_NODE_PERIOD * delay)))
+ schedule_timeout_uninterruptible(delay);
+}
+
/*
* Initialize a new grace period. Return 0 if no grace period required.
*/
@@ -1779,6 +1801,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
* will handle subsequent offline CPUs.
*/
rcu_for_each_leaf_node(rsp, rnp) {
+ rcu_gp_slow(rsp, gp_preinit_delay);
raw_spin_lock_irq(&rnp->lock);
smp_mb__after_unlock_lock();
if (rnp->qsmaskinit == rnp->qsmaskinitnext &&
@@ -1835,6 +1858,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
* process finishes, because this kthread handles both.
*/
rcu_for_each_node_breadth_first(rsp, rnp) {
+ rcu_gp_slow(rsp, gp_init_delay);
raw_spin_lock_irq(&rnp->lock);
smp_mb__after_unlock_lock();
rdp = this_cpu_ptr(rsp->rda);
@@ -1852,10 +1876,6 @@ static int rcu_gp_init(struct rcu_state *rsp)
raw_spin_unlock_irq(&rnp->lock);
cond_resched_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
- if (gp_init_delay > 0 &&
- !(rsp->gpnum %
- (rcu_num_nodes * PER_RCU_NODE_PERIOD * gp_init_delay)))
- schedule_timeout_uninterruptible(gp_init_delay);
}

return 1;
@@ -1950,6 +1970,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
raw_spin_unlock_irq(&rnp->lock);
cond_resched_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
+ rcu_gp_slow(rsp, gp_cleanup_delay);
}
rnp = rcu_get_root(rsp);
raw_spin_lock_irq(&rnp->lock);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ba2b0c87e65b..e1af93ae246b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1261,12 +1261,38 @@ config RCU_TORTURE_TEST_RUNNABLE
Say N here if you want the RCU torture tests to start only
after being manually enabled via /proc.

+config RCU_TORTURE_TEST_SLOW_PREINIT
+ bool "Slow down RCU grace-period pre-initialization to expose races"
+ depends on RCU_TORTURE_TEST
+ help
+ This option delays grace-period pre-initialization (the
+ propagation of CPU-hotplug changes up the rcu_node combining
+ tree) for a few jiffies between initializing each pair of
+ consecutive rcu_node structures. This helps to expose races
+ involving grace-period pre-initialization, in other words, it
+ makes your kernel less stable. It can also greatly increase
+ grace-period latency, especially on systems with large numbers
+ of CPUs. This is useful when torture-testing RCU, but in
+ almost no other circumstance.
+
+ Say Y here if you want your system to crash and hang more often.
+ Say N if you want a sane system.
+
+config RCU_TORTURE_TEST_SLOW_PREINIT_DELAY
+ int "How much to slow down RCU grace-period pre-initialization"
+ range 0 5
+ default 3
+ depends on RCU_TORTURE_TEST_SLOW_PREINIT
+ help
+ This option specifies the number of jiffies to wait between
+ each rcu_node structure pre-initialization step.
+
config RCU_TORTURE_TEST_SLOW_INIT
bool "Slow down RCU grace-period initialization to expose races"
depends on RCU_TORTURE_TEST
help
- This option makes grace-period initialization block for a
- few jiffies between initializing each pair of consecutive
+ This option delays grace-period initialization for a few
+ jiffies between initializing each pair of consecutive
rcu_node structures. This helps to expose races involving
grace-period initialization, in other words, it makes your
kernel less stable. It can also greatly increase grace-period
@@ -1286,6 +1312,30 @@ config RCU_TORTURE_TEST_SLOW_INIT_DELAY
This option specifies the number of jiffies to wait between
each rcu_node structure initialization.

+config RCU_TORTURE_TEST_SLOW_CLEANUP
+ bool "Slow down RCU grace-period cleanup to expose races"
+ depends on RCU_TORTURE_TEST
+ help
+ This option delays grace-period cleanup for a few jiffies
+ between cleaning up each pair of consecutive rcu_node
+ structures. This helps to expose races involving grace-period
+ cleanup, in other words, it makes your kernel less stable.
+ It can also greatly increase grace-period latency, especially
+ on systems with large numbers of CPUs. This is useful when
+ torture-testing RCU, but in almost no other circumstance.
+
+ Say Y here if you want your system to crash and hang more often.
+ Say N if you want a sane system.
+
+config RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY
+ int "How much to slow down RCU grace-period cleanup"
+ range 0 5
+ default 3
+ depends on RCU_TORTURE_TEST_SLOW_CLEANUP
+ help
+ This option specifies the number of jiffies to wait between
+ each rcu_node structure cleanup operation.
+
config RCU_CPU_STALL_TIMEOUT
int "RCU CPU stall timeout in seconds"
depends on RCU_STALL_COMMON
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
index 49701218dc62..f824b4c9d9d9 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
+++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
@@ -1,3 +1,5 @@
CONFIG_RCU_TORTURE_TEST=y
CONFIG_PRINTK_TIME=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
--
1.8.1.5

2015-05-12 22:31:20

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 14/24] rcu: Directly drive TASKS_RCU from Kconfig

From: "Paul E. McKenney" <[email protected]>

Currently, Kconfig will ask the user whether TASKS_RCU should be set.
This is silly because Kconfig already has all the information that it
needs to set this parameter. This commit therefore directly drives
the value of TASKS_RCU via "select" statements. Which means that
as subsystems require TASKS_RCU, those subsystems will need to add
"select" statements of their own.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Steven Rostedt <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 4 +---
lib/Kconfig.debug | 1 +
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index dc24dec60232..73db30a76afa 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -509,7 +509,7 @@ config SRCU
sections.

config TASKS_RCU
- bool "Task_based RCU implementation using voluntary context switch"
+ bool
default n
select SRCU
help
@@ -517,8 +517,6 @@ config TASKS_RCU
only voluntary context switch (not preemption!), idle, and
user-mode execution as quiescent states.

- If unsure, say N.
-
config RCU_STALL_COMMON
def_bool ( TREE_RCU || PREEMPT_RCU || RCU_TRACE )
help
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e1af93ae246b..c4e1cf04cf57 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1233,6 +1233,7 @@ config RCU_TORTURE_TEST
depends on DEBUG_KERNEL
select TORTURE_TEST
select SRCU
+ select TASKS_RCU
default n
help
This option provides a kernel module that runs torture tests
--
1.8.1.5

2015-05-12 22:35:44

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 15/24] rcu: Directly drive RCU_USER_QS from Kconfig

From: "Paul E. McKenney" <[email protected]>

Currently, Kconfig will ask the user whether RCU_USER_QS should be set.
This is silly because Kconfig already has all the information that it
needs to set this parameter. This commit therefore directly drives
the value of RCU_USER_QS via NO_HZ_FULL's "select" statement.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 73db30a76afa..927210810189 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -529,9 +529,7 @@ config CONTEXT_TRACKING
bool

config RCU_USER_QS
- bool "Consider userspace as in RCU extended quiescent state"
- depends on HAVE_CONTEXT_TRACKING && SMP
- select CONTEXT_TRACKING
+ bool
help
This option sets hooks on kernel / userspace boundaries and
puts RCU in extended quiescent state when the CPU runs in
@@ -539,12 +537,6 @@ config RCU_USER_QS
excluded from the global RCU state machine and thus doesn't
try to keep the timer tick on for RCU.

- Unless you want to hack and help the development of the full
- dynticks mode, you shouldn't enable this option. It also
- adds unnecessary overhead.
-
- If unsure say N
-
config CONTEXT_TRACKING_FORCE
bool "Force context tracking"
depends on CONTEXT_TRACKING
--
1.8.1.5

2015-05-12 22:33:49

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 16/24] rcu: Convert CONFIG_RCU_FANOUT_EXACT to boot parameter

From: "Paul E. McKenney" <[email protected]>

The CONFIG_RCU_FANOUT_EXACT Kconfig parameter is used primarily (and
perhaps only) by rcutorture to verify that RCU works correctly in specific
rcu_node combining-tree configurations. It therefore does not make
much sense have this as a question to people attempting to configure
their kernels. So this commit creates an rcutree.rcu_fanout_exact=
boot parameter that rcutorture can use, and eliminates the original
CONFIG_RCU_FANOUT_EXACT Kconfig parameter.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
Documentation/kernel-parameters.txt | 6 ++++++
init/Kconfig | 14 --------------
kernel/rcu/tree.c | 7 +++++--
kernel/rcu/tree_plugin.h | 2 +-
4 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index d4afdbfbc5da..73ac8aa84608 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3010,6 +3010,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
the rcu_node combining tree. This only has effect
when CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is set.

+ rcutree.rcu_fanout_exact= [KNL]
+ Disable autobalancing of the rcu_node combining
+ tree. This is used by rcutorture, and might
+ possibly be useful for architectures having high
+ cache-to-cache transfer latencies.
+
rcutree.rcu_fanout_leaf= [KNL]
Increase the number of CPUs assigned to each
leaf rcu_node structure. Useful for very large
diff --git a/init/Kconfig b/init/Kconfig
index 927210810189..0ec82362cfc0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -611,20 +611,6 @@ config RCU_FANOUT_LEAF

Take the default if unsure.

-config RCU_FANOUT_EXACT
- bool "Disable tree-based hierarchical RCU auto-balancing"
- depends on TREE_RCU || PREEMPT_RCU
- default n
- help
- This option forces use of the exact RCU_FANOUT value specified,
- regardless of imbalances in the hierarchy. This is useful for
- testing RCU itself, and might one day be useful on systems with
- strong NUMA behavior.
-
- Without RCU_FANOUT_EXACT, the code will balance the hierarchy.
-
- Say N if unsure.
-
config RCU_FAST_NO_HZ
bool "Accelerate last non-dyntick-idle CPU's grace periods"
depends on NO_HZ_COMMON && SMP
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 69dd18af71da..227b4f2cfb1e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -113,6 +113,9 @@ RCU_STATE_INITIALIZER(rcu_bh, 'b', call_rcu_bh);
static struct rcu_state *rcu_state_p;
LIST_HEAD(rcu_struct_flavors);

+/* Control rcu_node-tree auto-balancing at boot time. */
+static bool rcu_fanout_exact;
+module_param(rcu_fanout_exact, bool, 0444);
/* Increase (but not decrease) the CONFIG_RCU_FANOUT_LEAF at boot time. */
static int rcu_fanout_leaf = CONFIG_RCU_FANOUT_LEAF;
module_param(rcu_fanout_leaf, int, 0444);
@@ -3951,13 +3954,13 @@ void rcu_scheduler_starting(void)

/*
* Compute the per-level fanout, either using the exact fanout specified
- * or balancing the tree, depending on CONFIG_RCU_FANOUT_EXACT.
+ * or balancing the tree, depending on the rcu_fanout_exact boot parameter.
*/
static void __init rcu_init_levelspread(int *levelspread, const int *levelcnt)
{
int i;

- if (IS_ENABLED(CONFIG_RCU_FANOUT_EXACT)) {
+ if (rcu_fanout_exact) {
levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf;
for (i = rcu_num_lvls - 2; i >= 0; i--)
levelspread[i] = CONFIG_RCU_FANOUT;
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 9562dd76542e..3eb1cda880c8 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -64,7 +64,7 @@ static void __init rcu_bootup_announce_oddness(void)
(!IS_ENABLED(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 32))
pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
CONFIG_RCU_FANOUT);
- if (IS_ENABLED(CONFIG_RCU_FANOUT_EXACT))
+ if (rcu_fanout_exact)
pr_info("\tHierarchical RCU autobalancing is disabled.\n");
if (IS_ENABLED(CONFIG_RCU_FAST_NO_HZ))
pr_info("\tRCU dyntick-idle grace-period acceleration is enabled.\n");
--
1.8.1.5

2015-05-12 22:35:50

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 17/24] rcu: Enable diagnostic dump of rcu_node combining tree

From: "Paul E. McKenney" <[email protected]>

The purpose of this commit is to make it easier to verify that RCU's
combining tree is set up correctly, which is useful to have when making
changes in how that tree is initialized.

Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
[ paulmck: Fold fix found by Fengguang's 0-day test robot. ]
---
Documentation/kernel-parameters.txt | 5 +++++
kernel/rcu/tree.c | 27 +++++++++++++++++++++++++++
2 files changed, 32 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 73ac8aa84608..5a8d9f7a66fe 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2992,6 +2992,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Set maximum number of finished RCU callbacks to
process in one batch.

+ rcutree.dump_tree= [KNL]
+ Dump the structure of the rcu_node combining tree
+ out at early boot. This is used for diagnostic
+ purposes, to verify correct tree setup.
+
rcutree.gp_cleanup_delay= [KNL]
Set the number of jiffies to delay each step of
RCU grace-period cleanup. This only has effect
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 227b4f2cfb1e..01719402fe8e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -113,6 +113,9 @@ RCU_STATE_INITIALIZER(rcu_bh, 'b', call_rcu_bh);
static struct rcu_state *rcu_state_p;
LIST_HEAD(rcu_struct_flavors);

+/* Dump rcu_node combining tree at boot to verify correct setup. */
+static bool dump_tree;
+module_param(dump_tree, bool, 0444);
/* Control rcu_node-tree auto-balancing at boot time. */
static bool rcu_fanout_exact;
module_param(rcu_fanout_exact, bool, 0444);
@@ -4131,6 +4134,28 @@ static void __init rcu_init_geometry(void)
rcu_num_nodes += num_rcu_lvl[i];
}

+/*
+ * Dump out the structure of the rcu_node combining tree associated
+ * with the rcu_state structure referenced by rsp.
+ */
+static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp)
+{
+ int level = 0;
+ struct rcu_node *rnp;
+
+ pr_info("rcu_node tree layout dump\n");
+ pr_info(" ");
+ rcu_for_each_node_breadth_first(rsp, rnp) {
+ if (rnp->level != level) {
+ pr_cont("\n");
+ pr_info(" ");
+ level = rnp->level;
+ }
+ pr_cont("%d:%d ^%d ", rnp->grplo, rnp->grphi, rnp->grpnum);
+ }
+ pr_cont("\n");
+}
+
void __init rcu_init(void)
{
int cpu;
@@ -4141,6 +4166,8 @@ void __init rcu_init(void)
rcu_init_geometry();
rcu_init_one(&rcu_bh_state, &rcu_bh_data);
rcu_init_one(&rcu_sched_state, &rcu_sched_data);
+ if (dump_tree)
+ rcu_dump_rcu_node_tree(&rcu_sched_state);
__rcu_init_preempt();
open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);

--
1.8.1.5

2015-05-12 22:33:57

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 18/24] rcu: Create RCU_EXPERT Kconfig and hide booleans behind it

From: "Paul E. McKenney" <[email protected]>

This commit creates an RCU_EXPERT Kconfig and hides the independent
boolean RCU-related user-visible Kconfig parameters behind it, namely
RCU_FAST_NO_HZ and RCU_BOOST. This prevents Kconfig from asking about
these parameters unless the user really wants to be asked.

Reported-by: Linus Torvalds <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 0ec82362cfc0..7eb4c7b3543c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -501,6 +501,21 @@ config TINY_RCU

endchoice

+config RCU_EXPERT
+ bool "Make expert-level adjustments to RCU configuration"
+ default n
+ help
+ This option needs to be enabled if you wish to make
+ expert-level adjustments to RCU configuration. By default,
+ no such adjustments can be made, which has the often-beneficial
+ side-effect of preventing "make oldconfig" from asking you all
+ sorts of detailed questions about how you would like numerous
+ obscure RCU options to be set up.
+
+ Say Y if you need to make expert-level adjustments to RCU.
+
+ Say N if you are unsure.
+
config SRCU
bool
help
@@ -613,7 +628,7 @@ config RCU_FANOUT_LEAF

config RCU_FAST_NO_HZ
bool "Accelerate last non-dyntick-idle CPU's grace periods"
- depends on NO_HZ_COMMON && SMP
+ depends on NO_HZ_COMMON && SMP && RCU_EXPERT
default n
help
This option permits CPUs to enter dynticks-idle state even if
@@ -639,7 +654,7 @@ config TREE_RCU_TRACE

config RCU_BOOST
bool "Enable RCU priority boosting"
- depends on RT_MUTEXES && PREEMPT_RCU
+ depends on RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT
default n
help
This option boosts the priority of preempted RCU readers that
--
1.8.1.5

2015-05-12 22:32:30

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 19/24] rcu: Break dependency of RCU_FANOUT_LEAF on RCU_FANOUT

From: "Paul E. McKenney" <[email protected]>

RCU_FANOUT_LEAF's range and default values depend on the value of
RCU_FANOUT, which at the time seemed like a cute way to save two lines
of Kconfig code. However, adding a dependency from both of these
Kconfig parameters on RCU_EXPERT requires that RCU_FANOUT_LEAF operate
correctly even if RCU_FANOUT is undefined. This commit therefore
allows RCU_FANOUT_LEAF to take on the full range of permitted values,
even in cases where RCU_FANOUT is undefined.

Signed-off-by: Paul E. McKenney <[email protected]>
[ paulmck: Eliminate redundant "default" as suggested by Pranith Kumar. ]
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 7eb4c7b3543c..ac5386937d37 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -601,8 +601,8 @@ config RCU_FANOUT

config RCU_FANOUT_LEAF
int "Tree-based hierarchical RCU leaf-level fanout value"
- range 2 RCU_FANOUT if 64BIT
- range 2 RCU_FANOUT if !64BIT
+ range 2 64 if 64BIT
+ range 2 32 if !64BIT
depends on TREE_RCU || PREEMPT_RCU
default 16
help
--
1.8.1.5

2015-05-12 22:33:51

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 20/24] rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUT

From: "Paul E. McKenney" <[email protected]>

This commit introduces an RCU_FANOUT C-preprocessor macro so that RCU will
build even when CONFIG_RCU_FANOUT is undefined. The RCU_FANOUT macro is
set to the value of CONFIG_RCU_FANOUT when defined, otherwise it is set
to 32 for 32-bit systems and 64 for 64-bit systems. This commit then
makes CONFIG_RCU_FANOUT depend on CONFIG_RCU_EXPERT, so that Kconfig
users won't be asked about CONFIG_RCU_FANOUT unless they want to be.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 2 +-
kernel/rcu/tree.c | 4 ++--
kernel/rcu/tree.h | 17 ++++++++++++++---
kernel/rcu/tree_plugin.h | 6 +++---
4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index ac5386937d37..fd2d4fb517ca 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -583,7 +583,7 @@ config RCU_FANOUT
int "Tree-based hierarchical RCU fanout value"
range 2 64 if 64BIT
range 2 32 if !64BIT
- depends on TREE_RCU || PREEMPT_RCU
+ depends on (TREE_RCU || PREEMPT_RCU) && RCU_EXPERT
default 64 if 64BIT
default 32 if !64BIT
help
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 01719402fe8e..4b27a7c0926f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3966,7 +3966,7 @@ static void __init rcu_init_levelspread(int *levelspread, const int *levelcnt)
if (rcu_fanout_exact) {
levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf;
for (i = rcu_num_lvls - 2; i >= 0; i--)
- levelspread[i] = CONFIG_RCU_FANOUT;
+ levelspread[i] = RCU_FANOUT;
} else {
int ccur;
int cprv;
@@ -4097,7 +4097,7 @@ static void __init rcu_init_geometry(void)
*/
rcu_capacity[0] = rcu_fanout_leaf;
for (i = 1; i < RCU_NUM_LVLS; i++)
- rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;
+ rcu_capacity[i] = rcu_capacity[i - 1] * RCU_FANOUT;

/*
* The tree must be able to accommodate the configured number of CPUs.
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 5f294267ed20..7ca17b646428 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -35,10 +35,21 @@
* In practice, this did work well going from three levels to four.
* Of course, your mileage may vary.
*/
+
+#ifdef CONFIG_RCU_FANOUT
+#define RCU_FANOUT CONFIG_RCU_FANOUT
+#else /* #ifdef CONFIG_RCU_FANOUT */
+# ifdef CONFIG_64BIT
+# define RCU_FANOUT 64
+# else
+# define RCU_FANOUT 32
+# endif
+#endif /* #else #ifdef CONFIG_RCU_FANOUT */
+
#define RCU_FANOUT_1 (CONFIG_RCU_FANOUT_LEAF)
-#define RCU_FANOUT_2 (RCU_FANOUT_1 * CONFIG_RCU_FANOUT)
-#define RCU_FANOUT_3 (RCU_FANOUT_2 * CONFIG_RCU_FANOUT)
-#define RCU_FANOUT_4 (RCU_FANOUT_3 * CONFIG_RCU_FANOUT)
+#define RCU_FANOUT_2 (RCU_FANOUT_1 * RCU_FANOUT)
+#define RCU_FANOUT_3 (RCU_FANOUT_2 * RCU_FANOUT)
+#define RCU_FANOUT_4 (RCU_FANOUT_3 * RCU_FANOUT)

#if NR_CPUS <= RCU_FANOUT_1
# define RCU_NUM_LVLS 1
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 3eb1cda880c8..ca0f54cbba0e 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -60,10 +60,10 @@ static void __init rcu_bootup_announce_oddness(void)
{
if (IS_ENABLED(CONFIG_RCU_TRACE))
pr_info("\tRCU debugfs-based tracing is enabled.\n");
- if ((IS_ENABLED(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 64) ||
- (!IS_ENABLED(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 32))
+ if ((IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 64) ||
+ (!IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 32))
pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
- CONFIG_RCU_FANOUT);
+ RCU_FANOUT);
if (rcu_fanout_exact)
pr_info("\tHierarchical RCU autobalancing is disabled.\n");
if (IS_ENABLED(CONFIG_RCU_FAST_NO_HZ))
--
1.8.1.5

2015-05-12 22:32:18

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 21/24] rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUT_LEAF

From: "Paul E. McKenney" <[email protected]>

This commit introduces an RCU_FANOUT_LEAF C-preprocessor macro so
that RCU will build even when CONFIG_RCU_FANOUT_LEAF is undefined.
The RCU_FANOUT_LEAF macro is set to the value of CONFIG_RCU_FANOUT_LEAF
when defined, otherwise it is set to 32 for 32-bit systems and 64 for
64-bit systems. This commit then makes CONFIG_RCU_FANOUT_LEAF depend
on CONFIG_RCU_EXPERT, so that Kconfig users won't be asked about
CONFIG_RCU_FANOUT_LEAF unless they want to be.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 2 +-
kernel/rcu/tree.c | 8 ++++----
kernel/rcu/tree.h | 12 +++++++++++-
kernel/rcu/tree_plugin.h | 6 +++---
4 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index fd2d4fb517ca..78176001f73b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -603,7 +603,7 @@ config RCU_FANOUT_LEAF
int "Tree-based hierarchical RCU leaf-level fanout value"
range 2 64 if 64BIT
range 2 32 if !64BIT
- depends on TREE_RCU || PREEMPT_RCU
+ depends on (TREE_RCU || PREEMPT_RCU) && RCU_EXPERT
default 16
help
This option controls the leaf-level fanout of hierarchical
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 4b27a7c0926f..960054da4ae5 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -119,8 +119,8 @@ module_param(dump_tree, bool, 0444);
/* Control rcu_node-tree auto-balancing at boot time. */
static bool rcu_fanout_exact;
module_param(rcu_fanout_exact, bool, 0444);
-/* Increase (but not decrease) the CONFIG_RCU_FANOUT_LEAF at boot time. */
-static int rcu_fanout_leaf = CONFIG_RCU_FANOUT_LEAF;
+/* Increase (but not decrease) the RCU_FANOUT_LEAF at boot time. */
+static int rcu_fanout_leaf = RCU_FANOUT_LEAF;
module_param(rcu_fanout_leaf, int, 0444);
int rcu_num_lvls __read_mostly = RCU_NUM_LVLS;
/* Number of rcu_nodes at specified level. */
@@ -4085,7 +4085,7 @@ static void __init rcu_init_geometry(void)
jiffies_till_next_fqs = d;

/* If the compile-time values are accurate, just leave. */
- if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF &&
+ if (rcu_fanout_leaf == RCU_FANOUT_LEAF &&
nr_cpu_ids == NR_CPUS)
return;
pr_info("RCU: Adjusting geometry for rcu_fanout_leaf=%d, nr_cpu_ids=%d\n",
@@ -4111,7 +4111,7 @@ static void __init rcu_init_geometry(void)
*/
if (nr_cpu_ids > rcu_capacity[RCU_NUM_LVLS - 1])
panic("rcu_init_geometry: rcu_capacity[] is too small");
- else if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
+ else if (rcu_fanout_leaf < RCU_FANOUT_LEAF ||
rcu_fanout_leaf > sizeof(unsigned long) * 8) {
WARN_ON(1);
return;
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 7ca17b646428..23234dbc1070 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -46,7 +46,17 @@
# endif
#endif /* #else #ifdef CONFIG_RCU_FANOUT */

-#define RCU_FANOUT_1 (CONFIG_RCU_FANOUT_LEAF)
+#ifdef CONFIG_RCU_FANOUT_LEAF
+#define RCU_FANOUT_LEAF CONFIG_RCU_FANOUT_LEAF
+#else /* #ifdef CONFIG_RCU_FANOUT_LEAF */
+# ifdef CONFIG_64BIT
+# define RCU_FANOUT_LEAF 64
+# else
+# define RCU_FANOUT_LEAF 32
+# endif
+#endif /* #else #ifdef CONFIG_RCU_FANOUT_LEAF */
+
+#define RCU_FANOUT_1 (RCU_FANOUT_LEAF)
#define RCU_FANOUT_2 (RCU_FANOUT_1 * RCU_FANOUT)
#define RCU_FANOUT_3 (RCU_FANOUT_2 * RCU_FANOUT)
#define RCU_FANOUT_4 (RCU_FANOUT_3 * RCU_FANOUT)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index ca0f54cbba0e..9634b48ae084 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -76,10 +76,10 @@ static void __init rcu_bootup_announce_oddness(void)
pr_info("\tAdditional per-CPU info printed with stalls.\n");
if (RCU_NUM_LVLS >= 4)
pr_info("\tFour(or more)-level hierarchy is enabled.\n");
- if (CONFIG_RCU_FANOUT_LEAF != 16)
+ if (RCU_FANOUT_LEAF != 16)
pr_info("\tBuild-time adjustment of leaf fanout to %d.\n",
- CONFIG_RCU_FANOUT_LEAF);
- if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
+ RCU_FANOUT_LEAF);
+ if (rcu_fanout_leaf != RCU_FANOUT_LEAF)
pr_info("\tBoot-time adjustment of leaf fanout to %d.\n", rcu_fanout_leaf);
if (nr_cpu_ids != NR_CPUS)
pr_info("\tRCU restricting CPUs from NR_CPUS=%d to nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids);
--
1.8.1.5

2015-05-12 22:35:41

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 22/24] rcu: Make RCU able to tolerate undefined CONFIG_RCU_KTHREAD_PRIO

From: "Paul E. McKenney" <[email protected]>

This commit updates the initialization of the kthread_prio boot parameter
so that RCU will build even when CONFIG_RCU_KTHREAD_PRIO is undefined.
The kthread_prio boot parameter is set to CONFIG_RCU_KTHREAD_PRIO if
that is defined, otherwise to 1 if CONFIG_RCU_BOOST is defined and
to zero otherwise. This commit then makes CONFIG_RCU_KTHREAD_PRIO
depend on CONFIG_RCU_EXPERT, so that Kconfig users won't be asked about
CONFIG_RCU_KTHREAD_PRIO unless they want to be.

Reported-by: Linus Torvalds <[email protected]>
Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Pranith Kumar <[email protected]>
---
init/Kconfig | 1 +
kernel/rcu/tree.c | 4 ++++
2 files changed, 5 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 78176001f73b..af2c93c4a105 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -671,6 +671,7 @@ config RCU_KTHREAD_PRIO
range 0 99 if !RCU_BOOST
default 1 if RCU_BOOST
default 0 if !RCU_BOOST
+ depends on RCU_EXPERT
help
This option specifies the SCHED_FIFO priority value that will be
assigned to the rcuc/n and rcub/n threads and is also the value
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 960054da4ae5..0ab9e711a649 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -160,7 +160,11 @@ static void invoke_rcu_core(void);
static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);

/* rcuc/rcub kthread realtime priority */
+#ifdef CONFIG_RCU_KTHREAD_PRIO
static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
+#else /* #ifdef CONFIG_RCU_KTHREAD_PRIO */
+static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
+#endif /* #else #ifdef CONFIG_RCU_KTHREAD_PRIO */
module_param(kthread_prio, int, 0644);

/* Delay in jiffies for grace-period initialization delays, debug only. */
--
1.8.1.5

2015-05-12 22:32:23

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 23/24] rcu: Remove prompt for RCU implementation

From: Pranith Kumar <[email protected]>

The RCU implementation is chosen based on PREEMPT and SMP config options
and is not really a user-selectable choice. This commit removes the
menu entry, given that there is not much point in calling something a
choice when there is in fact no choice.. The TINY_RCU, TREE_RCU, and
PREEMPT_RCU Kconfig options continue to be selected based solely on the
values of the PREEMPT and SMP options.

Signed-off-by: Pranith Kumar <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
init/Kconfig | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index af2c93c4a105..4c08197044f1 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -465,13 +465,9 @@ endmenu # "CPU/Task time and stats accounting"

menu "RCU Subsystem"

-choice
- prompt "RCU Implementation"
- default TREE_RCU
-
config TREE_RCU
- bool "Tree-based hierarchical RCU"
- depends on !PREEMPT && SMP
+ bool
+ default y if !PREEMPT && SMP
help
This option selects the RCU implementation that is
designed for very large SMP system with hundreds or
@@ -479,8 +475,8 @@ config TREE_RCU
smaller systems.

config PREEMPT_RCU
- bool "Preemptible tree-based hierarchical RCU"
- depends on PREEMPT
+ bool
+ default y if PREEMPT
help
This option selects the RCU implementation that is
designed for very large SMP systems with hundreds or
@@ -491,16 +487,14 @@ config PREEMPT_RCU
Select this option if you are unsure.

config TINY_RCU
- bool "UP-only small-memory-footprint RCU"
- depends on !PREEMPT && !SMP
+ bool
+ default y if !PREEMPT && !SMP
help
This option selects the RCU implementation that is
designed for UP systems from which real-time response
is not required. This option greatly reduces the
memory footprint of RCU.

-endchoice
-
config RCU_EXPERT
bool "Make expert-level adjustments to RCU configuration"
default n
--
1.8.1.5

2015-05-12 22:35:57

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

From: "Paul E. McKenney" <[email protected]>

This commit applies some warning-omission micro-optimizations to RCU's
various extended-quiescent-state functions, which are on the kernel/user
hotpath for CONFIG_NO_HZ_FULL=y.

Reported-by: Rik van Riel <[email protected]>
Reported by: Mike Galbraith <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree.c | 23 +++++++++++++++--------
lib/Kconfig.debug | 11 +++++++++++
2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0ab9e711a649..01a4d2a0046e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -615,7 +615,8 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);

trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
- if (!user && !is_idle_task(current)) {
+ if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ !user && !is_idle_task(current)) {
struct task_struct *idle __maybe_unused =
idle_task(smp_processor_id());

@@ -634,7 +635,8 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
smp_mb__before_atomic(); /* See above. */
atomic_inc(&rdtp->dynticks);
smp_mb__after_atomic(); /* Force ordering with next sojourn. */
- WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ atomic_read(&rdtp->dynticks) & 0x1);
rcu_dynticks_task_enter();

/*
@@ -660,7 +662,8 @@ static void rcu_eqs_enter(bool user)

rdtp = this_cpu_ptr(&rcu_dynticks);
oldval = rdtp->dynticks_nesting;
- WARN_ON_ONCE((oldval & DYNTICK_TASK_NEST_MASK) == 0);
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ (oldval & DYNTICK_TASK_NEST_MASK) == 0);
if ((oldval & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) {
rdtp->dynticks_nesting = 0;
rcu_eqs_enter_common(oldval, user);
@@ -733,7 +736,8 @@ void rcu_irq_exit(void)
rdtp = this_cpu_ptr(&rcu_dynticks);
oldval = rdtp->dynticks_nesting;
rdtp->dynticks_nesting--;
- WARN_ON_ONCE(rdtp->dynticks_nesting < 0);
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ rdtp->dynticks_nesting < 0);
if (rdtp->dynticks_nesting)
trace_rcu_dyntick(TPS("--="), oldval, rdtp->dynticks_nesting);
else
@@ -758,10 +762,12 @@ static void rcu_eqs_exit_common(long long oldval, int user)
atomic_inc(&rdtp->dynticks);
/* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
smp_mb__after_atomic(); /* See above. */
- WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ !(atomic_read(&rdtp->dynticks) & 0x1));
rcu_cleanup_after_idle();
trace_rcu_dyntick(TPS("End"), oldval, rdtp->dynticks_nesting);
- if (!user && !is_idle_task(current)) {
+ if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ !user && !is_idle_task(current)) {
struct task_struct *idle __maybe_unused =
idle_task(smp_processor_id());

@@ -785,7 +791,7 @@ static void rcu_eqs_exit(bool user)

rdtp = this_cpu_ptr(&rcu_dynticks);
oldval = rdtp->dynticks_nesting;
- WARN_ON_ONCE(oldval < 0);
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
if (oldval & DYNTICK_TASK_NEST_MASK) {
rdtp->dynticks_nesting += DYNTICK_TASK_NEST_VALUE;
} else {
@@ -858,7 +864,8 @@ void rcu_irq_enter(void)
rdtp = this_cpu_ptr(&rcu_dynticks);
oldval = rdtp->dynticks_nesting;
rdtp->dynticks_nesting++;
- WARN_ON_ONCE(rdtp->dynticks_nesting == 0);
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+ rdtp->dynticks_nesting == 0);
if (oldval)
trace_rcu_dyntick(TPS("++="), oldval, rdtp->dynticks_nesting);
else
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c4e1cf04cf57..b908048f8d6a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1373,6 +1373,17 @@ config RCU_TRACE
Say Y here if you want to enable RCU tracing
Say N if you are unsure.

+config RCU_EQS_DEBUG
+ bool "Use this when adding any sort of NO_HZ support to your arch"
+ depends on DEBUG_KERNEL
+ help
+ This option provides consistency checks in RCU's handling of
+ NO_HZ. These checks have proven quite helpful in detecting
+ bugs in arch-specific NO_HZ code.
+
+ Say N here if you need ultimate kernel/user switch latencies
+ Say Y if you are unsure
+
endmenu # "RCU Debugging"

config DEBUG_BLOCK_EXT_DEVT
--
1.8.1.5

2015-05-13 00:37:58

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 15/24] rcu: Directly drive RCU_USER_QS from Kconfig

On Tue, May 12, 2015 at 03:30:45PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <[email protected]>
>
> Currently, Kconfig will ask the user whether RCU_USER_QS should be set.
> This is silly because Kconfig already has all the information that it
> needs to set this parameter. This commit therefore directly drives
> the value of RCU_USER_QS via NO_HZ_FULL's "select" statement.
>
> Reported-by: Ingo Molnar <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>

ACK. And we should remove it completely and use NO_HZ_FULL instead.
There won't seem to be more users.

> Reviewed-by: Pranith Kumar <[email protected]>
> ---
> init/Kconfig | 10 +---------
> 1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 73db30a76afa..927210810189 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -529,9 +529,7 @@ config CONTEXT_TRACKING
> bool
>
> config RCU_USER_QS
> - bool "Consider userspace as in RCU extended quiescent state"
> - depends on HAVE_CONTEXT_TRACKING && SMP
> - select CONTEXT_TRACKING
> + bool
> help
> This option sets hooks on kernel / userspace boundaries and
> puts RCU in extended quiescent state when the CPU runs in
> @@ -539,12 +537,6 @@ config RCU_USER_QS
> excluded from the global RCU state machine and thus doesn't
> try to keep the timer tick on for RCU.
>
> - Unless you want to hack and help the development of the full
> - dynticks mode, you shouldn't enable this option. It also
> - adds unnecessary overhead.
> -
> - If unsure say N
> -
> config CONTEXT_TRACKING_FORCE
> bool "Force context tracking"
> depends on CONTEXT_TRACKING
> --
> 1.8.1.5
>

2015-05-13 02:59:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 01/24] rcu: Control grace-period delays directly from value

On Tue, 12 May 2015 15:30:31 -0700
"Paul E. McKenney" <[email protected]> wrote:

>
> Reported-by: Linus Torvalds <[email protected]> Signed-off-by:
> Paul E. McKenney <[email protected]>
> ---

Line wrap issues?

-- Steve

2015-05-13 03:22:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 06/24] rcu: Cleanup rcu_init_geometry() code and arithmetics

On Tue, 12 May 2015 15:30:36 -0700
"Paul E. McKenney" <[email protected]> wrote:

> @@ -4103,24 +4102,21 @@ static void __init rcu_init_geometry(void)
> return;
> }
>
> + /* Calculate the number of levels in the tree. */
> + for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {

Should this start at i = 1 as it use to? Also, should there be a safety
check too:

for (i = 1; i <= MAX_RCU_LVLS && nr_cpu_ids > rcu_capacity[i]; i++) {


> + }
> + rcu_num_lvls = i;
> +
> /* Calculate the number of rcu_nodes at each level of the tree. */
> - for (i = 1; i <= MAX_RCU_LVLS; i++)
> - if (nr_cpu_ids <= rcu_capacity[i]) {
> - for (j = 0; j <= i; j++) {
> - int cap = rcu_capacity[i - j];
> - num_rcu_lvl[j] = DIV_ROUND_UP(nr_cpu_ids, cap);
> - }
> - rcu_num_lvls = i;
> - for (j = i + 1; j <= MAX_RCU_LVLS; j++)
> - num_rcu_lvl[j] = 0;
> - break;
> - }
> + for (i = 0; i < rcu_num_lvls; i++) {

Hmm, up above we have: for (j = 0; j <= i; j++)

and now we have rcu_num_lvls = i, so shouldn't this be;

for (i = 0; i <= rcu_num_lvls; i++)

?

-- Steve

> + int cap = rcu_capacity[rcu_num_lvls - i];
> + num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap);
> + }
>
> /* Calculate the total number of rcu_node structures. */
> rcu_num_nodes = 0;
> - for (i = 0; i <= MAX_RCU_LVLS; i++)
> + for (i = 0; i < rcu_num_lvls; i++)
> rcu_num_nodes += num_rcu_lvl[i];
> - rcu_num_nodes -= nr_cpu_ids;
> }
>
> void __init rcu_init(void)

2015-05-13 13:21:07

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 06/24] rcu: Cleanup rcu_init_geometry() code and arithmetics

On Tue, May 12, 2015 at 11:22:32PM -0400, Steven Rostedt wrote:
> On Tue, 12 May 2015 15:30:36 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > @@ -4103,24 +4102,21 @@ static void __init rcu_init_geometry(void)
> > return;
> > }
> >
> > + /* Calculate the number of levels in the tree. */
> > + for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {
>
> Should this start at i = 1 as it use to? Also, should there be a safety
> check too:
>
> for (i = 1; i <= MAX_RCU_LVLS && nr_cpu_ids > rcu_capacity[i]; i++) {

Alexander, these questions are for you.

Thanx, Paul

> > + }
> > + rcu_num_lvls = i;
> > +
> > /* Calculate the number of rcu_nodes at each level of the tree. */
> > - for (i = 1; i <= MAX_RCU_LVLS; i++)
> > - if (nr_cpu_ids <= rcu_capacity[i]) {
> > - for (j = 0; j <= i; j++) {
> > - int cap = rcu_capacity[i - j];
> > - num_rcu_lvl[j] = DIV_ROUND_UP(nr_cpu_ids, cap);
> > - }
> > - rcu_num_lvls = i;
> > - for (j = i + 1; j <= MAX_RCU_LVLS; j++)
> > - num_rcu_lvl[j] = 0;
> > - break;
> > - }
> > + for (i = 0; i < rcu_num_lvls; i++) {
>
> Hmm, up above we have: for (j = 0; j <= i; j++)
>
> and now we have rcu_num_lvls = i, so shouldn't this be;
>
> for (i = 0; i <= rcu_num_lvls; i++)
>
> ?
>
> -- Steve
>
> > + int cap = rcu_capacity[rcu_num_lvls - i];
> > + num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap);
> > + }
> >
> > /* Calculate the total number of rcu_node structures. */
> > rcu_num_nodes = 0;
> > - for (i = 0; i <= MAX_RCU_LVLS; i++)
> > + for (i = 0; i < rcu_num_lvls; i++)
> > rcu_num_nodes += num_rcu_lvl[i];
> > - rcu_num_nodes -= nr_cpu_ids;
> > }
> >
> > void __init rcu_init(void)
>

2015-05-13 13:31:29

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 14/24] rcu: Directly drive TASKS_RCU from Kconfig

On Tue, 12 May 2015 15:30:44 -0700
"Paul E. McKenney" <[email protected]> wrote:

> From: "Paul E. McKenney" <[email protected]>
>
> Currently, Kconfig will ask the user whether TASKS_RCU should be set.
> This is silly because Kconfig already has all the information that it
> needs to set this parameter. This commit therefore directly drives
> the value of TASKS_RCU via "select" statements. Which means that
> as subsystems require TASKS_RCU, those subsystems will need to add
> "select" statements of their own.
>
> Reported-by: Ingo Molnar <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Cc: Steven Rostedt <[email protected]>

Which reminds me. I need to write the code to implement the usage of
this :-)

-- Steve

> Reviewed-by: Pranith Kumar <[email protected]>
> ---

2015-05-13 13:46:54

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

On Tue, 12 May 2015 15:30:54 -0700
"Paul E. McKenney" <[email protected]> wrote:

> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1373,6 +1373,17 @@ config RCU_TRACE
> Say Y here if you want to enable RCU tracing
> Say N if you are unsure.
>
> +config RCU_EQS_DEBUG
> + bool "Use this when adding any sort of NO_HZ support to your arch"
> + depends on DEBUG_KERNEL

Should we add "depends on NO_HZ" ?

-- Steve

> + help
> + This option provides consistency checks in RCU's handling of
> + NO_HZ. These checks have proven quite helpful in detecting
> + bugs in arch-specific NO_HZ code.
> +
> + Say N here if you need ultimate kernel/user switch latencies
> + Say Y if you are unsure
> +
> endmenu # "RCU Debugging"
>
> config DEBUG_BLOCK_EXT_DEVT

2015-05-13 20:58:50

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

On Wed, May 13, 2015 at 09:46:48AM -0400, Steven Rostedt wrote:
> On Tue, 12 May 2015 15:30:54 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1373,6 +1373,17 @@ config RCU_TRACE
> > Say Y here if you want to enable RCU tracing
> > Say N if you are unsure.
> >
> > +config RCU_EQS_DEBUG
> > + bool "Use this when adding any sort of NO_HZ support to your arch"
> > + depends on DEBUG_KERNEL
>
> Should we add "depends on NO_HZ" ?

No, because RCU now uses the EQS code even for NO_HZ_PERIODIC to detect
idle time.

Thanx, Paul

> -- Steve
>
> > + help
> > + This option provides consistency checks in RCU's handling of
> > + NO_HZ. These checks have proven quite helpful in detecting
> > + bugs in arch-specific NO_HZ code.
> > +
> > + Say N here if you need ultimate kernel/user switch latencies
> > + Say Y if you are unsure
> > +
> > endmenu # "RCU Debugging"
> >
> > config DEBUG_BLOCK_EXT_DEVT
>

2015-05-13 20:58:43

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 14/24] rcu: Directly drive TASKS_RCU from Kconfig

On Wed, May 13, 2015 at 09:31:19AM -0400, Steven Rostedt wrote:
> On Tue, 12 May 2015 15:30:44 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > From: "Paul E. McKenney" <[email protected]>
> >
> > Currently, Kconfig will ask the user whether TASKS_RCU should be set.
> > This is silly because Kconfig already has all the information that it
> > needs to set this parameter. This commit therefore directly drives
> > the value of TASKS_RCU via "select" statements. Which means that
> > as subsystems require TASKS_RCU, those subsystems will need to add
> > "select" statements of their own.
> >
> > Reported-by: Ingo Molnar <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > Cc: Steven Rostedt <[email protected]>
>
> Which reminds me. I need to write the code to implement the usage of
> this :-)

Indeed, before the unused-feature mafia forces me to remove it. ;-)

Thanx, Paul

2015-05-13 21:00:24

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 01/24] rcu: Control grace-period delays directly from value

On Tue, May 12, 2015 at 10:59:29PM -0400, Steven Rostedt wrote:
> On Tue, 12 May 2015 15:30:31 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> >
> > Reported-by: Linus Torvalds <[email protected]> Signed-off-by:
> > Paul E. McKenney <[email protected]>
> > ---
>
> Line wrap issues?

Not sure how I managed to do that... Good catch, fixed!

Thanx, Paul

2015-05-13 20:58:39

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 15/24] rcu: Directly drive RCU_USER_QS from Kconfig

On Wed, May 13, 2015 at 02:37:52AM +0200, Frederic Weisbecker wrote:
> On Tue, May 12, 2015 at 03:30:45PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <[email protected]>
> >
> > Currently, Kconfig will ask the user whether RCU_USER_QS should be set.
> > This is silly because Kconfig already has all the information that it
> > needs to set this parameter. This commit therefore directly drives
> > the value of RCU_USER_QS via NO_HZ_FULL's "select" statement.
> >
> > Reported-by: Ingo Molnar <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > Cc: Frederic Weisbecker <[email protected]>
>
> ACK. And we should remove it completely and use NO_HZ_FULL instead.
> There won't seem to be more users.

Good point! I have queued the patch shown below for 4.3.

Thanx, Paul

------------------------------------------------------------------------

rcu: Drop RCU_USER_QS in favor of NO_HZ_FULL

The RCU_USER_QS Kconfig parameter is now just a synonym for NO_HZ_FULL,
so this commit eliminates RCU_USER_QS, replacing all uses with NO_HZ_FULL.

Reported-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 03a899aabd17..18e377b92875 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -307,7 +307,7 @@ static inline void rcu_sysrq_end(void)
}
#endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */

-#ifdef CONFIG_RCU_USER_QS
+#ifdef CONFIG_NO_HZ_FULL
void rcu_user_enter(void);
void rcu_user_exit(void);
#else
@@ -315,7 +315,7 @@ static inline void rcu_user_enter(void) { }
static inline void rcu_user_exit(void) { }
static inline void rcu_user_hooks_switch(struct task_struct *prev,
struct task_struct *next) { }
-#endif /* CONFIG_RCU_USER_QS */
+#endif /* CONFIG_NO_HZ_FULL */

#ifdef CONFIG_RCU_NOCB_CPU
void rcu_init_nohz(void);
diff --git a/init/Kconfig b/init/Kconfig
index 4c08197044f1..5b8726c10685 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -537,15 +537,6 @@ config RCU_STALL_COMMON
config CONTEXT_TRACKING
bool

-config RCU_USER_QS
- bool
- help
- This option sets hooks on kernel / userspace boundaries and
- puts RCU in extended quiescent state when the CPU runs in
- userspace. It means that when a CPU runs in userspace, it is
- excluded from the global RCU state machine and thus doesn't
- try to keep the timer tick on for RCU.
-
config CONTEXT_TRACKING_FORCE
bool "Force context tracking"
depends on CONTEXT_TRACKING
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7651d7dd982c..012cbee9d354 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -696,7 +696,7 @@ void rcu_idle_enter(void)
}
EXPORT_SYMBOL_GPL(rcu_idle_enter);

-#ifdef CONFIG_RCU_USER_QS
+#ifdef CONFIG_NO_HZ_FULL
/**
* rcu_user_enter - inform RCU that we are resuming userspace.
*
@@ -709,7 +709,7 @@ void rcu_user_enter(void)
{
rcu_eqs_enter(1);
}
-#endif /* CONFIG_RCU_USER_QS */
+#endif /* CONFIG_NO_HZ_FULL */

/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
@@ -823,7 +823,7 @@ void rcu_idle_exit(void)
}
EXPORT_SYMBOL_GPL(rcu_idle_exit);

-#ifdef CONFIG_RCU_USER_QS
+#ifdef CONFIG_NO_HZ_FULL
/**
* rcu_user_exit - inform RCU that we are exiting userspace.
*
@@ -834,7 +834,7 @@ void rcu_user_exit(void)
{
rcu_eqs_exit(1);
}
-#endif /* CONFIG_RCU_USER_QS */
+#endif /* CONFIG_NO_HZ_FULL */

/**
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 579ce1b929af..4008d9f95dd7 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -92,12 +92,10 @@ config NO_HZ_FULL
depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
# We need at least one periodic CPU for timekeeping
depends on SMP
- # RCU_USER_QS dependency
depends on HAVE_CONTEXT_TRACKING
# VIRT_CPU_ACCOUNTING_GEN dependency
depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
select NO_HZ_COMMON
- select RCU_USER_QS
select RCU_NOCB_CPU
select VIRT_CPU_ACCOUNTING_GEN
select IRQ_WORK

2015-05-14 00:28:36

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 15/24] rcu: Directly drive RCU_USER_QS from Kconfig

On Wed, May 13, 2015 at 10:45:05AM -0700, Paul E. McKenney wrote:
> On Wed, May 13, 2015 at 02:37:52AM +0200, Frederic Weisbecker wrote:
> > On Tue, May 12, 2015 at 03:30:45PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" <[email protected]>
> > >
> > > Currently, Kconfig will ask the user whether RCU_USER_QS should be set.
> > > This is silly because Kconfig already has all the information that it
> > > needs to set this parameter. This commit therefore directly drives
> > > the value of RCU_USER_QS via NO_HZ_FULL's "select" statement.
> > >
> > > Reported-by: Ingo Molnar <[email protected]>
> > > Signed-off-by: Paul E. McKenney <[email protected]>
> > > Cc: Frederic Weisbecker <[email protected]>
> >
> > ACK. And we should remove it completely and use NO_HZ_FULL instead.
> > There won't seem to be more users.
>
> Good point! I have queued the patch shown below for 4.3.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Drop RCU_USER_QS in favor of NO_HZ_FULL
>
> The RCU_USER_QS Kconfig parameter is now just a synonym for NO_HZ_FULL,
> so this commit eliminates RCU_USER_QS, replacing all uses with NO_HZ_FULL.
>
> Reported-by: Frederic Weisbecker <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Excellent! ACK+!

Thanks.

>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 03a899aabd17..18e377b92875 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -307,7 +307,7 @@ static inline void rcu_sysrq_end(void)
> }
> #endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */
>
> -#ifdef CONFIG_RCU_USER_QS
> +#ifdef CONFIG_NO_HZ_FULL
> void rcu_user_enter(void);
> void rcu_user_exit(void);
> #else
> @@ -315,7 +315,7 @@ static inline void rcu_user_enter(void) { }
> static inline void rcu_user_exit(void) { }
> static inline void rcu_user_hooks_switch(struct task_struct *prev,
> struct task_struct *next) { }
> -#endif /* CONFIG_RCU_USER_QS */
> +#endif /* CONFIG_NO_HZ_FULL */
>
> #ifdef CONFIG_RCU_NOCB_CPU
> void rcu_init_nohz(void);
> diff --git a/init/Kconfig b/init/Kconfig
> index 4c08197044f1..5b8726c10685 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -537,15 +537,6 @@ config RCU_STALL_COMMON
> config CONTEXT_TRACKING
> bool
>
> -config RCU_USER_QS
> - bool
> - help
> - This option sets hooks on kernel / userspace boundaries and
> - puts RCU in extended quiescent state when the CPU runs in
> - userspace. It means that when a CPU runs in userspace, it is
> - excluded from the global RCU state machine and thus doesn't
> - try to keep the timer tick on for RCU.
> -
> config CONTEXT_TRACKING_FORCE
> bool "Force context tracking"
> depends on CONTEXT_TRACKING
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 7651d7dd982c..012cbee9d354 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -696,7 +696,7 @@ void rcu_idle_enter(void)
> }
> EXPORT_SYMBOL_GPL(rcu_idle_enter);
>
> -#ifdef CONFIG_RCU_USER_QS
> +#ifdef CONFIG_NO_HZ_FULL
> /**
> * rcu_user_enter - inform RCU that we are resuming userspace.
> *
> @@ -709,7 +709,7 @@ void rcu_user_enter(void)
> {
> rcu_eqs_enter(1);
> }
> -#endif /* CONFIG_RCU_USER_QS */
> +#endif /* CONFIG_NO_HZ_FULL */
>
> /**
> * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> @@ -823,7 +823,7 @@ void rcu_idle_exit(void)
> }
> EXPORT_SYMBOL_GPL(rcu_idle_exit);
>
> -#ifdef CONFIG_RCU_USER_QS
> +#ifdef CONFIG_NO_HZ_FULL
> /**
> * rcu_user_exit - inform RCU that we are exiting userspace.
> *
> @@ -834,7 +834,7 @@ void rcu_user_exit(void)
> {
> rcu_eqs_exit(1);
> }
> -#endif /* CONFIG_RCU_USER_QS */
> +#endif /* CONFIG_NO_HZ_FULL */
>
> /**
> * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
> diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
> index 579ce1b929af..4008d9f95dd7 100644
> --- a/kernel/time/Kconfig
> +++ b/kernel/time/Kconfig
> @@ -92,12 +92,10 @@ config NO_HZ_FULL
> depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
> # We need at least one periodic CPU for timekeeping
> depends on SMP
> - # RCU_USER_QS dependency
> depends on HAVE_CONTEXT_TRACKING
> # VIRT_CPU_ACCOUNTING_GEN dependency
> depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
> select NO_HZ_COMMON
> - select RCU_USER_QS
> select RCU_NOCB_CPU
> select VIRT_CPU_ACCOUNTING_GEN
> select IRQ_WORK
>

2015-05-14 21:12:33

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 15/24] rcu: Directly drive RCU_USER_QS from Kconfig

On Thu, May 14, 2015 at 02:27:48AM +0200, Frederic Weisbecker wrote:
> On Wed, May 13, 2015 at 10:45:05AM -0700, Paul E. McKenney wrote:
> > On Wed, May 13, 2015 at 02:37:52AM +0200, Frederic Weisbecker wrote:
> > > On Tue, May 12, 2015 at 03:30:45PM -0700, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" <[email protected]>
> > > >
> > > > Currently, Kconfig will ask the user whether RCU_USER_QS should be set.
> > > > This is silly because Kconfig already has all the information that it
> > > > needs to set this parameter. This commit therefore directly drives
> > > > the value of RCU_USER_QS via NO_HZ_FULL's "select" statement.
> > > >
> > > > Reported-by: Ingo Molnar <[email protected]>
> > > > Signed-off-by: Paul E. McKenney <[email protected]>
> > > > Cc: Frederic Weisbecker <[email protected]>
> > >
> > > ACK. And we should remove it completely and use NO_HZ_FULL instead.
> > > There won't seem to be more users.
> >
> > Good point! I have queued the patch shown below for 4.3.
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > rcu: Drop RCU_USER_QS in favor of NO_HZ_FULL
> >
> > The RCU_USER_QS Kconfig parameter is now just a synonym for NO_HZ_FULL,
> > so this commit eliminates RCU_USER_QS, replacing all uses with NO_HZ_FULL.
> >
> > Reported-by: Frederic Weisbecker <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
>
> Excellent! ACK+!

Very good! I applied your Acked-by to both patches, please let me know
if you intended something else.

Thanx, Paul

> Thanks.
>
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 03a899aabd17..18e377b92875 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -307,7 +307,7 @@ static inline void rcu_sysrq_end(void)
> > }
> > #endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */
> >
> > -#ifdef CONFIG_RCU_USER_QS
> > +#ifdef CONFIG_NO_HZ_FULL
> > void rcu_user_enter(void);
> > void rcu_user_exit(void);
> > #else
> > @@ -315,7 +315,7 @@ static inline void rcu_user_enter(void) { }
> > static inline void rcu_user_exit(void) { }
> > static inline void rcu_user_hooks_switch(struct task_struct *prev,
> > struct task_struct *next) { }
> > -#endif /* CONFIG_RCU_USER_QS */
> > +#endif /* CONFIG_NO_HZ_FULL */
> >
> > #ifdef CONFIG_RCU_NOCB_CPU
> > void rcu_init_nohz(void);
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 4c08197044f1..5b8726c10685 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -537,15 +537,6 @@ config RCU_STALL_COMMON
> > config CONTEXT_TRACKING
> > bool
> >
> > -config RCU_USER_QS
> > - bool
> > - help
> > - This option sets hooks on kernel / userspace boundaries and
> > - puts RCU in extended quiescent state when the CPU runs in
> > - userspace. It means that when a CPU runs in userspace, it is
> > - excluded from the global RCU state machine and thus doesn't
> > - try to keep the timer tick on for RCU.
> > -
> > config CONTEXT_TRACKING_FORCE
> > bool "Force context tracking"
> > depends on CONTEXT_TRACKING
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 7651d7dd982c..012cbee9d354 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -696,7 +696,7 @@ void rcu_idle_enter(void)
> > }
> > EXPORT_SYMBOL_GPL(rcu_idle_enter);
> >
> > -#ifdef CONFIG_RCU_USER_QS
> > +#ifdef CONFIG_NO_HZ_FULL
> > /**
> > * rcu_user_enter - inform RCU that we are resuming userspace.
> > *
> > @@ -709,7 +709,7 @@ void rcu_user_enter(void)
> > {
> > rcu_eqs_enter(1);
> > }
> > -#endif /* CONFIG_RCU_USER_QS */
> > +#endif /* CONFIG_NO_HZ_FULL */
> >
> > /**
> > * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> > @@ -823,7 +823,7 @@ void rcu_idle_exit(void)
> > }
> > EXPORT_SYMBOL_GPL(rcu_idle_exit);
> >
> > -#ifdef CONFIG_RCU_USER_QS
> > +#ifdef CONFIG_NO_HZ_FULL
> > /**
> > * rcu_user_exit - inform RCU that we are exiting userspace.
> > *
> > @@ -834,7 +834,7 @@ void rcu_user_exit(void)
> > {
> > rcu_eqs_exit(1);
> > }
> > -#endif /* CONFIG_RCU_USER_QS */
> > +#endif /* CONFIG_NO_HZ_FULL */
> >
> > /**
> > * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
> > diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
> > index 579ce1b929af..4008d9f95dd7 100644
> > --- a/kernel/time/Kconfig
> > +++ b/kernel/time/Kconfig
> > @@ -92,12 +92,10 @@ config NO_HZ_FULL
> > depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
> > # We need at least one periodic CPU for timekeeping
> > depends on SMP
> > - # RCU_USER_QS dependency
> > depends on HAVE_CONTEXT_TRACKING
> > # VIRT_CPU_ACCOUNTING_GEN dependency
> > depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
> > select NO_HZ_COMMON
> > - select RCU_USER_QS
> > select RCU_NOCB_CPU
> > select VIRT_CPU_ACCOUNTING_GEN
> > select IRQ_WORK
> >
>

2015-05-14 21:03:11

by Alexander Gordeev

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 06/24] rcu: Cleanup rcu_init_geometry() code and arithmetics

On Tue, May 12, 2015 at 11:22:32PM -0400, Steven Rostedt wrote:
> On Tue, 12 May 2015 15:30:36 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > @@ -4103,24 +4102,21 @@ static void __init rcu_init_geometry(void)
> > return;
> > }
> >
> > + /* Calculate the number of levels in the tree. */
> > + for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {
>
> Should this start at i = 1 as it use to? Also, should there be a safety
> check too:
>
> for (i = 1; i <= MAX_RCU_LVLS && nr_cpu_ids > rcu_capacity[i]; i++) {

The safety check is not needed as it indirectly tried few lines above:

if (nr_cpu_ids > rcu_capacity[MAX_RCU_LVLS])
panic("rcu_init_geometry: rcu_capacity[] is too small");


Starting at i = 0 appears indeed incorrect. In case NR_CPUS of 1 that might
yield rcu_num_lvls = 0, which is wrong. I will check it.


>
> > + }
> > + rcu_num_lvls = i;
> > +
> > /* Calculate the number of rcu_nodes at each level of the tree. */
> > - for (i = 1; i <= MAX_RCU_LVLS; i++)
> > - if (nr_cpu_ids <= rcu_capacity[i]) {
> > - for (j = 0; j <= i; j++) {
> > - int cap = rcu_capacity[i - j];
> > - num_rcu_lvl[j] = DIV_ROUND_UP(nr_cpu_ids, cap);

1. In case j == i, num_rcu_lvl[j] = nr_cpu_ids

> > - }
> > - rcu_num_lvls = i;
> > - for (j = i + 1; j <= MAX_RCU_LVLS; j++)
> > - num_rcu_lvl[j] = 0;
> > - break;
> > - }
> > + for (i = 0; i < rcu_num_lvls; i++) {
>
> Hmm, up above we have: for (j = 0; j <= i; j++)
>
> and now we have rcu_num_lvls = i, so shouldn't this be;
>
> for (i = 0; i <= rcu_num_lvls; i++)
>
> ?

No, it should not. See [1] above and [2] below.

> -- Steve
>
> > + int cap = rcu_capacity[rcu_num_lvls - i];
> > + num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap);
> > + }
> >
> > /* Calculate the total number of rcu_node structures. */
> > rcu_num_nodes = 0;
> > - for (i = 0; i <= MAX_RCU_LVLS; i++)
> > + for (i = 0; i < rcu_num_lvls; i++)
> > rcu_num_nodes += num_rcu_lvl[i];
> > - rcu_num_nodes -= nr_cpu_ids;

So nr_cpu_ids is added rcu_num_nodes in the cycle and subtracted from
rcu_num_nodes afterwards.

The new version of code does not neigher [1] nor [2].

> > }
> >
> > void __init rcu_init(void)
>

--
Regards,
Alexander Gordeev
[email protected]

2015-06-29 09:39:19

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

Hi Paul,

On Wed, May 13, 2015 at 12:30 AM, Paul E. McKenney
<[email protected]> wrote:
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1373,6 +1373,17 @@ config RCU_TRACE
> Say Y here if you want to enable RCU tracing
> Say N if you are unsure.
>
> +config RCU_EQS_DEBUG
> + bool "Use this when adding any sort of NO_HZ support to your arch"

This sounds a bit fuzzy. Can you please provide a better one-line description?
Thanks!

> + depends on DEBUG_KERNEL
> + help
> + This option provides consistency checks in RCU's handling of
> + NO_HZ. These checks have proven quite helpful in detecting
> + bugs in arch-specific NO_HZ code.
> +
> + Say N here if you need ultimate kernel/user switch latencies
> + Say Y if you are unsure

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2015-06-29 20:55:44

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

On Mon, Jun 29, 2015 at 11:39:13AM +0200, Geert Uytterhoeven wrote:
> Hi Paul,
>
> On Wed, May 13, 2015 at 12:30 AM, Paul E. McKenney
> <[email protected]> wrote:
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1373,6 +1373,17 @@ config RCU_TRACE
> > Say Y here if you want to enable RCU tracing
> > Say N if you are unsure.
> >
> > +config RCU_EQS_DEBUG
> > + bool "Use this when adding any sort of NO_HZ support to your arch"
>
> This sounds a bit fuzzy. Can you please provide a better one-line description?
> Thanks!

So the point of this Kconfig option is to provide WARN_ON()s that catch
bug such as telling RCU that a given CPU entered idle, but failing to
tell RCU when that CPU later leaves idle. So how about this?

+ bool "Provide debugging asserts for adding NO_HZ support to an arch"

Thanx, Paul

> > + depends on DEBUG_KERNEL
> > + help
> > + This option provides consistency checks in RCU's handling of
> > + NO_HZ. These checks have proven quite helpful in detecting
> > + bugs in arch-specific NO_HZ code.
> > +
> > + Say N here if you need ultimate kernel/user switch latencies
> > + Say Y if you are unsure
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>

2015-06-29 20:58:58

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

Hi Paul,

On Mon, Jun 29, 2015 at 10:55 PM, Paul E. McKenney
<[email protected]> wrote:
> On Mon, Jun 29, 2015 at 11:39:13AM +0200, Geert Uytterhoeven wrote:
>> On Wed, May 13, 2015 at 12:30 AM, Paul E. McKenney
>> <[email protected]> wrote:
>> > --- a/lib/Kconfig.debug
>> > +++ b/lib/Kconfig.debug
>> > @@ -1373,6 +1373,17 @@ config RCU_TRACE
>> > Say Y here if you want to enable RCU tracing
>> > Say N if you are unsure.
>> >
>> > +config RCU_EQS_DEBUG
>> > + bool "Use this when adding any sort of NO_HZ support to your arch"
>>
>> This sounds a bit fuzzy. Can you please provide a better one-line description?
>> Thanks!
>
> So the point of this Kconfig option is to provide WARN_ON()s that catch
> bug such as telling RCU that a given CPU entered idle, but failing to
> tell RCU when that CPU later leaves idle. So how about this?
>
> + bool "Provide debugging asserts for adding NO_HZ support to an arch"

Much better, thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2015-06-30 16:58:04

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/24] rcu: Conditionally compile RCU's eqs warnings

On Mon, Jun 29, 2015 at 10:58:51PM +0200, Geert Uytterhoeven wrote:
> Hi Paul,
>
> On Mon, Jun 29, 2015 at 10:55 PM, Paul E. McKenney
> <[email protected]> wrote:
> > On Mon, Jun 29, 2015 at 11:39:13AM +0200, Geert Uytterhoeven wrote:
> >> On Wed, May 13, 2015 at 12:30 AM, Paul E. McKenney
> >> <[email protected]> wrote:
> >> > --- a/lib/Kconfig.debug
> >> > +++ b/lib/Kconfig.debug
> >> > @@ -1373,6 +1373,17 @@ config RCU_TRACE
> >> > Say Y here if you want to enable RCU tracing
> >> > Say N if you are unsure.
> >> >
> >> > +config RCU_EQS_DEBUG
> >> > + bool "Use this when adding any sort of NO_HZ support to your arch"
> >>
> >> This sounds a bit fuzzy. Can you please provide a better one-line description?
> >> Thanks!
> >
> > So the point of this Kconfig option is to provide WARN_ON()s that catch
> > bug such as telling RCU that a given CPU entered idle, but failing to
> > tell RCU when that CPU later leaves idle. So how about this?
> >
> > + bool "Provide debugging asserts for adding NO_HZ support to an arch"
>
> Much better, thanks!

Queued the following for v4.3, thank you!

Thanx, Paul

------------------------------------------------------------------------

commit bb651123a8ed43543f3ec8ac84e6152a866e2639
Author: Paul E. McKenney <[email protected]>
Date: Tue Jun 30 09:56:31 2015 -0700

rcu: Clarify CONFIG_RCU_EQS_DEBUG help text

Reported-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 6be521990d61..80efaade5e59 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1360,7 +1360,7 @@ config RCU_TRACE
Say N if you are unsure.

config RCU_EQS_DEBUG
- bool "Use this when adding any sort of NO_HZ support to your arch"
+ bool "Provide debugging asserts for adding NO_HZ support to an arch"
depends on DEBUG_KERNEL
help
This option provides consistency checks in RCU's handling of