2012-08-30 21:06:31

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 0/26] idle-related changes

Hello!

This series handles changes to the interaction between RCU and idle,
including adaptive ticks. Almost all of these patches are courtesy
of Frederic Weisbecker. The patches are as follows:

1. Allow non-idle tasks to enter dyntick-idle mode from RCU's
perspective in order to enable adaptive ticks.
2. Add rcu_user_enter_irq() and rcu_user_exit_irq() to enable
switching into and out of dyntick-idle mode in interrupt
handlers to handle wakeups.
3. Modify RCU_FAST_NO_HZ to accommodate adaptive ticks.
4. Add a Kconfig option to enable extended quiescent states while
in usermode execution.
5. Allow multiple rcu_user_enter() calls to match an rcu_user_exit().
6. Introcude runtime control for RCU's classification of user-mode
execution as an extended quiescent state.
7. Update adaptive-tick state on context switch in order to handle
task migration.
8. On x86, notify RCU of slowpath syscall entry to and exit from
usermode execution so that RCU can track the resulting extended
quiescent states.
9. On c86, notify RCU of exception-path entry to and exit from
usermode execution so that RCU can again strack the resulting
extended quiescent states.
10. Exit RCU extended QS on kernel preemption after irq/exception.
11. Exit RCU extended QS on user preemption.
12. On x86, use the new schedule_user API on userspace preemption.
13. On x86, exit RCU extended QS on notify resumes.
14. Provide a new RCU_USER_QS_FORCE kconfig option that enables
userspace RCU extended quiescent states on all CPUs for testing
purposes.
15-26. Fix idle-loop breakage introduced on 3.3. This affects all
architectures that do not implement NO_HZ.

Thanx, Paul

------------------------------------------------------------------------

arch/alpha/kernel/process.c | 3
b/arch/Kconfig | 10 +++
b/arch/alpha/kernel/process.c | 3
b/arch/alpha/kernel/smp.c | 1
b/arch/cris/kernel/process.c | 3
b/arch/frv/kernel/process.c | 3
b/arch/h8300/kernel/process.c | 3
b/arch/ia64/kernel/process.c | 3
b/arch/m32r/kernel/process.c | 3
b/arch/m68k/kernel/process.c | 3
b/arch/mn10300/kernel/process.c | 3
b/arch/parisc/kernel/process.c | 3
b/arch/score/kernel/process.c | 4 -
b/arch/um/drivers/mconsole_kern.c | 1
b/arch/x86/Kconfig | 1
b/arch/x86/include/asm/rcu.h | 20 ++++++
b/arch/x86/include/asm/thread_info.h | 10 ++-
b/arch/x86/kernel/entry_64.S | 8 +-
b/arch/x86/kernel/ptrace.c | 5 +
b/arch/x86/kernel/signal.c | 4 +
b/arch/x86/kernel/traps.c | 30 ++++++---
b/arch/x86/mm/fault.c | 13 +++
b/arch/xtensa/kernel/process.c | 3
b/include/linux/rcupdate.h | 2
b/include/linux/sched.h | 8 ++
b/init/Kconfig | 10 +++
b/kernel/rcutree.c | 115 ++++++++++++++++++++++++-----------
b/kernel/rcutree.h | 3
b/kernel/rcutree_plugin.h | 20 ++++++
b/kernel/sched/core.c | 1
include/linux/rcupdate.h | 12 +++
init/Kconfig | 8 ++
kernel/rcutree.c | 111 ++++++++++++++++++++++++++++++---
kernel/rcutree.h | 1
kernel/sched/core.c | 8 ++
35 files changed, 375 insertions(+), 64 deletions(-)


2012-08-30 21:14:03

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 24/26] score: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the scores's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chen Liqin <[email protected]>
Cc: Lennox Wu <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/score/kernel/process.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/score/kernel/process.c b/arch/score/kernel/process.c
index 2707023..637970c 100644
--- a/arch/score/kernel/process.c
+++ b/arch/score/kernel/process.c
@@ -27,6 +27,7 @@
#include <linux/reboot.h>
#include <linux/elfcore.h>
#include <linux/pm.h>
+#include <linux/rcupdate.h>

void (*pm_power_off)(void);
EXPORT_SYMBOL(pm_power_off);
@@ -50,9 +51,10 @@ void __noreturn cpu_idle(void)
{
/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched())
barrier();
-
+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:14:06

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 26/26] ia64: Add missing RCU idle APIs on idle loop

From: "Paul E. McKenney" <[email protected]>

Traditionally, the entire idle task served as an RCU quiescent state.
But when RCU read side critical sections started appearing within the
idle loop, this traditional strategy became untenable. The fix was to
create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which
must be called by each architecture's idle loop so that RCU can tell
when it is safe to ignore a given idle CPU.

Unfortunately, this fix was never applied to ia64, a shortcoming remedied
by this commit.

Reported by: Tony Luck <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Tested by: Tony Luck <[email protected]>
---
arch/ia64/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index dd6fc14..3e316ec 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -29,6 +29,7 @@
#include <linux/kdebug.h>
#include <linux/utsname.h>
#include <linux/tracehook.h>
+#include <linux/rcupdate.h>

#include <asm/cpu.h>
#include <asm/delay.h>
@@ -279,6 +280,7 @@ cpu_idle (void)

/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
if (can_do_pal_halt) {
current_thread_info()->status &= ~TS_POLLING;
/*
@@ -309,6 +311,7 @@ cpu_idle (void)
normal_xtp();
#endif
}
+ rcu_idle_exit();
schedule_preempt_disabled();
check_pgt_cache();
if (cpu_is_offline(cpu))
--
1.7.8

2012-08-30 21:14:17

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 07/26] rcu: Switch task's syscall hooks on context switch

From: Frederic Weisbecker <[email protected]>

Clear the syscalls hook of a task when it's scheduled out so that if
the task migrates, it doesn't run the syscall slow path on a CPU
that might not need it.

Also set the syscalls hook on the next task if needed.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/um/drivers/mconsole_kern.c | 1 +
include/linux/rcupdate.h | 2 ++
include/linux/sched.h | 8 ++++++++
kernel/rcutree.c | 15 +++++++++++++++
kernel/sched/core.c | 1 +
5 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
index 664a60e..c17de0d 100644
--- a/arch/um/drivers/mconsole_kern.c
+++ b/arch/um/drivers/mconsole_kern.c
@@ -705,6 +705,7 @@ static void stack_proc(void *arg)
struct task_struct *from = current, *to = arg;

to->thread.saved_task = from;
+ rcu_switch(from, to);
switch_to(from, to, from);
}

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e411117..1fc0a0e 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -197,6 +197,8 @@ extern void rcu_user_enter(void);
extern void rcu_user_exit(void);
extern void rcu_user_enter_irq(void);
extern void rcu_user_exit_irq(void);
+extern void rcu_user_hooks_switch(struct task_struct *prev,
+ struct task_struct *next);
#else
static inline void rcu_user_enter(void) { }
static inline void rcu_user_exit(void) { }
diff --git a/include/linux/sched.h b/include/linux/sched.h
index c147e70..e4d5936 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1894,6 +1894,14 @@ static inline void rcu_copy_process(struct task_struct *p)

#endif

+static inline void rcu_switch(struct task_struct *prev,
+ struct task_struct *next)
+{
+#ifdef CONFIG_RCU_USER_QS
+ rcu_user_hooks_switch(prev, next);
+#endif
+}
+
static inline void tsk_restore_flags(struct task_struct *task,
unsigned long orig_flags, unsigned long flags)
{
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e2fd370..af92681 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -721,6 +721,21 @@ int rcu_is_cpu_idle(void)
}
EXPORT_SYMBOL(rcu_is_cpu_idle);

+#ifdef CONFIG_RCU_USER_QS
+void rcu_user_hooks_switch(struct task_struct *prev,
+ struct task_struct *next)
+{
+ struct rcu_dynticks *rdtp;
+
+ /* Interrupts are disabled in context switch */
+ rdtp = &__get_cpu_var(rcu_dynticks);
+ if (!rdtp->ignore_user_qs) {
+ clear_tsk_thread_flag(prev, TIF_NOHZ);
+ set_tsk_thread_flag(next, TIF_NOHZ);
+ }
+}
+#endif /* #ifdef CONFIG_RCU_USER_QS */
+
#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)

/*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d325c4b..07c6d9a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2081,6 +2081,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
#endif

/* Here we just switch the register state and the stack. */
+ rcu_switch(prev, next);
switch_to(prev, next, prev);

barrier();
--
1.7.8

2012-08-30 21:14:27

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 14/26] rcu: Userspace RCU extended QS selftest

From: Frederic Weisbecker <[email protected]>

Provide a config option that enables the userspace
RCU extended quiescent state on every CPUs by default.

This is for testing purpose.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
init/Kconfig | 8 ++++++++
kernel/rcutree.c | 2 +-
2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index f6a1830..c26b8a1 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -451,6 +451,14 @@ config RCU_USER_QS
excluded from the global RCU state machine and thus doesn't
to keep the timer tick on for RCU.

+config RCU_USER_QS_FORCE
+ bool "Force userspace extended QS by default"
+ depends on RCU_USER_QS
+ help
+ Set the hooks in user/kernel boundaries by default in order to
+ test this feature that treats userspace as an extended quiescent
+ state until we have a real user like a full adaptive nohz option.
+
config RCU_FANOUT
int "Tree-based hierarchical RCU fanout value"
range 2 64 if 64BIT
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index af92681..ccf3cbf 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -210,7 +210,7 @@ EXPORT_SYMBOL_GPL(rcu_note_context_switch);
DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
.dynticks = ATOMIC_INIT(1),
-#ifdef CONFIG_RCU_USER_QS
+#if defined(CONFIG_RCU_USER_QS) && !defined(CONFIG_RCU_USER_QS_FORCE)
.ignore_user_qs = true,
#endif
};
--
1.7.8

2012-08-30 21:14:35

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 02/26] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs

From: Frederic Weisbecker <[email protected]>

In some cases, it is necessary to enter or exit userspace-RCU-idle mode
from an interrupt handler, for example, if some other CPU sends this
CPU a resched IPI. In this case, the current CPU would enter the IPI
handler in userspace-RCU-idle mode, but would need to exit the IPI handler
after having exited that mode.

To allow this to work, this commit adds two new APIs to TREE_RCU:

- rcu_user_enter_irq(). This must be called from an interrupt between
rcu_irq_enter() and rcu_irq_exit(). After the irq calls rcu_irq_exit(),
the irq handler will return into an RCU extended quiescent state.
In theory, this interrupt is never a nested interrupt, but in practice
it might interrupt softirq, which looks to RCU like a nested interrupt.

- rcu_user_exit_irq(). This must be called from a non-nesting
interrupt, interrupting an RCU extended quiescent state, also
between rcu_irq_enter() and rcu_irq_exit(). After the irq calls
rcu_irq_exit(), the irq handler will return in an RCU non-quiescent
state.

[ Combined with "Allow calls to rcu_exit_user_irq from nesting irqs." ]

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/rcupdate.h | 2 ++
kernel/rcutree.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+), 0 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 2a7549c..81d3d5c 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -193,6 +193,8 @@ extern void rcu_irq_enter(void);
extern void rcu_irq_exit(void);
extern void rcu_user_enter(void);
extern void rcu_user_exit(void);
+extern void rcu_user_enter_irq(void);
+extern void rcu_user_exit_irq(void);
extern void exit_rcu(void);

/**
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index c0507b7..8fdea17 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -440,6 +440,27 @@ EXPORT_SYMBOL_GPL(rcu_user_enter);


/**
+ * rcu_user_enter_irq - inform RCU that we are going to resume userspace
+ * after the current irq returns.
+ *
+ * This is similar to rcu_user_enter() but in the context of a non-nesting
+ * irq. After this call, RCU enters into idle mode when the interrupt
+ * returns.
+ */
+void rcu_user_enter_irq(void)
+{
+ unsigned long flags;
+ struct rcu_dynticks *rdtp;
+
+ local_irq_save(flags);
+ rdtp = &__get_cpu_var(rcu_dynticks);
+ /* Ensure this irq is interrupting a non-idle RCU state. */
+ WARN_ON_ONCE(!(rdtp->dynticks_nesting & DYNTICK_TASK_MASK));
+ rdtp->dynticks_nesting = 1;
+ local_irq_restore(flags);
+}
+
+/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
*
* Exit from an interrupt handler, which might possibly result in entering
@@ -554,6 +575,28 @@ void rcu_user_exit(void)
EXPORT_SYMBOL_GPL(rcu_user_exit);

/**
+ * rcu_user_exit_irq - inform RCU that we won't resume to userspace
+ * idle mode after the current non-nesting irq returns.
+ *
+ * This is similar to rcu_user_exit() but in the context of an irq.
+ * This is called when the irq has interrupted a userspace RCU idle mode
+ * context. When the current non-nesting interrupt returns after this call,
+ * the CPU won't restore the RCU idle mode.
+ */
+void rcu_user_exit_irq(void)
+{
+ unsigned long flags;
+ struct rcu_dynticks *rdtp;
+
+ local_irq_save(flags);
+ rdtp = &__get_cpu_var(rcu_dynticks);
+ /* Ensure we are interrupting an RCU idle mode. */
+ WARN_ON_ONCE(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK);
+ rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
+ local_irq_restore(flags);
+}
+
+/**
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
*
* Enter an interrupt handler, which might possibly result in exiting
--
1.7.8

2012-08-30 21:14:33

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 17/26] cris: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Cris's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Cris <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/cris/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/cris/kernel/process.c b/arch/cris/kernel/process.c
index 66fd017..7f65be6 100644
--- a/arch/cris/kernel/process.c
+++ b/arch/cris/kernel/process.c
@@ -25,6 +25,7 @@
#include <linux/elfcore.h>
#include <linux/mqueue.h>
#include <linux/reboot.h>
+#include <linux/rcupdate.h>

//#define DEBUG

@@ -74,6 +75,7 @@ void cpu_idle (void)
{
/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched()) {
void (*idle)(void);
/*
@@ -86,6 +88,7 @@ void cpu_idle (void)
idle = default_idle;
idle();
}
+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:14:30

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 18/26] frv: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Frv's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: David Howells <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/frv/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/frv/kernel/process.c b/arch/frv/kernel/process.c
index ff95f50..2eb7fa5 100644
--- a/arch/frv/kernel/process.c
+++ b/arch/frv/kernel/process.c
@@ -25,6 +25,7 @@
#include <linux/reboot.h>
#include <linux/interrupt.h>
#include <linux/pagemap.h>
+#include <linux/rcupdate.h>

#include <asm/asm-offsets.h>
#include <asm/uaccess.h>
@@ -69,12 +70,14 @@ void cpu_idle(void)
{
/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched()) {
check_pgt_cache();

if (!frv_dma_inprogress && idle)
idle();
}
+ rcu_idle_exit();

schedule_preempt_disabled();
}
--
1.7.8

2012-08-30 21:15:47

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 09/26] x86: Exception hooks for userspace RCU extended QS

From: Frederic Weisbecker <[email protected]>

Add necessary hooks to x86 exception for userspace
RCU extended quiescent state support.

This includes traps, page fault, debug exceptions, etc...

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/x86/include/asm/rcu.h | 20 ++++++++++++++++++++
arch/x86/kernel/traps.c | 30 ++++++++++++++++++++++--------
arch/x86/mm/fault.c | 13 +++++++++++--
3 files changed, 53 insertions(+), 10 deletions(-)
create mode 100644 arch/x86/include/asm/rcu.h

diff --git a/arch/x86/include/asm/rcu.h b/arch/x86/include/asm/rcu.h
new file mode 100644
index 0000000..439815b
--- /dev/null
+++ b/arch/x86/include/asm/rcu.h
@@ -0,0 +1,20 @@
+#ifndef _ASM_X86_RCU_H
+#define _ASM_X86_RCU_H
+
+#include <linux/rcupdate.h>
+#include <asm/ptrace.h>
+
+static inline void exception_enter(struct pt_regs *regs)
+{
+ rcu_user_exit();
+}
+
+static inline void exception_exit(struct pt_regs *regs)
+{
+#ifdef CONFIG_RCU_USER_QS
+ if (user_mode(regs))
+ rcu_user_enter();
+#endif
+}
+
+#endif
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b481341..ab82cbd 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -55,6 +55,7 @@
#include <asm/i387.h>
#include <asm/fpu-internal.h>
#include <asm/mce.h>
+#include <asm/rcu.h>

#include <asm/mach_traps.h>

@@ -180,11 +181,15 @@ vm86_trap:
#define DO_ERROR(trapnr, signr, str, name) \
dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
{ \
- if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \
- == NOTIFY_STOP) \
+ exception_enter(regs); \
+ if (notify_die(DIE_TRAP, str, regs, error_code, \
+ trapnr, signr) == NOTIFY_STOP) { \
+ exception_exit(regs); \
return; \
+ } \
conditional_sti(regs); \
do_trap(trapnr, signr, str, regs, error_code, NULL); \
+ exception_exit(regs); \
}

#define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \
@@ -195,11 +200,15 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
info.si_errno = 0; \
info.si_code = sicode; \
info.si_addr = (void __user *)siaddr; \
- if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \
- == NOTIFY_STOP) \
+ exception_enter(regs); \
+ if (notify_die(DIE_TRAP, str, regs, error_code, \
+ trapnr, signr) == NOTIFY_STOP) { \
+ exception_exit(regs); \
return; \
+ } \
conditional_sti(regs); \
do_trap(trapnr, signr, str, regs, error_code, &info); \
+ exception_exit(regs); \
}

DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV,
@@ -312,6 +321,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
ftrace_int3_handler(regs))
return;
#endif
+ exception_enter(regs);
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -331,6 +341,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, NULL);
preempt_conditional_cli(regs);
debug_stack_usage_dec();
+ exception_exit(regs);
}

#ifdef CONFIG_X86_64
@@ -391,6 +402,8 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
unsigned long dr6;
int si_code;

+ exception_enter(regs);
+
get_debugreg(dr6, 6);

/* Filter out all the reserved bits which are preset to 1 */
@@ -406,7 +419,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)

/* Catch kmemcheck conditions first of all! */
if ((dr6 & DR_STEP) && kmemcheck_trap(regs))
- return;
+ goto exit;

/* DR6 may or may not be cleared by the CPU */
set_debugreg(0, 6);
@@ -421,7 +434,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)

if (notify_die(DIE_DEBUG, "debug", regs, PTR_ERR(&dr6), error_code,
SIGTRAP) == NOTIFY_STOP)
- return;
+ goto exit;

/*
* Let others (NMI) know that the debug stack is in use
@@ -437,7 +450,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
X86_TRAP_DB);
preempt_conditional_cli(regs);
debug_stack_usage_dec();
- return;
+ goto exit;
}

/*
@@ -458,7 +471,8 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
preempt_conditional_cli(regs);
debug_stack_usage_dec();

- return;
+exit:
+ exception_exit(regs);
}

/*
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 76dcd9d..7dde46d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -18,6 +18,7 @@
#include <asm/pgalloc.h> /* pgd_*(), ... */
#include <asm/kmemcheck.h> /* kmemcheck_*(), ... */
#include <asm/fixmap.h> /* VSYSCALL_START */
+#include <asm/rcu.h> /* exception_enter(), ... */

/*
* Page fault error code bits:
@@ -1000,8 +1001,8 @@ static int fault_in_kernel_space(unsigned long address)
* and the problem, and then passes it off to one of the appropriate
* routines.
*/
-dotraplinkage void __kprobes
-do_page_fault(struct pt_regs *regs, unsigned long error_code)
+static void __kprobes
+__do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
struct vm_area_struct *vma;
struct task_struct *tsk;
@@ -1209,3 +1210,11 @@ good_area:

up_read(&mm->mmap_sem);
}
+
+dotraplinkage void __kprobes
+do_page_fault(struct pt_regs *regs, unsigned long error_code)
+{
+ exception_enter(regs);
+ __do_page_fault(regs, error_code);
+ exception_exit(regs);
+}
--
1.7.8

2012-08-30 21:15:46

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 04/26] rcu: Settle config for userspace extended quiescent state

From: Frederic Weisbecker <[email protected]>

Create a new config option under the RCU menu that put
CPUs under RCU extended quiescent state (as in dynticks
idle mode) when they run in userspace. This require
some contribution from architectures to hook into kernel
and userspace boundaries.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/Kconfig | 10 ++++++++++
include/linux/rcupdate.h | 8 ++++++++
init/Kconfig | 10 ++++++++++
kernel/rcutree.c | 5 ++++-
4 files changed, 32 insertions(+), 1 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 72f2fa1..1401a75 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -281,4 +281,14 @@ config SECCOMP_FILTER

See Documentation/prctl/seccomp_filter.txt for details.

+config HAVE_RCU_USER_QS
+ bool
+ help
+ Provide kernel entry/exit hooks necessary for userspace
+ RCU extended quiescent state. Syscalls need to be wrapped inside
+ rcu_user_exit()-rcu_user_enter() through the slow path using
+ TIF_NOHZ flag. Exceptions handlers must be wrapped as well. Irqs
+ are already protected inside rcu_irq_enter/rcu_irq_exit() but
+ preemption or signal handling on irq exit still need to be protected.
+
source "kernel/gcov/Kconfig"
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 81d3d5c..e411117 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -191,10 +191,18 @@ extern void rcu_idle_enter(void);
extern void rcu_idle_exit(void);
extern void rcu_irq_enter(void);
extern void rcu_irq_exit(void);
+
+#ifdef CONFIG_RCU_USER_QS
extern void rcu_user_enter(void);
extern void rcu_user_exit(void);
extern void rcu_user_enter_irq(void);
extern void rcu_user_exit_irq(void);
+#else
+static inline void rcu_user_enter(void) { }
+static inline void rcu_user_exit(void) { }
+#endif /* CONFIG_RCU_USER_QS */
+
+
extern void exit_rcu(void);

/**
diff --git a/init/Kconfig b/init/Kconfig
index af6c7f8..f6a1830 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -441,6 +441,16 @@ config PREEMPT_RCU
This option enables preemptible-RCU code that is common between
the TREE_PREEMPT_RCU and TINY_PREEMPT_RCU implementations.

+config RCU_USER_QS
+ bool "Consider userspace as in RCU extended quiescent state"
+ depends on HAVE_RCU_USER_QS && SMP
+ help
+ This option sets hooks on kernel / userspace boundaries and
+ puts RCU in extended quiescent state when the CPU runs in
+ userspace. It means that when a CPU runs in userspace, it is
+ excluded from the global RCU state machine and thus doesn't
+ to keep the timer tick on for RCU.
+
config RCU_FANOUT
int "Tree-based hierarchical RCU fanout value"
range 2 64 if 64BIT
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8fdea17..e287c4a 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -424,6 +424,7 @@ void rcu_idle_enter(void)
}
EXPORT_SYMBOL_GPL(rcu_idle_enter);

+#ifdef CONFIG_RCU_USER_QS
/**
* rcu_user_enter - inform RCU that we are resuming userspace.
*
@@ -438,7 +439,6 @@ void rcu_user_enter(void)
}
EXPORT_SYMBOL_GPL(rcu_user_enter);

-
/**
* rcu_user_enter_irq - inform RCU that we are going to resume userspace
* after the current irq returns.
@@ -459,6 +459,7 @@ void rcu_user_enter_irq(void)
rdtp->dynticks_nesting = 1;
local_irq_restore(flags);
}
+#endif

/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
@@ -562,6 +563,7 @@ void rcu_idle_exit(void)
}
EXPORT_SYMBOL_GPL(rcu_idle_exit);

+#ifdef CONFIG_RCU_USER_QS
/**
* rcu_user_exit - inform RCU that we are exiting userspace.
*
@@ -595,6 +597,7 @@ void rcu_user_exit_irq(void)
rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
local_irq_restore(flags);
}
+#endif

/**
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
--
1.7.8

2012-08-30 21:16:19

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 13/26] x86: Exit RCU extended QS on notify resume

From: Frederic Weisbecker <[email protected]>

do_notify_resume() may be called on irq or exception
exit. But at that time the exception has already called
rcu_user_enter() and the irq has already called rcu_irq_exit().

Since it can use RCU read side critical section, we must call
rcu_user_exit() before doing anything there. Then we must call
back rcu_user_enter() after this function because we know we are
going to userspace from there.

This complete support for userspace RCU extended quiescent state
in x86-64.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/x86/Kconfig | 1 +
arch/x86/kernel/signal.c | 4 ++++
2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ba2657c..5cd953a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -97,6 +97,7 @@ config X86
select KTIME_SCALAR if X86_32
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
+ select HAVE_RCU_USER_QS if X86_64

config INSTRUCTION_DECODER
def_bool (KPROBES || PERF_EVENTS || UPROBES)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index b280908..bca0ab9 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -779,6 +779,8 @@ static void do_signal(struct pt_regs *regs)
void
do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
{
+ rcu_user_exit();
+
#ifdef CONFIG_X86_MCE
/* notify userspace of pending MCEs */
if (thread_info_flags & _TIF_MCE_NOTIFY)
@@ -804,6 +806,8 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
#ifdef CONFIG_X86_32
clear_thread_flag(TIF_IRET);
#endif /* CONFIG_X86_32 */
+
+ rcu_user_enter();
}

void signal_fault(struct pt_regs *regs, void __user *frame, char *where)
--
1.7.8

2012-08-30 21:14:24

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 15/26] alpha: Fix preemption handling in idle loop

From: Frederic Weisbecker <[email protected]>

cpu_idle() is called on the boot CPU by the init code with
preemption disabled. But the cpu_idle() function in alpha
doesn't handle this when it calls schedule() directly.

Fix it by converting it into schedule_preempt_disabled().

Also disable preemption before calling cpu_idle() from
secondary CPU entry code to stay consistent with this
state.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: alpha <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Michael Cree <[email protected]>
---
arch/alpha/kernel/process.c | 3 ++-
arch/alpha/kernel/smp.c | 1 +
2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index 153d3fc..eac5e01 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -56,7 +56,8 @@ cpu_idle(void)

while (!need_resched())
cpu_relax();
- schedule();
+
+ schedule_preempt_disabled();
}
}

diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 35ddc02..a41ad90 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -166,6 +166,7 @@ smp_callin(void)
DBGS(("smp_callin: commencing CPU %d current %p active_mm %p\n",
cpuid, current, current->active_mm));

+ preempt_disable();
/* Do nothing. */
cpu_idle();
}
--
1.7.8

2012-08-30 21:16:36

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 20/26] m32r: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the m32r's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/m32r/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/m32r/kernel/process.c b/arch/m32r/kernel/process.c
index 3a4a32b..384e63f 100644
--- a/arch/m32r/kernel/process.c
+++ b/arch/m32r/kernel/process.c
@@ -26,6 +26,7 @@
#include <linux/ptrace.h>
#include <linux/unistd.h>
#include <linux/hardirq.h>
+#include <linux/rcupdate.h>

#include <asm/io.h>
#include <asm/uaccess.h>
@@ -82,6 +83,7 @@ void cpu_idle (void)
{
/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched()) {
void (*idle)(void) = pm_idle;

@@ -90,6 +92,7 @@ void cpu_idle (void)

idle();
}
+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:16:58

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 10/26] rcu: Exit RCU extended QS on kernel preemption after irq/exception

From: Frederic Weisbecker <[email protected]>

When an exception or an irq exits, and we are going to resume into
interrupted kernel code, the low level architecture code calls
preempt_schedule_irq() if there is a need to reschedule.

If the interrupt/exception occured between a call to rcu_user_enter()
(from syscall exit, exception exit, do_notify_resume exit, ...) and
a real resume to userspace (iret,...), preempt_schedule_irq() can be
called whereas RCU thinks we are in userspace. But preempt_schedule_irq()
is going to run kernel code and may be some RCU read side critical
section. We must exit the userspace extended quiescent state before
we call it.

To solve this, just call rcu_user_exit() in the beginning of
preempt_schedule_irq().

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/sched/core.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 07c6d9a..0bd599b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3564,6 +3564,7 @@ asmlinkage void __sched preempt_schedule_irq(void)
/* Catch callers which need to be fixed */
BUG_ON(ti->preempt_count || !irqs_disabled());

+ rcu_user_exit();
do {
add_preempt_count(PREEMPT_ACTIVE);
local_irq_enable();
--
1.7.8

2012-08-30 21:17:00

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 06/26] rcu: Ignore userspace extended quiescent state by default

From: Frederic Weisbecker <[email protected]>

By default we don't want to enter into RCU extended quiescent
state while in userspace because doing this produces some overhead
(eg: use of syscall slowpath). Set it off by default and ready to
run when some feature like adaptive tickless need it.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcutree.c | 5 ++++-
kernel/rcutree.h | 1 +
2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8bbc7fb..e2fd370 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -210,6 +210,9 @@ EXPORT_SYMBOL_GPL(rcu_note_context_switch);
DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
.dynticks = ATOMIC_INIT(1),
+#ifdef CONFIG_RCU_USER_QS
+ .ignore_user_qs = true,
+#endif
};

static int blimit = 10; /* Maximum callbacks per rcu_do_batch. */
@@ -443,7 +446,7 @@ void rcu_user_enter(void)

local_irq_save(flags);
rdtp = &__get_cpu_var(rcu_dynticks);
- if (!rdtp->in_user) {
+ if (!rdtp->ignore_user_qs && !rdtp->in_user) {
rdtp->in_user = true;
rcu_eqs_enter(1);
}
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 0dd5fd6..c190582 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -103,6 +103,7 @@ struct rcu_dynticks {
int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
#ifdef CONFIG_RCU_USER_QS
+ bool ignore_user_qs; /* Treat userspace as extended QS or not */
bool in_user; /* Is the CPU in userland from RCU POV? */
#endif
};
--
1.7.8

2012-08-30 21:16:56

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 16/26] alpha: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Alpha's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: alpha <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Michael Cree <[email protected]>
Cc: 3.2.x.. <[email protected]>
---
arch/alpha/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index eac5e01..eb9558c 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -28,6 +28,7 @@
#include <linux/tty.h>
#include <linux/console.h>
#include <linux/slab.h>
+#include <linux/rcupdate.h>

#include <asm/reg.h>
#include <asm/uaccess.h>
@@ -54,9 +55,11 @@ cpu_idle(void)
/* FIXME -- EV6 and LCA45 know how to power down
the CPU. */

+ rcu_idle_enter();
while (!need_resched())
cpu_relax();

+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:16:55

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 01/26] rcu: New rcu_user_enter() and rcu_user_exit() APIs

From: Frederic Weisbecker <[email protected]>

RCU currently insists that only idle tasks can enter RCU idle mode, which
prohibits an adaptive tickless kernel (AKA nohz cpusets), which in turn
would mean that usermode execution would always take scheduling-clock
interrupts, even when there is only one task runnable on the CPU in
question.

This commit therefore adds rcu_user_enter() and rcu_user_exit(), which
allow non-idle tasks to enter RCU idle mode. These are quite similar
to rcu_idle_enter() and rcu_idle_exit(), respectively, except that they
omit the idle-task checks.

[ Updated to use "user" flag rather than separate check functions. ]

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
include/linux/rcupdate.h | 2 +
kernel/rcutree.c | 115 ++++++++++++++++++++++++++++++++--------------
2 files changed, 83 insertions(+), 34 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 115ead2..2a7549c 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -191,6 +191,8 @@ extern void rcu_idle_enter(void);
extern void rcu_idle_exit(void);
extern void rcu_irq_enter(void);
extern void rcu_irq_exit(void);
+extern void rcu_user_enter(void);
+extern void rcu_user_exit(void);
extern void exit_rcu(void);

/**
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f280e54..c0507b7 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -346,16 +346,17 @@ static int rcu_implicit_offline_qs(struct rcu_data *rdp)
}

/*
- * rcu_idle_enter_common - inform RCU that current CPU is moving towards idle
+ * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
*
* If the new value of the ->dynticks_nesting counter now is zero,
* we really have entered idle, and must do the appropriate accounting.
* The caller must have disabled interrupts.
*/
-static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
+static void rcu_eqs_enter_common(struct rcu_dynticks *rdtp, long long oldval,
+ bool user)
{
trace_rcu_dyntick("Start", oldval, 0);
- if (!is_idle_task(current)) {
+ if (!is_idle_task(current) && !user) {
struct task_struct *idle = idle_task(smp_processor_id());

trace_rcu_dyntick("Error on entry: not idle task", oldval, 0);
@@ -372,7 +373,7 @@ static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);

/*
- * The idle task is not permitted to enter the idle loop while
+ * It is illegal to enter an extended quiescent state while
* in an RCU read-side critical section.
*/
rcu_lockdep_assert(!lock_is_held(&rcu_lock_map),
@@ -383,19 +384,11 @@ static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
"Illegal idle entry in RCU-sched read-side critical section.");
}

-/**
- * rcu_idle_enter - inform RCU that current CPU is entering idle
- *
- * Enter idle mode, in other words, -leave- the mode in which RCU
- * read-side critical sections can occur. (Though RCU read-side
- * critical sections can occur in irq handlers in idle, a possibility
- * handled by irq_enter() and irq_exit().)
- *
- * We crowbar the ->dynticks_nesting field to zero to allow for
- * the possibility of usermode upcalls having messed up our count
- * of interrupt nesting level during the prior busy period.
+/*
+ * Enter an RCU extended quiescent state, which can be either the
+ * idle loop or adaptive-tickless usermode execution.
*/
-void rcu_idle_enter(void)
+static void rcu_eqs_enter(bool user)
{
unsigned long flags;
long long oldval;
@@ -409,12 +402,44 @@ void rcu_idle_enter(void)
rdtp->dynticks_nesting = 0;
else
rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
- rcu_idle_enter_common(rdtp, oldval);
+ rcu_eqs_enter_common(rdtp, oldval, user);
local_irq_restore(flags);
}
+
+/**
+ * rcu_idle_enter - inform RCU that current CPU is entering idle
+ *
+ * Enter idle mode, in other words, -leave- the mode in which RCU
+ * read-side critical sections can occur. (Though RCU read-side
+ * critical sections can occur in irq handlers in idle, a possibility
+ * handled by irq_enter() and irq_exit().)
+ *
+ * We crowbar the ->dynticks_nesting field to zero to allow for
+ * the possibility of usermode upcalls having messed up our count
+ * of interrupt nesting level during the prior busy period.
+ */
+void rcu_idle_enter(void)
+{
+ rcu_eqs_enter(0);
+}
EXPORT_SYMBOL_GPL(rcu_idle_enter);

/**
+ * rcu_user_enter - inform RCU that we are resuming userspace.
+ *
+ * Enter RCU idle mode right before resuming userspace. No use of RCU
+ * is permitted between this call and rcu_user_exit(). This way the
+ * CPU doesn't need to maintain the tick for RCU maintenance purposes
+ * when the CPU runs in userspace.
+ */
+void rcu_user_enter(void)
+{
+ rcu_eqs_enter(1);
+}
+EXPORT_SYMBOL_GPL(rcu_user_enter);
+
+
+/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
*
* Exit from an interrupt handler, which might possibly result in entering
@@ -444,18 +469,19 @@ void rcu_irq_exit(void)
if (rdtp->dynticks_nesting)
trace_rcu_dyntick("--=", oldval, rdtp->dynticks_nesting);
else
- rcu_idle_enter_common(rdtp, oldval);
+ rcu_eqs_enter_common(rdtp, oldval, 1);
local_irq_restore(flags);
}

/*
- * rcu_idle_exit_common - inform RCU that current CPU is moving away from idle
+ * rcu_eqs_exit_common - current CPU moving away from extended quiescent state
*
* If the new value of the ->dynticks_nesting counter was previously zero,
* we really have exited idle, and must do the appropriate accounting.
* The caller must have disabled interrupts.
*/
-static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
+static void rcu_eqs_exit_common(struct rcu_dynticks *rdtp, long long oldval,
+ int user)
{
smp_mb__before_atomic_inc(); /* Force ordering w/previous sojourn. */
atomic_inc(&rdtp->dynticks);
@@ -464,7 +490,7 @@ static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
rcu_cleanup_after_idle(smp_processor_id());
trace_rcu_dyntick("End", oldval, rdtp->dynticks_nesting);
- if (!is_idle_task(current)) {
+ if (!is_idle_task(current) && !user) {
struct task_struct *idle = idle_task(smp_processor_id());

trace_rcu_dyntick("Error on exit: not idle task",
@@ -476,18 +502,11 @@ static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
}
}

-/**
- * rcu_idle_exit - inform RCU that current CPU is leaving idle
- *
- * Exit idle mode, in other words, -enter- the mode in which RCU
- * read-side critical sections can occur.
- *
- * We crowbar the ->dynticks_nesting field to DYNTICK_TASK_NEST to
- * allow for the possibility of usermode upcalls messing up our count
- * of interrupt nesting level during the busy period that is just
- * now starting.
+/*
+ * Exit an RCU extended quiescent state, which can be either the
+ * idle loop or adaptive-tickless usermode execution.
*/
-void rcu_idle_exit(void)
+static void rcu_eqs_exit(bool user)
{
unsigned long flags;
struct rcu_dynticks *rdtp;
@@ -501,12 +520,40 @@ void rcu_idle_exit(void)
rdtp->dynticks_nesting += DYNTICK_TASK_NEST_VALUE;
else
rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
- rcu_idle_exit_common(rdtp, oldval);
+ rcu_eqs_exit_common(rdtp, oldval, user);
local_irq_restore(flags);
}
+
+/**
+ * rcu_idle_exit - inform RCU that current CPU is leaving idle
+ *
+ * Exit idle mode, in other words, -enter- the mode in which RCU
+ * read-side critical sections can occur.
+ *
+ * We crowbar the ->dynticks_nesting field to DYNTICK_TASK_NEST to
+ * allow for the possibility of usermode upcalls messing up our count
+ * of interrupt nesting level during the busy period that is just
+ * now starting.
+ */
+void rcu_idle_exit(void)
+{
+ rcu_eqs_exit(0);
+}
EXPORT_SYMBOL_GPL(rcu_idle_exit);

/**
+ * rcu_user_exit - inform RCU that we are exiting userspace.
+ *
+ * Exit RCU idle mode while entering the kernel because it can
+ * run a RCU read side critical section anytime.
+ */
+void rcu_user_exit(void)
+{
+ rcu_eqs_exit(1);
+}
+EXPORT_SYMBOL_GPL(rcu_user_exit);
+
+/**
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
*
* Enter an interrupt handler, which might possibly result in exiting
@@ -539,7 +586,7 @@ void rcu_irq_enter(void)
if (oldval)
trace_rcu_dyntick("++=", oldval, rdtp->dynticks_nesting);
else
- rcu_idle_exit_common(rdtp, oldval);
+ rcu_eqs_exit_common(rdtp, oldval, 1);
local_irq_restore(flags);
}

--
1.7.8

2012-08-30 21:17:55

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 11/26] rcu: Exit RCU extended QS on user preemption

From: Frederic Weisbecker <[email protected]>

When exceptions or irq are about to resume userspace, if
the task needs to be rescheduled, the arch low level code
calls schedule() directly.

At that time we may be in extended quiescent state from RCU
POV: the exception is not anymore protected inside
rcu_user_exit() - rcu_user_enter() and the irq has called
rcu_irq_exit() already.

Create a new API schedule_user() that calls schedule() inside
rcu_user_exit()-rcu_user_enter() in order to protect it. Archs
will need to rely on it now to implement user preemption safely.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/sched/core.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0bd599b..e841dfc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3463,6 +3463,13 @@ asmlinkage void __sched schedule(void)
}
EXPORT_SYMBOL(schedule);

+asmlinkage void __sched schedule_user(void)
+{
+ rcu_user_exit();
+ schedule();
+ rcu_user_enter();
+}
+
/**
* schedule_preempt_disabled - called with preemption disabled
*
--
1.7.8

2012-08-30 21:17:53

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 08/26] x86: Syscall hooks for userspace RCU extended QS

From: Frederic Weisbecker <[email protected]>

Add syscall slow path hooks to notify syscall entry
and exit on CPUs that want to support userspace RCU
extended quiescent state.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/x86/include/asm/thread_info.h | 10 +++++++---
arch/x86/kernel/ptrace.c | 5 +++++
2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 89f794f..c535d84 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -89,6 +89,7 @@ struct thread_info {
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_FORK 18 /* ret_from_fork */
+#define TIF_NOHZ 19 /* in adaptive nohz mode */
#define TIF_MEMDIE 20 /* is terminating due to OOM killer */
#define TIF_DEBUG 21 /* uses debug registers */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
@@ -114,6 +115,7 @@ struct thread_info {
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK)
+#define _TIF_NOHZ (1 << TIF_NOHZ)
#define _TIF_DEBUG (1 << TIF_DEBUG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
#define _TIF_FORCED_TF (1 << TIF_FORCED_TF)
@@ -126,12 +128,13 @@ struct thread_info {
/* work to do in syscall_trace_enter() */
#define _TIF_WORK_SYSCALL_ENTRY \
(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT | \
- _TIF_SECCOMP | _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT)
+ _TIF_SECCOMP | _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT | \
+ _TIF_NOHZ)

/* work to do in syscall_trace_leave() */
#define _TIF_WORK_SYSCALL_EXIT \
(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | _TIF_SINGLESTEP | \
- _TIF_SYSCALL_TRACEPOINT)
+ _TIF_SYSCALL_TRACEPOINT | _TIF_NOHZ)

/* work to do on interrupt/exception return */
#define _TIF_WORK_MASK \
@@ -141,7 +144,8 @@ struct thread_info {

/* work to do on any return to user space */
#define _TIF_ALLWORK_MASK \
- ((0x0000FFFF & ~_TIF_SECCOMP) | _TIF_SYSCALL_TRACEPOINT)
+ ((0x0000FFFF & ~_TIF_SECCOMP) | _TIF_SYSCALL_TRACEPOINT | \
+ _TIF_NOHZ)

/* Only used for 64 bit */
#define _TIF_DO_NOTIFY_MASK \
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index c4c6a5c..9f94f8e 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -21,6 +21,7 @@
#include <linux/signal.h>
#include <linux/perf_event.h>
#include <linux/hw_breakpoint.h>
+#include <linux/rcupdate.h>

#include <asm/uaccess.h>
#include <asm/pgtable.h>
@@ -1463,6 +1464,8 @@ long syscall_trace_enter(struct pt_regs *regs)
{
long ret = 0;

+ rcu_user_exit();
+
/*
* If we stepped into a sysenter/syscall insn, it trapped in
* kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP.
@@ -1526,4 +1529,6 @@ void syscall_trace_leave(struct pt_regs *regs)
!test_thread_flag(TIF_SYSCALL_EMU);
if (step || test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall_exit(regs, step);
+
+ rcu_user_enter();
}
--
1.7.8

2012-08-30 21:18:27

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 03/26] rcu: Make RCU_FAST_NO_HZ handle adaptive ticks

From: "Paul E. McKenney" <[email protected]>

The current implementation of RCU_FAST_NO_HZ tries reasonably hard to rid
the current CPU of RCU callbacks. This is appropriate when the CPU is
entering idle, where it doesn't have much useful to do anyway, but is most
definitely not what you want when transitioning to user-mode execution.
This commit therefore detects the adaptive-tick case, and refrains from
burning CPU time getting rid of RCU callbacks in that case.

Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcutree_plugin.h | 20 ++++++++++++++++++++
1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 7f3244c..b0f09d6 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1997,6 +1997,26 @@ static void rcu_prepare_for_idle(int cpu)
if (!tne)
return;

+ /* Adaptive-tick mode, where usermode execution is idle to RCU. */
+ if (!is_idle_task(current)) {
+ rdtp->dyntick_holdoff = jiffies - 1;
+ if (rcu_cpu_has_nonlazy_callbacks(cpu)) {
+ trace_rcu_prep_idle("User dyntick with callbacks");
+ rdtp->idle_gp_timer_expires =
+ round_up(jiffies + RCU_IDLE_GP_DELAY,
+ RCU_IDLE_GP_DELAY);
+ } else if (rcu_cpu_has_callbacks(cpu)) {
+ rdtp->idle_gp_timer_expires =
+ round_jiffies(jiffies + RCU_IDLE_LAZY_GP_DELAY);
+ trace_rcu_prep_idle("User dyntick with lazy callbacks");
+ } else {
+ return;
+ }
+ tp = &rdtp->idle_gp_timer;
+ mod_timer_pinned(tp, rdtp->idle_gp_timer_expires);
+ return;
+ }
+
/*
* If this is an idle re-entry, for example, due to use of
* RCU_NONIDLE() or the new idle-loop tracing API within the idle
--
1.7.8

2012-08-30 21:18:45

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 25/26] xtensa: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the xtensa's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/xtensa/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c
index 2c8d6a3..bc44311 100644
--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -31,6 +31,7 @@
#include <linux/mqueue.h>
#include <linux/fs.h>
#include <linux/slab.h>
+#include <linux/rcupdate.h>

#include <asm/pgtable.h>
#include <asm/uaccess.h>
@@ -110,8 +111,10 @@ void cpu_idle(void)

/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched())
platform_idle();
+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:18:44

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 23/26] parisc: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the parisc's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Parisc <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/parisc/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index d4b94b3..c54a4db 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -48,6 +48,7 @@
#include <linux/unistd.h>
#include <linux/kallsyms.h>
#include <linux/uaccess.h>
+#include <linux/rcupdate.h>

#include <asm/io.h>
#include <asm/asm-offsets.h>
@@ -69,8 +70,10 @@ void cpu_idle(void)

/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched())
barrier();
+ rcu_idle_exit();
schedule_preempt_disabled();
check_pgt_cache();
}
--
1.7.8

2012-08-30 21:14:14

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 22/26] mn10300: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the mn10300's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: David Howells <[email protected]>
Cc: Koichi Yasutake <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/mn10300/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/mn10300/kernel/process.c b/arch/mn10300/kernel/process.c
index 7dab0cd..e9cceba 100644
--- a/arch/mn10300/kernel/process.c
+++ b/arch/mn10300/kernel/process.c
@@ -25,6 +25,7 @@
#include <linux/err.h>
#include <linux/fs.h>
#include <linux/slab.h>
+#include <linux/rcupdate.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
#include <asm/io.h>
@@ -107,6 +108,7 @@ void cpu_idle(void)
{
/* endless idle loop with no priority at all */
for (;;) {
+ rcu_idle_enter();
while (!need_resched()) {
void (*idle)(void);

@@ -121,6 +123,7 @@ void cpu_idle(void)
}
idle();
}
+ rcu_idle_exit();

schedule_preempt_disabled();
}
--
1.7.8

2012-08-30 21:19:19

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 05/26] rcu: Allow rcu_user_enter()/exit() to nest

From: Frederic Weisbecker <[email protected]>

Allow calls to rcu_user_enter() even if we are already
in userspace (as seen by RCU) and allow calls to rcu_user_exit()
even if we are already in the kernel.

This makes the APIs more flexible to be called from architectures.
Exception entries for example won't need to know if they come from
userspace before calling rcu_user_exit().

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcutree.c | 41 +++++++++++++++++++++++++++++++++--------
kernel/rcutree.h | 3 +++
2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e287c4a..8bbc7fb 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -390,11 +390,9 @@ static void rcu_eqs_enter_common(struct rcu_dynticks *rdtp, long long oldval,
*/
static void rcu_eqs_enter(bool user)
{
- unsigned long flags;
long long oldval;
struct rcu_dynticks *rdtp;

- local_irq_save(flags);
rdtp = &__get_cpu_var(rcu_dynticks);
oldval = rdtp->dynticks_nesting;
WARN_ON_ONCE((oldval & DYNTICK_TASK_NEST_MASK) == 0);
@@ -403,7 +401,6 @@ static void rcu_eqs_enter(bool user)
else
rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
rcu_eqs_enter_common(rdtp, oldval, user);
- local_irq_restore(flags);
}

/**
@@ -420,7 +417,11 @@ static void rcu_eqs_enter(bool user)
*/
void rcu_idle_enter(void)
{
+ unsigned long flags;
+
+ local_irq_save(flags);
rcu_eqs_enter(0);
+ local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_idle_enter);

@@ -435,7 +436,18 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter);
*/
void rcu_user_enter(void)
{
- rcu_eqs_enter(1);
+ unsigned long flags;
+ struct rcu_dynticks *rdtp;
+
+ WARN_ON_ONCE(!current->mm);
+
+ local_irq_save(flags);
+ rdtp = &__get_cpu_var(rcu_dynticks);
+ if (!rdtp->in_user) {
+ rdtp->in_user = true;
+ rcu_eqs_enter(1);
+ }
+ local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_user_enter);

@@ -530,11 +542,9 @@ static void rcu_eqs_exit_common(struct rcu_dynticks *rdtp, long long oldval,
*/
static void rcu_eqs_exit(bool user)
{
- unsigned long flags;
struct rcu_dynticks *rdtp;
long long oldval;

- local_irq_save(flags);
rdtp = &__get_cpu_var(rcu_dynticks);
oldval = rdtp->dynticks_nesting;
WARN_ON_ONCE(oldval < 0);
@@ -543,7 +553,6 @@ static void rcu_eqs_exit(bool user)
else
rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
rcu_eqs_exit_common(rdtp, oldval, user);
- local_irq_restore(flags);
}

/**
@@ -559,7 +568,11 @@ static void rcu_eqs_exit(bool user)
*/
void rcu_idle_exit(void)
{
+ unsigned long flags;
+
+ local_irq_save(flags);
rcu_eqs_exit(0);
+ local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_idle_exit);

@@ -572,7 +585,16 @@ EXPORT_SYMBOL_GPL(rcu_idle_exit);
*/
void rcu_user_exit(void)
{
- rcu_eqs_exit(1);
+ unsigned long flags;
+ struct rcu_dynticks *rdtp;
+
+ local_irq_save(flags);
+ rdtp = &__get_cpu_var(rcu_dynticks);
+ if (rdtp->in_user) {
+ rdtp->in_user = false;
+ rcu_eqs_exit(1);
+ }
+ local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_user_exit);

@@ -2590,6 +2612,9 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE);
WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1);
+#ifdef CONFIG_RCU_USER_QS
+ WARN_ON_ONCE(rdp->dynticks->in_user);
+#endif
rdp->cpu = cpu;
rdp->rsp = rsp;
raw_spin_unlock_irqrestore(&rnp->lock, flags);
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 4d29169..0dd5fd6 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -102,6 +102,9 @@ struct rcu_dynticks {
/* idle-period nonlazy_posted snapshot. */
int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
+#ifdef CONFIG_RCU_USER_QS
+ bool in_user; /* Is the CPU in userland from RCU POV? */
+#endif
};

/* RCU's kthread states for tracing. */
--
1.7.8

2012-08-30 21:19:22

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 21/26] m68k: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the m68k's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Cc: m68k <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/m68k/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index c488e3c..ac2892e 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -25,6 +25,7 @@
#include <linux/reboot.h>
#include <linux/init_task.h>
#include <linux/mqueue.h>
+#include <linux/rcupdate.h>

#include <asm/uaccess.h>
#include <asm/traps.h>
@@ -75,8 +76,10 @@ void cpu_idle(void)
{
/* endless idle loop with no priority at all */
while (1) {
+ rcu_idle_enter();
while (!need_resched())
idle();
+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:14:12

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 19/26] h8300: Add missing RCU idle APIs on idle loop

From: Frederic Weisbecker <[email protected]>

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the h8300's idle loop.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: 3.2.x.. <[email protected]>
Cc: Paul E. McKenney <[email protected]>
---
arch/h8300/kernel/process.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/h8300/kernel/process.c b/arch/h8300/kernel/process.c
index 0e9c315..f153ed1 100644
--- a/arch/h8300/kernel/process.c
+++ b/arch/h8300/kernel/process.c
@@ -36,6 +36,7 @@
#include <linux/reboot.h>
#include <linux/fs.h>
#include <linux/slab.h>
+#include <linux/rcupdate.h>

#include <asm/uaccess.h>
#include <asm/traps.h>
@@ -78,8 +79,10 @@ void (*idle)(void) = default_idle;
void cpu_idle(void)
{
while (1) {
+ rcu_idle_enter();
while (!need_resched())
idle();
+ rcu_idle_exit();
schedule_preempt_disabled();
}
}
--
1.7.8

2012-08-30 21:20:16

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH tip/core/rcu 12/26] x86: Use the new schedule_user API on userspace preemption

From: Frederic Weisbecker <[email protected]>

This way we can exit the RCU extended quiescent state before
we schedule a new task from irq/exception exit.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
arch/x86/kernel/entry_64.S | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 69babd8..6230487 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -565,7 +565,7 @@ sysret_careful:
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushq_cfi %rdi
- call schedule
+ call schedule_user
popq_cfi %rdi
jmp sysret_check

@@ -678,7 +678,7 @@ int_careful:
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushq_cfi %rdi
- call schedule
+ call schedule_user
popq_cfi %rdi
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
@@ -974,7 +974,7 @@ retint_careful:
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushq_cfi %rdi
- call schedule
+ call schedule_user
popq_cfi %rdi
GET_THREAD_INFO(%rcx)
DISABLE_INTERRUPTS(CLBR_NONE)
@@ -1449,7 +1449,7 @@ paranoid_userspace:
paranoid_schedule:
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_ANY)
- call schedule
+ call schedule_user
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
jmp paranoid_userspace
--
1.7.8

2012-08-31 19:07:47

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 01/26] rcu: New rcu_user_enter() and rcu_user_exit() APIs

On Thu, Aug 30, 2012 at 02:05:18PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> RCU currently insists that only idle tasks can enter RCU idle mode, which
> prohibits an adaptive tickless kernel (AKA nohz cpusets), which in turn
> would mean that usermode execution would always take scheduling-clock
> interrupts, even when there is only one task runnable on the CPU in
> question.
>
> This commit therefore adds rcu_user_enter() and rcu_user_exit(), which
> allow non-idle tasks to enter RCU idle mode. These are quite similar
> to rcu_idle_enter() and rcu_idle_exit(), respectively, except that they
> omit the idle-task checks.
>
> [ Updated to use "user" flag rather than separate check functions. ]
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Daniel Lezcano <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>

A few suggestions below: an optional microoptimization and some bugfixes.
With the bugfixes, and with or without the microoptimization:

Reviewed-by: Josh Triplett <[email protected]>

> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
[...]
> -static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
> +static void rcu_eqs_enter_common(struct rcu_dynticks *rdtp, long long oldval,
> + bool user)
> {
> trace_rcu_dyntick("Start", oldval, 0);
> - if (!is_idle_task(current)) {
> + if (!is_idle_task(current) && !user) {

Microoptimization: putting the !user check first (here and in the exit
function) would allow the compiler to partially inline rcu_eqs_*_common
into the two trivial wrappers and constant-fold away the test for !user.

> +void rcu_idle_enter(void)
> +{
> + rcu_eqs_enter(0);
> +}

s/0/false/

> +void rcu_user_enter(void)
> +{
> + rcu_eqs_enter(1);
> +}

s/1/true/

> -static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
> +static void rcu_eqs_exit_common(struct rcu_dynticks *rdtp, long long oldval,
> + int user)
> {
> smp_mb__before_atomic_inc(); /* Force ordering w/previous sojourn. */
> atomic_inc(&rdtp->dynticks);
> @@ -464,7 +490,7 @@ static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
> WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> rcu_cleanup_after_idle(smp_processor_id());
> trace_rcu_dyntick("End", oldval, rdtp->dynticks_nesting);
> - if (!is_idle_task(current)) {
> + if (!is_idle_task(current) && !user) {

Same micro-optimization as the enter function.

> +void rcu_idle_exit(void)
> +{
> + rcu_eqs_exit(0);
> +}

s/0/false/

> +void rcu_user_exit(void)
> +{
> + rcu_eqs_exit(1);
> +}

s/1/true/

> @@ -539,7 +586,7 @@ void rcu_irq_enter(void)
> if (oldval)
> trace_rcu_dyntick("++=", oldval, rdtp->dynticks_nesting);
> else
> - rcu_idle_exit_common(rdtp, oldval);
> + rcu_eqs_exit_common(rdtp, oldval, 1);

s/1/true/, and likewise in rcu_irq_exit.

- Josh Triplett

2012-08-31 19:13:31

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 02/26] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs

On Thu, Aug 30, 2012 at 02:05:19PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In some cases, it is necessary to enter or exit userspace-RCU-idle mode
> from an interrupt handler, for example, if some other CPU sends this
> CPU a resched IPI. In this case, the current CPU would enter the IPI
> handler in userspace-RCU-idle mode, but would need to exit the IPI handler
> after having exited that mode.
>
> To allow this to work, this commit adds two new APIs to TREE_RCU:
>
> - rcu_user_enter_irq(). This must be called from an interrupt between
> rcu_irq_enter() and rcu_irq_exit(). After the irq calls rcu_irq_exit(),
> the irq handler will return into an RCU extended quiescent state.
> In theory, this interrupt is never a nested interrupt, but in practice
> it might interrupt softirq, which looks to RCU like a nested interrupt.
>
> - rcu_user_exit_irq(). This must be called from a non-nesting
> interrupt, interrupting an RCU extended quiescent state, also
> between rcu_irq_enter() and rcu_irq_exit(). After the irq calls
> rcu_irq_exit(), the irq handler will return in an RCU non-quiescent
> state.

These names seem a bit confusing. From the descriptions, it sounds like
you don't always need to pair them; rcu_irq_exit() will return to a
non-quiescent state, unless you call rcu_user_enter_irq and *don't* call
rcu_user_exit_irq. Did I get that semantic right?

Given that, the "enter" and "exit" names seem confusing. This seems
more like a flag you can set and clear, rather than a delimited region
as suggested by an enter/exit pair.

How about something vaguely like rcu_user_irq_set_eqs and
rcu_user_irq_clear_eqs?

- Josh Triplett

2012-08-31 19:54:42

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 02/26] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs

2012/8/31 Josh Triplett <[email protected]>:
> On Thu, Aug 30, 2012 at 02:05:19PM -0700, Paul E. McKenney wrote:
>> From: Frederic Weisbecker <[email protected]>
>>
>> In some cases, it is necessary to enter or exit userspace-RCU-idle mode
>> from an interrupt handler, for example, if some other CPU sends this
>> CPU a resched IPI. In this case, the current CPU would enter the IPI
>> handler in userspace-RCU-idle mode, but would need to exit the IPI handler
>> after having exited that mode.
>>
>> To allow this to work, this commit adds two new APIs to TREE_RCU:
>>
>> - rcu_user_enter_irq(). This must be called from an interrupt between
>> rcu_irq_enter() and rcu_irq_exit(). After the irq calls rcu_irq_exit(),
>> the irq handler will return into an RCU extended quiescent state.
>> In theory, this interrupt is never a nested interrupt, but in practice
>> it might interrupt softirq, which looks to RCU like a nested interrupt.
>>
>> - rcu_user_exit_irq(). This must be called from a non-nesting
>> interrupt, interrupting an RCU extended quiescent state, also
>> between rcu_irq_enter() and rcu_irq_exit(). After the irq calls
>> rcu_irq_exit(), the irq handler will return in an RCU non-quiescent
>> state.
>
> These names seem a bit confusing. From the descriptions, it sounds like
> you don't always need to pair them; rcu_irq_exit() will return to a
> non-quiescent state, unless you call rcu_user_enter_irq and *don't* call
> rcu_user_exit_irq. Did I get that semantic right?

Yeah. They indeed don't always need to be paired. We can enter into
user (from rcu POV) with rcu_user_enter_irq() and exit user with
rcu_user_exit().

It's just a matter of context: from where do we set/unset RCU as in
user mode: irq or not. The only thing that is paired is the fact we
enter/exit that RCU user mode. There are just different APIs to do so.

> Given that, the "enter" and "exit" names seem confusing. This seems
> more like a flag you can set and clear, rather than a delimited region
> as suggested by an enter/exit pair.
>
> How about something vaguely like rcu_user_irq_set_eqs and
> rcu_user_irq_clear_eqs?

I'd rather suggest rcu_user_enter_after_irq and
rcu_user_exit_after_irq. It describes precisely what it does.
>
> - Josh Triplett

2012-08-31 21:38:54

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 02/26] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs

On Fri, Aug 31, 2012 at 09:54:39PM +0200, Frederic Weisbecker wrote:
> 2012/8/31 Josh Triplett <[email protected]>:
> > Given that, the "enter" and "exit" names seem confusing. This seems
> > more like a flag you can set and clear, rather than a delimited region
> > as suggested by an enter/exit pair.
> >
> > How about something vaguely like rcu_user_irq_set_eqs and
> > rcu_user_irq_clear_eqs?
>
> I'd rather suggest rcu_user_enter_after_irq and
> rcu_user_exit_after_irq. It describes precisely what it does.

Those names sound reasonable, sure; in the context of "after",
enter/exit sounds less confusing.

- Josh Triplett

2012-08-31 23:40:32

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 03/26] rcu: Make RCU_FAST_NO_HZ handle adaptive ticks

On Thu, Aug 30, 2012 at 02:05:20PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <[email protected]>
>
> The current implementation of RCU_FAST_NO_HZ tries reasonably hard to rid
> the current CPU of RCU callbacks. This is appropriate when the CPU is
> entering idle, where it doesn't have much useful to do anyway, but is most
> definitely not what you want when transitioning to user-mode execution.
> This commit therefore detects the adaptive-tick case, and refrains from
> burning CPU time getting rid of RCU callbacks in that case.

With the OOM handler from your other patch series, I don't know that it
makes as much sense in the idle case, either; perhaps it would make more
sense to wait and batch up more callbacks as long as you have memory,
and then run them in one big burst.

> Signed-off-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> kernel/rcutree_plugin.h | 20 ++++++++++++++++++++
> 1 files changed, 20 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 7f3244c..b0f09d6 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -1997,6 +1997,26 @@ static void rcu_prepare_for_idle(int cpu)
> if (!tne)
> return;
>
> + /* Adaptive-tick mode, where usermode execution is idle to RCU. */
> + if (!is_idle_task(current)) {
> + rdtp->dyntick_holdoff = jiffies - 1;
> + if (rcu_cpu_has_nonlazy_callbacks(cpu)) {
> + trace_rcu_prep_idle("User dyntick with callbacks");
> + rdtp->idle_gp_timer_expires =
> + round_up(jiffies + RCU_IDLE_GP_DELAY,
> + RCU_IDLE_GP_DELAY);
> + } else if (rcu_cpu_has_callbacks(cpu)) {
> + rdtp->idle_gp_timer_expires =
> + round_jiffies(jiffies + RCU_IDLE_LAZY_GP_DELAY);
> + trace_rcu_prep_idle("User dyntick with lazy callbacks");
> + } else {
> + return;
> + }
> + tp = &rdtp->idle_gp_timer;
> + mod_timer_pinned(tp, rdtp->idle_gp_timer_expires);
> + return;
> + }
> +
> /*
> * If this is an idle re-entry, for example, due to use of
> * RCU_NONIDLE() or the new idle-loop tracing API within the idle
> --
> 1.7.8
>

2012-08-31 23:44:13

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 04/26] rcu: Settle config for userspace extended quiescent state

On Thu, Aug 30, 2012 at 02:05:21PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Create a new config option under the RCU menu that put
> CPUs under RCU extended quiescent state (as in dynticks
> idle mode) when they run in userspace. This require
> some contribution from architectures to hook into kernel
> and userspace boundaries.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

One question below, but nonethelesss:

Reviewed-by: Josh Triplett <[email protected]>

> arch/Kconfig | 10 ++++++++++
> include/linux/rcupdate.h | 8 ++++++++
> init/Kconfig | 10 ++++++++++
> kernel/rcutree.c | 5 ++++-
> 4 files changed, 32 insertions(+), 1 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 72f2fa1..1401a75 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -281,4 +281,14 @@ config SECCOMP_FILTER
>
> See Documentation/prctl/seccomp_filter.txt for details.
>
> +config HAVE_RCU_USER_QS
> + bool
> + help
> + Provide kernel entry/exit hooks necessary for userspace
> + RCU extended quiescent state. Syscalls need to be wrapped inside
> + rcu_user_exit()-rcu_user_enter() through the slow path using
> + TIF_NOHZ flag. Exceptions handlers must be wrapped as well. Irqs
> + are already protected inside rcu_irq_enter/rcu_irq_exit() but
> + preemption or signal handling on irq exit still need to be protected.
> +
> source "kernel/gcov/Kconfig"
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 81d3d5c..e411117 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -191,10 +191,18 @@ extern void rcu_idle_enter(void);
> extern void rcu_idle_exit(void);
> extern void rcu_irq_enter(void);
> extern void rcu_irq_exit(void);
> +
> +#ifdef CONFIG_RCU_USER_QS
> extern void rcu_user_enter(void);
> extern void rcu_user_exit(void);
> extern void rcu_user_enter_irq(void);
> extern void rcu_user_exit_irq(void);
> +#else
> +static inline void rcu_user_enter(void) { }
> +static inline void rcu_user_exit(void) { }
> +#endif /* CONFIG_RCU_USER_QS */
> +
> +
> extern void exit_rcu(void);
>
> /**
> diff --git a/init/Kconfig b/init/Kconfig
> index af6c7f8..f6a1830 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -441,6 +441,16 @@ config PREEMPT_RCU
> This option enables preemptible-RCU code that is common between
> the TREE_PREEMPT_RCU and TINY_PREEMPT_RCU implementations.
>
> +config RCU_USER_QS
> + bool "Consider userspace as in RCU extended quiescent state"
> + depends on HAVE_RCU_USER_QS && SMP

Does this actually depend on SMP, or does it depend on the non-TINY RCU
implementation? If the latter, it should depend on that rather than
SMP.

(I assume that the tiny RCU implementation simply doesn't need all this
machinery because it doesn't need coordinated quiescence at all? Or
does tiny RCU still cause a periodic wakeup on UP?)

> + help
> + This option sets hooks on kernel / userspace boundaries and
> + puts RCU in extended quiescent state when the CPU runs in
> + userspace. It means that when a CPU runs in userspace, it is
> + excluded from the global RCU state machine and thus doesn't
> + to keep the timer tick on for RCU.
> +
> config RCU_FANOUT
> int "Tree-based hierarchical RCU fanout value"
> range 2 64 if 64BIT
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 8fdea17..e287c4a 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -424,6 +424,7 @@ void rcu_idle_enter(void)
> }
> EXPORT_SYMBOL_GPL(rcu_idle_enter);
>
> +#ifdef CONFIG_RCU_USER_QS
> /**
> * rcu_user_enter - inform RCU that we are resuming userspace.
> *
> @@ -438,7 +439,6 @@ void rcu_user_enter(void)
> }
> EXPORT_SYMBOL_GPL(rcu_user_enter);
>
> -
> /**
> * rcu_user_enter_irq - inform RCU that we are going to resume userspace
> * after the current irq returns.
> @@ -459,6 +459,7 @@ void rcu_user_enter_irq(void)
> rdtp->dynticks_nesting = 1;
> local_irq_restore(flags);
> }
> +#endif
>
> /**
> * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> @@ -562,6 +563,7 @@ void rcu_idle_exit(void)
> }
> EXPORT_SYMBOL_GPL(rcu_idle_exit);
>
> +#ifdef CONFIG_RCU_USER_QS
> /**
> * rcu_user_exit - inform RCU that we are exiting userspace.
> *
> @@ -595,6 +597,7 @@ void rcu_user_exit_irq(void)
> rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
> local_irq_restore(flags);
> }
> +#endif
>
> /**
> * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
> --
> 1.7.8
>

2012-08-31 23:46:03

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 05/26] rcu: Allow rcu_user_enter()/exit() to nest

On Thu, Aug 30, 2012 at 02:05:22PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Allow calls to rcu_user_enter() even if we are already
> in userspace (as seen by RCU) and allow calls to rcu_user_exit()
> even if we are already in the kernel.
>
> This makes the APIs more flexible to be called from architectures.
> Exception entries for example won't need to know if they come from
> userspace before calling rcu_user_exit().
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> kernel/rcutree.c | 41 +++++++++++++++++++++++++++++++++--------
> kernel/rcutree.h | 3 +++
> 2 files changed, 36 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index e287c4a..8bbc7fb 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -390,11 +390,9 @@ static void rcu_eqs_enter_common(struct rcu_dynticks *rdtp, long long oldval,
> */
> static void rcu_eqs_enter(bool user)
> {
> - unsigned long flags;
> long long oldval;
> struct rcu_dynticks *rdtp;
>
> - local_irq_save(flags);
> rdtp = &__get_cpu_var(rcu_dynticks);
> oldval = rdtp->dynticks_nesting;
> WARN_ON_ONCE((oldval & DYNTICK_TASK_NEST_MASK) == 0);
> @@ -403,7 +401,6 @@ static void rcu_eqs_enter(bool user)
> else
> rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
> rcu_eqs_enter_common(rdtp, oldval, user);
> - local_irq_restore(flags);
> }
>
> /**
> @@ -420,7 +417,11 @@ static void rcu_eqs_enter(bool user)
> */
> void rcu_idle_enter(void)
> {
> + unsigned long flags;
> +
> + local_irq_save(flags);
> rcu_eqs_enter(0);
> + local_irq_restore(flags);
> }
> EXPORT_SYMBOL_GPL(rcu_idle_enter);
>
> @@ -435,7 +436,18 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter);
> */
> void rcu_user_enter(void)
> {
> - rcu_eqs_enter(1);
> + unsigned long flags;
> + struct rcu_dynticks *rdtp;
> +
> + WARN_ON_ONCE(!current->mm);
> +
> + local_irq_save(flags);
> + rdtp = &__get_cpu_var(rcu_dynticks);
> + if (!rdtp->in_user) {
> + rdtp->in_user = true;
> + rcu_eqs_enter(1);
> + }
> + local_irq_restore(flags);
> }
> EXPORT_SYMBOL_GPL(rcu_user_enter);
>
> @@ -530,11 +542,9 @@ static void rcu_eqs_exit_common(struct rcu_dynticks *rdtp, long long oldval,
> */
> static void rcu_eqs_exit(bool user)
> {
> - unsigned long flags;
> struct rcu_dynticks *rdtp;
> long long oldval;
>
> - local_irq_save(flags);
> rdtp = &__get_cpu_var(rcu_dynticks);
> oldval = rdtp->dynticks_nesting;
> WARN_ON_ONCE(oldval < 0);
> @@ -543,7 +553,6 @@ static void rcu_eqs_exit(bool user)
> else
> rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
> rcu_eqs_exit_common(rdtp, oldval, user);
> - local_irq_restore(flags);
> }
>
> /**
> @@ -559,7 +568,11 @@ static void rcu_eqs_exit(bool user)
> */
> void rcu_idle_exit(void)
> {
> + unsigned long flags;
> +
> + local_irq_save(flags);
> rcu_eqs_exit(0);
> + local_irq_restore(flags);
> }
> EXPORT_SYMBOL_GPL(rcu_idle_exit);
>
> @@ -572,7 +585,16 @@ EXPORT_SYMBOL_GPL(rcu_idle_exit);
> */
> void rcu_user_exit(void)
> {
> - rcu_eqs_exit(1);
> + unsigned long flags;
> + struct rcu_dynticks *rdtp;
> +
> + local_irq_save(flags);
> + rdtp = &__get_cpu_var(rcu_dynticks);
> + if (rdtp->in_user) {
> + rdtp->in_user = false;
> + rcu_eqs_exit(1);
> + }
> + local_irq_restore(flags);
> }
> EXPORT_SYMBOL_GPL(rcu_user_exit);
>
> @@ -2590,6 +2612,9 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
> rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
> WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE);
> WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1);
> +#ifdef CONFIG_RCU_USER_QS
> + WARN_ON_ONCE(rdp->dynticks->in_user);
> +#endif
> rdp->cpu = cpu;
> rdp->rsp = rsp;
> raw_spin_unlock_irqrestore(&rnp->lock, flags);
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 4d29169..0dd5fd6 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -102,6 +102,9 @@ struct rcu_dynticks {
> /* idle-period nonlazy_posted snapshot. */
> int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
> #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
> +#ifdef CONFIG_RCU_USER_QS
> + bool in_user; /* Is the CPU in userland from RCU POV? */
> +#endif
> };
>
> /* RCU's kthread states for tracing. */
> --
> 1.7.8
>

2012-08-31 23:47:06

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 06/26] rcu: Ignore userspace extended quiescent state by default

On Thu, Aug 30, 2012 at 02:05:23PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> By default we don't want to enter into RCU extended quiescent
> state while in userspace because doing this produces some overhead
> (eg: use of syscall slowpath). Set it off by default and ready to
> run when some feature like adaptive tickless need it.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> kernel/rcutree.c | 5 ++++-
> kernel/rcutree.h | 1 +
> 2 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 8bbc7fb..e2fd370 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -210,6 +210,9 @@ EXPORT_SYMBOL_GPL(rcu_note_context_switch);
> DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
> .dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
> .dynticks = ATOMIC_INIT(1),
> +#ifdef CONFIG_RCU_USER_QS
> + .ignore_user_qs = true,
> +#endif
> };
>
> static int blimit = 10; /* Maximum callbacks per rcu_do_batch. */
> @@ -443,7 +446,7 @@ void rcu_user_enter(void)
>
> local_irq_save(flags);
> rdtp = &__get_cpu_var(rcu_dynticks);
> - if (!rdtp->in_user) {
> + if (!rdtp->ignore_user_qs && !rdtp->in_user) {
> rdtp->in_user = true;
> rcu_eqs_enter(1);
> }
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 0dd5fd6..c190582 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -103,6 +103,7 @@ struct rcu_dynticks {
> int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
> #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
> #ifdef CONFIG_RCU_USER_QS
> + bool ignore_user_qs; /* Treat userspace as extended QS or not */
> bool in_user; /* Is the CPU in userland from RCU POV? */
> #endif
> };
> --
> 1.7.8
>

2012-08-31 23:48:47

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 07/26] rcu: Switch task's syscall hooks on context switch

On Thu, Aug 30, 2012 at 02:05:24PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Clear the syscalls hook of a task when it's scheduled out so that if
> the task migrates, it doesn't run the syscall slow path on a CPU
> that might not need it.
>
> Also set the syscalls hook on the next task if needed.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/um/drivers/mconsole_kern.c | 1 +
> include/linux/rcupdate.h | 2 ++
> include/linux/sched.h | 8 ++++++++
> kernel/rcutree.c | 15 +++++++++++++++
> kernel/sched/core.c | 1 +
> 5 files changed, 27 insertions(+), 0 deletions(-)
>
> diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
> index 664a60e..c17de0d 100644
> --- a/arch/um/drivers/mconsole_kern.c
> +++ b/arch/um/drivers/mconsole_kern.c
> @@ -705,6 +705,7 @@ static void stack_proc(void *arg)
> struct task_struct *from = current, *to = arg;
>
> to->thread.saved_task = from;
> + rcu_switch(from, to);
> switch_to(from, to, from);
> }
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e411117..1fc0a0e 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -197,6 +197,8 @@ extern void rcu_user_enter(void);
> extern void rcu_user_exit(void);
> extern void rcu_user_enter_irq(void);
> extern void rcu_user_exit_irq(void);
> +extern void rcu_user_hooks_switch(struct task_struct *prev,
> + struct task_struct *next);
> #else
> static inline void rcu_user_enter(void) { }
> static inline void rcu_user_exit(void) { }
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index c147e70..e4d5936 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1894,6 +1894,14 @@ static inline void rcu_copy_process(struct task_struct *p)
>
> #endif
>
> +static inline void rcu_switch(struct task_struct *prev,
> + struct task_struct *next)
> +{
> +#ifdef CONFIG_RCU_USER_QS
> + rcu_user_hooks_switch(prev, next);
> +#endif
> +}
> +
> static inline void tsk_restore_flags(struct task_struct *task,
> unsigned long orig_flags, unsigned long flags)
> {
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index e2fd370..af92681 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -721,6 +721,21 @@ int rcu_is_cpu_idle(void)
> }
> EXPORT_SYMBOL(rcu_is_cpu_idle);
>
> +#ifdef CONFIG_RCU_USER_QS
> +void rcu_user_hooks_switch(struct task_struct *prev,
> + struct task_struct *next)
> +{
> + struct rcu_dynticks *rdtp;
> +
> + /* Interrupts are disabled in context switch */
> + rdtp = &__get_cpu_var(rcu_dynticks);
> + if (!rdtp->ignore_user_qs) {
> + clear_tsk_thread_flag(prev, TIF_NOHZ);
> + set_tsk_thread_flag(next, TIF_NOHZ);
> + }
> +}
> +#endif /* #ifdef CONFIG_RCU_USER_QS */
> +
> #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)
>
> /*
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d325c4b..07c6d9a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2081,6 +2081,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
> #endif
>
> /* Here we just switch the register state and the stack. */
> + rcu_switch(prev, next);
> switch_to(prev, next, prev);
>
> barrier();
> --
> 1.7.8
>

2012-08-31 23:51:31

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 09/26] x86: Exception hooks for userspace RCU extended QS

On Thu, Aug 30, 2012 at 02:05:26PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Add necessary hooks to x86 exception for userspace
> RCU extended quiescent state support.
>
> This includes traps, page fault, debug exceptions, etc...
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/x86/include/asm/rcu.h | 20 ++++++++++++++++++++
> arch/x86/kernel/traps.c | 30 ++++++++++++++++++++++--------
> arch/x86/mm/fault.c | 13 +++++++++++--
> 3 files changed, 53 insertions(+), 10 deletions(-)
> create mode 100644 arch/x86/include/asm/rcu.h
>
> diff --git a/arch/x86/include/asm/rcu.h b/arch/x86/include/asm/rcu.h
> new file mode 100644
> index 0000000..439815b
> --- /dev/null
> +++ b/arch/x86/include/asm/rcu.h
> @@ -0,0 +1,20 @@
> +#ifndef _ASM_X86_RCU_H
> +#define _ASM_X86_RCU_H
> +
> +#include <linux/rcupdate.h>
> +#include <asm/ptrace.h>
> +
> +static inline void exception_enter(struct pt_regs *regs)
> +{
> + rcu_user_exit();
> +}
> +
> +static inline void exception_exit(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_RCU_USER_QS
> + if (user_mode(regs))
> + rcu_user_enter();
> +#endif
> +}
> +
> +#endif
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index b481341..ab82cbd 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -55,6 +55,7 @@
> #include <asm/i387.h>
> #include <asm/fpu-internal.h>
> #include <asm/mce.h>
> +#include <asm/rcu.h>
>
> #include <asm/mach_traps.h>
>
> @@ -180,11 +181,15 @@ vm86_trap:
> #define DO_ERROR(trapnr, signr, str, name) \
> dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
> { \
> - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \
> - == NOTIFY_STOP) \
> + exception_enter(regs); \
> + if (notify_die(DIE_TRAP, str, regs, error_code, \
> + trapnr, signr) == NOTIFY_STOP) { \
> + exception_exit(regs); \
> return; \
> + } \
> conditional_sti(regs); \
> do_trap(trapnr, signr, str, regs, error_code, NULL); \
> + exception_exit(regs); \
> }
>
> #define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \
> @@ -195,11 +200,15 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
> info.si_errno = 0; \
> info.si_code = sicode; \
> info.si_addr = (void __user *)siaddr; \
> - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \
> - == NOTIFY_STOP) \
> + exception_enter(regs); \
> + if (notify_die(DIE_TRAP, str, regs, error_code, \
> + trapnr, signr) == NOTIFY_STOP) { \
> + exception_exit(regs); \
> return; \
> + } \
> conditional_sti(regs); \
> do_trap(trapnr, signr, str, regs, error_code, &info); \
> + exception_exit(regs); \
> }
>
> DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV,
> @@ -312,6 +321,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
> ftrace_int3_handler(regs))
> return;
> #endif
> + exception_enter(regs);
> #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
> if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
> SIGTRAP) == NOTIFY_STOP)
> @@ -331,6 +341,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
> do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, NULL);
> preempt_conditional_cli(regs);
> debug_stack_usage_dec();
> + exception_exit(regs);
> }
>
> #ifdef CONFIG_X86_64
> @@ -391,6 +402,8 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> unsigned long dr6;
> int si_code;
>
> + exception_enter(regs);
> +
> get_debugreg(dr6, 6);
>
> /* Filter out all the reserved bits which are preset to 1 */
> @@ -406,7 +419,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
>
> /* Catch kmemcheck conditions first of all! */
> if ((dr6 & DR_STEP) && kmemcheck_trap(regs))
> - return;
> + goto exit;
>
> /* DR6 may or may not be cleared by the CPU */
> set_debugreg(0, 6);
> @@ -421,7 +434,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
>
> if (notify_die(DIE_DEBUG, "debug", regs, PTR_ERR(&dr6), error_code,
> SIGTRAP) == NOTIFY_STOP)
> - return;
> + goto exit;
>
> /*
> * Let others (NMI) know that the debug stack is in use
> @@ -437,7 +450,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> X86_TRAP_DB);
> preempt_conditional_cli(regs);
> debug_stack_usage_dec();
> - return;
> + goto exit;
> }
>
> /*
> @@ -458,7 +471,8 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> preempt_conditional_cli(regs);
> debug_stack_usage_dec();
>
> - return;
> +exit:
> + exception_exit(regs);
> }
>
> /*
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 76dcd9d..7dde46d 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -18,6 +18,7 @@
> #include <asm/pgalloc.h> /* pgd_*(), ... */
> #include <asm/kmemcheck.h> /* kmemcheck_*(), ... */
> #include <asm/fixmap.h> /* VSYSCALL_START */
> +#include <asm/rcu.h> /* exception_enter(), ... */
>
> /*
> * Page fault error code bits:
> @@ -1000,8 +1001,8 @@ static int fault_in_kernel_space(unsigned long address)
> * and the problem, and then passes it off to one of the appropriate
> * routines.
> */
> -dotraplinkage void __kprobes
> -do_page_fault(struct pt_regs *regs, unsigned long error_code)
> +static void __kprobes
> +__do_page_fault(struct pt_regs *regs, unsigned long error_code)
> {
> struct vm_area_struct *vma;
> struct task_struct *tsk;
> @@ -1209,3 +1210,11 @@ good_area:
>
> up_read(&mm->mmap_sem);
> }
> +
> +dotraplinkage void __kprobes
> +do_page_fault(struct pt_regs *regs, unsigned long error_code)
> +{
> + exception_enter(regs);
> + __do_page_fault(regs, error_code);
> + exception_exit(regs);
> +}
> --
> 1.7.8
>

2012-08-31 23:52:09

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 10/26] rcu: Exit RCU extended QS on kernel preemption after irq/exception

On Thu, Aug 30, 2012 at 02:05:27PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> When an exception or an irq exits, and we are going to resume into
> interrupted kernel code, the low level architecture code calls
> preempt_schedule_irq() if there is a need to reschedule.
>
> If the interrupt/exception occured between a call to rcu_user_enter()
> (from syscall exit, exception exit, do_notify_resume exit, ...) and
> a real resume to userspace (iret,...), preempt_schedule_irq() can be
> called whereas RCU thinks we are in userspace. But preempt_schedule_irq()
> is going to run kernel code and may be some RCU read side critical
> section. We must exit the userspace extended quiescent state before
> we call it.
>
> To solve this, just call rcu_user_exit() in the beginning of
> preempt_schedule_irq().
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> kernel/sched/core.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 07c6d9a..0bd599b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3564,6 +3564,7 @@ asmlinkage void __sched preempt_schedule_irq(void)
> /* Catch callers which need to be fixed */
> BUG_ON(ti->preempt_count || !irqs_disabled());
>
> + rcu_user_exit();
> do {
> add_preempt_count(PREEMPT_ACTIVE);
> local_irq_enable();
> --
> 1.7.8
>

2012-08-31 23:52:36

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 11/26] rcu: Exit RCU extended QS on user preemption

On Thu, Aug 30, 2012 at 02:05:28PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> When exceptions or irq are about to resume userspace, if
> the task needs to be rescheduled, the arch low level code
> calls schedule() directly.
>
> At that time we may be in extended quiescent state from RCU
> POV: the exception is not anymore protected inside
> rcu_user_exit() - rcu_user_enter() and the irq has called
> rcu_irq_exit() already.
>
> Create a new API schedule_user() that calls schedule() inside
> rcu_user_exit()-rcu_user_enter() in order to protect it. Archs
> will need to rely on it now to implement user preemption safely.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> kernel/sched/core.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0bd599b..e841dfc 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3463,6 +3463,13 @@ asmlinkage void __sched schedule(void)
> }
> EXPORT_SYMBOL(schedule);
>
> +asmlinkage void __sched schedule_user(void)
> +{
> + rcu_user_exit();
> + schedule();
> + rcu_user_enter();
> +}
> +
> /**
> * schedule_preempt_disabled - called with preemption disabled
> *
> --
> 1.7.8
>

2012-08-31 23:53:58

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 12/26] x86: Use the new schedule_user API on userspace preemption

On Thu, Aug 30, 2012 at 02:05:29PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> This way we can exit the RCU extended quiescent state before
> we schedule a new task from irq/exception exit.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/x86/kernel/entry_64.S | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 69babd8..6230487 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -565,7 +565,7 @@ sysret_careful:
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_NONE)
> pushq_cfi %rdi
> - call schedule
> + call schedule_user
> popq_cfi %rdi
> jmp sysret_check
>
> @@ -678,7 +678,7 @@ int_careful:
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_NONE)
> pushq_cfi %rdi
> - call schedule
> + call schedule_user
> popq_cfi %rdi
> DISABLE_INTERRUPTS(CLBR_NONE)
> TRACE_IRQS_OFF
> @@ -974,7 +974,7 @@ retint_careful:
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_NONE)
> pushq_cfi %rdi
> - call schedule
> + call schedule_user
> popq_cfi %rdi
> GET_THREAD_INFO(%rcx)
> DISABLE_INTERRUPTS(CLBR_NONE)
> @@ -1449,7 +1449,7 @@ paranoid_userspace:
> paranoid_schedule:
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_ANY)
> - call schedule
> + call schedule_user
> DISABLE_INTERRUPTS(CLBR_ANY)
> TRACE_IRQS_OFF
> jmp paranoid_userspace
> --
> 1.7.8
>

2012-08-31 23:54:23

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 13/26] x86: Exit RCU extended QS on notify resume

On Thu, Aug 30, 2012 at 02:05:30PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> do_notify_resume() may be called on irq or exception
> exit. But at that time the exception has already called
> rcu_user_enter() and the irq has already called rcu_irq_exit().
>
> Since it can use RCU read side critical section, we must call
> rcu_user_exit() before doing anything there. Then we must call
> back rcu_user_enter() after this function because we know we are
> going to userspace from there.
>
> This complete support for userspace RCU extended quiescent state
> in x86-64.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/x86/Kconfig | 1 +
> arch/x86/kernel/signal.c | 4 ++++
> 2 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index ba2657c..5cd953a 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -97,6 +97,7 @@ config X86
> select KTIME_SCALAR if X86_32
> select GENERIC_STRNCPY_FROM_USER
> select GENERIC_STRNLEN_USER
> + select HAVE_RCU_USER_QS if X86_64
>
> config INSTRUCTION_DECODER
> def_bool (KPROBES || PERF_EVENTS || UPROBES)
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index b280908..bca0ab9 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -779,6 +779,8 @@ static void do_signal(struct pt_regs *regs)
> void
> do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
> {
> + rcu_user_exit();
> +
> #ifdef CONFIG_X86_MCE
> /* notify userspace of pending MCEs */
> if (thread_info_flags & _TIF_MCE_NOTIFY)
> @@ -804,6 +806,8 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
> #ifdef CONFIG_X86_32
> clear_thread_flag(TIF_IRET);
> #endif /* CONFIG_X86_32 */
> +
> + rcu_user_enter();
> }
>
> void signal_fault(struct pt_regs *regs, void __user *frame, char *where)
> --
> 1.7.8
>

2012-08-31 23:54:57

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 14/26] rcu: Userspace RCU extended QS selftest

On Thu, Aug 30, 2012 at 02:05:31PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Provide a config option that enables the userspace
> RCU extended quiescent state on every CPUs by default.
>
> This is for testing purpose.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> init/Kconfig | 8 ++++++++
> kernel/rcutree.c | 2 +-
> 2 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index f6a1830..c26b8a1 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -451,6 +451,14 @@ config RCU_USER_QS
> excluded from the global RCU state machine and thus doesn't
> to keep the timer tick on for RCU.
>
> +config RCU_USER_QS_FORCE
> + bool "Force userspace extended QS by default"
> + depends on RCU_USER_QS
> + help
> + Set the hooks in user/kernel boundaries by default in order to
> + test this feature that treats userspace as an extended quiescent
> + state until we have a real user like a full adaptive nohz option.
> +
> config RCU_FANOUT
> int "Tree-based hierarchical RCU fanout value"
> range 2 64 if 64BIT
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index af92681..ccf3cbf 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -210,7 +210,7 @@ EXPORT_SYMBOL_GPL(rcu_note_context_switch);
> DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
> .dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
> .dynticks = ATOMIC_INIT(1),
> -#ifdef CONFIG_RCU_USER_QS
> +#if defined(CONFIG_RCU_USER_QS) && !defined(CONFIG_RCU_USER_QS_FORCE)
> .ignore_user_qs = true,
> #endif
> };
> --
> 1.7.8
>

2012-08-31 23:55:57

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 15/26] alpha: Fix preemption handling in idle loop

On Thu, Aug 30, 2012 at 02:05:32PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> cpu_idle() is called on the boot CPU by the init code with
> preemption disabled. But the cpu_idle() function in alpha
> doesn't handle this when it calls schedule() directly.
>
> Fix it by converting it into schedule_preempt_disabled().
>
> Also disable preemption before calling cpu_idle() from
> secondary CPU entry code to stay consistent with this
> state.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Richard Henderson <[email protected]>
> Cc: Ivan Kokshaysky <[email protected]>
> Cc: Matt Turner <[email protected]>
> Cc: alpha <[email protected]>
> Cc: Paul E. McKenney <[email protected]>
> Cc: Michael Cree <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/alpha/kernel/process.c | 3 ++-
> arch/alpha/kernel/smp.c | 1 +
> 2 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
> index 153d3fc..eac5e01 100644
> --- a/arch/alpha/kernel/process.c
> +++ b/arch/alpha/kernel/process.c
> @@ -56,7 +56,8 @@ cpu_idle(void)
>
> while (!need_resched())
> cpu_relax();
> - schedule();
> +
> + schedule_preempt_disabled();
> }
> }
>
> diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
> index 35ddc02..a41ad90 100644
> --- a/arch/alpha/kernel/smp.c
> +++ b/arch/alpha/kernel/smp.c
> @@ -166,6 +166,7 @@ smp_callin(void)
> DBGS(("smp_callin: commencing CPU %d current %p active_mm %p\n",
> cpuid, current, current->active_mm));
>
> + preempt_disable();
> /* Do nothing. */
> cpu_idle();
> }
> --
> 1.7.8
>

2012-08-31 23:59:21

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 08/26] x86: Syscall hooks for userspace RCU extended QS

On Thu, Aug 30, 2012 at 02:05:25PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> Add syscall slow path hooks to notify syscall entry
> and exit on CPUs that want to support userspace RCU
> extended quiescent state.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Alessio Igor Bogani <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Avi Kivity <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Geoff Levand <[email protected]>
> Cc: Gilad Ben Yossef <[email protected]>
> Cc: Hakan Akkan <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Josh Triplett <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Max Krasnyansky <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Sven-Thorsten Dietrich <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>

This seems reasonable; presumably you plan to add something actually
setting TIF_NOHZ in a subsequent patch series?

Reviewed-by: Josh Triplett <[email protected]>

> arch/x86/include/asm/thread_info.h | 10 +++++++---
> arch/x86/kernel/ptrace.c | 5 +++++
> 2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> index 89f794f..c535d84 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -89,6 +89,7 @@ struct thread_info {
> #define TIF_NOTSC 16 /* TSC is not accessible in userland */
> #define TIF_IA32 17 /* IA32 compatibility process */
> #define TIF_FORK 18 /* ret_from_fork */
> +#define TIF_NOHZ 19 /* in adaptive nohz mode */
> #define TIF_MEMDIE 20 /* is terminating due to OOM killer */
> #define TIF_DEBUG 21 /* uses debug registers */
> #define TIF_IO_BITMAP 22 /* uses I/O bitmap */
> @@ -114,6 +115,7 @@ struct thread_info {
> #define _TIF_NOTSC (1 << TIF_NOTSC)
> #define _TIF_IA32 (1 << TIF_IA32)
> #define _TIF_FORK (1 << TIF_FORK)
> +#define _TIF_NOHZ (1 << TIF_NOHZ)
> #define _TIF_DEBUG (1 << TIF_DEBUG)
> #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
> #define _TIF_FORCED_TF (1 << TIF_FORCED_TF)
> @@ -126,12 +128,13 @@ struct thread_info {
> /* work to do in syscall_trace_enter() */
> #define _TIF_WORK_SYSCALL_ENTRY \
> (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT | \
> - _TIF_SECCOMP | _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT)
> + _TIF_SECCOMP | _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT | \
> + _TIF_NOHZ)
>
> /* work to do in syscall_trace_leave() */
> #define _TIF_WORK_SYSCALL_EXIT \
> (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | _TIF_SINGLESTEP | \
> - _TIF_SYSCALL_TRACEPOINT)
> + _TIF_SYSCALL_TRACEPOINT | _TIF_NOHZ)
>
> /* work to do on interrupt/exception return */
> #define _TIF_WORK_MASK \
> @@ -141,7 +144,8 @@ struct thread_info {
>
> /* work to do on any return to user space */
> #define _TIF_ALLWORK_MASK \
> - ((0x0000FFFF & ~_TIF_SECCOMP) | _TIF_SYSCALL_TRACEPOINT)
> + ((0x0000FFFF & ~_TIF_SECCOMP) | _TIF_SYSCALL_TRACEPOINT | \
> + _TIF_NOHZ)
>
> /* Only used for 64 bit */
> #define _TIF_DO_NOTIFY_MASK \
> diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
> index c4c6a5c..9f94f8e 100644
> --- a/arch/x86/kernel/ptrace.c
> +++ b/arch/x86/kernel/ptrace.c
> @@ -21,6 +21,7 @@
> #include <linux/signal.h>
> #include <linux/perf_event.h>
> #include <linux/hw_breakpoint.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/uaccess.h>
> #include <asm/pgtable.h>
> @@ -1463,6 +1464,8 @@ long syscall_trace_enter(struct pt_regs *regs)
> {
> long ret = 0;
>
> + rcu_user_exit();
> +
> /*
> * If we stepped into a sysenter/syscall insn, it trapped in
> * kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP.
> @@ -1526,4 +1529,6 @@ void syscall_trace_leave(struct pt_regs *regs)
> !test_thread_flag(TIF_SYSCALL_EMU);
> if (step || test_thread_flag(TIF_SYSCALL_TRACE))
> tracehook_report_syscall_exit(regs, step);
> +
> + rcu_user_enter();
> }
> --
> 1.7.8
>

2012-09-01 00:01:06

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 16/26] alpha: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:33PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the Alpha's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Richard Henderson <[email protected]>
> Cc: Ivan Kokshaysky <[email protected]>
> Cc: Matt Turner <[email protected]>
> Cc: alpha <[email protected]>
> Cc: Paul E. McKenney <[email protected]>
> Cc: Michael Cree <[email protected]>
> Cc: 3.2.x.. <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/alpha/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
> index eac5e01..eb9558c 100644
> --- a/arch/alpha/kernel/process.c
> +++ b/arch/alpha/kernel/process.c
> @@ -28,6 +28,7 @@
> #include <linux/tty.h>
> #include <linux/console.h>
> #include <linux/slab.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/reg.h>
> #include <asm/uaccess.h>
> @@ -54,9 +55,11 @@ cpu_idle(void)
> /* FIXME -- EV6 and LCA45 know how to power down
> the CPU. */
>
> + rcu_idle_enter();
> while (!need_resched())
> cpu_relax();
>
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:01:20

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 17/26] cris: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:34PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the Cris's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Mikael Starvik <[email protected]>
> Cc: Jesper Nilsson <[email protected]>
> Cc: Cris <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/cris/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/cris/kernel/process.c b/arch/cris/kernel/process.c
> index 66fd017..7f65be6 100644
> --- a/arch/cris/kernel/process.c
> +++ b/arch/cris/kernel/process.c
> @@ -25,6 +25,7 @@
> #include <linux/elfcore.h>
> #include <linux/mqueue.h>
> #include <linux/reboot.h>
> +#include <linux/rcupdate.h>
>
> //#define DEBUG
>
> @@ -74,6 +75,7 @@ void cpu_idle (void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched()) {
> void (*idle)(void);
> /*
> @@ -86,6 +88,7 @@ void cpu_idle (void)
> idle = default_idle;
> idle();
> }
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:01:35

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 18/26] frv: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:35PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the Frv's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: David Howells <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/frv/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/frv/kernel/process.c b/arch/frv/kernel/process.c
> index ff95f50..2eb7fa5 100644
> --- a/arch/frv/kernel/process.c
> +++ b/arch/frv/kernel/process.c
> @@ -25,6 +25,7 @@
> #include <linux/reboot.h>
> #include <linux/interrupt.h>
> #include <linux/pagemap.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/asm-offsets.h>
> #include <asm/uaccess.h>
> @@ -69,12 +70,14 @@ void cpu_idle(void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched()) {
> check_pgt_cache();
>
> if (!frv_dma_inprogress && idle)
> idle();
> }
> + rcu_idle_exit();
>
> schedule_preempt_disabled();
> }
> --
> 1.7.8
>

2012-09-01 00:02:12

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 19/26] h8300: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:36PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the h8300's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Yoshinori Sato <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/h8300/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/h8300/kernel/process.c b/arch/h8300/kernel/process.c
> index 0e9c315..f153ed1 100644
> --- a/arch/h8300/kernel/process.c
> +++ b/arch/h8300/kernel/process.c
> @@ -36,6 +36,7 @@
> #include <linux/reboot.h>
> #include <linux/fs.h>
> #include <linux/slab.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/uaccess.h>
> #include <asm/traps.h>
> @@ -78,8 +79,10 @@ void (*idle)(void) = default_idle;
> void cpu_idle(void)
> {
> while (1) {
> + rcu_idle_enter();
> while (!need_resched())
> idle();
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:02:30

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 20/26] m32r: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:37PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the m32r's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Hirokazu Takata <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/m32r/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/m32r/kernel/process.c b/arch/m32r/kernel/process.c
> index 3a4a32b..384e63f 100644
> --- a/arch/m32r/kernel/process.c
> +++ b/arch/m32r/kernel/process.c
> @@ -26,6 +26,7 @@
> #include <linux/ptrace.h>
> #include <linux/unistd.h>
> #include <linux/hardirq.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/io.h>
> #include <asm/uaccess.h>
> @@ -82,6 +83,7 @@ void cpu_idle (void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched()) {
> void (*idle)(void) = pm_idle;
>
> @@ -90,6 +92,7 @@ void cpu_idle (void)
>
> idle();
> }
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:02:53

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 21/26] m68k: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:38PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the m68k's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Acked-by: Geert Uytterhoeven <[email protected]>
> Cc: m68k <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/m68k/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
> index c488e3c..ac2892e 100644
> --- a/arch/m68k/kernel/process.c
> +++ b/arch/m68k/kernel/process.c
> @@ -25,6 +25,7 @@
> #include <linux/reboot.h>
> #include <linux/init_task.h>
> #include <linux/mqueue.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/uaccess.h>
> #include <asm/traps.h>
> @@ -75,8 +76,10 @@ void cpu_idle(void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched())
> idle();
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:03:11

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 22/26] mn10300: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:39PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the mn10300's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: David Howells <[email protected]>
> Cc: Koichi Yasutake <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/mn10300/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/mn10300/kernel/process.c b/arch/mn10300/kernel/process.c
> index 7dab0cd..e9cceba 100644
> --- a/arch/mn10300/kernel/process.c
> +++ b/arch/mn10300/kernel/process.c
> @@ -25,6 +25,7 @@
> #include <linux/err.h>
> #include <linux/fs.h>
> #include <linux/slab.h>
> +#include <linux/rcupdate.h>
> #include <asm/uaccess.h>
> #include <asm/pgtable.h>
> #include <asm/io.h>
> @@ -107,6 +108,7 @@ void cpu_idle(void)
> {
> /* endless idle loop with no priority at all */
> for (;;) {
> + rcu_idle_enter();
> while (!need_resched()) {
> void (*idle)(void);
>
> @@ -121,6 +123,7 @@ void cpu_idle(void)
> }
> idle();
> }
> + rcu_idle_exit();
>
> schedule_preempt_disabled();
> }
> --
> 1.7.8
>

2012-09-01 00:03:25

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 23/26] parisc: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:40PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the parisc's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: James E.J. Bottomley <[email protected]>
> Cc: Helge Deller <[email protected]>
> Cc: Parisc <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/parisc/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
> index d4b94b3..c54a4db 100644
> --- a/arch/parisc/kernel/process.c
> +++ b/arch/parisc/kernel/process.c
> @@ -48,6 +48,7 @@
> #include <linux/unistd.h>
> #include <linux/kallsyms.h>
> #include <linux/uaccess.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/io.h>
> #include <asm/asm-offsets.h>
> @@ -69,8 +70,10 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched())
> barrier();
> + rcu_idle_exit();
> schedule_preempt_disabled();
> check_pgt_cache();
> }
> --
> 1.7.8
>

2012-09-01 00:04:47

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 24/26] score: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:41PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the scores's idle loop.

s/scores's/score/ or s/the scores's/score's/

> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Chen Liqin <[email protected]>
> Cc: Lennox Wu <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

With the fix above,
Reviewed-by: Josh Triplett <[email protected]>

> arch/score/kernel/process.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/arch/score/kernel/process.c b/arch/score/kernel/process.c
> index 2707023..637970c 100644
> --- a/arch/score/kernel/process.c
> +++ b/arch/score/kernel/process.c
> @@ -27,6 +27,7 @@
> #include <linux/reboot.h>
> #include <linux/elfcore.h>
> #include <linux/pm.h>
> +#include <linux/rcupdate.h>
>
> void (*pm_power_off)(void);
> EXPORT_SYMBOL(pm_power_off);
> @@ -50,9 +51,10 @@ void __noreturn cpu_idle(void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched())
> barrier();
> -
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:05:16

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 25/26] xtensa: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:42PM -0700, Paul E. McKenney wrote:
> From: Frederic Weisbecker <[email protected]>
>
> In the old times, the whole idle task was considered
> as an RCU quiescent state. But as RCU became more and
> more successful overtime, some RCU read side critical
> section have been added even in the code of some
> architectures idle tasks, for tracing for example.
>
> So nowadays, rcu_idle_enter() and rcu_idle_exit() must
> be called by the architecture to tell RCU about the part
> in the idle loop that doesn't make use of rcu read side
> critical sections, typically the part that puts the CPU
> in low power mode.
>
> This is necessary for RCU to find the quiescent states in
> idle in order to complete grace periods.
>
> Add this missing pair of calls in the xtensa's idle loop.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> Cc: Chris Zankel <[email protected]>
> Cc: 3.2.x.. <[email protected]>
> Cc: Paul E. McKenney <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/xtensa/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c
> index 2c8d6a3..bc44311 100644
> --- a/arch/xtensa/kernel/process.c
> +++ b/arch/xtensa/kernel/process.c
> @@ -31,6 +31,7 @@
> #include <linux/mqueue.h>
> #include <linux/fs.h>
> #include <linux/slab.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/pgtable.h>
> #include <asm/uaccess.h>
> @@ -110,8 +111,10 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> while (!need_resched())
> platform_idle();
> + rcu_idle_exit();
> schedule_preempt_disabled();
> }
> }
> --
> 1.7.8
>

2012-09-01 00:05:33

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH tip/core/rcu 26/26] ia64: Add missing RCU idle APIs on idle loop

On Thu, Aug 30, 2012 at 02:05:43PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <[email protected]>
>
> Traditionally, the entire idle task served as an RCU quiescent state.
> But when RCU read side critical sections started appearing within the
> idle loop, this traditional strategy became untenable. The fix was to
> create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which
> must be called by each architecture's idle loop so that RCU can tell
> when it is safe to ignore a given idle CPU.
>
> Unfortunately, this fix was never applied to ia64, a shortcoming remedied
> by this commit.
>
> Reported by: Tony Luck <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Tested by: Tony Luck <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> arch/ia64/kernel/process.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
> index dd6fc14..3e316ec 100644
> --- a/arch/ia64/kernel/process.c
> +++ b/arch/ia64/kernel/process.c
> @@ -29,6 +29,7 @@
> #include <linux/kdebug.h>
> #include <linux/utsname.h>
> #include <linux/tracehook.h>
> +#include <linux/rcupdate.h>
>
> #include <asm/cpu.h>
> #include <asm/delay.h>
> @@ -279,6 +280,7 @@ cpu_idle (void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> + rcu_idle_enter();
> if (can_do_pal_halt) {
> current_thread_info()->status &= ~TS_POLLING;
> /*
> @@ -309,6 +311,7 @@ cpu_idle (void)
> normal_xtp();
> #endif
> }
> + rcu_idle_exit();
> schedule_preempt_disabled();
> check_pgt_cache();
> if (cpu_is_offline(cpu))
> --
> 1.7.8
>