2012-02-02 00:43:04

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH RFC idle] Make arm, sh, and x86 stop using RCU when idle

Hello!

RCU's shiny new diagnostics (thank you, Frederic!) for using RCU when idle
located a few problems in arm, sh, and x86. This patch series contains
alleged fixes for these problems. And they are real problems -- if RCU
believes that the CPU is idle, it is ignoring it. Which means that the
idle CPU can say "rcu_read_lock()" all it like, but there will be no
useful effect.

I was tempted to break these up, but doing so is bad for bisectability.

Thanx, Paul

------------------------------------------------------------------------

arch/arm/kernel/process.c | 2 --
arch/arm/mach-at91/cpuidle.c | 3 +++
arch/arm/mach-davinci/cpuidle.c | 3 +++
arch/arm/mach-exynos/common.c | 2 ++
arch/arm/mach-highbank/pm.c | 12 ++++++++++++
arch/arm/mach-imx/mm-imx3.c | 3 +++
arch/arm/mach-imx/pm-imx27.c | 4 ++++
arch/arm/mach-imx/pm-imx6q.c | 4 ++++
arch/arm/mach-kirkwood/cpuidle.c | 3 +++
arch/arm/mach-mx5/mm.c | 3 +++
arch/arm/mach-mx5/pm-imx5.c | 3 +++
arch/arm/mach-mxs/pm.c | 4 ++++
arch/arm/mach-omap1/pm.c | 4 ++++
arch/arm/mach-omap2/pm24xx.c | 2 ++
arch/arm/mach-omap2/pm34xx.c | 2 ++
arch/arm/mach-omap2/pm44xx.c | 3 +++
arch/arm/mach-pnx4008/pm.c | 2 ++
arch/arm/mach-prima2/pm.c | 4 ++++
arch/arm/mach-s5p64x0/common.c | 2 ++
arch/arm/mach-s5pc100/common.c | 2 ++
arch/arm/mach-s5pv210/common.c | 2 ++
arch/arm/mach-shmobile/cpuidle.c | 3 +++
arch/arm/mach-shmobile/pm-sh7372.c | 8 ++++++++
arch/sh/kernel/idle.c | 11 ++++++++---
arch/x86/kernel/process.c | 13 ++++++++++++-
arch/x86/kernel/process_32.c | 2 --
arch/x86/kernel/process_64.c | 4 ----
drivers/idle/intel_idle.c | 2 ++
28 files changed, 100 insertions(+), 12 deletions(-)


2012-02-02 00:43:39

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH RFC idle 3/3] sh: Avoid invoking RCU when CPU is idle

From: "Paul E. McKenney" <[email protected]>

The idle loop is a quiscent state for RCU, which means that RCU ignores
CPUs that have told RCU that they are idle via rcu_idle_enter().
There are nevertheless quite a few places where idle CPUs use RCU, most
commonly indirectly via tracing. This patch fixes these problems for SH.

Many of these bugs have been in the kernel for quite some time, but
Frederic's recent change now gives warnings.

This patch takes the straightforward approach of pushing the
rcu_idle_enter()/rcu_idle_exit() pair further down into the core of the
idle loop.

Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Mike Frysinger <[email protected]>
Cc: [email protected]
---
arch/sh/kernel/idle.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
index 406508d..d125668 100644
--- a/arch/sh/kernel/idle.c
+++ b/arch/sh/kernel/idle.c
@@ -53,8 +53,10 @@ static inline int hlt_works(void)
static void poll_idle(void)
{
local_irq_enable();
+ rcu_idle_enter();
while (!need_resched())
cpu_relax();
+ rcu_idle_exit();
}

void default_idle(void)
@@ -66,9 +68,14 @@ void default_idle(void)
set_bl_bit();
if (!need_resched()) {
local_irq_enable();
+ rcu_idle_enter();
cpu_sleep();
- } else
+ rcu_idle_exit();
+ } else {
local_irq_enable();
+ rcu_idle_enter();
+ rcu_idle_exit();
+ }

set_thread_flag(TIF_POLLING_NRFLAG);
clear_bl_bit();
@@ -90,7 +97,6 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
tick_nohz_idle_enter();
- rcu_idle_enter();

while (!need_resched()) {
check_pgt_cache();
@@ -112,7 +118,6 @@ void cpu_idle(void)
start_critical_timings();
}

- rcu_idle_exit();
tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
--
1.7.8

2012-02-02 00:43:59

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH RFC idle 1/3] x86: Avoid invoking RCU when CPU is idle

From: "Paul E. McKenney" <[email protected]>

The idle loop is a quiscent state for RCU, which means that RCU ignores
CPUs that have told RCU that they are idle via rcu_idle_enter(). There
are nevertheless quite a few places where idle CPUs use RCU, most commonly
indirectly via tracing. This patch fixes these problems for x86.

Many of these bugs have been in the kernel for quite some time, but
Frederic's recent change now gives warnings.

This patch takes the straightforward approach of pushing the
rcu_idle_enter()/rcu_idle_exit() pair further down into the core
of the idle loop.

Reported-by: Eric Dumazet <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Tested-by: Eric Dumazet <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Kamalesh Babulal <[email protected]>
Cc: Stephen Wilson <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/x86/kernel/process.c | 13 ++++++++++++-
arch/x86/kernel/process_32.c | 2 --
arch/x86/kernel/process_64.c | 4 ----
drivers/idle/intel_idle.c | 2 ++
4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 15763af..f6978b0 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -386,17 +386,21 @@ void default_idle(void)
*/
smp_mb();

+ rcu_idle_enter();
if (!need_resched())
safe_halt(); /* enables interrupts racelessly */
else
local_irq_enable();
+ rcu_idle_exit();
current_thread_info()->status |= TS_POLLING;
trace_power_end(smp_processor_id());
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
} else {
local_irq_enable();
/* loop is done by the caller */
+ rcu_idle_enter();
cpu_relax();
+ rcu_idle_exit();
}
}
#ifdef CONFIG_APM_MODULE
@@ -457,14 +461,19 @@ static void mwait_idle(void)

__monitor((void *)&current_thread_info()->flags, 0, 0);
smp_mb();
+ rcu_idle_enter();
if (!need_resched())
__sti_mwait(0, 0);
else
local_irq_enable();
+ rcu_idle_exit();
trace_power_end(smp_processor_id());
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
- } else
+ } else {
local_irq_enable();
+ rcu_idle_enter();
+ rcu_idle_exit();
+ }
}

/*
@@ -477,8 +486,10 @@ static void poll_idle(void)
trace_power_start(POWER_CSTATE, 0, smp_processor_id());
trace_cpu_idle(0, smp_processor_id());
local_irq_enable();
+ rcu_idle_enter();
while (!need_resched())
cpu_relax();
+ rcu_idle_exit();
trace_power_end(smp_processor_id());
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 485204f..6d9d4d5 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -100,7 +100,6 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
tick_nohz_idle_enter();
- rcu_idle_enter();
while (!need_resched()) {

check_pgt_cache();
@@ -117,7 +116,6 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
}
- rcu_idle_exit();
tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9b9fe4a..55a1a35 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -140,13 +140,9 @@ void cpu_idle(void)
/* Don't trace irqs off for idle */
stop_critical_timings();

- /* enter_idle() needs rcu for notifiers */
- rcu_idle_enter();
-
if (cpuidle_idle_call())
pm_idle();

- rcu_idle_exit();
start_critical_timings();

/* In many cases the interrupt that ended idle
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 20bce51..a9ddab8 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -261,6 +261,7 @@ static int intel_idle(struct cpuidle_device *dev,
kt_before = ktime_get_real();

stop_critical_timings();
+ rcu_idle_enter();
if (!need_resched()) {

__monitor((void *)&current_thread_info()->flags, 0, 0);
@@ -268,6 +269,7 @@ static int intel_idle(struct cpuidle_device *dev,
if (!need_resched())
__mwait(eax, ecx);
}
+ rcu_idle_exit();

start_critical_timings();

--
1.7.8

2012-02-02 00:48:51

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH RFC idle] Make arm, sh, and x86 stop using RCU when idle

On Wed, Feb 01, 2012 at 04:42:53PM -0800, Paul E. McKenney wrote:
> RCU's shiny new diagnostics (thank you, Frederic!) for using RCU when idle
> located a few problems in arm, sh, and x86. This patch series contains
> alleged fixes for these problems. And they are real problems -- if RCU
> believes that the CPU is idle, it is ignoring it. Which means that the
> idle CPU can say "rcu_read_lock()" all it like, but there will be no
> useful effect.

Having to put these calls down in every idle driver seems like such an
ugly layering violation. Not that I have a better alternative to
suggest...

- Josh Triplett

2012-02-02 01:14:58

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH RFC idle] Make arm, sh, and x86 stop using RCU when idle

On Wed, Feb 01, 2012 at 04:48:29PM -0800, Josh Triplett wrote:
> On Wed, Feb 01, 2012 at 04:42:53PM -0800, Paul E. McKenney wrote:
> > RCU's shiny new diagnostics (thank you, Frederic!) for using RCU when idle
> > located a few problems in arm, sh, and x86. This patch series contains
> > alleged fixes for these problems. And they are real problems -- if RCU
> > believes that the CPU is idle, it is ignoring it. Which means that the
> > idle CPU can say "rcu_read_lock()" all it like, but there will be no
> > useful effect.
>
> Having to put these calls down in every idle driver seems like such an
> ugly layering violation. Not that I have a better alternative to
> suggest...

If you do happen to think of one, I would very much like to hear of it! ;-)

Thanx, Paul

2012-02-02 01:54:39

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH RFC idle 1/3] x86: Avoid invoking RCU when CPU is idle

On Wed, Feb 01, 2012 at 04:43:22PM -0800, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <[email protected]>
>
> The idle loop is a quiscent state for RCU, which means that RCU ignores
> CPUs that have told RCU that they are idle via rcu_idle_enter(). There
> are nevertheless quite a few places where idle CPUs use RCU, most commonly
> indirectly via tracing. This patch fixes these problems for x86.
>
> Many of these bugs have been in the kernel for quite some time, but
> Frederic's recent change now gives warnings.
>
> This patch takes the straightforward approach of pushing the
> rcu_idle_enter()/rcu_idle_exit() pair further down into the core
> of the idle loop.

I guess this is about the trace_power_*() things, right?

Acked-by: Frederic Weisbecker <[email protected]>

Thanks.

2012-02-02 02:31:17

by Paul Mundt

[permalink] [raw]
Subject: Re: [PATCH RFC idle] Make arm, sh, and x86 stop using RCU when idle

On Wed, Feb 01, 2012 at 04:42:53PM -0800, Paul E. McKenney wrote:
> Hello!
>
> RCU's shiny new diagnostics (thank you, Frederic!) for using RCU when idle
> located a few problems in arm, sh, and x86. This patch series contains
> alleged fixes for these problems. And they are real problems -- if RCU
> believes that the CPU is idle, it is ignoring it. Which means that the
> idle CPU can say "rcu_read_lock()" all it like, but there will be no
> useful effect.
>
> I was tempted to break these up, but doing so is bad for bisectability.
>
Presumably the same changes will also need to be reflected in cpuidle?

If so, here's a start:

Signed-off-by: Paul Mundt <[email protected]>

---

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 59f4261..97adcd4 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -18,6 +18,7 @@
#include <linux/ktime.h>
#include <linux/hrtimer.h>
#include <linux/module.h>
+#include <linux/rcupdate.h>
#include <trace/events/power.h>

#include "cpuidle.h"
@@ -89,6 +90,8 @@ int cpuidle_idle_call(void)
next_state = cpuidle_curr_governor->select(drv, dev);
if (need_resched()) {
local_irq_enable();
+ rcu_idle_enter();
+ rcu_idle_exit();
return 0;
}

@@ -96,9 +99,11 @@ int cpuidle_idle_call(void)

trace_power_start(POWER_CSTATE, next_state, dev->cpu);
trace_cpu_idle(next_state, dev->cpu);
+ rcu_idle_enter();

entered_state = target_state->enter(dev, drv, next_state);

+ rcu_idle_exit();
trace_power_end(dev->cpu);
trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);

@@ -173,8 +178,10 @@ static int poll_idle(struct cpuidle_device *dev,

t1 = ktime_get();
local_irq_enable();
+ rcu_idle_enter();
while (!need_resched())
cpu_relax();
+ rcu_idle_exit();

t2 = ktime_get();
diff = ktime_to_us(ktime_sub(t2, t1));

2012-02-02 04:55:52

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH RFC idle 1/3] x86: Avoid invoking RCU when CPU is idle

On Thu, Feb 02, 2012 at 02:54:30AM +0100, Frederic Weisbecker wrote:
> On Wed, Feb 01, 2012 at 04:43:22PM -0800, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <[email protected]>
> >
> > The idle loop is a quiscent state for RCU, which means that RCU ignores
> > CPUs that have told RCU that they are idle via rcu_idle_enter(). There
> > are nevertheless quite a few places where idle CPUs use RCU, most commonly
> > indirectly via tracing. This patch fixes these problems for x86.
> >
> > Many of these bugs have been in the kernel for quite some time, but
> > Frederic's recent change now gives warnings.
> >
> > This patch takes the straightforward approach of pushing the
> > rcu_idle_enter()/rcu_idle_exit() pair further down into the core
> > of the idle loop.
>
> I guess this is about the trace_power_*() things, right?

Indeed it is! Alternative suggestions gladly accepted. ;-)

Thanx, Paul

> Acked-by: Frederic Weisbecker <[email protected]>
>
> Thanks.
>

2012-02-02 04:58:54

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH RFC idle] Make arm, sh, and x86 stop using RCU when idle

On Thu, Feb 02, 2012 at 11:29:59AM +0900, Paul Mundt wrote:
> On Wed, Feb 01, 2012 at 04:42:53PM -0800, Paul E. McKenney wrote:
> > Hello!
> >
> > RCU's shiny new diagnostics (thank you, Frederic!) for using RCU when idle
> > located a few problems in arm, sh, and x86. This patch series contains
> > alleged fixes for these problems. And they are real problems -- if RCU
> > believes that the CPU is idle, it is ignoring it. Which means that the
> > idle CPU can say "rcu_read_lock()" all it like, but there will be no
> > useful effect.
> >
> > I was tempted to break these up, but doing so is bad for bisectability.
> >
> Presumably the same changes will also need to be reflected in cpuidle?

Hmmm... I do need to check that...

> If so, here's a start:
>
> Signed-off-by: Paul Mundt <[email protected]>

The ARM guys are choking on this, so there might be an update. Some
of them are thinking in terms of removing the tracing from the inner
idle loop. Any solution works for me -- as long as there is no use
of RCU betwenen the rcu_idle_enter() and the matching rcu_idle_exit(),
both RCU and I are happy. ;-)

Thanx, Paul

> ---
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 59f4261..97adcd4 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -18,6 +18,7 @@
> #include <linux/ktime.h>
> #include <linux/hrtimer.h>
> #include <linux/module.h>
> +#include <linux/rcupdate.h>
> #include <trace/events/power.h>
>
> #include "cpuidle.h"
> @@ -89,6 +90,8 @@ int cpuidle_idle_call(void)
> next_state = cpuidle_curr_governor->select(drv, dev);
> if (need_resched()) {
> local_irq_enable();
> + rcu_idle_enter();
> + rcu_idle_exit();
> return 0;
> }
>
> @@ -96,9 +99,11 @@ int cpuidle_idle_call(void)
>
> trace_power_start(POWER_CSTATE, next_state, dev->cpu);
> trace_cpu_idle(next_state, dev->cpu);
> + rcu_idle_enter();
>
> entered_state = target_state->enter(dev, drv, next_state);
>
> + rcu_idle_exit();
> trace_power_end(dev->cpu);
> trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
>
> @@ -173,8 +178,10 @@ static int poll_idle(struct cpuidle_device *dev,
>
> t1 = ktime_get();
> local_irq_enable();
> + rcu_idle_enter();
> while (!need_resched())
> cpu_relax();
> + rcu_idle_exit();
>
> t2 = ktime_get();
> diff = ktime_to_us(ktime_sub(t2, t1));
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>