Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
that the locking tracepoints were using RCU.
Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
Specifically, sched_clock_idle_wakeup_event() will use ktime which
will use seqlocks which will tickle lockdep, and
stop_critical_timings() uses lock.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Marco Elver <[email protected]>
---
drivers/cpuidle/cpuidle.c | 12 ++++++++----
kernel/sched/idle.c | 22 ++++++++--------------
2 files changed, 16 insertions(+), 18 deletions(-)
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -145,13 +145,14 @@ static void enter_s2idle_proper(struct c
* executing it contains RCU usage regarded as invalid in the idle
* context, so tell RCU about that.
*/
- RCU_NONIDLE(tick_freeze());
+ tick_freeze();
/*
* The state used here cannot be a "coupled" one, because the "coupled"
* cpuidle mechanism enables interrupts and doing that with timekeeping
* suspended is generally unsafe.
*/
stop_critical_timings();
+ rcu_idle_enter();
drv->states[index].enter_s2idle(dev, drv, index);
if (WARN_ON_ONCE(!irqs_disabled()))
local_irq_disable();
@@ -160,7 +161,8 @@ static void enter_s2idle_proper(struct c
* first CPU executing it calls functions containing RCU read-side
* critical sections, so tell RCU about that.
*/
- RCU_NONIDLE(tick_unfreeze());
+ rcu_idle_exit();
+ tick_unfreeze();
start_critical_timings();
time_end = ns_to_ktime(local_clock());
@@ -229,16 +231,18 @@ int cpuidle_enter_state(struct cpuidle_d
/* Take note of the planned idle state. */
sched_idle_set_state(target_state);
- trace_cpu_idle_rcuidle(index, dev->cpu);
+ trace_cpu_idle(index, dev->cpu);
time_start = ns_to_ktime(local_clock());
stop_critical_timings();
+ rcu_idle_enter();
entered_state = target_state->enter(dev, drv, index);
+ rcu_idle_exit();
start_critical_timings();
sched_clock_idle_wakeup_event();
time_end = ns_to_ktime(local_clock());
- trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
+ trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
/* The cpu is no longer idle or about to enter idle. */
sched_idle_set_state(NULL);
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -54,17 +54,18 @@ __setup("hlt", cpu_idle_nopoll_setup);
static noinline int __cpuidle cpu_idle_poll(void)
{
+ trace_cpu_idle(0, smp_processor_id());
+ stop_critical_timings();
rcu_idle_enter();
- trace_cpu_idle_rcuidle(0, smp_processor_id());
local_irq_enable();
- stop_critical_timings();
while (!tif_need_resched() &&
- (cpu_idle_force_poll || tick_check_broadcast_expired()))
+ (cpu_idle_force_poll || tick_check_broadcast_expired()))
cpu_relax();
- start_critical_timings();
- trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
+
rcu_idle_exit();
+ start_critical_timings();
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
return 1;
}
@@ -91,7 +92,9 @@ void __cpuidle default_idle_call(void)
local_irq_enable();
} else {
stop_critical_timings();
+ rcu_idle_enter();
arch_cpu_idle();
+ rcu_idle_exit();
start_critical_timings();
}
}
@@ -158,7 +161,6 @@ static void cpuidle_idle_call(void)
if (cpuidle_not_available(drv, dev)) {
tick_nohz_idle_stop_tick();
- rcu_idle_enter();
default_idle_call();
goto exit_idle;
@@ -178,21 +180,17 @@ static void cpuidle_idle_call(void)
u64 max_latency_ns;
if (idle_should_enter_s2idle()) {
- rcu_idle_enter();
entered_state = call_cpuidle_s2idle(drv, dev);
if (entered_state > 0)
goto exit_idle;
- rcu_idle_exit();
-
max_latency_ns = U64_MAX;
} else {
max_latency_ns = dev->forced_idle_latency_limit_ns;
}
tick_nohz_idle_stop_tick();
- rcu_idle_enter();
next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns);
call_cpuidle(drv, dev, next_state);
@@ -209,8 +207,6 @@ static void cpuidle_idle_call(void)
else
tick_nohz_idle_retain_tick();
- rcu_idle_enter();
-
entered_state = call_cpuidle(drv, dev, next_state);
/*
* Give the governor an opportunity to reflect on the outcome
@@ -226,8 +222,6 @@ static void cpuidle_idle_call(void)
*/
if (WARN_ON_ONCE(irqs_disabled()))
local_irq_enable();
-
- rcu_idle_exit();
}
/*
On Fri, Aug 21, 2020 at 10:47:41AM +0200, Peter Zijlstra wrote:
> Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
> that the locking tracepoints were using RCU.
>
> Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
> this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
>
> Specifically, sched_clock_idle_wakeup_event() will use ktime which
> will use seqlocks which will tickle lockdep, and
> stop_critical_timings() uses lock.
I was wondering if those tracepoints should just use _rcuidle variant of the
trace call. But that's a terrible idea considering that would add unwanted
overhead I think.
Reviewed-by: Joel Fernandes (Google) <[email protected]>
thanks,
- Joel
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Reviewed-by: Steven Rostedt (VMware) <[email protected]>
> Tested-by: Marco Elver <[email protected]>
> ---
> drivers/cpuidle/cpuidle.c | 12 ++++++++----
> kernel/sched/idle.c | 22 ++++++++--------------
> 2 files changed, 16 insertions(+), 18 deletions(-)
>
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -145,13 +145,14 @@ static void enter_s2idle_proper(struct c
> * executing it contains RCU usage regarded as invalid in the idle
> * context, so tell RCU about that.
> */
> - RCU_NONIDLE(tick_freeze());
> + tick_freeze();
> /*
> * The state used here cannot be a "coupled" one, because the "coupled"
> * cpuidle mechanism enables interrupts and doing that with timekeeping
> * suspended is generally unsafe.
> */
> stop_critical_timings();
> + rcu_idle_enter();
> drv->states[index].enter_s2idle(dev, drv, index);
> if (WARN_ON_ONCE(!irqs_disabled()))
> local_irq_disable();
> @@ -160,7 +161,8 @@ static void enter_s2idle_proper(struct c
> * first CPU executing it calls functions containing RCU read-side
> * critical sections, so tell RCU about that.
> */
> - RCU_NONIDLE(tick_unfreeze());
> + rcu_idle_exit();
> + tick_unfreeze();
> start_critical_timings();
>
> time_end = ns_to_ktime(local_clock());
> @@ -229,16 +231,18 @@ int cpuidle_enter_state(struct cpuidle_d
> /* Take note of the planned idle state. */
> sched_idle_set_state(target_state);
>
> - trace_cpu_idle_rcuidle(index, dev->cpu);
> + trace_cpu_idle(index, dev->cpu);
> time_start = ns_to_ktime(local_clock());
>
> stop_critical_timings();
> + rcu_idle_enter();
> entered_state = target_state->enter(dev, drv, index);
> + rcu_idle_exit();
> start_critical_timings();
>
> sched_clock_idle_wakeup_event();
> time_end = ns_to_ktime(local_clock());
> - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
> + trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
>
> /* The cpu is no longer idle or about to enter idle. */
> sched_idle_set_state(NULL);
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -54,17 +54,18 @@ __setup("hlt", cpu_idle_nopoll_setup);
>
> static noinline int __cpuidle cpu_idle_poll(void)
> {
> + trace_cpu_idle(0, smp_processor_id());
> + stop_critical_timings();
> rcu_idle_enter();
> - trace_cpu_idle_rcuidle(0, smp_processor_id());
> local_irq_enable();
> - stop_critical_timings();
>
> while (!tif_need_resched() &&
> - (cpu_idle_force_poll || tick_check_broadcast_expired()))
> + (cpu_idle_force_poll || tick_check_broadcast_expired()))
> cpu_relax();
> - start_critical_timings();
> - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
> +
> rcu_idle_exit();
> + start_critical_timings();
> + trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
>
> return 1;
> }
> @@ -91,7 +92,9 @@ void __cpuidle default_idle_call(void)
> local_irq_enable();
> } else {
> stop_critical_timings();
> + rcu_idle_enter();
> arch_cpu_idle();
> + rcu_idle_exit();
> start_critical_timings();
> }
> }
> @@ -158,7 +161,6 @@ static void cpuidle_idle_call(void)
>
> if (cpuidle_not_available(drv, dev)) {
> tick_nohz_idle_stop_tick();
> - rcu_idle_enter();
>
> default_idle_call();
> goto exit_idle;
> @@ -178,21 +180,17 @@ static void cpuidle_idle_call(void)
> u64 max_latency_ns;
>
> if (idle_should_enter_s2idle()) {
> - rcu_idle_enter();
>
> entered_state = call_cpuidle_s2idle(drv, dev);
> if (entered_state > 0)
> goto exit_idle;
>
> - rcu_idle_exit();
> -
> max_latency_ns = U64_MAX;
> } else {
> max_latency_ns = dev->forced_idle_latency_limit_ns;
> }
>
> tick_nohz_idle_stop_tick();
> - rcu_idle_enter();
>
> next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns);
> call_cpuidle(drv, dev, next_state);
> @@ -209,8 +207,6 @@ static void cpuidle_idle_call(void)
> else
> tick_nohz_idle_retain_tick();
>
> - rcu_idle_enter();
> -
> entered_state = call_cpuidle(drv, dev, next_state);
> /*
> * Give the governor an opportunity to reflect on the outcome
> @@ -226,8 +222,6 @@ static void cpuidle_idle_call(void)
> */
> if (WARN_ON_ONCE(irqs_disabled()))
> local_irq_enable();
> -
> - rcu_idle_exit();
> }
>
> /*
>
>
On Wed, Aug 26, 2020 at 09:18:26PM -0400, Joel Fernandes wrote:
> On Fri, Aug 21, 2020 at 10:47:41AM +0200, Peter Zijlstra wrote:
> > Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
> > that the locking tracepoints were using RCU.
> >
> > Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
> > this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
> >
> > Specifically, sched_clock_idle_wakeup_event() will use ktime which
> > will use seqlocks which will tickle lockdep, and
> > stop_critical_timings() uses lock.
>
> I was wondering if those tracepoints should just use _rcuidle variant of the
> trace call. But that's a terrible idea considering that would add unwanted
> overhead I think.
>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>
BTW, if tracepoint is converted to use RCU-trace flavor, then these kinds of
issues go away, no? That RCU flavor is always watching.
thanks,
- Joel
>
> thanks,
>
> - Joel
>
> >
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > Reviewed-by: Steven Rostedt (VMware) <[email protected]>
> > Tested-by: Marco Elver <[email protected]>
> > ---
> > drivers/cpuidle/cpuidle.c | 12 ++++++++----
> > kernel/sched/idle.c | 22 ++++++++--------------
> > 2 files changed, 16 insertions(+), 18 deletions(-)
> >
> > --- a/drivers/cpuidle/cpuidle.c
> > +++ b/drivers/cpuidle/cpuidle.c
> > @@ -145,13 +145,14 @@ static void enter_s2idle_proper(struct c
> > * executing it contains RCU usage regarded as invalid in the idle
> > * context, so tell RCU about that.
> > */
> > - RCU_NONIDLE(tick_freeze());
> > + tick_freeze();
> > /*
> > * The state used here cannot be a "coupled" one, because the "coupled"
> > * cpuidle mechanism enables interrupts and doing that with timekeeping
> > * suspended is generally unsafe.
> > */
> > stop_critical_timings();
> > + rcu_idle_enter();
> > drv->states[index].enter_s2idle(dev, drv, index);
> > if (WARN_ON_ONCE(!irqs_disabled()))
> > local_irq_disable();
> > @@ -160,7 +161,8 @@ static void enter_s2idle_proper(struct c
> > * first CPU executing it calls functions containing RCU read-side
> > * critical sections, so tell RCU about that.
> > */
> > - RCU_NONIDLE(tick_unfreeze());
> > + rcu_idle_exit();
> > + tick_unfreeze();
> > start_critical_timings();
> >
> > time_end = ns_to_ktime(local_clock());
> > @@ -229,16 +231,18 @@ int cpuidle_enter_state(struct cpuidle_d
> > /* Take note of the planned idle state. */
> > sched_idle_set_state(target_state);
> >
> > - trace_cpu_idle_rcuidle(index, dev->cpu);
> > + trace_cpu_idle(index, dev->cpu);
> > time_start = ns_to_ktime(local_clock());
> >
> > stop_critical_timings();
> > + rcu_idle_enter();
> > entered_state = target_state->enter(dev, drv, index);
> > + rcu_idle_exit();
> > start_critical_timings();
> >
> > sched_clock_idle_wakeup_event();
> > time_end = ns_to_ktime(local_clock());
> > - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
> > + trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
> >
> > /* The cpu is no longer idle or about to enter idle. */
> > sched_idle_set_state(NULL);
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -54,17 +54,18 @@ __setup("hlt", cpu_idle_nopoll_setup);
> >
> > static noinline int __cpuidle cpu_idle_poll(void)
> > {
> > + trace_cpu_idle(0, smp_processor_id());
> > + stop_critical_timings();
> > rcu_idle_enter();
> > - trace_cpu_idle_rcuidle(0, smp_processor_id());
> > local_irq_enable();
> > - stop_critical_timings();
> >
> > while (!tif_need_resched() &&
> > - (cpu_idle_force_poll || tick_check_broadcast_expired()))
> > + (cpu_idle_force_poll || tick_check_broadcast_expired()))
> > cpu_relax();
> > - start_critical_timings();
> > - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
> > +
> > rcu_idle_exit();
> > + start_critical_timings();
> > + trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
> >
> > return 1;
> > }
> > @@ -91,7 +92,9 @@ void __cpuidle default_idle_call(void)
> > local_irq_enable();
> > } else {
> > stop_critical_timings();
> > + rcu_idle_enter();
> > arch_cpu_idle();
> > + rcu_idle_exit();
> > start_critical_timings();
> > }
> > }
> > @@ -158,7 +161,6 @@ static void cpuidle_idle_call(void)
> >
> > if (cpuidle_not_available(drv, dev)) {
> > tick_nohz_idle_stop_tick();
> > - rcu_idle_enter();
> >
> > default_idle_call();
> > goto exit_idle;
> > @@ -178,21 +180,17 @@ static void cpuidle_idle_call(void)
> > u64 max_latency_ns;
> >
> > if (idle_should_enter_s2idle()) {
> > - rcu_idle_enter();
> >
> > entered_state = call_cpuidle_s2idle(drv, dev);
> > if (entered_state > 0)
> > goto exit_idle;
> >
> > - rcu_idle_exit();
> > -
> > max_latency_ns = U64_MAX;
> > } else {
> > max_latency_ns = dev->forced_idle_latency_limit_ns;
> > }
> >
> > tick_nohz_idle_stop_tick();
> > - rcu_idle_enter();
> >
> > next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns);
> > call_cpuidle(drv, dev, next_state);
> > @@ -209,8 +207,6 @@ static void cpuidle_idle_call(void)
> > else
> > tick_nohz_idle_retain_tick();
> >
> > - rcu_idle_enter();
> > -
> > entered_state = call_cpuidle(drv, dev, next_state);
> > /*
> > * Give the governor an opportunity to reflect on the outcome
> > @@ -226,8 +222,6 @@ static void cpuidle_idle_call(void)
> > */
> > if (WARN_ON_ONCE(irqs_disabled()))
> > local_irq_enable();
> > -
> > - rcu_idle_exit();
> > }
> >
> > /*
> >
> >
On Wed, Aug 26, 2020 at 09:24:19PM -0400, Joel Fernandes wrote:
> On Wed, Aug 26, 2020 at 09:18:26PM -0400, Joel Fernandes wrote:
> > On Fri, Aug 21, 2020 at 10:47:41AM +0200, Peter Zijlstra wrote:
> > > Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
> > > that the locking tracepoints were using RCU.
> > >
> > > Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
> > > this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
> > >
> > > Specifically, sched_clock_idle_wakeup_event() will use ktime which
> > > will use seqlocks which will tickle lockdep, and
> > > stop_critical_timings() uses lock.
> >
> > I was wondering if those tracepoints should just use _rcuidle variant of the
> > trace call. But that's a terrible idea considering that would add unwanted
> > overhead I think.
> >
> > Reviewed-by: Joel Fernandes (Google) <[email protected]>
>
> BTW, if tracepoint is converted to use RCU-trace flavor, then these kinds of
> issues go away, no? That RCU flavor is always watching.
All trace_*_rcuidle() and RCU_NONIDLE() usage is a bug IMO.
Ideally RCU-trace goes away too.
On Thu, Aug 27 2020 at 09:47, peterz wrote:
> On Wed, Aug 26, 2020 at 09:24:19PM -0400, Joel Fernandes wrote:
>> On Wed, Aug 26, 2020 at 09:18:26PM -0400, Joel Fernandes wrote:
>> > On Fri, Aug 21, 2020 at 10:47:41AM +0200, Peter Zijlstra wrote:
>> > > Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
>> > > that the locking tracepoints were using RCU.
>> > >
>> > > Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
>> > > this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
>> > >
>> > > Specifically, sched_clock_idle_wakeup_event() will use ktime which
>> > > will use seqlocks which will tickle lockdep, and
>> > > stop_critical_timings() uses lock.
>> >
>> > I was wondering if those tracepoints should just use _rcuidle variant of the
>> > trace call. But that's a terrible idea considering that would add unwanted
>> > overhead I think.
>> >
>> > Reviewed-by: Joel Fernandes (Google) <[email protected]>
>>
>> BTW, if tracepoint is converted to use RCU-trace flavor, then these kinds of
>> issues go away, no? That RCU flavor is always watching.
>
> All trace_*_rcuidle() and RCU_NONIDLE() usage is a bug IMO.
It's the same problem as low level entry/exit. And that stuff is a hack
which papers over the problem instead of fixing it from ground up. But
we are talking about tracing, right?
Thanks,
tglx
The following commit has been merged into the locking/core branch of tip:
Commit-ID: 1098582a0f6c4e8fd28da0a6305f9233d02c9c1d
Gitweb: https://git.kernel.org/tip/1098582a0f6c4e8fd28da0a6305f9233d02c9c1d
Author: Peter Zijlstra <[email protected]>
AuthorDate: Fri, 07 Aug 2020 20:50:19 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 26 Aug 2020 12:41:53 +02:00
sched,idle,rcu: Push rcu_idle deeper into the idle path
Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
that the locking tracepoints were using RCU.
Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
Specifically, sched_clock_idle_wakeup_event() will use ktime which
will use seqlocks which will tickle lockdep, and
stop_critical_timings() uses lock.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Tested-by: Marco Elver <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
drivers/cpuidle/cpuidle.c | 12 ++++++++----
kernel/sched/idle.c | 22 ++++++++--------------
2 files changed, 16 insertions(+), 18 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 2fe4f3c..9bcda41 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -145,13 +145,14 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv,
* executing it contains RCU usage regarded as invalid in the idle
* context, so tell RCU about that.
*/
- RCU_NONIDLE(tick_freeze());
+ tick_freeze();
/*
* The state used here cannot be a "coupled" one, because the "coupled"
* cpuidle mechanism enables interrupts and doing that with timekeeping
* suspended is generally unsafe.
*/
stop_critical_timings();
+ rcu_idle_enter();
drv->states[index].enter_s2idle(dev, drv, index);
if (WARN_ON_ONCE(!irqs_disabled()))
local_irq_disable();
@@ -160,7 +161,8 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv,
* first CPU executing it calls functions containing RCU read-side
* critical sections, so tell RCU about that.
*/
- RCU_NONIDLE(tick_unfreeze());
+ rcu_idle_exit();
+ tick_unfreeze();
start_critical_timings();
time_end = ns_to_ktime(local_clock());
@@ -229,16 +231,18 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
/* Take note of the planned idle state. */
sched_idle_set_state(target_state);
- trace_cpu_idle_rcuidle(index, dev->cpu);
+ trace_cpu_idle(index, dev->cpu);
time_start = ns_to_ktime(local_clock());
stop_critical_timings();
+ rcu_idle_enter();
entered_state = target_state->enter(dev, drv, index);
+ rcu_idle_exit();
start_critical_timings();
sched_clock_idle_wakeup_event();
time_end = ns_to_ktime(local_clock());
- trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
+ trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
/* The cpu is no longer idle or about to enter idle. */
sched_idle_set_state(NULL);
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 6bf3498..ea3a098 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -54,17 +54,18 @@ __setup("hlt", cpu_idle_nopoll_setup);
static noinline int __cpuidle cpu_idle_poll(void)
{
+ trace_cpu_idle(0, smp_processor_id());
+ stop_critical_timings();
rcu_idle_enter();
- trace_cpu_idle_rcuidle(0, smp_processor_id());
local_irq_enable();
- stop_critical_timings();
while (!tif_need_resched() &&
- (cpu_idle_force_poll || tick_check_broadcast_expired()))
+ (cpu_idle_force_poll || tick_check_broadcast_expired()))
cpu_relax();
- start_critical_timings();
- trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
+
rcu_idle_exit();
+ start_critical_timings();
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
return 1;
}
@@ -91,7 +92,9 @@ void __cpuidle default_idle_call(void)
local_irq_enable();
} else {
stop_critical_timings();
+ rcu_idle_enter();
arch_cpu_idle();
+ rcu_idle_exit();
start_critical_timings();
}
}
@@ -158,7 +161,6 @@ static void cpuidle_idle_call(void)
if (cpuidle_not_available(drv, dev)) {
tick_nohz_idle_stop_tick();
- rcu_idle_enter();
default_idle_call();
goto exit_idle;
@@ -178,21 +180,17 @@ static void cpuidle_idle_call(void)
u64 max_latency_ns;
if (idle_should_enter_s2idle()) {
- rcu_idle_enter();
entered_state = call_cpuidle_s2idle(drv, dev);
if (entered_state > 0)
goto exit_idle;
- rcu_idle_exit();
-
max_latency_ns = U64_MAX;
} else {
max_latency_ns = dev->forced_idle_latency_limit_ns;
}
tick_nohz_idle_stop_tick();
- rcu_idle_enter();
next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns);
call_cpuidle(drv, dev, next_state);
@@ -209,8 +207,6 @@ static void cpuidle_idle_call(void)
else
tick_nohz_idle_retain_tick();
- rcu_idle_enter();
-
entered_state = call_cpuidle(drv, dev, next_state);
/*
* Give the governor an opportunity to reflect on the outcome
@@ -226,8 +222,6 @@ exit_idle:
*/
if (WARN_ON_ONCE(irqs_disabled()))
local_irq_enable();
-
- rcu_idle_exit();
}
/*
On Thu, Aug 27, 2020 at 09:47:48AM +0200, [email protected] wrote:
> On Wed, Aug 26, 2020 at 09:24:19PM -0400, Joel Fernandes wrote:
> > On Wed, Aug 26, 2020 at 09:18:26PM -0400, Joel Fernandes wrote:
> > > On Fri, Aug 21, 2020 at 10:47:41AM +0200, Peter Zijlstra wrote:
> > > > Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
> > > > that the locking tracepoints were using RCU.
> > > >
> > > > Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
> > > > this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.
> > > >
> > > > Specifically, sched_clock_idle_wakeup_event() will use ktime which
> > > > will use seqlocks which will tickle lockdep, and
> > > > stop_critical_timings() uses lock.
> > >
> > > I was wondering if those tracepoints should just use _rcuidle variant of the
> > > trace call. But that's a terrible idea considering that would add unwanted
> > > overhead I think.
> > >
> > > Reviewed-by: Joel Fernandes (Google) <[email protected]>
> >
> > BTW, if tracepoint is converted to use RCU-trace flavor, then these kinds of
> > issues go away, no? That RCU flavor is always watching.
>
> All trace_*_rcuidle() and RCU_NONIDLE() usage is a bug IMO.
>
> Ideally RCU-trace goes away too.
I was thinking that unless the rcu_idle_enter/exit calls coincide with points
in the kernel where instrumentation is not allowed, there is always a chance
somebody wants to use tracepoints after rcu_idle_enter or before exit. In
this case, trace_*_rcuidle() is unavoidable unless you can move the
rcu_idle_enter/exit calls deeper as you are doing.
Maybe objtool can help with that?
The other solution is RCU-trace if you can't push the rcu_idle_enter/exit
calls any deeper and still want to get rid of trace_*_rcuidle thingies. Me
and Mathieu were talking at LPC about tracepoint conversion to RCU-trace and
we can work on it if that's the right direction.
thanks,
- Joel
On Thu, Aug 27, 2020 at 06:30:01PM -0400, Joel Fernandes wrote:
> On Thu, Aug 27, 2020 at 09:47:48AM +0200, [email protected] wrote:
> > All trace_*_rcuidle() and RCU_NONIDLE() usage is a bug IMO.
> >
> > Ideally RCU-trace goes away too.
>
> I was thinking that unless the rcu_idle_enter/exit calls coincide with points
> in the kernel where instrumentation is not allowed, there is always a chance
> somebody wants to use tracepoints after rcu_idle_enter or before exit. In
> this case, trace_*_rcuidle() is unavoidable unless you can move the
> rcu_idle_enter/exit calls deeper as you are doing.
>
> Maybe objtool can help with that?
The eventual goal is to mark idle code noinstr too. There's more work
before we can do that.