2022-06-08 15:19:33

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
Xeons") wrecked intel_idle in two ways:

- must not have tracing in idle functions
- must return with IRQs disabled

Additionally, it added a branch for no good reason.

Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
drivers/idle/intel_idle.c | 48 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 37 insertions(+), 11 deletions(-)

--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
*
* Must be called under local_irq_disable().
*/
+
-static __cpuidle int intel_idle(struct cpuidle_device *dev,
- struct cpuidle_driver *drv, int index)
+static __always_inline int __intel_idle(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
{
struct cpuidle_state *state = &drv->states[index];
unsigned long eax = flg2MWAIT(state->flags);
unsigned long ecx = 1; /* break on interrupt flag */

- if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
- local_irq_enable();
-
mwait_idle_with_hints(eax, ecx);

return index;
}

+static __cpuidle int intel_idle(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
+{
+ return __intel_idle(dev, drv, index);
+}
+
+static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
+{
+ int ret;
+
+ raw_local_irq_enable();
+ ret = __intel_idle(dev, drv, index);
+ raw_local_irq_disable();
+
+ return ret;
+}
+
/**
* intel_idle_s2idle - Ask the processor to enter the given idle state.
* @dev: cpuidle device of the target CPU.
@@ -1801,6 +1824,9 @@ static void __init intel_idle_init_cstat
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];

+ if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE)
+ drv->states[drv->state_count].enter = intel_idle_irq;
+
if ((disabled_states_mask & BIT(drv->state_count)) ||
((icpu->use_acpi || force_use_acpi) &&
intel_idle_off_by_default(mwait_hint) &&



2022-06-08 16:17:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Wed, Jun 08, 2022 at 05:01:05PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra <[email protected]> wrote:
> >
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> >
> > - must not have tracing in idle functions
> > - must return with IRQs disabled
> >
> > Additionally, it added a branch for no good reason.
> >
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>
> Acked-by: Rafael J. Wysocki <[email protected]>
>
> And do I think correctly that this can be applied without the rest of
> the series?

Yeah, I don't think this relies on any of the preceding patches. If you
want to route this through the pm/fixes tree that's fine.

Thanks!

2022-06-08 16:36:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Wed, Jun 8, 2022 at 5:48 PM Peter Zijlstra <[email protected]> wrote:
>
> On Wed, Jun 08, 2022 at 05:01:05PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra <[email protected]> wrote:
> > >
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > >
> > > - must not have tracing in idle functions
> > > - must return with IRQs disabled
> > >
> > > Additionally, it added a branch for no good reason.
> > >
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> >
> > Acked-by: Rafael J. Wysocki <[email protected]>
> >
> > And do I think correctly that this can be applied without the rest of
> > the series?
>
> Yeah, I don't think this relies on any of the preceding patches. If you
> want to route this through the pm/fixes tree that's fine.

OK, thanks, applied (and I moved the intel_idle() kerneldoc so it is
next to the function to avoid the docs build warning).

2022-06-10 00:07:13

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

Hi Peter,

On Wed, 08 Jun 2022 16:27:27 +0200, Peter Zijlstra <[email protected]>
wrote:

> Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> Xeons") wrecked intel_idle in two ways:
>
> - must not have tracing in idle functions
> - must return with IRQs disabled
>
> Additionally, it added a branch for no good reason.
>
> Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---
> drivers/idle/intel_idle.c | 48
> +++++++++++++++++++++++++++++++++++----------- 1 file changed, 37
> insertions(+), 11 deletions(-)
>
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
> *
> * Must be called under local_irq_disable().
> */
nit: this comment is no long true, right?

> +
> -static __cpuidle int intel_idle(struct cpuidle_device *dev,
> - struct cpuidle_driver *drv, int index)
> +static __always_inline int __intel_idle(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv, int
> index) {
> struct cpuidle_state *state = &drv->states[index];
> unsigned long eax = flg2MWAIT(state->flags);
> unsigned long ecx = 1; /* break on interrupt flag */
>
> - if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
> - local_irq_enable();
> -
> mwait_idle_with_hints(eax, ecx);
>
> return index;
> }
>
> +static __cpuidle int intel_idle(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv, int index)
> +{
> + return __intel_idle(dev, drv, index);
> +}
> +
> +static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv, int
> index) +{
> + int ret;
> +
> + raw_local_irq_enable();
> + ret = __intel_idle(dev, drv, index);
> + raw_local_irq_disable();
> +
> + return ret;
> +}
> +
> /**
> * intel_idle_s2idle - Ask the processor to enter the given idle state.
> * @dev: cpuidle device of the target CPU.
> @@ -1801,6 +1824,9 @@ static void __init intel_idle_init_cstat
> /* Structure copy. */
> drv->states[drv->state_count] =
> cpuidle_state_table[cstate];
> + if (cpuidle_state_table[cstate].flags &
> CPUIDLE_FLAG_IRQ_ENABLE)
> + drv->states[drv->state_count].enter =
> intel_idle_irq; +
> if ((disabled_states_mask & BIT(drv->state_count)) ||
> ((icpu->use_acpi || force_use_acpi) &&
> intel_idle_off_by_default(mwait_hint) &&
>
>


Thanks,

Jacob

2022-06-13 09:22:45

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Thu, Jun 09, 2022 at 04:49:21PM -0700, Jacob Pan wrote:
> Hi Peter,
>
> On Wed, 08 Jun 2022 16:27:27 +0200, Peter Zijlstra <[email protected]>
> wrote:
>
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> >
> > - must not have tracing in idle functions
> > - must return with IRQs disabled
> >
> > Additionally, it added a branch for no good reason.
> >
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > ---
> > drivers/idle/intel_idle.c | 48
> > +++++++++++++++++++++++++++++++++++----------- 1 file changed, 37
> > insertions(+), 11 deletions(-)
> >
> > --- a/drivers/idle/intel_idle.c
> > +++ b/drivers/idle/intel_idle.c
> > @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
> > *
> > * Must be called under local_irq_disable().
> > */
> nit: this comment is no long true, right?

It still is, all the idle routines are called with interrupts disabled,
but must also exit with interrupts disabled.

If the idle method requires interrupts to be enabled, it must be sure to
disable them again before returning. Given all the RCU/tracing concerns
it must use raw_local_irq_*() for this though.

2022-06-16 21:33:41

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

Hi Peter,

On Mon, 13 Jun 2022 10:44:22 +0200, Peter Zijlstra <[email protected]>
wrote:

> On Thu, Jun 09, 2022 at 04:49:21PM -0700, Jacob Pan wrote:
> > Hi Peter,
> >
> > On Wed, 08 Jun 2022 16:27:27 +0200, Peter Zijlstra
> > <[email protected]> wrote:
> >
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > >
> > > - must not have tracing in idle functions
> > > - must return with IRQs disabled
> > >
> > > Additionally, it added a branch for no good reason.
> > >
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > > ---
> > > drivers/idle/intel_idle.c | 48
> > > +++++++++++++++++++++++++++++++++++----------- 1 file changed, 37
> > > insertions(+), 11 deletions(-)
> > >
> > > --- a/drivers/idle/intel_idle.c
> > > +++ b/drivers/idle/intel_idle.c
> > > @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
> > > *
> > > * Must be called under local_irq_disable().
> > > */
> > nit: this comment is no long true, right?
>
> It still is, all the idle routines are called with interrupts disabled,
> but must also exit with interrupts disabled.
>
> If the idle method requires interrupts to be enabled, it must be sure to
> disable them again before returning. Given all the RCU/tracing concerns
> it must use raw_local_irq_*() for this though.
Makes sense, it is just little confusing when the immediate caller does
raw_local_irq_enable() which does not cancel out local_irq_disable().

Thanks,

Jacob

2022-07-25 19:51:34

by Michel Lespinasse

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> Xeons") wrecked intel_idle in two ways:
>
> - must not have tracing in idle functions
> - must return with IRQs disabled
>
> Additionally, it added a branch for no good reason.
>
> Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

After this change was introduced, I am seeing "WARNING: suspicious RCU
usage" when booting a kernel with debug options compiled in. Please
see the attached dmesg output. The issue starts with commit 32d4fd5751ea
and is still present in v5.19-rc8.

I'm not sure, is this too late to fix or revert in v5.19 final ?

Thanks,

--
Michel "walken" Lespinasse


Attachments:
(No filename) (832.00 B)
dmesg (75.01 kB)
Download all attachments

2022-07-28 17:22:31

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> >
> > - must not have tracing in idle functions
> > - must return with IRQs disabled
> >
> > Additionally, it added a branch for no good reason.
> >
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>
> After this change was introduced, I am seeing "WARNING: suspicious RCU
> usage" when booting a kernel with debug options compiled in. Please
> see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> and is still present in v5.19-rc8.
>
> I'm not sure, is this too late to fix or revert in v5.19 final ?

I finally got a chance to take a quick look at this.

The rcu_eqs_exit() function is making a lockdep complaint about
being invoked with interrupts enabled. This function is called from
rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
interrupts before invoking rcu_eqs_exit().

The only other call to rcu_idle_exit() does not disable interrupts,
but it is via rcu_user_exit(), which would be a very odd choice for
cpuidle_enter_state().

It seems unlikely, but it might be that it is the use of local_irq_save()
instead of raw_local_irq_save() within rcu_idle_exit() that is causing
the trouble. If this is the case, then the commit shown below would
help. Note that this commit removes the warning from lockdep, so it
is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
equivalent debugging.

Could you please try your test with the -rce commit shown below applied?

Thanx, Paul

------------------------------------------------------------------------

commit ed4ae5eff4b38797607cbdd80da394149110fb37
Author: Paul E. McKenney <[email protected]>
Date: Tue May 17 21:00:04 2022 -0700

rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()

This commit applies the "noinstr" tag to the rcu_idle_enter() and
rcu_idle_exit() functions, which are invoked from portions of the idle
loop that cannot be instrumented. These tags require reworking the
rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
invoke in order to cause them to use normal assertions rather than
lockdep. In addition, within rcu_idle_exit(), the raw versions of
local_irq_save() and local_irq_restore() are used, again to avoid issues
with lockdep in uninstrumented code.

This patch is based in part on an earlier patch by Jiri Olsa, discussions
with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
Gleixner, and off-list discussions with Yonghong Song.

Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Jiri Olsa <[email protected]>
Reported-by: Alexei Starovoitov <[email protected]>
Reported-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Yonghong Song <[email protected]>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c25ba442044a6..9a5edab5558c9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -631,8 +631,8 @@ static noinstr void rcu_eqs_enter(bool user)
return;
}

- lockdep_assert_irqs_disabled();
instrumentation_begin();
+ lockdep_assert_irqs_disabled();
trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
rcu_preempt_deferred_qs(current);
@@ -659,9 +659,9 @@ static noinstr void rcu_eqs_enter(bool user)
* If you add or remove a call to rcu_idle_enter(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
-void rcu_idle_enter(void)
+void noinstr rcu_idle_enter(void)
{
- lockdep_assert_irqs_disabled();
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
rcu_eqs_enter(false);
}
EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -861,7 +861,7 @@ static void noinstr rcu_eqs_exit(bool user)
struct rcu_data *rdp;
long oldval;

- lockdep_assert_irqs_disabled();
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
rdp = this_cpu_ptr(&rcu_data);
oldval = rdp->dynticks_nesting;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
@@ -896,13 +896,13 @@ static void noinstr rcu_eqs_exit(bool user)
* If you add or remove a call to rcu_idle_exit(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
{
unsigned long flags;

- local_irq_save(flags);
+ raw_local_irq_save(flags);
rcu_eqs_exit(false);
- local_irq_restore(flags);
+ raw_local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_idle_exit);

2022-07-29 10:28:33

by Michel Lespinasse

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > >
> > > - must not have tracing in idle functions
> > > - must return with IRQs disabled
> > >
> > > Additionally, it added a branch for no good reason.
> > >
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> >
> > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > usage" when booting a kernel with debug options compiled in. Please
> > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > and is still present in v5.19-rc8.
> >
> > I'm not sure, is this too late to fix or revert in v5.19 final ?
>
> I finally got a chance to take a quick look at this.
>
> The rcu_eqs_exit() function is making a lockdep complaint about
> being invoked with interrupts enabled. This function is called from
> rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> interrupts before invoking rcu_eqs_exit().
>
> The only other call to rcu_idle_exit() does not disable interrupts,
> but it is via rcu_user_exit(), which would be a very odd choice for
> cpuidle_enter_state().
>
> It seems unlikely, but it might be that it is the use of local_irq_save()
> instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> the trouble. If this is the case, then the commit shown below would
> help. Note that this commit removes the warning from lockdep, so it
> is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> equivalent debugging.
>
> Could you please try your test with the -rce commit shown below applied?

Thanks for looking into it.

After checking out Peter's commit 32d4fd5751ea,
cherry picking your commit ed4ae5eff4b3,
and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
I am now seeing this a few seconds into the boot:

[ 3.010650] ------------[ cut here ]------------
[ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60
[ 3.010657] Modules linked in:
[ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
[ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
[ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
[ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
89 c0 48 03 1c c5 c0 98
[ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
[ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
[ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
[ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
[ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
[ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
[ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
[ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
[ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3.010683] Call Trace:
[ 3.010685] <TASK>
[ 3.010688] cpuidle_enter_state+0xb7/0x4b0
[ 3.010694] cpuidle_enter+0x29/0x40
[ 3.010697] do_idle+0x1d4/0x210
[ 3.010702] cpu_startup_entry+0x19/0x20
[ 3.010704] rest_init+0x117/0x1a0
[ 3.010708] arch_call_rest_init+0xa/0x10
[ 3.010711] start_kernel+0x6d8/0x6ff
[ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
[ 3.010728] </TASK>
[ 3.010729] irq event stamp: 44179
[ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
[ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
[ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
[ 3.010741] ---[ end trace 0000000000000000 ]---

2022-07-29 15:15:53

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
<[email protected]> wrote:
>
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > >
> > > > - must not have tracing in idle functions
> > > > - must return with IRQs disabled
> > > >
> > > > Additionally, it added a branch for no good reason.
> > > >
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > >
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > >
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> >
> > I finally got a chance to take a quick look at this.
> >
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled. This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> >
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> >
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble. If this is the case, then the commit shown below would
> > help. Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> >
> > Could you please try your test with the -rce commit shown below applied?
>
> Thanks for looking into it.
>
> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
>
> [ 3.010650] ------------[ cut here ]------------
> [ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60
> [ 3.010657] Modules linked in:
> [ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
> [ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> [ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> 89 c0 48 03 1c c5 c0 98
> [ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
> [ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
> [ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
> [ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
> [ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
> [ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
> [ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
> [ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
> [ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 3.010683] Call Trace:
> [ 3.010685] <TASK>
> [ 3.010688] cpuidle_enter_state+0xb7/0x4b0
> [ 3.010694] cpuidle_enter+0x29/0x40
> [ 3.010697] do_idle+0x1d4/0x210
> [ 3.010702] cpu_startup_entry+0x19/0x20
> [ 3.010704] rest_init+0x117/0x1a0
> [ 3.010708] arch_call_rest_init+0xa/0x10
> [ 3.010711] start_kernel+0x6d8/0x6ff
> [ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
> [ 3.010728] </TASK>
> [ 3.010729] irq event stamp: 44179
> [ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
> [ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
> [ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
> [ 3.010741] ---[ end trace 0000000000000000 ]---

Can you please give this patch a go:
https://patchwork.kernel.org/project/linux-pm/patch/Yt/[email protected]/
?

2022-07-29 15:36:55

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

Or better yet, try the patch that Rafael proposed. ;-)

Thanx, Paul

On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 29, 2022 at 03:24:58AM -0700, Michel Lespinasse wrote:
> > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > Xeons") wrecked intel_idle in two ways:
> > > > >
> > > > > - must not have tracing in idle functions
> > > > > - must return with IRQs disabled
> > > > >
> > > > > Additionally, it added a branch for no good reason.
> > > > >
> > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > > >
> > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > usage" when booting a kernel with debug options compiled in. Please
> > > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > > and is still present in v5.19-rc8.
> > > >
> > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > >
> > > I finally got a chance to take a quick look at this.
> > >
> > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > being invoked with interrupts enabled. This function is called from
> > > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > > via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> > > interrupts before invoking rcu_eqs_exit().
> > >
> > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > cpuidle_enter_state().
> > >
> > > It seems unlikely, but it might be that it is the use of local_irq_save()
> > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > the trouble. If this is the case, then the commit shown below would
> > > help. Note that this commit removes the warning from lockdep, so it
> > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > equivalent debugging.
> > >
> > > Could you please try your test with the -rce commit shown below applied?
> >
> > Thanks for looking into it.
>
> And thank you for trying this shot in the dark!
>
> > After checking out Peter's commit 32d4fd5751ea,
> > cherry picking your commit ed4ae5eff4b3,
> > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > I am now seeing this a few seconds into the boot:
> >
> > [ 3.010650] ------------[ cut here ]------------
> > [ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60
>
> And this is again a complaint about interrupts not being disabled.
>
> But it does appear that the problem was the lockdep complaint, and
> eliminating that did take care of part of the problem. But lockdep
> remained enabled, and you therefore hit the next complaint.
>
> > [ 3.010657] Modules linked in:
> > [ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
> > [ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> > [ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
>
> The most straightforward way to get to sched_clock_tick() from
> cpuidle_enter_state() is via the call to sched_clock_idle_wakeup_event().
>
> Except that it disables interrupts before invoking sched_clock_tick().
>
> > [ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> > 89 c0 48 03 1c c5 c0 98
> > [ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
> > [ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
> > [ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
> > [ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
> > [ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
> > [ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
> > [ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
> > [ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
> > [ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 3.010683] Call Trace:
> > [ 3.010685] <TASK>
> > [ 3.010688] cpuidle_enter_state+0xb7/0x4b0
> > [ 3.010694] cpuidle_enter+0x29/0x40
> > [ 3.010697] do_idle+0x1d4/0x210
> > [ 3.010702] cpu_startup_entry+0x19/0x20
> > [ 3.010704] rest_init+0x117/0x1a0
> > [ 3.010708] arch_call_rest_init+0xa/0x10
> > [ 3.010711] start_kernel+0x6d8/0x6ff
> > [ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
> > [ 3.010728] </TASK>
> > [ 3.010729] irq event stamp: 44179
> > [ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
> > [ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
> > [ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
> > [ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
> > [ 3.010741] ---[ end trace 0000000000000000 ]---
>
> Would you be willing to try another shot in the dark, but untested
> this time? I freely admit that this is getting strange.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index e374c0c923dae..279f557bf60bb 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
> if (!static_branch_likely(&sched_clock_running))
> return;
>
> - lockdep_assert_irqs_disabled();
> + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>
> scd = this_scd();
> __scd_stamp(scd);

2022-07-29 15:38:22

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Fri, Jul 29, 2022 at 03:24:58AM -0700, Michel Lespinasse wrote:
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > >
> > > > - must not have tracing in idle functions
> > > > - must return with IRQs disabled
> > > >
> > > > Additionally, it added a branch for no good reason.
> > > >
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > >
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > >
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> >
> > I finally got a chance to take a quick look at this.
> >
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled. This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> >
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> >
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble. If this is the case, then the commit shown below would
> > help. Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> >
> > Could you please try your test with the -rce commit shown below applied?
>
> Thanks for looking into it.

And thank you for trying this shot in the dark!

> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
>
> [ 3.010650] ------------[ cut here ]------------
> [ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60

And this is again a complaint about interrupts not being disabled.

But it does appear that the problem was the lockdep complaint, and
eliminating that did take care of part of the problem. But lockdep
remained enabled, and you therefore hit the next complaint.

> [ 3.010657] Modules linked in:
> [ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
> [ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60

The most straightforward way to get to sched_clock_tick() from
cpuidle_enter_state() is via the call to sched_clock_idle_wakeup_event().

Except that it disables interrupts before invoking sched_clock_tick().

> [ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> 89 c0 48 03 1c c5 c0 98
> [ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
> [ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
> [ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
> [ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
> [ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
> [ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
> [ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
> [ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
> [ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 3.010683] Call Trace:
> [ 3.010685] <TASK>
> [ 3.010688] cpuidle_enter_state+0xb7/0x4b0
> [ 3.010694] cpuidle_enter+0x29/0x40
> [ 3.010697] do_idle+0x1d4/0x210
> [ 3.010702] cpu_startup_entry+0x19/0x20
> [ 3.010704] rest_init+0x117/0x1a0
> [ 3.010708] arch_call_rest_init+0xa/0x10
> [ 3.010711] start_kernel+0x6d8/0x6ff
> [ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
> [ 3.010728] </TASK>
> [ 3.010729] irq event stamp: 44179
> [ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
> [ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
> [ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
> [ 3.010741] ---[ end trace 0000000000000000 ]---

Would you be willing to try another shot in the dark, but untested
this time? I freely admit that this is getting strange.

Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index e374c0c923dae..279f557bf60bb 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
if (!static_branch_likely(&sched_clock_running))
return;

- lockdep_assert_irqs_disabled();
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());

scd = this_scd();
__scd_stamp(scd);

2022-07-30 09:54:32

by Michel Lespinasse

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Fri, Jul 29, 2022 at 04:59:50PM +0200, Rafael J. Wysocki wrote:
> On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
> <[email protected]> wrote:
> >
> > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > Xeons") wrecked intel_idle in two ways:
> > > > >
> > > > > - must not have tracing in idle functions
> > > > > - must return with IRQs disabled
> > > > >
> > > > > Additionally, it added a branch for no good reason.
> > > > >
> > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > > >
> > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > usage" when booting a kernel with debug options compiled in. Please
> > > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > > and is still present in v5.19-rc8.
> > > >
> > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > >
> > > I finally got a chance to take a quick look at this.
> > >
> > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > being invoked with interrupts enabled. This function is called from
> > > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > > via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> > > interrupts before invoking rcu_eqs_exit().
> > >
> > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > cpuidle_enter_state().
> > >
> > > It seems unlikely, but it might be that it is the use of local_irq_save()
> > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > the trouble. If this is the case, then the commit shown below would
> > > help. Note that this commit removes the warning from lockdep, so it
> > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > equivalent debugging.
> > >
> > > Could you please try your test with the -rce commit shown below applied?
> >
> > Thanks for looking into it.
> >
> > After checking out Peter's commit 32d4fd5751ea,
> > cherry picking your commit ed4ae5eff4b3,
> > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > I am now seeing this a few seconds into the boot:
> >
> > [ 3.010650] ------------[ cut here ]------------
> > [ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60
> > [ 3.010657] Modules linked in:
> > [ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
> > [ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> > [ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> > [ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> > 89 c0 48 03 1c c5 c0 98
> > [ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
> > [ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
> > [ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
> > [ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
> > [ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
> > [ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
> > [ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
> > [ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
> > [ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 3.010683] Call Trace:
> > [ 3.010685] <TASK>
> > [ 3.010688] cpuidle_enter_state+0xb7/0x4b0
> > [ 3.010694] cpuidle_enter+0x29/0x40
> > [ 3.010697] do_idle+0x1d4/0x210
> > [ 3.010702] cpu_startup_entry+0x19/0x20
> > [ 3.010704] rest_init+0x117/0x1a0
> > [ 3.010708] arch_call_rest_init+0xa/0x10
> > [ 3.010711] start_kernel+0x6d8/0x6ff
> > [ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
> > [ 3.010728] </TASK>
> > [ 3.010729] irq event stamp: 44179
> > [ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
> > [ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
> > [ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
> > [ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
> > [ 3.010741] ---[ end trace 0000000000000000 ]---
>
> Can you please give this patch a go:
> https://patchwork.kernel.org/project/linux-pm/patch/Yt/[email protected]/
> ?

I tried, but it didn't change the picture for me.

I'm not sure if that was the patch you meant to send though, as it
seems it's only adding a tracepoint so shouldn't make any difference
if I'm not actually using the tracepoint ?

2022-07-30 09:57:46

by Michel Lespinasse

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:> Would you be willing to try another shot in the dark, but untested
> this time? I freely admit that this is getting strange.
>
> Thanx, Paul

Yes, adding this second change got rid of the boot time warning for me.

> ------------------------------------------------------------------------
>
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index e374c0c923dae..279f557bf60bb 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
> if (!static_branch_likely(&sched_clock_running))
> return;
>
> - lockdep_assert_irqs_disabled();
> + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>
> scd = this_scd();
> __scd_stamp(scd);

2022-07-30 18:17:51

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Sat, Jul 30, 2022 at 02:40:32AM -0700, Michel Lespinasse wrote:
> On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:> Would you be willing to try another shot in the dark, but untested
> > this time? I freely admit that this is getting strange.
> >
> > Thanx, Paul
>
> Yes, adding this second change got rid of the boot time warning for me.

OK, I will make a real patch. May I have your Tested-by?

Thanx, Paul

> > ------------------------------------------------------------------------
> >
> > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > index e374c0c923dae..279f557bf60bb 100644
> > --- a/kernel/sched/clock.c
> > +++ b/kernel/sched/clock.c
> > @@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
> > if (!static_branch_likely(&sched_clock_running))
> > return;
> >
> > - lockdep_assert_irqs_disabled();
> > + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >
> > scd = this_scd();
> > __scd_stamp(scd);

2022-07-30 19:55:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Sat, Jul 30, 2022 at 11:48 AM Michel Lespinasse
<[email protected]> wrote:
>
> On Fri, Jul 29, 2022 at 04:59:50PM +0200, Rafael J. Wysocki wrote:
> > On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
> > <[email protected]> wrote:
> > >
> > > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > > Xeons") wrecked intel_idle in two ways:
> > > > > >
> > > > > > - must not have tracing in idle functions
> > > > > > - must return with IRQs disabled
> > > > > >
> > > > > > Additionally, it added a branch for no good reason.
> > > > > >
> > > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > > > >
> > > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > > usage" when booting a kernel with debug options compiled in. Please
> > > > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > > > and is still present in v5.19-rc8.
> > > > >
> > > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > > >
> > > > I finally got a chance to take a quick look at this.
> > > >
> > > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > > being invoked with interrupts enabled. This function is called from
> > > > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > > > via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> > > > interrupts before invoking rcu_eqs_exit().
> > > >
> > > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > > cpuidle_enter_state().
> > > >
> > > > It seems unlikely, but it might be that it is the use of local_irq_save()
> > > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > > the trouble. If this is the case, then the commit shown below would
> > > > help. Note that this commit removes the warning from lockdep, so it
> > > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > > equivalent debugging.
> > > >
> > > > Could you please try your test with the -rce commit shown below applied?
> > >
> > > Thanks for looking into it.
> > >
> > > After checking out Peter's commit 32d4fd5751ea,
> > > cherry picking your commit ed4ae5eff4b3,
> > > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > > I am now seeing this a few seconds into the boot:
> > >
> > > [ 3.010650] ------------[ cut here ]------------
> > > [ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60
> > > [ 3.010657] Modules linked in:
> > > [ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
> > > [ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> > > [ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> > > [ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> > > 89 c0 48 03 1c c5 c0 98
> > > [ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
> > > [ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
> > > [ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
> > > [ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
> > > [ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
> > > [ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
> > > [ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
> > > [ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
> > > [ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [ 3.010683] Call Trace:
> > > [ 3.010685] <TASK>
> > > [ 3.010688] cpuidle_enter_state+0xb7/0x4b0
> > > [ 3.010694] cpuidle_enter+0x29/0x40
> > > [ 3.010697] do_idle+0x1d4/0x210
> > > [ 3.010702] cpu_startup_entry+0x19/0x20
> > > [ 3.010704] rest_init+0x117/0x1a0
> > > [ 3.010708] arch_call_rest_init+0xa/0x10
> > > [ 3.010711] start_kernel+0x6d8/0x6ff
> > > [ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
> > > [ 3.010728] </TASK>
> > > [ 3.010729] irq event stamp: 44179
> > > [ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
> > > [ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
> > > [ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
> > > [ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
> > > [ 3.010741] ---[ end trace 0000000000000000 ]---
> >
> > Can you please give this patch a go:
> > https://patchwork.kernel.org/project/linux-pm/patch/Yt/[email protected]/
> > ?
>
> I tried, but it didn't change the picture for me.
>
> I'm not sure if that was the patch you meant to send though, as it
> seems it's only adding a tracepoint so shouldn't make any difference
> if I'm not actually using the tracepoint ?

You are right, it looks like I pasted a link to a different patch by
mistake. Sorry about that.

I meant this one:

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=pm&id=d295ad34f236c3518634fb6403d4c0160456e470

which will appear in the final 5.19.

2022-07-31 00:04:15

by Michel Lespinasse

[permalink] [raw]
Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

On Sat, Jul 30, 2022 at 09:52:34PM +0200, Rafael J. Wysocki wrote:
> On Sat, Jul 30, 2022 at 11:48 AM Michel Lespinasse
> <[email protected]> wrote:
> > I'm not sure if that was the patch you meant to send though, as it
> > seems it's only adding a tracepoint so shouldn't make any difference
> > if I'm not actually using the tracepoint ?
>
> You are right, it looks like I pasted a link to a different patch by
> mistake. Sorry about that.
>
> I meant this one:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=pm&id=d295ad34f236c3518634fb6403d4c0160456e470
>
> which will appear in the final 5.19.

Thanks. I can confirm that this patch fixes the boot time debug
warnings for me. And I see that linus already merged it, nice!

--
Michel "walken" Lespinasse.