The call to rcu_cpu_starting() in smp_init_secondary() is not early
enough in the CPU-hotplug onlining process, which results in lockdep
splats as follows:
WARNING: suspicious RCU usage
-----------------------------
kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
other info that might help us debug this:
RCU used illegally from offline CPU!
rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/1/0.
Call Trace:
show_stack+0x158/0x1f0
dump_stack+0x1f2/0x238
__lock_acquire+0x2640/0x4dd0
lock_acquire+0x3a8/0xd08
_raw_spin_lock_irqsave+0xc0/0xf0
clockevents_register_device+0xa8/0x528
init_cpu_timer+0x33e/0x468
smp_init_secondary+0x11a/0x328
smp_start_secondary+0x82/0x88
This is avoided by moving the call to rcu_cpu_starting up near the
beginning of the smp_init_secondary() function. Note that the
raw_smp_processor_id() is required in order to avoid calling into
lockdep before RCU has declared the CPU to be watched for readers.
Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
Signed-off-by: Qian Cai <[email protected]>
---
arch/s390/kernel/smp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index ebfe86d097f0..390d97daa2b3 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -855,13 +855,14 @@ void __init smp_detect_cpus(void)
static void smp_init_secondary(void)
{
- int cpu = smp_processor_id();
+ int cpu = raw_smp_processor_id();
S390_lowcore.last_update_clock = get_tod_clock();
restore_access_regs(S390_lowcore.access_regs_save_area);
set_cpu_flag(CIF_ASCE_PRIMARY);
set_cpu_flag(CIF_ASCE_SECONDARY);
cpu_init();
+ rcu_cpu_starting(cpu);
preempt_disable();
init_cpu_timer();
vtime_init();
--
2.28.0
On Wed, Oct 28, 2020 at 02:27:42PM -0400, Qian Cai wrote:
> The call to rcu_cpu_starting() in smp_init_secondary() is not early
> enough in the CPU-hotplug onlining process, which results in lockdep
> splats as follows:
>
> WARNING: suspicious RCU usage
> -----------------------------
> kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
>
> other info that might help us debug this:
>
> RCU used illegally from offline CPU!
> rcu_scheduler_active = 1, debug_locks = 1
> no locks held by swapper/1/0.
>
> Call Trace:
> show_stack+0x158/0x1f0
> dump_stack+0x1f2/0x238
> __lock_acquire+0x2640/0x4dd0
> lock_acquire+0x3a8/0xd08
> _raw_spin_lock_irqsave+0xc0/0xf0
> clockevents_register_device+0xa8/0x528
> init_cpu_timer+0x33e/0x468
> smp_init_secondary+0x11a/0x328
> smp_start_secondary+0x82/0x88
>
> This is avoided by moving the call to rcu_cpu_starting up near the
> beginning of the smp_init_secondary() function. Note that the
> raw_smp_processor_id() is required in order to avoid calling into
> lockdep before RCU has declared the CPU to be watched for readers.
>
> Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
> Signed-off-by: Qian Cai <[email protected]>
> ---
> arch/s390/kernel/smp.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
Could you provide the config you used? I'm wondering why I can't
reproduce this even though I have lot's of debug options enabled.
I will apply it anyway after rc2 has been released, just curious.
On Sat, 2020-10-31 at 19:37 +0100, Heiko Carstens wrote:
> On Wed, Oct 28, 2020 at 02:27:42PM -0400, Qian Cai wrote:
> > The call to rcu_cpu_starting() in smp_init_secondary() is not early
> > enough in the CPU-hotplug onlining process, which results in lockdep
> > splats as follows:
> >
> > WARNING: suspicious RCU usage
> > -----------------------------
> > kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
> >
> > other info that might help us debug this:
> >
> > RCU used illegally from offline CPU!
> > rcu_scheduler_active = 1, debug_locks = 1
> > no locks held by swapper/1/0.
> >
> > Call Trace:
> > show_stack+0x158/0x1f0
> > dump_stack+0x1f2/0x238
> > __lock_acquire+0x2640/0x4dd0
> > lock_acquire+0x3a8/0xd08
> > _raw_spin_lock_irqsave+0xc0/0xf0
> > clockevents_register_device+0xa8/0x528
> > init_cpu_timer+0x33e/0x468
> > smp_init_secondary+0x11a/0x328
> > smp_start_secondary+0x82/0x88
> >
> > This is avoided by moving the call to rcu_cpu_starting up near the
> > beginning of the smp_init_secondary() function. Note that the
> > raw_smp_processor_id() is required in order to avoid calling into
> > lockdep before RCU has declared the CPU to be watched for readers.
> >
> > Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
> > Signed-off-by: Qian Cai <[email protected]>
> > ---
> > arch/s390/kernel/smp.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
>
> Could you provide the config you used? I'm wondering why I can't
> reproduce this even though I have lot's of debug options enabled.
https://cailca.coding.net/public/linux/mm/git/files/master/s390.config
Essentially, I believe it requires CONFIG_PROVE_RCU_LIST=y. Also, it occurs to
me that this only starts to happen after the commit mentioned in the above link.
On Sat, Oct 31, 2020 at 07:38:52PM -0400, Qian Cai wrote:
> > > This is avoided by moving the call to rcu_cpu_starting up near the
> > > beginning of the smp_init_secondary() function. Note that the
> > > raw_smp_processor_id() is required in order to avoid calling into
> > > lockdep before RCU has declared the CPU to be watched for readers.
> > >
> > > Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
> > > Signed-off-by: Qian Cai <[email protected]>
> > > ---
> > > arch/s390/kernel/smp.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > Could you provide the config you used? I'm wondering why I can't
> > reproduce this even though I have lot's of debug options enabled.
> https://cailca.coding.net/public/linux/mm/git/files/master/s390.config
>
> Essentially, I believe it requires CONFIG_PROVE_RCU_LIST=y. Also, it occurs to
> me that this only starts to happen after the commit mentioned in the above link.
Yes, with that enabled I can reprocuce it. Thanks! It depends on
CONFIG_RCU_EXPERT. I can't image why I didn't had that enabled.. :)