Hello!
I am seeing the following splat in rcutorture testing of v4.11-rc1:
[ 30.694013] =============================
[ 30.694013] WARNING: suspicious RCU usage
[ 30.694013] 4.11.0-rc1+ #1 Not tainted
[ 30.694013] -----------------------------
[ 30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
[ 30.694013]
[ 30.694013] other info that might help us debug this:
[ 30.694013]
[ 30.694013]
[ 30.694013] RCU used illegally from offline CPU!
[ 30.694013] rcu_scheduler_active = 2, debug_locks = 0
[ 30.694013] no locks held by swapper/1/0.
[ 30.694013]
[ 30.694013] stack backtrace:
[ 30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
[ 30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 30.694013] Call Trace:
[ 30.694013] dump_stack+0x67/0x99
[ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
[ 30.694013] get_work_pool+0x82/0x90
[ 30.694013] __queue_work+0x70/0x5f0
[ 30.694013] queue_work_on+0x33/0x70
[ 30.694013] clear_sched_clock_stable+0x33/0x40
[ 30.694013] early_init_intel+0xe7/0x2f0
[ 30.694013] init_intel+0x11/0x350
[ 30.694013] identify_cpu+0x344/0x5a0
[ 30.694013] identify_secondary_cpu+0x18/0x80
[ 30.694013] smp_store_cpu_info+0x39/0x40
[ 30.694013] start_secondary+0x4e/0x100
[ 30.694013] start_cpu+0x14/0x14
Here is the relevant code from x86's smp_callin():
/*
* Save our processor parameters. Note: this information
* is needed for clock calibration.
*/
smp_store_cpu_info(cpuid);
/*
* Get our bogomips.
* Update loops_per_jiffy in cpu_data. Previous call to
* smp_store_cpu_info() stored a value that is close but not as
* accurate as the value just calculated.
*/
calibrate_delay();
cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
pr_debug("Stack at about %p\n", &cpuid);
/*
* This must be done before setting cpu_online_mask
* or calling notify_cpu_starting.
*/
set_cpu_sibling_map(raw_smp_processor_id());
wmb();
notify_cpu_starting(cpuid);
The problem is that smp_store_cpu_info() indirectly invokes
schedule_work(), which wants to use RCU. But RCU isn't informed
of the incoming CPU until the call to notify_cpu_starting(), which
causes lockdep to complain bitterly about the use of RCU by the
premature call to schedule_work().
I considered just moving the notify_cpu_starting() earlier in the
sequence, but the comments make it seem like this would not be
a wise choice.
Any suggestions?
Thanx, Paul
On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> Hello!
>
> I am seeing the following splat in rcutorture testing of v4.11-rc1:
>
> [ 30.694013] =============================
> [ 30.694013] WARNING: suspicious RCU usage
> [ 30.694013] 4.11.0-rc1+ #1 Not tainted
> [ 30.694013] -----------------------------
> [ 30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
> [ 30.694013]
> [ 30.694013] other info that might help us debug this:
> [ 30.694013]
> [ 30.694013]
> [ 30.694013] RCU used illegally from offline CPU!
> [ 30.694013] rcu_scheduler_active = 2, debug_locks = 0
> [ 30.694013] no locks held by swapper/1/0.
> [ 30.694013]
> [ 30.694013] stack backtrace:
> [ 30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> [ 30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> [ 30.694013] Call Trace:
> [ 30.694013] dump_stack+0x67/0x99
> [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> [ 30.694013] get_work_pool+0x82/0x90
> [ 30.694013] __queue_work+0x70/0x5f0
> [ 30.694013] queue_work_on+0x33/0x70
> [ 30.694013] clear_sched_clock_stable+0x33/0x40
> [ 30.694013] early_init_intel+0xe7/0x2f0
> [ 30.694013] init_intel+0x11/0x350
> [ 30.694013] identify_cpu+0x344/0x5a0
> [ 30.694013] identify_secondary_cpu+0x18/0x80
> [ 30.694013] smp_store_cpu_info+0x39/0x40
> [ 30.694013] start_secondary+0x4e/0x100
> [ 30.694013] start_cpu+0x14/0x14
>
> Here is the relevant code from x86's smp_callin():
>
> /*
> * Save our processor parameters. Note: this information
> * is needed for clock calibration.
> */
> smp_store_cpu_info(cpuid);
>
> /*
> * Get our bogomips.
> * Update loops_per_jiffy in cpu_data. Previous call to
> * smp_store_cpu_info() stored a value that is close but not as
> * accurate as the value just calculated.
> */
> calibrate_delay();
> cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> pr_debug("Stack at about %p\n", &cpuid);
>
> /*
> * This must be done before setting cpu_online_mask
> * or calling notify_cpu_starting.
> */
> set_cpu_sibling_map(raw_smp_processor_id());
> wmb();
>
> notify_cpu_starting(cpuid);
>
> The problem is that smp_store_cpu_info() indirectly invokes
> schedule_work(), which wants to use RCU. But RCU isn't informed
> of the incoming CPU until the call to notify_cpu_starting(), which
> causes lockdep to complain bitterly about the use of RCU by the
> premature call to schedule_work().
>
> I considered just moving the notify_cpu_starting() earlier in the
> sequence, but the comments make it seem like this would not be
> a wise choice.
>
> Any suggestions?
And I should hasten to mention my default approach, which would be
to provide a Kconfig option that causes notify_cpu_starting() to
skip calling rcu_cpu_starting(), and to add a call to this function
in x86's smp_callin() just before the call to smp_store_cpu_info().
But this seemed a bit crude, so I figured I should check with you
guys before taking that approach.
But what is the fun in that? Please see below for the corresponding
lightly tested patch. ;-)
Thanx, Paul
------------------------------------------------------------------------
commit aa805dad9ae66cc4f5106d004b6bb5102fd84434
Author: Paul E. McKenney <[email protected]>
Date: Wed Mar 8 15:32:29 2017 -0800
cpu: Move RCU earlier in x86 CPU-online procedure
Running rcutorture on v4.11-rc1 results in the following splat on x86
on kernels built with both CPU hotplug and lockdep:
[ 30.694013] =============================
[ 30.694013] WARNING: suspicious RCU usage
[ 30.694013] 4.11.0-rc1+ #1 Not tainted
[ 30.694013] -----------------------------
[ 30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
[ 30.694013]
[ 30.694013] other info that might help us debug this:
[ 30.694013]
[ 30.694013]
[ 30.694013] RCU used illegally from offline CPU!
[ 30.694013] rcu_scheduler_active = 2, debug_locks = 0
[ 30.694013] no locks held by swapper/1/0.
[ 30.694013]
[ 30.694013] stack backtrace:
[ 30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
[ 30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 30.694013] Call Trace:
[ 30.694013] dump_stack+0x67/0x99
[ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
[ 30.694013] get_work_pool+0x82/0x90
[ 30.694013] __queue_work+0x70/0x5f0
[ 30.694013] queue_work_on+0x33/0x70
[ 30.694013] clear_sched_clock_stable+0x33/0x40
[ 30.694013] early_init_intel+0xe7/0x2f0
[ 30.694013] init_intel+0x11/0x350
[ 30.694013] identify_cpu+0x344/0x5a0
[ 30.694013] identify_secondary_cpu+0x18/0x80
[ 30.694013] smp_store_cpu_info+0x39/0x40
[ 30.694013] start_secondary+0x4e/0x100
[ 30.694013] start_cpu+0x14/0x14
This is caused by the fact that smp_store_cpu_info() indirectly invokes
schedule_work(), which wants to use RCU. But RCU isn't informed of the
incoming CPU until the later call to notify_cpu_starting(), which causes
lockdep to complain bitterly about the use of RCU by the premature call
to schedule_work(). It is said to be unwise to move the call to
notify_cpu_starting() to precede that to smp_store_cpu_info(), so this
commit adds a ARCH_RCU_ONLINE_EARLY Kconfig option that is selected by
the x86 architecture. This option suppresses the call to rcu_cpu_starting()
from notify_cpu_starting(), and a call to rcu_cpu_starting() is added
preceding the call to smp_store_cpu_info().
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a294ee..bb42b76ec2dd 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -63,6 +63,7 @@ config X86
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
+ select ARCH_RCU_ONLINE_EARLY
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index bd1f1ad35284..ca7386d04194 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -188,6 +188,12 @@ static void smp_callin(void)
apic_ap_setup();
/*
+ * Enable RCU readers on this CPU, needed for workqueues,
+ * which are used in smp_store_cpu_info().
+ */
+ rcu_cpu_starting(cpuid);
+
+ /*
* Save our processor parameters. Note: this information
* is needed for clock calibration.
*/
diff --git a/init/Kconfig b/init/Kconfig
index c859c993c26f..31c2d5e08f16 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -543,6 +543,9 @@ config RCU_STALL_COMMON
the tiny variants to disable RCU CPU stall warnings, while
making these warnings mandatory for the tree variants.
+config ARCH_RCU_ONLINE_EARLY
+ bool
+
config CONTEXT_TRACKING
bool
diff --git a/kernel/cpu.c b/kernel/cpu.c
index f7c063239fa5..9d98f2785fa0 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -851,7 +851,8 @@ void notify_cpu_starting(unsigned int cpu)
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
- rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
+ if (!IS_ENABLED(CONFIG_ARCH_RCU_ONLINE_EARLY))
+ rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
while (st->state < target) {
st->state++;
cpuhp_invoke_callback(cpu, st->state, true, NULL);
On Wed, Mar 08, 2017 at 03:41:52PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> > Hello!
> >
> > I am seeing the following splat in rcutorture testing of v4.11-rc1:
> >
> > [ 30.694013] =============================
> > [ 30.694013] WARNING: suspicious RCU usage
> > [ 30.694013] 4.11.0-rc1+ #1 Not tainted
> > [ 30.694013] -----------------------------
> > [ 30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
> > [ 30.694013]
> > [ 30.694013] other info that might help us debug this:
> > [ 30.694013]
> > [ 30.694013]
> > [ 30.694013] RCU used illegally from offline CPU!
> > [ 30.694013] rcu_scheduler_active = 2, debug_locks = 0
> > [ 30.694013] no locks held by swapper/1/0.
> > [ 30.694013]
> > [ 30.694013] stack backtrace:
> > [ 30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> > [ 30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > [ 30.694013] Call Trace:
> > [ 30.694013] dump_stack+0x67/0x99
> > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > [ 30.694013] get_work_pool+0x82/0x90
> > [ 30.694013] __queue_work+0x70/0x5f0
> > [ 30.694013] queue_work_on+0x33/0x70
> > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > [ 30.694013] early_init_intel+0xe7/0x2f0
> > [ 30.694013] init_intel+0x11/0x350
> > [ 30.694013] identify_cpu+0x344/0x5a0
> > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > [ 30.694013] start_secondary+0x4e/0x100
> > [ 30.694013] start_cpu+0x14/0x14
> >
> > Here is the relevant code from x86's smp_callin():
> >
> > /*
> > * Save our processor parameters. Note: this information
> > * is needed for clock calibration.
> > */
> > smp_store_cpu_info(cpuid);
> >
> > /*
> > * Get our bogomips.
> > * Update loops_per_jiffy in cpu_data. Previous call to
> > * smp_store_cpu_info() stored a value that is close but not as
> > * accurate as the value just calculated.
> > */
> > calibrate_delay();
> > cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> > pr_debug("Stack at about %p\n", &cpuid);
> >
> > /*
> > * This must be done before setting cpu_online_mask
> > * or calling notify_cpu_starting.
> > */
> > set_cpu_sibling_map(raw_smp_processor_id());
> > wmb();
> >
> > notify_cpu_starting(cpuid);
> >
> > The problem is that smp_store_cpu_info() indirectly invokes
> > schedule_work(), which wants to use RCU. But RCU isn't informed
> > of the incoming CPU until the call to notify_cpu_starting(), which
> > causes lockdep to complain bitterly about the use of RCU by the
> > premature call to schedule_work().
> >
> > I considered just moving the notify_cpu_starting() earlier in the
> > sequence, but the comments make it seem like this would not be
> > a wise choice.
> >
> > Any suggestions?
Calling schedule_work() from an offline (booting) CPU doesn't sound like a good
idea in the first place. And neither is it a good idea to allow using
RCU on a CPU that is not yet online.
Perhaps we could delay this sched clock stability check to a later
stage in the secondary CPU boot-up code? Once the CPU is online
and RCU is initialized? For example it could be a CPU_ONLINE hotplug
callback.
On Thu, Mar 09, 2017 at 04:55:29AM +0100, Frederic Weisbecker wrote:
> On Wed, Mar 08, 2017 at 03:41:52PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> > > Hello!
> > >
> > > I am seeing the following splat in rcutorture testing of v4.11-rc1:
> > >
> > > [ 30.694013] =============================
> > > [ 30.694013] WARNING: suspicious RCU usage
> > > [ 30.694013] 4.11.0-rc1+ #1 Not tainted
> > > [ 30.694013] -----------------------------
> > > [ 30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
> > > [ 30.694013]
> > > [ 30.694013] other info that might help us debug this:
> > > [ 30.694013]
> > > [ 30.694013]
> > > [ 30.694013] RCU used illegally from offline CPU!
> > > [ 30.694013] rcu_scheduler_active = 2, debug_locks = 0
> > > [ 30.694013] no locks held by swapper/1/0.
> > > [ 30.694013]
> > > [ 30.694013] stack backtrace:
> > > [ 30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> > > [ 30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > > [ 30.694013] Call Trace:
> > > [ 30.694013] dump_stack+0x67/0x99
> > > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > > [ 30.694013] get_work_pool+0x82/0x90
> > > [ 30.694013] __queue_work+0x70/0x5f0
> > > [ 30.694013] queue_work_on+0x33/0x70
> > > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > > [ 30.694013] early_init_intel+0xe7/0x2f0
> > > [ 30.694013] init_intel+0x11/0x350
> > > [ 30.694013] identify_cpu+0x344/0x5a0
> > > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > > [ 30.694013] start_secondary+0x4e/0x100
> > > [ 30.694013] start_cpu+0x14/0x14
> > >
> > > Here is the relevant code from x86's smp_callin():
> > >
> > > /*
> > > * Save our processor parameters. Note: this information
> > > * is needed for clock calibration.
> > > */
> > > smp_store_cpu_info(cpuid);
> > >
> > > /*
> > > * Get our bogomips.
> > > * Update loops_per_jiffy in cpu_data. Previous call to
> > > * smp_store_cpu_info() stored a value that is close but not as
> > > * accurate as the value just calculated.
> > > */
> > > calibrate_delay();
> > > cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> > > pr_debug("Stack at about %p\n", &cpuid);
> > >
> > > /*
> > > * This must be done before setting cpu_online_mask
> > > * or calling notify_cpu_starting.
> > > */
> > > set_cpu_sibling_map(raw_smp_processor_id());
> > > wmb();
> > >
> > > notify_cpu_starting(cpuid);
> > >
> > > The problem is that smp_store_cpu_info() indirectly invokes
> > > schedule_work(), which wants to use RCU. But RCU isn't informed
> > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > causes lockdep to complain bitterly about the use of RCU by the
> > > premature call to schedule_work().
> > >
> > > I considered just moving the notify_cpu_starting() earlier in the
> > > sequence, but the comments make it seem like this would not be
> > > a wise choice.
> > >
> > > Any suggestions?
>
> Calling schedule_work() from an offline (booting) CPU doesn't sound like a good
> idea in the first place. And neither is it a good idea to allow using
> RCU on a CPU that is not yet online.
Fair point, though it is only RCU readers that are in use, so no
RCU core code would be running, at least not until interrupts are
enabled and so on. But I needed the patch to get this out of the
way of my rcutorture testing, so it is serving a purpose whether or
not it eventually hits mainline.
> Perhaps we could delay this sched clock stability check to a later
> stage in the secondary CPU boot-up code? Once the CPU is online
> and RCU is initialized? For example it could be a CPU_ONLINE hotplug
> callback.
In theory, this does seem like a cleaner solution to me. In practice,
I must defer to those who know the code better than I do.
Thanx, Paul
On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> [ 30.694013] get_work_pool+0x82/0x90
> [ 30.694013] __queue_work+0x70/0x5f0
> [ 30.694013] queue_work_on+0x33/0x70
> [ 30.694013] clear_sched_clock_stable+0x33/0x40
> [ 30.694013] early_init_intel+0xe7/0x2f0
> [ 30.694013] init_intel+0x11/0x350
> [ 30.694013] identify_cpu+0x344/0x5a0
> [ 30.694013] identify_secondary_cpu+0x18/0x80
> [ 30.694013] smp_store_cpu_info+0x39/0x40
> [ 30.694013] start_secondary+0x4e/0x100
> [ 30.694013] start_cpu+0x14/0x14
>
> Here is the relevant code from x86's smp_callin():
>
> /*
> * Save our processor parameters. Note: this information
> * is needed for clock calibration.
> */
> smp_store_cpu_info(cpuid);
>
> The problem is that smp_store_cpu_info() indirectly invokes
> schedule_work(), which wants to use RCU. But RCU isn't informed
> of the incoming CPU until the call to notify_cpu_starting(), which
> causes lockdep to complain bitterly about the use of RCU by the
> premature call to schedule_work().
Right. And that want's to be fixed, not hacked around by silencing RCU.
Peter????
Thanks,
tglx
On Thu, Mar 09, 2017 at 02:08:23PM +0100, Thomas Gleixner wrote:
> On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > [ 30.694013] get_work_pool+0x82/0x90
> > [ 30.694013] __queue_work+0x70/0x5f0
> > [ 30.694013] queue_work_on+0x33/0x70
> > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > [ 30.694013] early_init_intel+0xe7/0x2f0
> > [ 30.694013] init_intel+0x11/0x350
> > [ 30.694013] identify_cpu+0x344/0x5a0
> > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > [ 30.694013] start_secondary+0x4e/0x100
> > [ 30.694013] start_cpu+0x14/0x14
> >
> > Here is the relevant code from x86's smp_callin():
> >
> > /*
> > * Save our processor parameters. Note: this information
> > * is needed for clock calibration.
> > */
> > smp_store_cpu_info(cpuid);
> >
> > The problem is that smp_store_cpu_info() indirectly invokes
> > schedule_work(), which wants to use RCU. But RCU isn't informed
> > of the incoming CPU until the call to notify_cpu_starting(), which
> > causes lockdep to complain bitterly about the use of RCU by the
> > premature call to schedule_work().
>
> Right. And that want's to be fixed, not hacked around by silencing RCU.
>
> Peter????
I'm thinking this is hotplug? 30 seconds after boot is far too late for
SMP bringup, or you have a stupid slow machine.
Because it only calls schedule_work() after SMP-init. In which case
there's then two cases, either:
- TSC was stable, hotplug wrecked it, TSC is now unstable, and we're
screwed.
- TSC was unstable, hotplug triggers and we want to mark it unstable
_again_.
If this is the second, the below should fix it, if its the first, I've
no idea yet on how to fix that properly :/
Bloody hotplug..
---
kernel/sched/clock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index a08795e..eecf388 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -172,7 +172,7 @@ void clear_sched_clock_stable(void)
smp_mb(); /* matches sched_clock_init_late() */
- if (sched_clock_running == 2)
+ if (sched_clock_running == 2 && sched_clock_stable())
schedule_work(&sched_clock_work);
}
On Thu, Mar 09, 2017 at 02:08:23PM +0100, Thomas Gleixner wrote:
> On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > [ 30.694013] get_work_pool+0x82/0x90
> > [ 30.694013] __queue_work+0x70/0x5f0
> > [ 30.694013] queue_work_on+0x33/0x70
> > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > [ 30.694013] early_init_intel+0xe7/0x2f0
> > [ 30.694013] init_intel+0x11/0x350
> > [ 30.694013] identify_cpu+0x344/0x5a0
> > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > [ 30.694013] start_secondary+0x4e/0x100
> > [ 30.694013] start_cpu+0x14/0x14
> >
> > Here is the relevant code from x86's smp_callin():
> >
> > /*
> > * Save our processor parameters. Note: this information
> > * is needed for clock calibration.
> > */
> > smp_store_cpu_info(cpuid);
> >
> > The problem is that smp_store_cpu_info() indirectly invokes
> > schedule_work(), which wants to use RCU. But RCU isn't informed
> > of the incoming CPU until the call to notify_cpu_starting(), which
> > causes lockdep to complain bitterly about the use of RCU by the
> > premature call to schedule_work().
>
> Right. And that want's to be fixed, not hacked around by silencing RCU.
Fair enough!
I have updated my commit to indicate yours and Frederic's discomfort with
it and marked it as not intended for upstream. If we get an alternative
fix shortly, I will drop my commit -- but failing that at some point I
will of course start pushing this patch again.
Thanx, Paul
> Peter????
>
> Thanks,
>
> tglx
>
On Thu, Mar 09, 2017 at 04:12:55PM +0100, Peter Zijlstra wrote:
> On Thu, Mar 09, 2017 at 02:08:23PM +0100, Thomas Gleixner wrote:
> > On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> > > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > > [ 30.694013] get_work_pool+0x82/0x90
> > > [ 30.694013] __queue_work+0x70/0x5f0
> > > [ 30.694013] queue_work_on+0x33/0x70
> > > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > > [ 30.694013] early_init_intel+0xe7/0x2f0
> > > [ 30.694013] init_intel+0x11/0x350
> > > [ 30.694013] identify_cpu+0x344/0x5a0
> > > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > > [ 30.694013] start_secondary+0x4e/0x100
> > > [ 30.694013] start_cpu+0x14/0x14
> > >
> > > Here is the relevant code from x86's smp_callin():
> > >
> > > /*
> > > * Save our processor parameters. Note: this information
> > > * is needed for clock calibration.
> > > */
> > > smp_store_cpu_info(cpuid);
> > >
> > > The problem is that smp_store_cpu_info() indirectly invokes
> > > schedule_work(), which wants to use RCU. But RCU isn't informed
> > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > causes lockdep to complain bitterly about the use of RCU by the
> > > premature call to schedule_work().
> >
> > Right. And that want's to be fixed, not hacked around by silencing RCU.
> >
> > Peter????
>
> I'm thinking this is hotplug? 30 seconds after boot is far too late for
> SMP bringup, or you have a stupid slow machine.
And this certainly does qualify as "shortly", thank you!
Yes, this only happens on hotplug with lockdep enabled, specifically
on rcutorture scenarios TASKS01 and TREE05.
> Because it only calls schedule_work() after SMP-init. In which case
> there's then two cases, either:
>
> - TSC was stable, hotplug wrecked it, TSC is now unstable, and we're
> screwed.
>
> - TSC was unstable, hotplug triggers and we want to mark it unstable
> _again_.
>
> If this is the second, the below should fix it, if its the first, I've
> no idea yet on how to fix that properly :/
I have applied this patch and started tests on TREE05 and TASKS01, should
get results shortly.
Thanx, Paul
> Bloody hotplug..
>
> ---
> kernel/sched/clock.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index a08795e..eecf388 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -172,7 +172,7 @@ void clear_sched_clock_stable(void)
>
> smp_mb(); /* matches sched_clock_init_late() */
>
> - if (sched_clock_running == 2)
> + if (sched_clock_running == 2 && sched_clock_stable())
> schedule_work(&sched_clock_work);
> }
>
>
On Thu, Mar 09, 2017 at 07:29:26AM -0800, Paul E. McKenney wrote:
> On Thu, Mar 09, 2017 at 04:12:55PM +0100, Peter Zijlstra wrote:
> > On Thu, Mar 09, 2017 at 02:08:23PM +0100, Thomas Gleixner wrote:
> > > On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> > > > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > > > [ 30.694013] get_work_pool+0x82/0x90
> > > > [ 30.694013] __queue_work+0x70/0x5f0
> > > > [ 30.694013] queue_work_on+0x33/0x70
> > > > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > > > [ 30.694013] early_init_intel+0xe7/0x2f0
> > > > [ 30.694013] init_intel+0x11/0x350
> > > > [ 30.694013] identify_cpu+0x344/0x5a0
> > > > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > > > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > > > [ 30.694013] start_secondary+0x4e/0x100
> > > > [ 30.694013] start_cpu+0x14/0x14
> > > >
> > > > Here is the relevant code from x86's smp_callin():
> > > >
> > > > /*
> > > > * Save our processor parameters. Note: this information
> > > > * is needed for clock calibration.
> > > > */
> > > > smp_store_cpu_info(cpuid);
> > > >
> > > > The problem is that smp_store_cpu_info() indirectly invokes
> > > > schedule_work(), which wants to use RCU. But RCU isn't informed
> > > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > > causes lockdep to complain bitterly about the use of RCU by the
> > > > premature call to schedule_work().
> > >
> > > Right. And that want's to be fixed, not hacked around by silencing RCU.
> > >
> > > Peter????
> >
> > I'm thinking this is hotplug? 30 seconds after boot is far too late for
> > SMP bringup, or you have a stupid slow machine.
>
> And this certainly does qualify as "shortly", thank you!
>
> Yes, this only happens on hotplug with lockdep enabled, specifically
> on rcutorture scenarios TASKS01 and TREE05.
>
> > Because it only calls schedule_work() after SMP-init. In which case
> > there's then two cases, either:
> >
> > - TSC was stable, hotplug wrecked it, TSC is now unstable, and we're
> > screwed.
> >
> > - TSC was unstable, hotplug triggers and we want to mark it unstable
> > _again_.
> >
> > If this is the second, the below should fix it, if its the first, I've
> > no idea yet on how to fix that properly :/
>
> I have applied this patch and started tests on TREE05 and TASKS01, should
> get results shortly.
And the below patch passed light rcutorture testing, so looking good!
Thanx, Paul
> > Bloody hotplug..
> >
> > ---
> > kernel/sched/clock.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > index a08795e..eecf388 100644
> > --- a/kernel/sched/clock.c
> > +++ b/kernel/sched/clock.c
> > @@ -172,7 +172,7 @@ void clear_sched_clock_stable(void)
> >
> > smp_mb(); /* matches sched_clock_init_late() */
> >
> > - if (sched_clock_running == 2)
> > + if (sched_clock_running == 2 && sched_clock_stable())
> > schedule_work(&sched_clock_work);
> > }
> >
> >
On Mon, Mar 20, 2017 at 09:32:37AM +0100, Tomeu Vizoso wrote:
> On 9 March 2017 at 16:50, Paul E. McKenney <[email protected]> wrote:
> >
> > On Thu, Mar 09, 2017 at 07:29:26AM -0800, Paul E. McKenney wrote:
> > > On Thu, Mar 09, 2017 at 04:12:55PM +0100, Peter Zijlstra wrote:
> > > > On Thu, Mar 09, 2017 at 02:08:23PM +0100, Thomas Gleixner wrote:
> > > > > On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> > > > > > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > > > > > [ 30.694013] get_work_pool+0x82/0x90
> > > > > > [ 30.694013] __queue_work+0x70/0x5f0
> > > > > > [ 30.694013] queue_work_on+0x33/0x70
> > > > > > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > > > > > [ 30.694013] early_init_intel+0xe7/0x2f0
> > > > > > [ 30.694013] init_intel+0x11/0x350
> > > > > > [ 30.694013] identify_cpu+0x344/0x5a0
> > > > > > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > > > > > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > > > > > [ 30.694013] start_secondary+0x4e/0x100
> > > > > > [ 30.694013] start_cpu+0x14/0x14
> > > > > >
> > > > > > Here is the relevant code from x86's smp_callin():
> > > > > >
> > > > > > /*
> > > > > > * Save our processor parameters. Note: this information
> > > > > > * is needed for clock calibration.
> > > > > > */
> > > > > > smp_store_cpu_info(cpuid);
> > > > > >
> > > > > > The problem is that smp_store_cpu_info() indirectly invokes
> > > > > > schedule_work(), which wants to use RCU. But RCU isn't informed
> > > > > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > > > > causes lockdep to complain bitterly about the use of RCU by the
> > > > > > premature call to schedule_work().
> > > > >
> > > > > Right. And that want's to be fixed, not hacked around by silencing RCU.
> > > > >
> > > > > Peter????
> > > >
> > > > I'm thinking this is hotplug? 30 seconds after boot is far too late for
> > > > SMP bringup, or you have a stupid slow machine.
> > >
> > > And this certainly does qualify as "shortly", thank you!
> > >
> > > Yes, this only happens on hotplug with lockdep enabled, specifically
> > > on rcutorture scenarios TASKS01 and TREE05.
> > >
> > > > Because it only calls schedule_work() after SMP-init. In which case
> > > > there's then two cases, either:
> > > >
> > > > - TSC was stable, hotplug wrecked it, TSC is now unstable, and we're
> > > > screwed.
> > > >
> > > > - TSC was unstable, hotplug triggers and we want to mark it unstable
> > > > _again_.
> > > >
> > > > If this is the second, the below should fix it, if its the first, I've
> > > > no idea yet on how to fix that properly :/
> > >
> > > I have applied this patch and started tests on TREE05 and TASKS01, should
> > > get results shortly.
> >
> > And the below patch passed light rcutorture testing, so looking good!
>
> I'm having trouble finding this patch in linux-next, has it been pushed already?
Peter pointed out that this v4.11-rc2 patch should fix the problem, see
Message-ID: <[email protected]>.
I rebased to v4.11-rc2, and haven't seen the problem, so I dropped the
patch referred to above.
f94c8d116997 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
I am not sure whether or not Peter is sending another patch or if he
was instead was going to amend f94c8d116997's changelog.
Thanx, Paul
Thanx, Paul
On 9 March 2017 at 16:50, Paul E. McKenney <[email protected]> wrote:
>
> On Thu, Mar 09, 2017 at 07:29:26AM -0800, Paul E. McKenney wrote:
> > On Thu, Mar 09, 2017 at 04:12:55PM +0100, Peter Zijlstra wrote:
> > > On Thu, Mar 09, 2017 at 02:08:23PM +0100, Thomas Gleixner wrote:
> > > > On Wed, 8 Mar 2017, Paul E. McKenney wrote:
> > > > > [ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
> > > > > [ 30.694013] get_work_pool+0x82/0x90
> > > > > [ 30.694013] __queue_work+0x70/0x5f0
> > > > > [ 30.694013] queue_work_on+0x33/0x70
> > > > > [ 30.694013] clear_sched_clock_stable+0x33/0x40
> > > > > [ 30.694013] early_init_intel+0xe7/0x2f0
> > > > > [ 30.694013] init_intel+0x11/0x350
> > > > > [ 30.694013] identify_cpu+0x344/0x5a0
> > > > > [ 30.694013] identify_secondary_cpu+0x18/0x80
> > > > > [ 30.694013] smp_store_cpu_info+0x39/0x40
> > > > > [ 30.694013] start_secondary+0x4e/0x100
> > > > > [ 30.694013] start_cpu+0x14/0x14
> > > > >
> > > > > Here is the relevant code from x86's smp_callin():
> > > > >
> > > > > /*
> > > > > * Save our processor parameters. Note: this information
> > > > > * is needed for clock calibration.
> > > > > */
> > > > > smp_store_cpu_info(cpuid);
> > > > >
> > > > > The problem is that smp_store_cpu_info() indirectly invokes
> > > > > schedule_work(), which wants to use RCU. But RCU isn't informed
> > > > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > > > causes lockdep to complain bitterly about the use of RCU by the
> > > > > premature call to schedule_work().
> > > >
> > > > Right. And that want's to be fixed, not hacked around by silencing RCU.
> > > >
> > > > Peter????
> > >
> > > I'm thinking this is hotplug? 30 seconds after boot is far too late for
> > > SMP bringup, or you have a stupid slow machine.
> >
> > And this certainly does qualify as "shortly", thank you!
> >
> > Yes, this only happens on hotplug with lockdep enabled, specifically
> > on rcutorture scenarios TASKS01 and TREE05.
> >
> > > Because it only calls schedule_work() after SMP-init. In which case
> > > there's then two cases, either:
> > >
> > > - TSC was stable, hotplug wrecked it, TSC is now unstable, and we're
> > > screwed.
> > >
> > > - TSC was unstable, hotplug triggers and we want to mark it unstable
> > > _again_.
> > >
> > > If this is the second, the below should fix it, if its the first, I've
> > > no idea yet on how to fix that properly :/
> >
> > I have applied this patch and started tests on TREE05 and TASKS01, should
> > get results shortly.
>
> And the below patch passed light rcutorture testing, so looking good!
I'm having trouble finding this patch in linux-next, has it been pushed already?
Thanks,
Tomeu