Am seeing this at boot:
[ 0.344028] Switching to clocksource hpet
[ 0.344114]
[ 0.344115] =======================================================
[ 0.344125] [ INFO: possible circular locking dependency detected ]
[ 0.344133] 2.6.32-rc5-mmotm1101 #2
[ 0.344139] -------------------------------------------------------
[ 0.344147] swapper/1 is trying to acquire lock:
[ 0.344154] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103c222>] cpu_maps_update_begin+0x12/0x14
[ 0.344174]
[ 0.344175] but task is already holding lock:
[ 0.344183] (setup_lock){+.+.+.}, at: [<ffffffff81078755>] stop_machine_create+0x12/0x9b
[ 0.344200]
[ 0.344201] which lock already depends on the new lock.
[ 0.344202]
[ 0.344212]
[ 0.344213] the existing dependency chain (in reverse order) is:
[ 0.344222]
[ 0.344223] -> #3 (setup_lock){+.+.+.}:
[ 0.344235] [<ffffffff81067898>] __lock_acquire+0xb31/0xcd9
[ 0.344246] [<ffffffff81067b0b>] lock_acquire+0xcb/0xe8
[ 0.344255] [<ffffffff814bbf41>] __mutex_lock_common+0x5c/0x5aa
[ 0.344268] [<ffffffff814bc535>] mutex_lock_nested+0x34/0x39
[ 0.344278] [<ffffffff81078755>] stop_machine_create+0x12/0x9b
[ 0.344289] [<ffffffff810787f9>] stop_machine+0x1b/0x4e
[ 0.344298] [<ffffffff8105e312>] timekeeping_notify+0x1e/0x25
[ 0.344310] [<ffffffff8105f40f>] clocksource_select+0xbd/0xc5
[ 0.344320] [<ffffffff81b1f403>] clocksource_done_booting+0x28/0x38
[ 0.344333] [<ffffffff810001f7>] do_one_initcall+0x59/0x14e
[ 0.344344] [<ffffffff81b0c66c>] kernel_init+0x15f/0x1b5
[ 0.344355] [<ffffffff8100337a>] child_rip+0xa/0x20
[ 0.344366]
[ 0.344366] -> #2 (clocksource_mutex){+.+.+.}:
[ 0.344378] [<ffffffff81067898>] __lock_acquire+0xb31/0xcd9
[ 0.344388] [<ffffffff81067b0b>] lock_acquire+0xcb/0xe8
[ 0.344398] [<ffffffff814bbf41>] __mutex_lock_common+0x5c/0x5aa
[ 0.344408] [<ffffffff814bc535>] mutex_lock_nested+0x34/0x39
[ 0.344419] [<ffffffff8105f608>] clocksource_change_rating+0x1b/0x37
[ 0.344430] [<ffffffff81008449>] mark_tsc_unstable+0x3f/0x4d
[ 0.344440] [<ffffffff814b7135>] check_tsc_sync_source+0xc8/0x10d
[ 0.344451] [<ffffffff814b6e50>] native_cpu_up+0xfb/0x13e
[ 0.344461] [<ffffffff814b82ff>] _cpu_up+0xa8/0x14a
[ 0.344471] [<ffffffff814b83fc>] cpu_up+0x5b/0x68
[ 0.344480] [<ffffffff81b0c5d2>] kernel_init+0xc5/0x1b5
[ 0.344490] [<ffffffff8100337a>] child_rip+0xa/0x20
[ 0.344499]
[ 0.344500] -> #1 (cpu_hotplug.lock){+.+.+.}:
[ 0.344512] [<ffffffff81067898>] __lock_acquire+0xb31/0xcd9
[ 0.344522] [<ffffffff81067b0b>] lock_acquire+0xcb/0xe8
[ 0.344531] [<ffffffff814bbf41>] __mutex_lock_common+0x5c/0x5aa
[ 0.344542] [<ffffffff814bc535>] mutex_lock_nested+0x34/0x39
[ 0.344552] [<ffffffff8103c25d>] cpu_hotplug_begin+0x27/0x68
[ 0.344562] [<ffffffff814b82af>] _cpu_up+0x58/0x14a
[ 0.344572] [<ffffffff814b83fc>] cpu_up+0x5b/0x68
[ 0.344581] [<ffffffff81b0c5d2>] kernel_init+0xc5/0x1b5
[ 0.344591] [<ffffffff8100337a>] child_rip+0xa/0x20
[ 0.344600]
[ 0.344601] -> #0 (cpu_add_remove_lock){+.+.+.}:
[ 0.344613] [<ffffffff81067742>] __lock_acquire+0x9db/0xcd9
[ 0.344623] [<ffffffff81067b0b>] lock_acquire+0xcb/0xe8
[ 0.344632] [<ffffffff814bbf41>] __mutex_lock_common+0x5c/0x5aa
[ 0.344643] [<ffffffff814bc535>] mutex_lock_nested+0x34/0x39
[ 0.344653] [<ffffffff8103c222>] cpu_maps_update_begin+0x12/0x14
[ 0.344664] [<ffffffff81052461>] __create_workqueue_key+0x146/0x1f4
[ 0.344675] [<ffffffff8107877d>] stop_machine_create+0x3a/0x9b
[ 0.344686] [<ffffffff810787f9>] stop_machine+0x1b/0x4e
[ 0.344696] [<ffffffff8105e312>] timekeeping_notify+0x1e/0x25
[ 0.344706] [<ffffffff8105f40f>] clocksource_select+0xbd/0xc5
[ 0.344716] [<ffffffff81b1f403>] clocksource_done_booting+0x28/0x38
[ 0.344727] [<ffffffff810001f7>] do_one_initcall+0x59/0x14e
[ 0.344737] [<ffffffff81b0c66c>] kernel_init+0x15f/0x1b5
[ 0.344747] [<ffffffff8100337a>] child_rip+0xa/0x20
[ 0.344756]
[ 0.344757] other info that might help us debug this:
[ 0.344758]
[ 0.344769] 2 locks held by swapper/1:
[ 0.344774] #0: (clocksource_mutex){+.+.+.}, at: [<ffffffff81b1f3fe>] clocksource_done_booting+0x23/0x38
[ 0.344792] #1: (setup_lock){+.+.+.}, at: [<ffffffff81078755>] stop_machine_create+0x12/0x9b
[ 0.344809]
[ 0.344810] stack backtrace:
[ 0.344820] Pid: 1, comm: swapper Not tainted 2.6.32-rc5-mmotm1101 #2
[ 0.344829] Call Trace:
[ 0.344837] [<ffffffff81066944>] print_circular_bug+0xa8/0xb7
[ 0.344847] [<ffffffff81067742>] __lock_acquire+0x9db/0xcd9
[ 0.344857] [<ffffffff8103c222>] ? cpu_maps_update_begin+0x12/0x14
[ 0.344867] [<ffffffff81067b0b>] lock_acquire+0xcb/0xe8
[ 0.344876] [<ffffffff8103c222>] ? cpu_maps_update_begin+0x12/0x14
[ 0.344887] [<ffffffff814bbf41>] __mutex_lock_common+0x5c/0x5aa
[ 0.344896] [<ffffffff8103c222>] ? cpu_maps_update_begin+0x12/0x14
[ 0.344906] [<ffffffff814bbcea>] ? mutex_unlock+0x9/0xb
[ 0.344916] [<ffffffff810d743d>] ? pcpu_alloc+0x80c/0x826
[ 0.344926] [<ffffffff8103c222>] ? cpu_maps_update_begin+0x12/0x14
[ 0.344936] [<ffffffff810d4fda>] ? kmem_cache_alloc+0xbb/0x183
[ 0.344946] [<ffffffff81066295>] ? trace_hardirqs_on_caller+0x118/0x13c
[ 0.344956] [<ffffffff814bc535>] mutex_lock_nested+0x34/0x39
[ 0.344966] [<ffffffff8103c222>] cpu_maps_update_begin+0x12/0x14
[ 0.344976] [<ffffffff81052461>] __create_workqueue_key+0x146/0x1f4
[ 0.344986] [<ffffffff8105dbc4>] ? change_clocksource+0x0/0x5b
[ 0.344996] [<ffffffff8107877d>] stop_machine_create+0x3a/0x9b
[ 0.344999] [<ffffffff810787f9>] stop_machine+0x1b/0x4e
[ 0.344999] [<ffffffff8105e312>] timekeeping_notify+0x1e/0x25
[ 0.344999] [<ffffffff8105f40f>] clocksource_select+0xbd/0xc5
[ 0.344999] [<ffffffff81b1f3db>] ? clocksource_done_booting+0x0/0x38
[ 0.344999] [<ffffffff81b1f403>] clocksource_done_booting+0x28/0x38
[ 0.344999] [<ffffffff810001f7>] do_one_initcall+0x59/0x14e
[ 0.344999] [<ffffffff81b0c66c>] kernel_init+0x15f/0x1b5
[ 0.344999] [<ffffffff8100337a>] child_rip+0xa/0x20
[ 0.344999] [<ffffffff814be2c0>] ? restore_args+0x0/0x30
[ 0.344999] [<ffffffff81b0c50d>] ? kernel_init+0x0/0x1b5
[ 0.344999] [<ffffffff81003370>] ? child_rip+0x0/0x20
[ 0.345393] Warning: could not register annotated branches stats
[ 0.349818] pnp: PnP ACPI init
[ 0.349843] ACPI: bus type pnp registered
On Thu, 5 Nov 2009 02:41:24 am [email protected] wrote:
[ 0.344147] swapper/1 is trying to acquire lock:
> [ 0.344154] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103c222>] cpu_maps_update_begin+0x12/0x14
> [ 0.344174]
> [ 0.344175] but task is already holding lock:
> [ 0.344183] (setup_lock){+.+.+.}, at: [<ffffffff81078755>] stop_machine_create+0x12/0x9b
> [ 0.344200]
> [ 0.344201] which lock already depends on the new lock.
Hi Vladis!
Sigh. I always find reading these a complete mindfuck.
stop_machine_create: setup_lock then cpu_add_remove_lock
(in create_workqueue_key() -> cpu_maps_update_begin())
clocksource_done_booting: clocksource_mutex then setup_lock
(in stop_machine_create(), as above)
cpu_up: cpu_add_remove_lock then clocksource_mutex
(in mark_tsc_unstable() -> clocksource_change_rating())
AFAICT this is our circular dependency. But I'm no closer to knowing how to
solve it.
Oleg (CC'd) made workqueues use cpu_maps_update_begin() instead of the
more obvious get_online_cpus() in 3da1c84c00c7e5f. Reverting that seems like
a bad idea.
Or, if the clocksource list wasn't ordered, we could change the rating without
a lock.
Either way, the locking shark is well and truly jumped...
Rusty.
On 11/05, Rusty Russell wrote:
>
> On Thu, 5 Nov 2009 02:41:24 am [email protected] wrote:
> [ 0.344147] swapper/1 is trying to acquire lock:
> > [ 0.344154] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103c222>] cpu_maps_update_begin+0x12/0x14
> > [ 0.344174]
> > [ 0.344175] but task is already holding lock:
> > [ 0.344183] (setup_lock){+.+.+.}, at: [<ffffffff81078755>] stop_machine_create+0x12/0x9b
> > [ 0.344200]
> > [ 0.344201] which lock already depends on the new lock.
>
> Hi Vladis!
>
> Sigh. I always find reading these a complete mindfuck.
>
> stop_machine_create: setup_lock then cpu_add_remove_lock
> (in create_workqueue_key() -> cpu_maps_update_begin())
> clocksource_done_booting: clocksource_mutex then setup_lock
> (in stop_machine_create(), as above)
> cpu_up: cpu_add_remove_lock then clocksource_mutex
> (in mark_tsc_unstable() -> clocksource_change_rating())
>
> AFAICT this is our circular dependency. But I'm no closer to knowing how to
> solve it.
Not sure I understand this correctly, but afaics this dependency is
even simpler:
cpu_up()->clocksource_change_rating() path takes clocksource_mutex
under CPU hotplug locks.
clocksource_done_booting()->create_workueue() path takes CPU hotplug
locks under clocksource_mutex.
> Oleg (CC'd) made workqueues use cpu_maps_update_begin() instead of the
> more obvious get_online_cpus() in 3da1c84c00c7e5f. Reverting that seems like
> a bad idea.
Even if create_workueue() used get_online_cpus() instead of cpu_add_remove_lock,
we have the same problem: _cpu_up() takes cpu_hotplug.lock which is needed for
get_online_cpus(). The dependency above becomes:
cpu_up()->clocksource_change_rating() takes clocksource_mutex under
cpu_hotplug.lock (cpu_hotplug_begin)
clocksource_done_booting()->create_workueue() takes cpu_hotplug.lock
(get_online_cpus) under clocksource_mutex
Oleg.
That is imho caused by this patch snippet from the linux-next.patch from mmotd:
diff -puN arch/x86/kernel/tsc.c~linux-next arch/x86/kernel/tsc.c
--- a/arch/x86/kernel/tsc.c~linux-next
+++ a/arch/x86/kernel/tsc.c
@@ -763,14 +763,12 @@ void mark_tsc_unstable(char *reason)
{
if (!tsc_unstable) {
tsc_unstable = 1;
- printk(KERN_INFO "Marking TSC unstable due to %s\n", reason);
+ printk("Marking TSC unstable due to %s\n", reason);
/* Change only the rating, when not registered */
if (clocksource_tsc.mult)
- clocksource_mark_unstable(&clocksource_tsc);
- else {
- clocksource_tsc.flags |= CLOCK_SOURCE_UNSTABLE;
+ clocksource_change_rating(&clocksource_tsc, 0);
+ else
clocksource_tsc.rating = 0;
- }
}
}
The upstream variant of mark_tsc_unstable uses clocksource_mark_unstable
which does not take the mutex. The reason is the clocksource mutex vs. the
cpu_add_remove_lock spinlock.
--
blue skies,
Martin
On Thu, 5 Nov 2009, Oleg Nesterov wrote:
> > Oleg (CC'd) made workqueues use cpu_maps_update_begin() instead of the
> > more obvious get_online_cpus() in 3da1c84c00c7e5f. Reverting that seems like
> > a bad idea.
>
> Even if create_workueue() used get_online_cpus() instead of cpu_add_remove_lock,
> we have the same problem: _cpu_up() takes cpu_hotplug.lock which is needed for
> get_online_cpus(). The dependency above becomes:
>
> cpu_up()->clocksource_change_rating() takes clocksource_mutex under
> cpu_hotplug.lock (cpu_hotplug_begin)
>
> clocksource_done_booting()->create_workueue() takes cpu_hotplug.lock
> (get_online_cpus) under clocksource_mutex
Hmm, we fixed all this lock madness in mainline already.
clocksource_done_booting() does not longer call
create_workqueue(). How got this code reverted in motm ?
Thanks,
tglx
On 11/05, Thomas Gleixner wrote:
>
> On Thu, 5 Nov 2009, Oleg Nesterov wrote:
> > > Oleg (CC'd) made workqueues use cpu_maps_update_begin() instead of the
> > > more obvious get_online_cpus() in 3da1c84c00c7e5f. Reverting that seems like
> > > a bad idea.
> >
> > Even if create_workueue() used get_online_cpus() instead of cpu_add_remove_lock,
> > we have the same problem: _cpu_up() takes cpu_hotplug.lock which is needed for
> > get_online_cpus(). The dependency above becomes:
> >
> > cpu_up()->clocksource_change_rating() takes clocksource_mutex under
> > cpu_hotplug.lock (cpu_hotplug_begin)
> >
> > clocksource_done_booting()->create_workueue() takes cpu_hotplug.lock
> > (get_online_cpus) under clocksource_mutex
>
> Hmm, we fixed all this lock madness in mainline already.
> clocksource_done_booting() does not longer call
> create_workqueue(). How got this code reverted in motm ?
I don't understand this code at all. But this is what I see
in Linus's tree:
clocksource_done_booting
clocksource_select
timekeeping_notify
stop_machine
stop_machine_create
OTOH, I don't see where native_cpu_up() path calls clocksource_change_rating(),
perhaps this was changed in -mm.
Oleg.
On 11/05, Oleg Nesterov wrote:
>
> On 11/05, Thomas Gleixner wrote:
> >
> > Hmm, we fixed all this lock madness in mainline already.
> > clocksource_done_booting() does not longer call
> > create_workqueue(). How got this code reverted in motm ?
>
> I don't understand this code at all. But this is what I see
> in Linus's tree:
>
> clocksource_done_booting
> clocksource_select
> timekeeping_notify
> stop_machine
> stop_machine_create
>
>
> OTOH, I don't see where native_cpu_up() path calls clocksource_change_rating(),
> perhaps this was changed in -mm.
I believe Martin knows the answer:
On 11/05, Martin Schwidefsky wrote:
>
> That is imho caused by this patch snippet from the linux-next.patch from mmotd:
>
> diff -puN arch/x86/kernel/tsc.c~linux-next arch/x86/kernel/tsc.c
> --- a/arch/x86/kernel/tsc.c~linux-next
> +++ a/arch/x86/kernel/tsc.c
> @@ -763,14 +763,12 @@ void mark_tsc_unstable(char *reason)
> {
> if (!tsc_unstable) {
> tsc_unstable = 1;
> - printk(KERN_INFO "Marking TSC unstable due to %s\n", reason);
> + printk("Marking TSC unstable due to %s\n", reason);
> /* Change only the rating, when not registered */
> if (clocksource_tsc.mult)
> - clocksource_mark_unstable(&clocksource_tsc);
> - else {
> - clocksource_tsc.flags |= CLOCK_SOURCE_UNSTABLE;
> + clocksource_change_rating(&clocksource_tsc, 0);
^^^^^^^^^^^^^^^^^^^^^^^^^
Oleg.
On Thu, 5 Nov 2009, Oleg Nesterov wrote:
> On 11/05, Oleg Nesterov wrote:
> >
> > On 11/05, Thomas Gleixner wrote:
> > >
> > > Hmm, we fixed all this lock madness in mainline already.
> > > clocksource_done_booting() does not longer call
> > > create_workqueue(). How got this code reverted in motm ?
> >
> > I don't understand this code at all. But this is what I see
> > in Linus's tree:
> >
> > clocksource_done_booting
> > clocksource_select
> > timekeeping_notify
> > stop_machine
> > stop_machine_create
> >
> >
> > OTOH, I don't see where native_cpu_up() path calls clocksource_change_rating(),
> > perhaps this was changed in -mm.
>
> I believe Martin knows the answer:
>
> On 11/05, Martin Schwidefsky wrote:
> >
> > That is imho caused by this patch snippet from the linux-next.patch from mmotd:
> >
> > diff -puN arch/x86/kernel/tsc.c~linux-next arch/x86/kernel/tsc.c
> > --- a/arch/x86/kernel/tsc.c~linux-next
> > +++ a/arch/x86/kernel/tsc.c
> > @@ -763,14 +763,12 @@ void mark_tsc_unstable(char *reason)
> > {
> > if (!tsc_unstable) {
> > tsc_unstable = 1;
> > - printk(KERN_INFO "Marking TSC unstable due to %s\n", reason);
> > + printk("Marking TSC unstable due to %s\n", reason);
> > /* Change only the rating, when not registered */
> > if (clocksource_tsc.mult)
> > - clocksource_mark_unstable(&clocksource_tsc);
> > - else {
> > - clocksource_tsc.flags |= CLOCK_SOURCE_UNSTABLE;
> > + clocksource_change_rating(&clocksource_tsc, 0);
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^
Which basically reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16
Thanks,
tglx
>
> Which basically reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16
>
It is this commit in linux-next:
commit 66b1a72ceaa7da811db025855f19482f0a38e815
Author: Ingo Molnar <[email protected]>
Date: Thu Sep 10 19:39:52 2009 +0200
Revert "clocksource: Resolve cpu hotplug dead lock with TSC unstable"
This reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16.
causes crashes.
Signed-off-by: Ingo Molnar <[email protected]>
--
blue skies,
Martin
On Thu, 5 Nov 2009, Martin Schwidefsky wrote:
> >
> > Which basically reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16
> >
>
> It is this commit in linux-next:
>
> commit 66b1a72ceaa7da811db025855f19482f0a38e815
> Author: Ingo Molnar <[email protected]>
> Date: Thu Sep 10 19:39:52 2009 +0200
>
> Revert "clocksource: Resolve cpu hotplug dead lock with TSC unstable"
>
> This reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16.
>
> causes crashes.
>
> Signed-off-by: Ingo Molnar <[email protected]>
Eeek. We fixed this in mainline via:
commit f79e0258ea1f04d63db499479b5fb855dff6dbc5
Author: Martin Schwidefsky <[email protected]>
Date: Fri Sep 11 15:33:05 2009 +0200
clocksource: Resolve cpu hotplug dead lock with TSC unstable, fix crash
and
commit 54a6bc0b071c50150bc6d1da16c2cd9a963e288c
Author: Thomas Gleixner <[email protected]>
Date: Mon Sep 14 19:49:02 2009 +0200
clocksource: Delay clocksource down rating to late boot
These patches came via the tip tree right at the beginning of the
merge window several days _before_ I asked Linus to pull the timer
code for 32.
So there is something weird going on.
Thanks,
tglx
On Thu, 5 Nov 2009, Thomas Gleixner wrote:
> On Thu, 5 Nov 2009, Martin Schwidefsky wrote:
> > >
> > > Which basically reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16
> > >
> >
> > It is this commit in linux-next:
> >
> > commit 66b1a72ceaa7da811db025855f19482f0a38e815
> > Author: Ingo Molnar <[email protected]>
> > Date: Thu Sep 10 19:39:52 2009 +0200
> >
> > Revert "clocksource: Resolve cpu hotplug dead lock with TSC unstable"
> >
> > This reverts commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16.
> >
> > causes crashes.
> >
> > Signed-off-by: Ingo Molnar <[email protected]>
>
> Eeek. We fixed this in mainline via:
>
> commit f79e0258ea1f04d63db499479b5fb855dff6dbc5
> Author: Martin Schwidefsky <[email protected]>
> Date: Fri Sep 11 15:33:05 2009 +0200
>
> clocksource: Resolve cpu hotplug dead lock with TSC unstable, fix crash
>
> and
>
> commit 54a6bc0b071c50150bc6d1da16c2cd9a963e288c
> Author: Thomas Gleixner <[email protected]>
> Date: Mon Sep 14 19:49:02 2009 +0200
>
> clocksource: Delay clocksource down rating to late boot
>
>
> These patches came via the tip tree right at the beginning of the
> merge window several days _before_ I asked Linus to pull the timer
> code for 32.
>
> So there is something weird going on.
Ok, figured out what happened.
The revert has been in next via the tip-tree for some days when
testing turned up the problem and the fixes were not ready.
So it looks like Andrew picked exactly such a linux-next snapshot to
create his baseline patch.
linux-next, tip, mainline of today all have the correct and functional
code.
Thanks,
tglx