2007-12-18 14:03:24

by Miles Lane

[permalink] [raw]
Subject: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

I have only seen this happen once, and cannot reproduce it. I'll keep
trying, though.

Dec 16 22:10:48 syntropy kernel: [ 231.718023]
=======================================================
Dec 16 22:10:48 syntropy kernel: [ 231.718025] [ INFO: possible
circular locking dependency detected ]
Dec 16 22:10:48 syntropy kernel: [ 231.718028] 2.6.24-rc5-mm1 #7
Dec 16 22:10:48 syntropy kernel: [ 231.718029]
-------------------------------------------------------
Dec 16 22:10:48 syntropy kernel: [ 231.718032] pm-suspend/5800 is
trying to acquire lock:
Dec 16 22:10:48 syntropy kernel: [ 231.718034]
(&(&priv->init_alive_start)->work){--..}, at:
[__cancel_work_timer+0x8a/0x17f] __cancel_work_timer+0x8a/0x17f
Dec 16 22:10:48 syntropy kernel: [ 231.718045]
Dec 16 22:10:48 syntropy kernel: [ 231.718046] but task is already
holding lock:
Dec 16 22:10:48 syntropy kernel: [ 231.718047] (&priv->mutex){--..},
at: [<f8a587e7>] iwl3945_pci_suspend+0x1d/0x65 [iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718065]
Dec 16 22:10:48 syntropy kernel: [ 231.718066] which lock already
depends on the new lock.
Dec 16 22:10:48 syntropy kernel: [ 231.718067]
Dec 16 22:10:48 syntropy kernel: [ 231.718068]
Dec 16 22:10:48 syntropy kernel: [ 231.718069] the existing
dependency chain (in reverse order) is:
Dec 16 22:10:48 syntropy kernel: [ 231.718071]
Dec 16 22:10:48 syntropy kernel: [ 231.718072] -> #1 (&priv->mutex){--..}:
Dec 16 22:10:48 syntropy kernel: [ 231.718075]
[__lock_acquire+0xa17/0xbf4] __lock_acquire+0xa17/0xbf4
Dec 16 22:10:48 syntropy kernel: [ 231.718083]
[mac80211:lock_acquire+0x76/0x1d8] lock_acquire+0x76/0x9d
Dec 16 22:10:48 syntropy kernel: [ 231.718088]
[pcmcia:mutex_lock_nested+0xf7/0xd7d] mutex_lock_nested+0xf7/0x294
Dec 16 22:10:48 syntropy kernel: [ 231.718096] [<f8a56ff7>]
iwl3945_bg_init_alive_start+0x2d/0x1d7 [iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718109]
[run_workqueue+0xbb/0x18b] run_workqueue+0xbb/0x18b
Dec 16 22:10:48 syntropy kernel: [ 231.718115]
[worker_thread+0xbe/0xcd] worker_thread+0xbe/0xcd
Dec 16 22:10:48 syntropy kernel: [ 231.718121]
[kthread+0x3b/0x61] kthread+0x3b/0x61
Dec 16 22:10:48 syntropy kernel: [ 231.718126]
[kernel_thread_helper+0x7/0x10] kernel_thread_helper+0x7/0x10
Dec 16 22:10:48 syntropy kernel: [ 231.718133] [<ffffffff>] 0xffffffff
Dec 16 22:10:48 syntropy kernel: [ 231.718145]
Dec 16 22:10:48 syntropy kernel: [ 231.718146] -> #0
(&(&priv->init_alive_start)->work){--..}:
Dec 16 22:10:48 syntropy kernel: [ 231.718149]
[__lock_acquire+0x93e/0xbf4] __lock_acquire+0x93e/0xbf4
Dec 16 22:10:48 syntropy kernel: [ 231.718155]
[mac80211:lock_acquire+0x76/0x1d8] lock_acquire+0x76/0x9d
Dec 16 22:10:48 syntropy kernel: [ 231.718161]
[__cancel_work_timer+0xb3/0x17f] __cancel_work_timer+0xb3/0x17f
Dec 16 22:10:48 syntropy kernel: [ 231.718167]
[iwl3945:cancel_delayed_work_sync+0xb/0x0d]
cancel_delayed_work_sync+0xb/0xd
Dec 16 22:10:48 syntropy kernel: [ 231.718173] [<f8a542cb>]
__iwl3945_down+0x51/0x310 [iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718184] [<f8a587f7>]
iwl3945_pci_suspend+0x2d/0x65 [iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718196]
[pci_device_suspend+0x1b/0x4b] pci_device_suspend+0x1b/0x4b
Dec 16 22:10:48 syntropy kernel: [ 231.718203]
[device_suspend+0x17e/0x259] device_suspend+0x17e/0x259
Dec 16 22:10:48 syntropy kernel: [ 231.718210]
[suspend_devices_and_enter+0x3d/0x138]
suspend_devices_and_enter+0x3d/0x138
Dec 16 22:10:48 syntropy kernel: [ 231.718217]
[enter_state+0x121/0x17d] enter_state+0x121/0x17d
Dec 16 22:10:48 syntropy kernel: [ 231.718222]
[state_store+0x96/0xac] state_store+0x96/0xac
Dec 16 22:10:48 syntropy kernel: [ 231.718228]
[kobj_attr_store+0x1a/0x22] kobj_attr_store+0x1a/0x22
Dec 16 22:10:48 syntropy kernel: [ 231.718234]
[sysfs_write_file+0xb8/0xe3] sysfs_write_file+0xb8/0xe3
Dec 16 22:10:48 syntropy kernel: [ 231.718242]
[vfs_write+0xa4/0x120] vfs_write+0xa4/0x120
Dec 16 22:10:48 syntropy kernel: [ 231.718248]
[sys_write+0x3b/0x60] sys_write+0x3b/0x60
Dec 16 22:10:48 syntropy kernel: [ 231.718254]
[sysenter_past_esp+0x6b/0xc1] sysenter_past_esp+0x6b/0xc1
Dec 16 22:10:48 syntropy kernel: [ 231.718259] [<ffffffff>] 0xffffffff
Dec 16 22:10:48 syntropy kernel: [ 231.718269]
Dec 16 22:10:48 syntropy kernel: [ 231.718269] other info that might
help us debug this:
Dec 16 22:10:48 syntropy kernel: [ 231.718271]
Dec 16 22:10:48 syntropy kernel: [ 231.718272] 4 locks held by pm-suspend/5800:
Dec 16 22:10:48 syntropy kernel: [ 231.718274] #0:
(&buffer->mutex){--..}, at: [sysfs_write_file+0x25/0xe3]
sysfs_write_file+0x25/0xe3
Dec 16 22:10:48 syntropy kernel: [ 231.718280] #1:
(pm_mutex){--..}, at: [enter_state+0x166/0x17d]
enter_state+0x166/0x17d
Dec 16 22:10:48 syntropy kernel: [ 231.718286] #2: (dpm_mtx){--..},
at: [device_suspend+0x2b/0x259] device_suspend+0x2b/0x259
Dec 16 22:10:48 syntropy kernel: [ 231.718291] #3:
(&priv->mutex){--..}, at: [<f8a587e7>] iwl3945_pci_suspend+0x1d/0x65
[iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718304]
Dec 16 22:10:48 syntropy kernel: [ 231.718304] stack backtrace:
Dec 16 22:10:48 syntropy kernel: [ 231.718307] Pid: 5800, comm:
pm-suspend Not tainted 2.6.24-rc5-mm1 #7
Dec 16 22:10:48 syntropy kernel: [ 231.718310]
[show_trace_log_lvl+0x12/0x25] show_trace_log_lvl+0x12/0x25
Dec 16 22:10:48 syntropy kernel: [ 231.718314] [show_trace+0xd/0x10]
show_trace+0xd/0x10
Dec 16 22:10:48 syntropy kernel: [ 231.718318]
[sbp2:dump_stack+0x57/0x1734] dump_stack+0x57/0x5f
Dec 16 22:10:48 syntropy kernel: [ 231.718321]
[print_circular_bug_tail+0x5b/0x66] print_circular_bug_tail+0x5b/0x66
Dec 16 22:10:48 syntropy kernel: [ 231.718325]
[__lock_acquire+0x93e/0xbf4] __lock_acquire+0x93e/0xbf4
Dec 16 22:10:48 syntropy kernel: [ 231.718329]
[mac80211:lock_acquire+0x76/0x1d8] lock_acquire+0x76/0x9d
Dec 16 22:10:48 syntropy kernel: [ 231.718332]
[__cancel_work_timer+0xb3/0x17f] __cancel_work_timer+0xb3/0x17f
Dec 16 22:10:48 syntropy kernel: [ 231.718336]
[iwl3945:cancel_delayed_work_sync+0xb/0x0d]
cancel_delayed_work_sync+0xb/0xd
Dec 16 22:10:48 syntropy kernel: [ 231.718340] [<f8a542cb>]
__iwl3945_down+0x51/0x310 [iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718350] [<f8a587f7>]
iwl3945_pci_suspend+0x2d/0x65 [iwl3945]
Dec 16 22:10:48 syntropy kernel: [ 231.718360]
[pci_device_suspend+0x1b/0x4b] pci_device_suspend+0x1b/0x4b
Dec 16 22:10:48 syntropy kernel: [ 231.718363]
[device_suspend+0x17e/0x259] device_suspend+0x17e/0x259
Dec 16 22:10:48 syntropy kernel: [ 231.718367]
[suspend_devices_and_enter+0x3d/0x138]
suspend_devices_and_enter+0x3d/0x138
Dec 16 22:10:48 syntropy kernel: [ 231.718371]
[enter_state+0x121/0x17d] enter_state+0x121/0x17d
Dec 16 22:10:48 syntropy kernel: [ 231.718374]
[state_store+0x96/0xac] state_store+0x96/0xac
Dec 16 22:10:48 syntropy kernel: [ 231.718378]
[kobj_attr_store+0x1a/0x22] kobj_attr_store+0x1a/0x22
Dec 16 22:10:48 syntropy kernel: [ 231.718381]
[sysfs_write_file+0xb8/0xe3] sysfs_write_file+0xb8/0xe3
Dec 16 22:10:48 syntropy kernel: [ 231.718385]
[vfs_write+0xa4/0x120] vfs_write+0xa4/0x120
Dec 16 22:10:48 syntropy kernel: [ 231.718388] [sys_write+0x3b/0x60]
sys_write+0x3b/0x60
Dec 16 22:10:48 syntropy kernel: [ 231.718391]
[sysenter_past_esp+0x6b/0xc1] sysenter_past_esp+0x6b/0xc1
Dec 16 22:10:48 syntropy kernel: [ 231.718395] =======================


2007-12-18 14:08:53

by Johannes Berg

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock


On Tue, 2007-12-18 at 09:03 -0500, Miles Lane wrote:
> I have only seen this happen once, and cannot reproduce it. I'll keep
> trying, though.
>
> Dec 16 22:10:48 syntropy kernel: [ 231.718023]
> =======================================================

Do you have a version that isn't line-wrapped before I try to unwrap it?

Thanks,
johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2007-12-18 14:41:53

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

Sorry. GMail doesn't support sending unwrapped text, as far as I can
tell. I will send the log segment to you as an attachment. Also,
when I sent my .config inline to Andrew recently, it tripped his spam
filter. I'll attach it as well.

I just managed to reproduce this with a suspend-to-disk (my apologies
for CCing the suspend-to-ram folks).

On Dec 18, 2007 9:08 AM, Johannes Berg <[email protected]> wrote:
>
> On Tue, 2007-12-18 at 09:03 -0500, Miles Lane wrote:
> > I have only seen this happen once, and cannot reproduce it. I'll keep
> > trying, though.
> >
> > Dec 16 22:10:48 syntropy kernel: [ 231.718023]
> > =======================================================
>
> Do you have a version that isn't line-wrapped before I try to unwrap it?
>
> Thanks,
> johannes
>


Attachments:
(No filename) (793.00 B)
Locking-problem.txt (10.88 kB)
.config (61.21 kB)
Download all attachments

2007-12-18 14:57:23

by Johannes Berg

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock


> Sorry. GMail doesn't support sending unwrapped text, as far as I can
> tell. I will send the log segment to you as an attachment. Also,
> when I sent my .config inline to Andrew recently, it tripped his spam
> filter. I'll attach it as well.

Thanks. This is a bug in iwlwifi.

The problem is actually another case where my workqueue debugging with
lockdep is triggering a warning :))

Here's the thing:

iwl3945_cancel_deferred_work does

cancel_delayed_work_sync(&priv->init_alive_start);

(which is the "(&(&priv->init_alive_start)->work)" lock)

but it is called from within a locked section of
mutex_lock(&priv->mutex); (locked from iwl3945_pci_suspend)

On the other hand, the task that runs from the init_alive_start
workqueue is iwl3945_bg_init_alive_start() which will lock the same
mutex.

So the deadlock condition is that you can be in
cancel_delayed_work_sync() above while the mutex is locked, and be
waiting for iwl_3945_bg_init_alive_start() which tries to lock the
mutex.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2007-12-19 02:58:22

by Zhu Yi

[permalink] [raw]
Subject: Re: [ipw3945-devel] 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock


On Tue, 2007-12-18 at 15:57 +0100, Johannes Berg wrote:
> Thanks. This is a bug in iwlwifi.
>
> The problem is actually another case where my workqueue debugging with
> lockdep is triggering a warning :))
>
> Here's the thing:
>
> iwl3945_cancel_deferred_work does
>
> cancel_delayed_work_sync(&priv->init_alive_start);
>
> (which is the "(&(&priv->init_alive_start)->work)" lock)
>
> but it is called from within a locked section of
> mutex_lock(&priv->mutex); (locked from iwl3945_pci_suspend)
>
> On the other hand, the task that runs from the init_alive_start
> workqueue is iwl3945_bg_init_alive_start() which will lock the same
> mutex.
>
> So the deadlock condition is that you can be in
> cancel_delayed_work_sync() above while the mutex is locked, and be
> waiting for iwl_3945_bg_init_alive_start() which tries to lock the
> mutex.

Thanks for the analysis.

Miles, please try the attached patch. I'll send a patch for both 3945
and 4965 to linux-wireless later.

Thanks,
-yi


Attachments:
fix-3945-deadlock-susp.patch (2.15 kB)

2007-12-19 13:11:10

by Miles Lane

[permalink] [raw]
Subject: Re: [ipw3945-devel] 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

On Dec 18, 2007 9:58 PM, Zhu Yi <[email protected]> wrote:
>
> On Tue, 2007-12-18 at 15:57 +0100, Johannes Berg wrote:
> > Thanks. This is a bug in iwlwifi.
> >
> > The problem is actually another case where my workqueue debugging with
> > lockdep is triggering a warning :))
> >
> > Here's the thing:
> >
> > iwl3945_cancel_deferred_work does
> >
> > cancel_delayed_work_sync(&priv->init_alive_start);
> >
> > (which is the "(&(&priv->init_alive_start)->work)" lock)
> >
> > but it is called from within a locked section of
> > mutex_lock(&priv->mutex); (locked from iwl3945_pci_suspend)
> >
> > On the other hand, the task that runs from the init_alive_start
> > workqueue is iwl3945_bg_init_alive_start() which will lock the same
> > mutex.
> >
> > So the deadlock condition is that you can be in
> > cancel_delayed_work_sync() above while the mutex is locked, and be
> > waiting for iwl_3945_bg_init_alive_start() which tries to lock the
> > mutex.
>
> Thanks for the analysis.
>
> Miles, please try the attached patch. I'll send a patch for both 3945
> and 4965 to linux-wireless later.

I tested it and it looks good here. Thanks!

Miles