2012-02-13 11:17:31

by Markus

[permalink] [raw]
Subject: Deadlock?

Hi!

I noted some kind of deadlock, where I was not able to write to the raid6,
while writing to each disk would still work.
This caused many processes to "wait" in d-state, thus making it impossible to
unmount, cleanly reboot, sync, ...

So I enabled the detection of hung tasks and this deplock option.
After running a 3.2.2 for about 5 days:
http://pastebin.com/gy1kaYmS

I dont know if its a bug or nothing. Or if it has anything to do with my
problem as there was no hungtask detected and the raid still seems to work.

Can anybody enlight me?! ;)

Markus


2012-02-13 17:37:41

by Jack Stone

[permalink] [raw]
Subject: Re: Deadlock?

Adding CCs

On 02/13/2012 11:17 AM, Markus wrote:
> Hi!
>
> I noted some kind of deadlock, where I was not able to write to the raid6,
> while writing to each disk would still work.
> This caused many processes to "wait" in d-state, thus making it impossible to
> unmount, cleanly reboot, sync, ...
>
> So I enabled the detection of hung tasks and this deplock option.
> After running a 3.2.2 for about 5 days:
> http://pastebin.com/gy1kaYmS
>
> I dont know if its a bug or nothing. Or if it has anything to do with my
> problem as there was no hungtask detected and the raid still seems to work.
>
> Can anybody enlight me?! ;)
>
> Markus


[493061.240015] [ INFO: RECLAIM_FS-safe -> RECLAIM_FS-unsafe lock order detected ]
[493061.240015] 3.2.2 #1
[493061.240015] ------------------------------------------------------
[493061.240015] linuxdcpp/11649 [HC0[0]:SC0[0]:HE1:SE1] is trying to acquire:
[493061.240015] (&tty->atomic_write_lock){+.+.+.}, at: [<ffffffff811bf40a>] tty_write_message+0x2a/0x80
[493061.240015]
[493061.240015] and this task is already holding:
[493061.240015] (&s->s_dquot.dqptr_sem){++++-.}, at: [<ffffffff8111888d>] __dquot_alloc_space+0x9d/0x230
[493061.240015] which would create a new lock dependency:
[493061.240015] (&s->s_dquot.dqptr_sem){++++-.} -> (&tty->atomic_write_lock){+.+.+.}
[493061.240015]
[493061.240015] but this new dependency connects a RECLAIM_FS-irq-safe lock:
[493061.240015] (&s->s_dquot.dqptr_sem){++++-.}
[493061.240015] ... which became RECLAIM_FS-irq-safe at:
[493061.240015] [<ffffffff81069914>] __lock_acquire+0x6e4/0x1eb0
[493061.240015] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493061.240015] [<ffffffff81375e4c>] down_write+0x2c/0x50
[493061.240015] [<ffffffff81119c97>] __dquot_drop+0x27/0x80
[493061.240015] [<ffffffff81119d15>] dquot_drop+0x25/0x30
[493061.240015] [<ffffffff81138278>] ext3_evict_inode+0x158/0x260
[493061.240015] [<ffffffff810e0167>] evict+0xa7/0x1b0
[493061.240015] [<ffffffff810e0807>] dispose_list+0x47/0x60
[493061.240015] [<ffffffff810e1222>] prune_icache_sb+0x192/0x370
[493061.240015] [<ffffffff810c9e03>] prune_super+0x153/0x1b0
[493061.240015] [<ffffffff8109c355>] shrink_slab+0x135/0x1f0
[493061.240015] [<ffffffff8109e932>] kswapd+0x632/0x8e0
[493061.240015] [<ffffffff81055ce6>] kthread+0x96/0xa0
[493061.240015] [<ffffffff81378eb4>] kernel_thread_helper+0x4/0x10
[493061.240015]
[493061.240015] to a RECLAIM_FS-irq-unsafe lock:
[493061.240015] (&tty->atomic_write_lock){+.+.+.}
[493061.240015] ... which became RECLAIM_FS-irq-unsafe at:
[493061.240015] ... [<ffffffff810681f6>] mark_held_locks+0x76/0x150
[493061.240015] [<ffffffff81068ab0>] lockdep_trace_alloc+0xb0/0xe0
[493061.240015] [<ffffffff810c2a78>] __kmalloc+0x78/0x160
[493061.240015] [<ffffffff811bf126>] tty_write+0x136/0x270
[493061.240015] [<ffffffff811bf30d>] redirected_tty_write+0xad/0xb0
[493061.240015] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493061.240015] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493061.240015] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493061.240015]
[493061.240015] other info that might help us debug this:
[493061.240015]
[493061.240015] Possible interrupt unsafe locking scenario:
[493061.240015]
[493061.240015] CPU0 CPU1
[493061.240015] ---- ----
[493061.240015] lock(&tty->atomic_write_lock);
[493061.240015] local_irq_disable();
[493061.240015] lock(&s->s_dquot.dqptr_sem);
[493061.240015] lock(&tty->atomic_write_lock);
[493061.240015] <Interrupt>
[493061.240015] lock(&s->s_dquot.dqptr_sem);
[493061.240015]
[493061.240015] *** DEADLOCK ***
[493061.240015]
[493061.240015] 4 locks held by linuxdcpp/11649:
[493063.862237] #0: (&sb->s_type->i_mutex_key#3){+.+.+.}, at: [<ffffffff81090947>] generic_file_aio_write+0x57/0xe0
[493063.862237] #1: (jbd_handle){+.+.-.}, at: [<ffffffff81152098>] start_this_handle+0x358/0x450
[493063.862237] #2: (&ei->truncate_mutex){+.+...}, at: [<ffffffff811367c5>] ext3_get_blocks_handle+0xe5/0xbf0
[493063.862237] #3: (&s->s_dquot.dqptr_sem){++++-.}, at: [<ffffffff8111888d>] __dquot_alloc_space+0x9d/0x230
[493063.862237]
[493063.862237] the dependencies between RECLAIM_FS-irq-safe lock and the holding lock:
[493063.862237] -> (&s->s_dquot.dqptr_sem){++++-.} ops: 187112656 {
[493063.862237] HARDIRQ-ON-W at:
[493063.862237] [<ffffffff810699c5>] __lock_acquire+0x795/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81375e4c>] down_write+0x2c/0x50
[493063.862237] [<ffffffff81119c97>] __dquot_drop+0x27/0x80
[493063.862237] [<ffffffff8111ac23>] vfs_load_quota_inode+0x4c3/0x510
[493063.862237] [<ffffffff8111aed0>] dquot_quota_on+0x70/0x80
[493063.862237] [<ffffffff8113dadb>] ext3_quota_on+0xfb/0x130
[493063.862237] [<ffffffff8111c726>] do_quotactl+0x536/0x560
[493063.862237] [<ffffffff8111c81e>] sys_quotactl+0xce/0x1a0
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] HARDIRQ-ON-R at:
[493063.862237] [<ffffffff8106988d>] __lock_acquire+0x65d/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81375e9f>] down_read+0x2f/0x50
[493063.862237] [<ffffffff8111bac3>] dquot_alloc_inode+0x43/0x140
[493063.862237] [<ffffffff8113354b>] ext3_new_inode+0x89b/0x980
[493063.862237] [<ffffffff8113b3fd>] ext3_create+0x8d/0x110
[493063.862237] [<ffffffff810d26fc>] vfs_create+0xac/0xe0
[493063.862237] [<ffffffff810d4149>] do_last.clone.30+0x469/0x7d0
[493063.862237] [<ffffffff810d5b50>] path_openat+0xd0/0x410
[493063.862237] [<ffffffff810d5fa4>] do_filp_open+0x44/0xa0
[493063.862237] [<ffffffff810c5f9c>] do_sys_open+0xfc/0x1e0
[493063.862237] [<ffffffff810c609b>] sys_open+0x1b/0x20
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] SOFTIRQ-ON-W at:
[493063.862237] [<ffffffff810699ff>] __lock_acquire+0x7cf/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81375e4c>] down_write+0x2c/0x50
[493063.862237] [<ffffffff81119c97>] __dquot_drop+0x27/0x80
[493063.862237] [<ffffffff8111ac23>] vfs_load_quota_inode+0x4c3/0x510
[493063.862237] [<ffffffff8111aed0>] dquot_quota_on+0x70/0x80
[493063.862237] [<ffffffff8113dadb>] ext3_quota_on+0xfb/0x130
[493063.862237] [<ffffffff8111c726>] do_quotactl+0x536/0x560
[493063.862237] [<ffffffff8111c81e>] sys_quotactl+0xce/0x1a0
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] SOFTIRQ-ON-R at:
[493063.862237] [<ffffffff810699ff>] __lock_acquire+0x7cf/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81375e9f>] down_read+0x2f/0x50
[493063.862237] [<ffffffff8111bac3>] dquot_alloc_inode+0x43/0x140
[493063.862237] [<ffffffff8113354b>] ext3_new_inode+0x89b/0x980
[493063.862237] [<ffffffff8113b3fd>] ext3_create+0x8d/0x110
[493063.862237] [<ffffffff810d26fc>] vfs_create+0xac/0xe0
[493063.862237] [<ffffffff810d4149>] do_last.clone.30+0x469/0x7d0
[493063.862237] [<ffffffff810d5b50>] path_openat+0xd0/0x410
[493063.862237] [<ffffffff810d5fa4>] do_filp_open+0x44/0xa0
[493063.862237] [<ffffffff810c5f9c>] do_sys_open+0xfc/0x1e0
[493063.862237] [<ffffffff810c609b>] sys_open+0x1b/0x20
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] IN-RECLAIM_FS-W at:
[493063.862237] [<ffffffff81069914>] __lock_acquire+0x6e4/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81375e4c>] down_write+0x2c/0x50
[493063.862237] [<ffffffff81119c97>] __dquot_drop+0x27/0x80
[493063.862237] [<ffffffff81119d15>] dquot_drop+0x25/0x30
[493063.862237] [<ffffffff81138278>] ext3_evict_inode+0x158/0x260
[493063.862237] [<ffffffff810e0167>] evict+0xa7/0x1b0
[493063.862237] [<ffffffff810e0807>] dispose_list+0x47/0x60
[493063.862237] [<ffffffff810e1222>] prune_icache_sb+0x192/0x370
[493063.862237] [<ffffffff810c9e03>] prune_super+0x153/0x1b0
[493063.862237] [<ffffffff8109c355>] shrink_slab+0x135/0x1f0
[493063.862237] [<ffffffff8109e932>] kswapd+0x632/0x8e0
[493063.862237] [<ffffffff81055ce6>] kthread+0x96/0xa0
[493063.862237] [<ffffffff81378eb4>] kernel_thread_helper+0x4/0x10
[493063.862237] INITIAL USE at:
[493063.862237] [<ffffffff81069619>] __lock_acquire+0x3e9/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81375e4c>] down_write+0x2c/0x50
[493063.862237] [<ffffffff81119c97>] __dquot_drop+0x27/0x80
[493063.862237] [<ffffffff8111ac23>] vfs_load_quota_inode+0x4c3/0x510
[493063.862237] [<ffffffff8111aed0>] dquot_quota_on+0x70/0x80
[493063.862237] [<ffffffff8113dadb>] ext3_quota_on+0xfb/0x130
[493063.862237] [<ffffffff8111c726>] do_quotactl+0x536/0x560
[493063.862237] [<ffffffff8111c81e>] sys_quotactl+0xce/0x1a0
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] }
[493063.862237] ... key at: [<ffffffff81f48938>] __key.28495+0x0/0x8
[493063.862237] ... acquired at:
[493063.862237] [<ffffffff810677b0>] check_irq_usage+0x60/0xf0
[493063.862237] [<ffffffff8106a244>] __lock_acquire+0x1014/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81374d6b>] mutex_lock_nested+0x3b/0x300
[493063.862237] [<ffffffff811bf40a>] tty_write_message+0x2a/0x80
[493063.862237] [<ffffffff81118377>] flush_warnings+0xe7/0x200
[493063.862237] [<ffffffff8111898a>] __dquot_alloc_space+0x19a/0x230
[493063.862237] [<ffffffff81130f4c>] ext3_new_blocks+0x6c/0x680
[493063.862237] [<ffffffff811369d5>] ext3_get_blocks_handle+0x2f5/0xbf0
[493063.862237] [<ffffffff8113738f>] ext3_get_block+0xbf/0x120
[493063.862237] [<ffffffff810f44bb>] __block_write_begin+0x1db/0x540
[493063.862237] [<ffffffff8113637f>] ext3_write_begin+0xaf/0x200
[493063.862237] [<ffffffff8108ea40>] generic_file_buffered_write+0x110/0x280
[493063.862237] [<ffffffff810906e1>] __generic_file_aio_write+0x221/0x430
[493063.862237] [<ffffffff81090963>] generic_file_aio_write+0x73/0xe0
[493063.862237] [<ffffffff810c62ca>] do_sync_write+0xda/0x120
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237]
[493063.862237]
[493063.862237] the dependencies between the lock to be acquired and RECLAIM_FS-irq-unsafe lock:
[493063.862237] -> (&tty->atomic_write_lock){+.+.+.} ops: 270462830 {
[493063.862237] HARDIRQ-ON-W at:
[493063.862237] [<ffffffff810681f6>] mark_held_locks+0x76/0x150
[493063.862237] [<ffffffff8106837d>] trace_hardirqs_on_caller+0xad/0x1e0
[493063.862237] [<ffffffff810684bd>] trace_hardirqs_on+0xd/0x10
[493063.862237] [<ffffffff81374cbd>] mutex_trylock+0xfd/0x170
[493063.862237] [<ffffffff811befb3>] tty_write_lock+0x23/0x60
[493063.862237] [<ffffffff811bf0c3>] tty_write+0xd3/0x270
[493063.862237] [<ffffffff811bf30d>] redirected_tty_write+0xad/0xb0
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] SOFTIRQ-ON-W at:
[493063.862237] [<ffffffff810681f6>] mark_held_locks+0x76/0x150
[493063.862237] [<ffffffff810683ed>] trace_hardirqs_on_caller+0x11d/0x1e0
[493063.862237] [<ffffffff810684bd>] trace_hardirqs_on+0xd/0x10
[493063.862237] [<ffffffff81374cbd>] mutex_trylock+0xfd/0x170
[493063.862237] [<ffffffff811befb3>] tty_write_lock+0x23/0x60
[493063.862237] [<ffffffff811bf0c3>] tty_write+0xd3/0x270
[493063.862237] [<ffffffff811bf30d>] redirected_tty_write+0xad/0xb0
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] RECLAIM_FS-ON-W at:
[493063.862237] [<ffffffff810681f6>] mark_held_locks+0x76/0x150
[493063.862237] [<ffffffff81068ab0>] lockdep_trace_alloc+0xb0/0xe0
[493063.862237] [<ffffffff810c2a78>] __kmalloc+0x78/0x160
[493063.862237] [<ffffffff811bf126>] tty_write+0x136/0x270
[493063.862237] [<ffffffff811bf30d>] redirected_tty_write+0xad/0xb0
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] INITIAL USE at:
[493063.862237] [<ffffffff81069619>] __lock_acquire+0x3e9/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81374c6d>] mutex_trylock+0xad/0x170
[493063.862237] [<ffffffff811befb3>] tty_write_lock+0x23/0x60
[493063.862237] [<ffffffff811bf0c3>] tty_write+0xd3/0x270
[493063.862237] [<ffffffff811bf30d>] redirected_tty_write+0xad/0xb0
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237] }
[493063.862237] ... key at: [<ffffffff81f549b8>] __key.27415+0x0/0x8
[493063.862237] ... acquired at:
[493063.862237] [<ffffffff810677b0>] check_irq_usage+0x60/0xf0
[493063.862237] [<ffffffff8106a244>] __lock_acquire+0x1014/0x1eb0
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff81374d6b>] mutex_lock_nested+0x3b/0x300
[493063.862237] [<ffffffff811bf40a>] tty_write_message+0x2a/0x80
[493063.862237] [<ffffffff81118377>] flush_warnings+0xe7/0x200
[493063.862237] [<ffffffff8111898a>] __dquot_alloc_space+0x19a/0x230
[493063.862237] [<ffffffff81130f4c>] ext3_new_blocks+0x6c/0x680
[493063.862237] [<ffffffff811369d5>] ext3_get_blocks_handle+0x2f5/0xbf0
[493063.862237] [<ffffffff8113738f>] ext3_get_block+0xbf/0x120
[493063.862237] [<ffffffff810f44bb>] __block_write_begin+0x1db/0x540
[493063.862237] [<ffffffff8113637f>] ext3_write_begin+0xaf/0x200
[493063.862237] [<ffffffff8108ea40>] generic_file_buffered_write+0x110/0x280
[493063.862237] [<ffffffff810906e1>] __generic_file_aio_write+0x221/0x430
[493063.862237] [<ffffffff81090963>] generic_file_aio_write+0x73/0xe0
[493063.862237] [<ffffffff810c62ca>] do_sync_write+0xda/0x120
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b
[493063.862237]
[493063.862237]
[493063.862237] stack backtrace:
[493063.862237] Pid: 11649, comm: linuxdcpp Not tainted 3.2.2 #1
[493063.862237] Call Trace:
[493063.862237] [<ffffffff8106774a>] check_usage+0x4aa/0x4b0
[493063.862237] [<ffffffff810050f4>] ? print_context_stack+0x74/0xd0
[493063.862237] [<ffffffff810677b0>] check_irq_usage+0x60/0xf0
[493063.862237] [<ffffffff8106a244>] __lock_acquire+0x1014/0x1eb0
[493063.862237] [<ffffffff8100e6fa>] ? save_stack_trace+0x2a/0x50
[493063.862237] [<ffffffff8106b6ad>] lock_acquire+0x8d/0xb0
[493063.862237] [<ffffffff811bf40a>] ? tty_write_message+0x2a/0x80
[493063.862237] [<ffffffff81374d6b>] mutex_lock_nested+0x3b/0x300
[493063.862237] [<ffffffff811bf40a>] ? tty_write_message+0x2a/0x80
[493063.862237] [<ffffffff811bf40a>] tty_write_message+0x2a/0x80
[493063.862237] [<ffffffff81118377>] flush_warnings+0xe7/0x200
[493063.862237] [<ffffffff8111898a>] __dquot_alloc_space+0x19a/0x230
[493063.862237] [<ffffffff81130f4c>] ext3_new_blocks+0x6c/0x680
[493063.862237] [<ffffffff81374f8f>] ? mutex_lock_nested+0x25f/0x300
[493063.862237] [<ffffffff81374f7b>] ? mutex_lock_nested+0x24b/0x300
[493063.862237] [<ffffffff811369d5>] ext3_get_blocks_handle+0x2f5/0xbf0
[493063.862237] [<ffffffff8106973a>] ? __lock_acquire+0x50a/0x1eb0
[493063.862237] [<ffffffff8106973a>] ? __lock_acquire+0x50a/0x1eb0
[493063.862237] [<ffffffff810f2c2b>] ? create_empty_buffers+0x4b/0xd0
[493063.862237] [<ffffffff8113738f>] ext3_get_block+0xbf/0x120
[493063.862237] [<ffffffff810f44bb>] __block_write_begin+0x1db/0x540
[493063.862237] [<ffffffff811372d0>] ? ext3_get_blocks_handle+0xbf0/0xbf0
[493063.862237] [<ffffffff8113637f>] ext3_write_begin+0xaf/0x200
[493063.862237] [<ffffffff8108ea40>] generic_file_buffered_write+0x110/0x280
[493063.862237] [<ffffffff8103f221>] ? current_fs_time+0x11/0x50
[493063.862237] [<ffffffff810906e1>] __generic_file_aio_write+0x221/0x430
[493063.862237] [<ffffffff81090963>] generic_file_aio_write+0x73/0xe0
[493063.862237] [<ffffffff810c62ca>] do_sync_write+0xda/0x120
[493063.862237] [<ffffffff810a850b>] ? might_fault+0x3b/0x90
[493063.862237] [<ffffffff81167287>] ? security_file_permission+0x27/0xb0
[493063.862237] [<ffffffff810c6b76>] vfs_write+0xc6/0x180
[493063.862237] [<ffffffff810c6e8c>] sys_write+0x4c/0x90
[493063.862237] [<ffffffff81377b7b>] system_call_fastpath+0x16/0x1b

2012-02-14 03:16:52

by Paul Hartman

[permalink] [raw]
Subject: Re: Deadlock?

On Mon, Feb 13, 2012 at 5:17 AM, Markus <[email protected]> wrote:
> Hi!
>
> I noted some kind of deadlock, where I was not able to write to the raid6,
> while writing to each disk would still work.
> This caused many processes to "wait" in d-state, thus making it impossible to
> unmount, cleanly reboot, sync, ...

I'm using RAID5 and kernel 3.2.2 and encountered a hang when running
e4defrag on some large files today. When I tried to browse to that
directory in Midnight Commander, it also became D state hung. After
magic sysrq REISUB, raid is resyncing.

I don't know if it's related, or useful, but here's my dmesg output:

[1555304.508443] INFO: task e4defrag:9929 blocked for more than 120 seconds.
[1555304.508446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555304.508449] e4defrag D 000000015ccdc212 0 9929 21774
0x00000000
[1555304.508454] ffff880309b93d38 0000000000000082 ffff880309b93cc8
ffffffff00000000
[1555304.508458] ffff8801009d0000 0000000000010980 ffff880309b93fd8
ffff880309b92000
[1555304.508463] 0000000000010980 0000000000004000 ffff880309b93fd8
0000000000010980
[1555304.508467] Call Trace:
[1555304.508474] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555304.508480] [<ffffffff81494f02>] ? _raw_spin_unlock_irqrestore+0x12/0x40
[1555304.508484] [<ffffffff810319ee>] ? __wake_up+0x4e/0x70
[1555304.508487] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555304.508490] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555304.508493] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555304.508496] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555304.508500] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555304.508504] [<ffffffff810a77d0>] ? __lock_page+0x70/0x70
[1555304.508507] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555304.508510] [<ffffffff81492de7>] io_schedule+0x87/0xd0
[1555304.508513] [<ffffffff810a77d9>] sleep_on_page+0x9/0x10
[1555304.508516] [<ffffffff814934d7>] __wait_on_bit+0x57/0x80
[1555304.508520] [<ffffffff810a79fe>] wait_on_page_bit+0x6e/0x80
[1555304.508524] [<ffffffff8105fb80>] ? autoremove_wake_function+0x40/0x40
[1555304.508528] [<ffffffff810b2310>] ? pagevec_lookup_tag+0x20/0x30
[1555304.508531] [<ffffffff810a80ca>] filemap_fdatawait_range+0xfa/0x190
[1555304.508537] [<ffffffff81116e76>] sys_sync_file_range+0x176/0x180
[1555304.508541] [<ffffffff81495b7b>] system_call_fastpath+0x16/0x1b
[1555424.340835] INFO: task e4defrag:9929 blocked for more than 120 seconds.
[1555424.340838] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555424.340841] e4defrag D 000000015ccdc212 0 9929 21774
0x00000000
[1555424.340845] ffff880309b93d38 0000000000000082 ffff880309b93cc8
ffffffff00000000
[1555424.340850] ffff8801009d0000 0000000000010980 ffff880309b93fd8
ffff880309b92000
[1555424.340854] 0000000000010980 0000000000004000 ffff880309b93fd8
0000000000010980
[1555424.340859] Call Trace:
[1555424.340866] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555424.340872] [<ffffffff81494f02>] ? _raw_spin_unlock_irqrestore+0x12/0x40
[1555424.340875] [<ffffffff810319ee>] ? __wake_up+0x4e/0x70
[1555424.340878] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555424.340881] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555424.340885] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555424.340888] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555424.340891] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555424.340895] [<ffffffff810a77d0>] ? __lock_page+0x70/0x70
[1555424.340899] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555424.340902] [<ffffffff81492de7>] io_schedule+0x87/0xd0
[1555424.340905] [<ffffffff810a77d9>] sleep_on_page+0x9/0x10
[1555424.340908] [<ffffffff814934d7>] __wait_on_bit+0x57/0x80
[1555424.340911] [<ffffffff810a79fe>] wait_on_page_bit+0x6e/0x80
[1555424.340916] [<ffffffff8105fb80>] ? autoremove_wake_function+0x40/0x40
[1555424.340919] [<ffffffff810b2310>] ? pagevec_lookup_tag+0x20/0x30
[1555424.340923] [<ffffffff810a80ca>] filemap_fdatawait_range+0xfa/0x190
[1555424.340928] [<ffffffff81116e76>] sys_sync_file_range+0x176/0x180
[1555424.340932] [<ffffffff81495b7b>] system_call_fastpath+0x16/0x1b
[1555544.173120] INFO: task jbd2/md2-8:2842 blocked for more than 120 seconds.
[1555544.173123] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555544.173126] jbd2/md2-8 D 000000015cd20110 0 2842 2
0x00000000
[1555544.173131] ffff88030fba1ce0 0000000000000046 0000000000000002
0000000100000000
[1555544.173136] ffff88030f9c1650 0000000000010980 ffff88030fba1fd8
ffff88030fba0000
[1555544.173140] 0000000000010980 0000000000004000 ffff88030fba1fd8
0000000000010980
[1555544.173144] Call Trace:
[1555544.173152] [<ffffffff8100a1b8>] ? native_sched_clock+0x28/0x90
[1555544.173158] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555544.173163] [<ffffffff811932ac>]
jbd2_journal_commit_transaction+0x17c/0x13a0
[1555544.173166] [<ffffffff8149259e>] ? __schedule+0x46e/0xa90
[1555544.173171] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173175] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555544.173179] [<ffffffff8105fb40>] ? wake_up_bit+0x40/0x40
[1555544.173183] [<ffffffff81197472>] kjournald2+0xb2/0x210
[1555544.173187] [<ffffffff8105fb40>] ? wake_up_bit+0x40/0x40
[1555544.173190] [<ffffffff811973c0>] ? commit_timeout+0x10/0x10
[1555544.173194] [<ffffffff8105f666>] kthread+0x96/0xa0
[1555544.173198] [<ffffffff81496f74>] kernel_thread_helper+0x4/0x10
[1555544.173204] [<ffffffff8105f5d0>] ? kthread_worker_fn+0x190/0x190
[1555544.173207] [<ffffffff81496f70>] ? gs_change+0xb/0xb
[1555544.173231] INFO: task e4defrag:9929 blocked for more than 120 seconds.
[1555544.173233] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555544.173235] e4defrag D 000000015ccdc212 0 9929 21774
0x00000000
[1555544.173238] ffff880309b93d38 0000000000000082 ffff880309b93cc8
ffffffff00000000
[1555544.173242] ffff8801009d0000 0000000000010980 ffff880309b93fd8
ffff880309b92000
[1555544.173245] 0000000000010980 0000000000004000 ffff880309b93fd8
0000000000010980
[1555544.173249] Call Trace:
[1555544.173252] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555544.173256] [<ffffffff81494f02>] ? _raw_spin_unlock_irqrestore+0x12/0x40
[1555544.173259] [<ffffffff810319ee>] ? __wake_up+0x4e/0x70
[1555544.173261] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173264] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173267] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555544.173270] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173272] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173276] [<ffffffff810a77d0>] ? __lock_page+0x70/0x70
[1555544.173279] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555544.173281] [<ffffffff81492de7>] io_schedule+0x87/0xd0
[1555544.173284] [<ffffffff810a77d9>] sleep_on_page+0x9/0x10
[1555544.173287] [<ffffffff814934d7>] __wait_on_bit+0x57/0x80
[1555544.173290] [<ffffffff810a79fe>] wait_on_page_bit+0x6e/0x80
[1555544.173293] [<ffffffff8105fb80>] ? autoremove_wake_function+0x40/0x40
[1555544.173296] [<ffffffff810b2310>] ? pagevec_lookup_tag+0x20/0x30
[1555544.173299] [<ffffffff810a80ca>] filemap_fdatawait_range+0xfa/0x190
[1555544.173303] [<ffffffff81116e76>] sys_sync_file_range+0x176/0x180
[1555544.173307] [<ffffffff81495b7b>] system_call_fastpath+0x16/0x1b
[1555544.173310] INFO: task mc:22249 blocked for more than 120 seconds.
[1555544.173312] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555544.173313] mc D 000000015cd1eecf 0 22249 9847
0x00000000
[1555544.173317] ffff8802f7ea9b48 0000000000000082 ffff8802f7ea9a28
0000000000000000
[1555544.173321] ffff8802fc390000 0000000000010980 ffff8802f7ea9fd8
ffff8802f7ea8000
[1555544.173324] 0000000000010980 0000000000004000 ffff8802f7ea9fd8
0000000000010980
[1555544.173327] Call Trace:
[1555544.173331] [<ffffffff810b3242>] ? __lru_cache_add+0x72/0xd0
[1555544.173334] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173337] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555544.173339] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173342] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555544.173346] [<ffffffff811194d0>] ? unmap_underlying_metadata+0x50/0x50
[1555544.173349] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555544.173351] [<ffffffff81492de7>] io_schedule+0x87/0xd0
[1555544.173354] [<ffffffff811194d9>] sleep_on_buffer+0x9/0x10
[1555544.173357] [<ffffffff81493392>] __wait_on_bit_lock+0x52/0xb0
[1555544.173360] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555544.173363] [<ffffffff811194d0>] ? unmap_underlying_metadata+0x50/0x50
[1555544.173366] [<ffffffff81493463>] out_of_line_wait_on_bit_lock+0x73/0x90
[1555544.173369] [<ffffffff8105fb80>] ? autoremove_wake_function+0x40/0x40
[1555544.173372] [<ffffffff8111a6be>] __lock_buffer+0x2e/0x30
[1555544.173375] [<ffffffff811924e9>] do_get_write_access+0x5a9/0x6c0
[1555544.173379] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555544.173382] [<ffffffff81198dc6>] ?
jbd2_journal_add_journal_head+0xd6/0x1f0
[1555544.173385] [<ffffffff811927dc>] jbd2_journal_get_write_access+0x2c/0x50
[1555544.173389] [<ffffffff8117e3b9>] __ext4_journal_get_write_access+0x39/0x80
[1555544.173392] [<ffffffff81164f08>] ext4_reserve_inode_write+0x88/0xb0
[1555544.173395] [<ffffffff81164f69>] ext4_mark_inode_dirty+0x39/0x1e0
[1555544.173399] [<ffffffff810ff8a0>] ? filldir64+0xd0/0xd0
[1555544.173402] [<ffffffff81167148>] ext4_dirty_inode+0x38/0x60
[1555544.173404] [<ffffffff8111246b>] __mark_inode_dirty+0x3b/0x220
[1555544.173407] [<ffffffff81106445>] touch_atime+0x115/0x160
[1555544.173410] [<ffffffff810ffae6>] vfs_readdir+0xb6/0xc0
[1555544.173413] [<ffffffff810ffbd0>] sys_getdents+0x80/0xe0
[1555544.173416] [<ffffffff81495b7b>] system_call_fastpath+0x16/0x1b
[1555664.005410] INFO: task jbd2/md2-8:2842 blocked for more than 120 seconds.
[1555664.005413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555664.005416] jbd2/md2-8 D 000000015cd20110 0 2842 2
0x00000000
[1555664.005421] ffff88030fba1ce0 0000000000000046 0000000000000002
0000000100000000
[1555664.005426] ffff88030f9c1650 0000000000010980 ffff88030fba1fd8
ffff88030fba0000
[1555664.005430] 0000000000010980 0000000000004000 ffff88030fba1fd8
0000000000010980
[1555664.005434] Call Trace:
[1555664.005442] [<ffffffff8100a1b8>] ? native_sched_clock+0x28/0x90
[1555664.005448] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555664.005453] [<ffffffff811932ac>]
jbd2_journal_commit_transaction+0x17c/0x13a0
[1555664.005456] [<ffffffff8149259e>] ? __schedule+0x46e/0xa90
[1555664.005462] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005465] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555664.005469] [<ffffffff8105fb40>] ? wake_up_bit+0x40/0x40
[1555664.005473] [<ffffffff81197472>] kjournald2+0xb2/0x210
[1555664.005477] [<ffffffff8105fb40>] ? wake_up_bit+0x40/0x40
[1555664.005480] [<ffffffff811973c0>] ? commit_timeout+0x10/0x10
[1555664.005483] [<ffffffff8105f666>] kthread+0x96/0xa0
[1555664.005488] [<ffffffff81496f74>] kernel_thread_helper+0x4/0x10
[1555664.005492] [<ffffffff8105f5d0>] ? kthread_worker_fn+0x190/0x190
[1555664.005495] [<ffffffff81496f70>] ? gs_change+0xb/0xb
[1555664.005521] INFO: task e4defrag:9929 blocked for more than 120 seconds.
[1555664.005523] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555664.005525] e4defrag D 000000015ccdc212 0 9929 21774
0x00000000
[1555664.005529] ffff880309b93d38 0000000000000082 ffff880309b93cc8
ffffffff00000000
[1555664.005533] ffff8801009d0000 0000000000010980 ffff880309b93fd8
ffff880309b92000
[1555664.005537] 0000000000010980 0000000000004000 ffff880309b93fd8
0000000000010980
[1555664.005542] Call Trace:
[1555664.005545] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555664.005549] [<ffffffff81494f02>] ? _raw_spin_unlock_irqrestore+0x12/0x40
[1555664.005553] [<ffffffff810319ee>] ? __wake_up+0x4e/0x70
[1555664.005556] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005559] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005562] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555664.005566] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005569] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005573] [<ffffffff810a77d0>] ? __lock_page+0x70/0x70
[1555664.005576] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555664.005579] [<ffffffff81492de7>] io_schedule+0x87/0xd0
[1555664.005582] [<ffffffff810a77d9>] sleep_on_page+0x9/0x10
[1555664.005585] [<ffffffff814934d7>] __wait_on_bit+0x57/0x80
[1555664.005589] [<ffffffff810a79fe>] wait_on_page_bit+0x6e/0x80
[1555664.005592] [<ffffffff8105fb80>] ? autoremove_wake_function+0x40/0x40
[1555664.005596] [<ffffffff810b2310>] ? pagevec_lookup_tag+0x20/0x30
[1555664.005599] [<ffffffff810a80ca>] filemap_fdatawait_range+0xfa/0x190
[1555664.005604] [<ffffffff81116e76>] sys_sync_file_range+0x176/0x180
[1555664.005608] [<ffffffff81495b7b>] system_call_fastpath+0x16/0x1b
[1555664.005612] INFO: task mc:22249 blocked for more than 120 seconds.
[1555664.005614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555664.005616] mc D 000000015cd1eecf 0 22249 9847
0x00000000
[1555664.005620] ffff8802f7ea9b48 0000000000000082 ffff8802f7ea9a28
0000000000000000
[1555664.005624] ffff8802fc390000 0000000000010980 ffff8802f7ea9fd8
ffff8802f7ea8000
[1555664.005628] 0000000000010980 0000000000004000 ffff8802f7ea9fd8
0000000000010980
[1555664.005632] Call Trace:
[1555664.005636] [<ffffffff810b3242>] ? __lru_cache_add+0x72/0xd0
[1555664.005639] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005642] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555664.005646] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005649] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555664.005653] [<ffffffff811194d0>] ? unmap_underlying_metadata+0x50/0x50
[1555664.005656] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555664.005659] [<ffffffff81492de7>] io_schedule+0x87/0xd0
[1555664.005663] [<ffffffff811194d9>] sleep_on_buffer+0x9/0x10
[1555664.005666] [<ffffffff81493392>] __wait_on_bit_lock+0x52/0xb0
[1555664.005669] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555664.005673] [<ffffffff811194d0>] ? unmap_underlying_metadata+0x50/0x50
[1555664.005676] [<ffffffff81493463>] out_of_line_wait_on_bit_lock+0x73/0x90
[1555664.005680] [<ffffffff8105fb80>] ? autoremove_wake_function+0x40/0x40
[1555664.005684] [<ffffffff8111a6be>] __lock_buffer+0x2e/0x30
[1555664.005687] [<ffffffff811924e9>] do_get_write_access+0x5a9/0x6c0
[1555664.005691] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555664.005695] [<ffffffff81198dc6>] ?
jbd2_journal_add_journal_head+0xd6/0x1f0
[1555664.005698] [<ffffffff811927dc>] jbd2_journal_get_write_access+0x2c/0x50
[1555664.005703] [<ffffffff8117e3b9>] __ext4_journal_get_write_access+0x39/0x80
[1555664.005707] [<ffffffff81164f08>] ext4_reserve_inode_write+0x88/0xb0
[1555664.005710] [<ffffffff81164f69>] ext4_mark_inode_dirty+0x39/0x1e0
[1555664.005714] [<ffffffff810ff8a0>] ? filldir64+0xd0/0xd0
[1555664.005717] [<ffffffff81167148>] ext4_dirty_inode+0x38/0x60
[1555664.005721] [<ffffffff8111246b>] __mark_inode_dirty+0x3b/0x220
[1555664.005724] [<ffffffff81106445>] touch_atime+0x115/0x160
[1555664.005728] [<ffffffff810ffae6>] vfs_readdir+0xb6/0xc0
[1555664.005731] [<ffffffff810ffbd0>] sys_getdents+0x80/0xe0
[1555664.005734] [<ffffffff81495b7b>] system_call_fastpath+0x16/0x1b
[1555783.837700] INFO: task jbd2/md2-8:2842 blocked for more than 120 seconds.
[1555783.837703] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[1555783.837706] jbd2/md2-8 D 000000015cd20110 0 2842 2
0x00000000
[1555783.837711] ffff88030fba1ce0 0000000000000046 0000000000000002
0000000100000000
[1555783.837715] ffff88030f9c1650 0000000000010980 ffff88030fba1fd8
ffff88030fba0000
[1555783.837719] 0000000000010980 0000000000004000 ffff88030fba1fd8
0000000000010980
[1555783.837724] Call Trace:
[1555783.837731] [<ffffffff8100a1b8>] ? native_sched_clock+0x28/0x90
[1555783.837737] [<ffffffff81492d4a>] schedule+0x3a/0x50
[1555783.837742] [<ffffffff811932ac>]
jbd2_journal_commit_transaction+0x17c/0x13a0
[1555783.837746] [<ffffffff8149259e>] ? __schedule+0x46e/0xa90
[1555783.837751] [<ffffffff8103a671>] ? get_parent_ip+0x11/0x50
[1555783.837754] [<ffffffff8103a74d>] ? sub_preempt_count+0x9d/0xd0
[1555783.837759] [<ffffffff8105fb40>] ? wake_up_bit+0x40/0x40
[1555783.837763] [<ffffffff81197472>] kjournald2+0xb2/0x210
[1555783.837766] [<ffffffff8105fb40>] ? wake_up_bit+0x40/0x40
[1555783.837770] [<ffffffff811973c0>] ? commit_timeout+0x10/0x10
[1555783.837773] [<ffffffff8105f666>] kthread+0x96/0xa0
[1555783.837777] [<ffffffff81496f74>] kernel_thread_helper+0x4/0x10
[1555783.837781] [<ffffffff8105f5d0>] ? kthread_worker_fn+0x190/0x190
[1555783.837784] [<ffffffff81496f70>] ? gs_change+0xb/0xb

2012-02-14 12:47:42

by Jan Kara

[permalink] [raw]
Subject: Re: Deadlock?

On Mon 13-02-12 17:37:34, Jack Stone wrote:
> Adding CCs
>
> On 02/13/2012 11:17 AM, Markus wrote:
> > I noted some kind of deadlock, where I was not able to write to the raid6,
> > while writing to each disk would still work.
> > This caused many processes to "wait" in d-state, thus making it impossible to
> > unmount, cleanly reboot, sync, ...
Thanks for report. Could you please provide output of
'echo w >/proc/sysrq-trigger'
in dmesg? That would help us narrow down the real cause the the current
deadlock.

> > So I enabled the detection of hung tasks and this deplock option.
> > After running a 3.2.2 for about 5 days:
> > http://pastebin.com/gy1kaYmS
> >
> > I dont know if its a bug or nothing. Or if it has anything to do with my
> > problem as there was no hungtask detected and the raid still seems to work.
It's lock debugging code complaining about a possible problem it found.
Actually, the problem looks real (although hard to hit AFAICS) and attached
patch should fix it. Does the patch fix things for you? If not, please
provide the output I describe above.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR


Attachments:
(No filename) (1.12 kB)
0001-quota-Make-quota-code-not-call-tty-layer-with-dqptr_.patch (16.23 kB)
Download all attachments

2012-02-15 22:19:43

by Markus

[permalink] [raw]
Subject: Re: Deadlock?

Hi!

/proc/sysrq-trigger does not exists, I will apply the patch and enable sysrq.
But it may take some days for a result.

Thanks,
Markus

Jan Kara schrieb am 14.02.2012:
> On Mon 13-02-12 17:37:34, Jack Stone wrote:
> > Adding CCs
> >
> > On 02/13/2012 11:17 AM, Markus wrote:
> > > I noted some kind of deadlock, where I was not able to write to the raid6,
> > > while writing to each disk would still work.
> > > This caused many processes to "wait" in d-state, thus making it impossible to
> > > unmount, cleanly reboot, sync, ...
> Thanks for report. Could you please provide output of
> 'echo w >/proc/sysrq-trigger'
> in dmesg? That would help us narrow down the real cause the the current
> deadlock.
>
> > > So I enabled the detection of hung tasks and this deplock option.
> > > After running a 3.2.2 for about 5 days:
> > > http://pastebin.com/gy1kaYmS
> > >
> > > I dont know if its a bug or nothing. Or if it has anything to do with my
> > > problem as there was no hungtask detected and the raid still seems to work.
> It's lock debugging code complaining about a possible problem it found.
> Actually, the problem looks real (although hard to hit AFAICS) and attached
> patch should fix it. Does the patch fix things for you? If not, please
> provide the output I describe above.
>
> Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR