2012-10-29 20:03:37

by Torsten Kaiser

[permalink] [raw]
Subject: Hang in XFS reclaim on 3.7.0-rc3

After experiencing a hang of all IO yesterday (
http://marc.info/?l=linux-kernel&m=135142236520624&w=2 ), I turned on
LOCKDEP after upgrading to -rc3.

I then tried to replicate the load that hung yesterday and got the
following lockdep report, implicating XFS instead of by stacking swap
onto dm-crypt and md.

Oct 29 20:27:11 thoregon kernel: [ 2675.571958] usb 7-2: USB
disconnect, device number 2
Oct 29 20:30:01 thoregon kernel: [ 2844.971913]
Oct 29 20:30:01 thoregon kernel: [ 2844.971920]
=================================
Oct 29 20:30:01 thoregon kernel: [ 2844.971921] [ INFO: inconsistent
lock state ]
Oct 29 20:30:01 thoregon kernel: [ 2844.971924] 3.7.0-rc3 #1 Not tainted
Oct 29 20:30:01 thoregon kernel: [ 2844.971925]
---------------------------------
Oct 29 20:30:01 thoregon kernel: [ 2844.971927] inconsistent
{RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
Oct 29 20:30:01 thoregon kernel: [ 2844.971929] kswapd0/725
[HC0[0]:SC0[0]:HE1:SE1] takes:
Oct 29 20:30:01 thoregon kernel: [ 2844.971931]
(&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e7ef4>]
xfs_ilock+0x84/0xb0
Oct 29 20:30:01 thoregon kernel: [ 2844.971941] {RECLAIM_FS-ON-W}
state was registered at:
Oct 29 20:30:01 thoregon kernel: [ 2844.971942] [<ffffffff8108137e>]
mark_held_locks+0x7e/0x130
Oct 29 20:30:01 thoregon kernel: [ 2844.971947] [<ffffffff81081a63>]
lockdep_trace_alloc+0x63/0xc0
Oct 29 20:30:01 thoregon kernel: [ 2844.971949] [<ffffffff810e9dd5>]
kmem_cache_alloc+0x35/0xe0
Oct 29 20:30:01 thoregon kernel: [ 2844.971952] [<ffffffff810dba31>]
vm_map_ram+0x271/0x770
Oct 29 20:30:01 thoregon kernel: [ 2844.971955] [<ffffffff811e10a6>]
_xfs_buf_map_pages+0x46/0xe0
Oct 29 20:30:01 thoregon kernel: [ 2844.971959] [<ffffffff811e1fba>]
xfs_buf_get_map+0x8a/0x130
Oct 29 20:30:01 thoregon kernel: [ 2844.971961] [<ffffffff81233849>]
xfs_trans_get_buf_map+0xa9/0xd0
Oct 29 20:30:01 thoregon kernel: [ 2844.971964] [<ffffffff8121e339>]
xfs_ifree_cluster+0x129/0x670
Oct 29 20:30:01 thoregon kernel: [ 2844.971967] [<ffffffff8121f959>]
xfs_ifree+0xe9/0xf0
Oct 29 20:30:01 thoregon kernel: [ 2844.971969] [<ffffffff811f4abf>]
xfs_inactive+0x2af/0x480
Oct 29 20:30:01 thoregon kernel: [ 2844.971972] [<ffffffff811efb90>]
xfs_fs_evict_inode+0x70/0x80
Oct 29 20:30:01 thoregon kernel: [ 2844.971974] [<ffffffff8110cb8f>]
evict+0xaf/0x1b0
Oct 29 20:30:01 thoregon kernel: [ 2844.971977] [<ffffffff8110cd95>]
iput+0x105/0x210
Oct 29 20:30:01 thoregon kernel: [ 2844.971979] [<ffffffff811070d0>]
dentry_iput+0xa0/0xe0
Oct 29 20:30:01 thoregon kernel: [ 2844.971981] [<ffffffff81108310>]
dput+0x150/0x280
Oct 29 20:30:01 thoregon kernel: [ 2844.971983] [<ffffffff811020fb>]
sys_renameat+0x21b/0x290
Oct 29 20:30:01 thoregon kernel: [ 2844.971986] [<ffffffff81102186>]
sys_rename+0x16/0x20
Oct 29 20:30:01 thoregon kernel: [ 2844.971988] [<ffffffff816b2292>]
system_call_fastpath+0x16/0x1b
Oct 29 20:30:01 thoregon kernel: [ 2844.971992] irq event stamp: 155377
Oct 29 20:30:01 thoregon kernel: [ 2844.971993] hardirqs last enabled
at (155377): [<ffffffff816ae1ed>] mutex_trylock+0xfd/0x170
Oct 29 20:30:01 thoregon kernel: [ 2844.971997] hardirqs last disabled
at (155376): [<ffffffff816ae12e>] mutex_trylock+0x3e/0x170
Oct 29 20:30:01 thoregon kernel: [ 2844.971999] softirqs last enabled
at (155368): [<ffffffff81042fb1>] __do_softirq+0x111/0x170
Oct 29 20:30:01 thoregon kernel: [ 2844.972002] softirqs last disabled
at (155353): [<ffffffff816b33bc>] call_softirq+0x1c/0x30
Oct 29 20:30:01 thoregon kernel: [ 2844.972004]
Oct 29 20:30:01 thoregon kernel: [ 2844.972004] other info that might
help us debug this:
Oct 29 20:30:01 thoregon kernel: [ 2844.972006] Possible unsafe
locking scenario:
Oct 29 20:30:01 thoregon kernel: [ 2844.972006]
Oct 29 20:30:01 thoregon kernel: [ 2844.972007] CPU0
Oct 29 20:30:01 thoregon kernel: [ 2844.972007] ----
Oct 29 20:30:01 thoregon kernel: [ 2844.972008] lock(&(&ip->i_lock)->mr_lock);
Oct 29 20:30:01 thoregon kernel: [ 2844.972009] <Interrupt>
Oct 29 20:30:01 thoregon kernel: [ 2844.972010]
lock(&(&ip->i_lock)->mr_lock);
Oct 29 20:30:01 thoregon kernel: [ 2844.972012]
Oct 29 20:30:01 thoregon kernel: [ 2844.972012] *** DEADLOCK ***
Oct 29 20:30:01 thoregon kernel: [ 2844.972012]
Oct 29 20:30:01 thoregon kernel: [ 2844.972013] 3 locks held by kswapd0/725:
Oct 29 20:30:01 thoregon kernel: [ 2844.972014] #0:
(shrinker_rwsem){++++..}, at: [<ffffffff810bbd22>]
shrink_slab+0x32/0x1f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972020] #1:
(&type->s_umount_key#20){++++.+}, at: [<ffffffff810f5a8e>]
grab_super_passive+0x3e/0x90
Oct 29 20:30:01 thoregon kernel: [ 2844.972024] #2:
(&pag->pag_ici_reclaim_lock){+.+...}, at: [<ffffffff811f263c>]
xfs_reclaim_inodes_ag+0xbc/0x4f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972027]
Oct 29 20:30:01 thoregon kernel: [ 2844.972027] stack backtrace:
Oct 29 20:30:01 thoregon kernel: [ 2844.972029] Pid: 725, comm:
kswapd0 Not tainted 3.7.0-rc3 #1
Oct 29 20:30:01 thoregon kernel: [ 2844.972031] Call Trace:
Oct 29 20:30:01 thoregon kernel: [ 2844.972035] [<ffffffff816a782c>]
print_usage_bug+0x1f5/0x206
Oct 29 20:30:01 thoregon kernel: [ 2844.972039] [<ffffffff8100ed8a>]
? save_stack_trace+0x2a/0x50
Oct 29 20:30:01 thoregon kernel: [ 2844.972042] [<ffffffff8107e9fd>]
mark_lock+0x28d/0x2f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972044] [<ffffffff8107de30>]
? print_irq_inversion_bug.part.37+0x1f0/0x1f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972047] [<ffffffff8107efdf>]
__lock_acquire+0x57f/0x1c00
Oct 29 20:30:01 thoregon kernel: [ 2844.972049] [<ffffffff8107c899>]
? __lock_is_held+0x59/0x70
Oct 29 20:30:01 thoregon kernel: [ 2844.972051] [<ffffffff81080b55>]
lock_acquire+0x55/0x70
Oct 29 20:30:01 thoregon kernel: [ 2844.972053] [<ffffffff811e7ef4>]
? xfs_ilock+0x84/0xb0
Oct 29 20:30:01 thoregon kernel: [ 2844.972056] [<ffffffff8106126a>]
down_write_nested+0x4a/0x70
Oct 29 20:30:01 thoregon kernel: [ 2844.972058] [<ffffffff811e7ef4>]
? xfs_ilock+0x84/0xb0
Oct 29 20:30:01 thoregon kernel: [ 2844.972060] [<ffffffff811e7ef4>]
xfs_ilock+0x84/0xb0
Oct 29 20:30:01 thoregon kernel: [ 2844.972063] [<ffffffff811f1a76>]
xfs_reclaim_inode+0x136/0x340
Oct 29 20:30:01 thoregon kernel: [ 2844.972065] [<ffffffff811f283f>]
xfs_reclaim_inodes_ag+0x2bf/0x4f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972066] [<ffffffff811f2660>]
? xfs_reclaim_inodes_ag+0xe0/0x4f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972069] [<ffffffff811f2bae>]
xfs_reclaim_inodes_nr+0x2e/0x40
Oct 29 20:30:01 thoregon kernel: [ 2844.972071] [<ffffffff811ef480>]
xfs_fs_free_cached_objects+0x10/0x20
Oct 29 20:30:01 thoregon kernel: [ 2844.972073] [<ffffffff810f5bf3>]
prune_super+0x113/0x1a0
Oct 29 20:30:01 thoregon kernel: [ 2844.972075] [<ffffffff810bbe0e>]
shrink_slab+0x11e/0x1f0
Oct 29 20:30:01 thoregon kernel: [ 2844.972077] [<ffffffff810be400>]
kswapd+0x690/0xa10
Oct 29 20:30:01 thoregon kernel: [ 2844.972080] [<ffffffff8105ca30>]
? __init_waitqueue_head+0x60/0x60
Oct 29 20:30:01 thoregon kernel: [ 2844.972082] [<ffffffff810bdd70>]
? shrink_lruvec+0x540/0x540
Oct 29 20:30:01 thoregon kernel: [ 2844.972084] [<ffffffff8105c246>]
kthread+0xd6/0xe0
Oct 29 20:30:01 thoregon kernel: [ 2844.972087] [<ffffffff816b148b>]
? _raw_spin_unlock_irq+0x2b/0x50
Oct 29 20:30:01 thoregon kernel: [ 2844.972089] [<ffffffff8105c170>]
? flush_kthread_worker+0xe0/0xe0
Oct 29 20:30:01 thoregon kernel: [ 2844.972091] [<ffffffff816b21ec>]
ret_from_fork+0x7c/0xb0
Oct 29 20:30:01 thoregon kernel: [ 2844.972093] [<ffffffff8105c170>]
? flush_kthread_worker+0xe0/0xe0
Oct 29 20:30:01 thoregon cron[24374]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons)

As kswapd got stuck yesterday, I think LOCKDEP found a real problem.

If you need more information, please ask. I will try to provide it.

Thanks,

Torsten


2012-10-29 22:26:19

by Dave Chinner

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote:
> After experiencing a hang of all IO yesterday (
> http://marc.info/?l=linux-kernel&m=135142236520624&w=2 ), I turned on
> LOCKDEP after upgrading to -rc3.
>
> I then tried to replicate the load that hung yesterday and got the
> following lockdep report, implicating XFS instead of by stacking swap
> onto dm-crypt and md.
>
> [ 2844.971913]
> [ 2844.971920] =================================
> [ 2844.971921] [ INFO: inconsistent lock state ]
> [ 2844.971924] 3.7.0-rc3 #1 Not tainted
> [ 2844.971925] ---------------------------------
> [ 2844.971927] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> [ 2844.971929] kswapd0/725 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 2844.971931] (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e7ef4>] xfs_ilock+0x84/0xb0
> [ 2844.971941] {RECLAIM_FS-ON-W} state was registered at:
> [ 2844.971942] [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> [ 2844.971947] [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> [ 2844.971949] [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> [ 2844.971952] [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> [ 2844.971955] [<ffffffff811e10a6>] _xfs_buf_map_pages+0x46/0xe0
> [ 2844.971959] [<ffffffff811e1fba>] xfs_buf_get_map+0x8a/0x130
> [ 2844.971961] [<ffffffff81233849>] xfs_trans_get_buf_map+0xa9/0xd0
> [ 2844.971964] [<ffffffff8121e339>] xfs_ifree_cluster+0x129/0x670
> [ 2844.971967] [<ffffffff8121f959>] xfs_ifree+0xe9/0xf0
> [ 2844.971969] [<ffffffff811f4abf>] xfs_inactive+0x2af/0x480
> [ 2844.971972] [<ffffffff811efb90>] xfs_fs_evict_inode+0x70/0x80
> [ 2844.971974] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
> [ 2844.971977] [<ffffffff8110cd95>] iput+0x105/0x210
> [ 2844.971979] [<ffffffff811070d0>] dentry_iput+0xa0/0xe0
> [ 2844.971981] [<ffffffff81108310>] dput+0x150/0x280
> [ 2844.971983] [<ffffffff811020fb>] sys_renameat+0x21b/0x290
> [ 2844.971986] [<ffffffff81102186>] sys_rename+0x16/0x20
> [ 2844.971988] [<ffffffff816b2292>] system_call_fastpath+0x16/0x1b

We shouldn't be mapping pages there. See if the patch below fixes
it.

Fundamentally, though, the lockdep warning has come about because
vm_map_ram is doing a GFP_KERNEL allocation when we need it to be
doing GFP_NOFS - we are within a transaction here, so memory reclaim
is not allowed to recurse back into the filesystem.

mm-folk: can we please get this vmalloc/gfp_flags passing API
fixed once and for all? This is the fourth time in the last month or
so that I've seen XFS bug reports with silent hangs and associated
lockdep output that implicate GFP_KERNEL allocations from vm_map_ram
in GFP_NOFS conditions as the potential cause....

Cheers,

Dave.
--
Dave Chinner
[email protected]

xfs: don't vmap inode cluster buffers during free

From: Dave Chinner <[email protected]>

Signed-off-by: Dave Chinner <[email protected]>
---
fs/xfs/xfs_inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index c4add46..82f6e5d 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1781,7 +1781,8 @@ xfs_ifree_cluster(
* to mark all the active inodes on the buffer stale.
*/
bp = xfs_trans_get_buf(tp, mp->m_ddev_targp, blkno,
- mp->m_bsize * blks_per_cluster, 0);
+ mp->m_bsize * blks_per_cluster,
+ XBF_UNMAPPED);

if (!bp)
return ENOMEM;

2012-10-29 22:41:31

by Dave Chinner

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

[add the linux-mm cc I forgot to add before sending]

On Tue, Oct 30, 2012 at 09:26:13AM +1100, Dave Chinner wrote:
> On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote:
> > After experiencing a hang of all IO yesterday (
> > http://marc.info/?l=linux-kernel&m=135142236520624&w=2 ), I turned on
> > LOCKDEP after upgrading to -rc3.
> >
> > I then tried to replicate the load that hung yesterday and got the
> > following lockdep report, implicating XFS instead of by stacking swap
> > onto dm-crypt and md.
> >
> > [ 2844.971913]
> > [ 2844.971920] =================================
> > [ 2844.971921] [ INFO: inconsistent lock state ]
> > [ 2844.971924] 3.7.0-rc3 #1 Not tainted
> > [ 2844.971925] ---------------------------------
> > [ 2844.971927] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> > [ 2844.971929] kswapd0/725 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [ 2844.971931] (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e7ef4>] xfs_ilock+0x84/0xb0
> > [ 2844.971941] {RECLAIM_FS-ON-W} state was registered at:
> > [ 2844.971942] [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> > [ 2844.971947] [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> > [ 2844.971949] [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> > [ 2844.971952] [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> > [ 2844.971955] [<ffffffff811e10a6>] _xfs_buf_map_pages+0x46/0xe0
> > [ 2844.971959] [<ffffffff811e1fba>] xfs_buf_get_map+0x8a/0x130
> > [ 2844.971961] [<ffffffff81233849>] xfs_trans_get_buf_map+0xa9/0xd0
> > [ 2844.971964] [<ffffffff8121e339>] xfs_ifree_cluster+0x129/0x670
> > [ 2844.971967] [<ffffffff8121f959>] xfs_ifree+0xe9/0xf0
> > [ 2844.971969] [<ffffffff811f4abf>] xfs_inactive+0x2af/0x480
> > [ 2844.971972] [<ffffffff811efb90>] xfs_fs_evict_inode+0x70/0x80
> > [ 2844.971974] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
> > [ 2844.971977] [<ffffffff8110cd95>] iput+0x105/0x210
> > [ 2844.971979] [<ffffffff811070d0>] dentry_iput+0xa0/0xe0
> > [ 2844.971981] [<ffffffff81108310>] dput+0x150/0x280
> > [ 2844.971983] [<ffffffff811020fb>] sys_renameat+0x21b/0x290
> > [ 2844.971986] [<ffffffff81102186>] sys_rename+0x16/0x20
> > [ 2844.971988] [<ffffffff816b2292>] system_call_fastpath+0x16/0x1b
>
> We shouldn't be mapping pages there. See if the patch below fixes
> it.
>
> Fundamentally, though, the lockdep warning has come about because
> vm_map_ram is doing a GFP_KERNEL allocation when we need it to be
> doing GFP_NOFS - we are within a transaction here, so memory reclaim
> is not allowed to recurse back into the filesystem.
>
> mm-folk: can we please get this vmalloc/gfp_flags passing API
> fixed once and for all? This is the fourth time in the last month or
> so that I've seen XFS bug reports with silent hangs and associated
> lockdep output that implicate GFP_KERNEL allocations from vm_map_ram
> in GFP_NOFS conditions as the potential cause....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> xfs: don't vmap inode cluster buffers during free
>
> From: Dave Chinner <[email protected]>
>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/xfs/xfs_inode.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index c4add46..82f6e5d 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1781,7 +1781,8 @@ xfs_ifree_cluster(
> * to mark all the active inodes on the buffer stale.
> */
> bp = xfs_trans_get_buf(tp, mp->m_ddev_targp, blkno,
> - mp->m_bsize * blks_per_cluster, 0);
> + mp->m_bsize * blks_per_cluster,
> + XBF_UNMAPPED);
>
> if (!bp)
> return ENOMEM;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>

--
Dave Chinner
[email protected]

2012-10-30 20:37:11

by Torsten Kaiser

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Mon, Oct 29, 2012 at 11:26 PM, Dave Chinner <[email protected]> wrote:
> On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote:
>> After experiencing a hang of all IO yesterday (
>> http://marc.info/?l=linux-kernel&m=135142236520624&w=2 ), I turned on
>> LOCKDEP after upgrading to -rc3.
>>
>> I then tried to replicate the load that hung yesterday and got the
>> following lockdep report, implicating XFS instead of by stacking swap
>> onto dm-crypt and md.
>>
>> [ 2844.971913]
>> [ 2844.971920] =================================
>> [ 2844.971921] [ INFO: inconsistent lock state ]
>> [ 2844.971924] 3.7.0-rc3 #1 Not tainted
>> [ 2844.971925] ---------------------------------
>> [ 2844.971927] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
>> [ 2844.971929] kswapd0/725 [HC0[0]:SC0[0]:HE1:SE1] takes:
>> [ 2844.971931] (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e7ef4>] xfs_ilock+0x84/0xb0
>> [ 2844.971941] {RECLAIM_FS-ON-W} state was registered at:
>> [ 2844.971942] [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
>> [ 2844.971947] [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
>> [ 2844.971949] [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
>> [ 2844.971952] [<ffffffff810dba31>] vm_map_ram+0x271/0x770
>> [ 2844.971955] [<ffffffff811e10a6>] _xfs_buf_map_pages+0x46/0xe0
>> [ 2844.971959] [<ffffffff811e1fba>] xfs_buf_get_map+0x8a/0x130
>> [ 2844.971961] [<ffffffff81233849>] xfs_trans_get_buf_map+0xa9/0xd0
>> [ 2844.971964] [<ffffffff8121e339>] xfs_ifree_cluster+0x129/0x670
>> [ 2844.971967] [<ffffffff8121f959>] xfs_ifree+0xe9/0xf0
>> [ 2844.971969] [<ffffffff811f4abf>] xfs_inactive+0x2af/0x480
>> [ 2844.971972] [<ffffffff811efb90>] xfs_fs_evict_inode+0x70/0x80
>> [ 2844.971974] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
>> [ 2844.971977] [<ffffffff8110cd95>] iput+0x105/0x210
>> [ 2844.971979] [<ffffffff811070d0>] dentry_iput+0xa0/0xe0
>> [ 2844.971981] [<ffffffff81108310>] dput+0x150/0x280
>> [ 2844.971983] [<ffffffff811020fb>] sys_renameat+0x21b/0x290
>> [ 2844.971986] [<ffffffff81102186>] sys_rename+0x16/0x20
>> [ 2844.971988] [<ffffffff816b2292>] system_call_fastpath+0x16/0x1b
>
> We shouldn't be mapping pages there. See if the patch below fixes
> it.

Applying your fix and rerunning my test workload did not trigger this
or any other LOCKDEP reports.
While I'm not 100% sure about my test case always hitting this, your
description makes me quite confident, that it really fixed this issue.

I will keep LOCKDEP enabled on that system, and if there really is
another splat, I will report back here. But I rather doubt that this
will be needed.

Thanks for the very quick fix!

Torsten

> Fundamentally, though, the lockdep warning has come about because
> vm_map_ram is doing a GFP_KERNEL allocation when we need it to be
> doing GFP_NOFS - we are within a transaction here, so memory reclaim
> is not allowed to recurse back into the filesystem.
>
> mm-folk: can we please get this vmalloc/gfp_flags passing API
> fixed once and for all? This is the fourth time in the last month or
> so that I've seen XFS bug reports with silent hangs and associated
> lockdep output that implicate GFP_KERNEL allocations from vm_map_ram
> in GFP_NOFS conditions as the potential cause....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> xfs: don't vmap inode cluster buffers during free
>
> From: Dave Chinner <[email protected]>
>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/xfs/xfs_inode.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index c4add46..82f6e5d 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1781,7 +1781,8 @@ xfs_ifree_cluster(
> * to mark all the active inodes on the buffer stale.
> */
> bp = xfs_trans_get_buf(tp, mp->m_ddev_targp, blkno,
> - mp->m_bsize * blks_per_cluster, 0);
> + mp->m_bsize * blks_per_cluster,
> + XBF_UNMAPPED);
>
> if (!bp)
> return ENOMEM;

2012-10-30 20:46:36

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Oct 30, 2012 at 09:37:06PM +0100, Torsten Kaiser wrote:
> On Mon, Oct 29, 2012 at 11:26 PM, Dave Chinner <[email protected]> wrote:

For some reason I only managed to get the two mails from Torsten into my
xfs list folder, but the one from Dave is missing. I did see Dave's
mail in my linux-mm folder, though.

> > From: Dave Chinner <[email protected]>
> >
> > Signed-off-by: Dave Chinner <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

And I agree that vmap needs to be fixed to pass through the gfp flags
ASAP.

2012-11-01 21:30:21

by Ben Myers

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

Hi Dave,

On Tue, Oct 30, 2012 at 09:26:13AM +1100, Dave Chinner wrote:
> On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote:
> > After experiencing a hang of all IO yesterday (
> > http://marc.info/?l=linux-kernel&m=135142236520624&w=2 ), I turned on
> > LOCKDEP after upgrading to -rc3.
> >
> > I then tried to replicate the load that hung yesterday and got the
> > following lockdep report, implicating XFS instead of by stacking swap
> > onto dm-crypt and md.
> >
> > [ 2844.971913]
> > [ 2844.971920] =================================
> > [ 2844.971921] [ INFO: inconsistent lock state ]
> > [ 2844.971924] 3.7.0-rc3 #1 Not tainted
> > [ 2844.971925] ---------------------------------
> > [ 2844.971927] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> > [ 2844.971929] kswapd0/725 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [ 2844.971931] (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e7ef4>] xfs_ilock+0x84/0xb0
> > [ 2844.971941] {RECLAIM_FS-ON-W} state was registered at:
> > [ 2844.971942] [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> > [ 2844.971947] [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> > [ 2844.971949] [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> > [ 2844.971952] [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> > [ 2844.971955] [<ffffffff811e10a6>] _xfs_buf_map_pages+0x46/0xe0
> > [ 2844.971959] [<ffffffff811e1fba>] xfs_buf_get_map+0x8a/0x130
> > [ 2844.971961] [<ffffffff81233849>] xfs_trans_get_buf_map+0xa9/0xd0
> > [ 2844.971964] [<ffffffff8121e339>] xfs_ifree_cluster+0x129/0x670
> > [ 2844.971967] [<ffffffff8121f959>] xfs_ifree+0xe9/0xf0
> > [ 2844.971969] [<ffffffff811f4abf>] xfs_inactive+0x2af/0x480
> > [ 2844.971972] [<ffffffff811efb90>] xfs_fs_evict_inode+0x70/0x80
> > [ 2844.971974] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
> > [ 2844.971977] [<ffffffff8110cd95>] iput+0x105/0x210
> > [ 2844.971979] [<ffffffff811070d0>] dentry_iput+0xa0/0xe0
> > [ 2844.971981] [<ffffffff81108310>] dput+0x150/0x280
> > [ 2844.971983] [<ffffffff811020fb>] sys_renameat+0x21b/0x290
> > [ 2844.971986] [<ffffffff81102186>] sys_rename+0x16/0x20
> > [ 2844.971988] [<ffffffff816b2292>] system_call_fastpath+0x16/0x1b
>
> We shouldn't be mapping pages there. See if the patch below fixes
> it.
>
> Fundamentally, though, the lockdep warning has come about because
> vm_map_ram is doing a GFP_KERNEL allocation when we need it to be
> doing GFP_NOFS - we are within a transaction here, so memory reclaim
> is not allowed to recurse back into the filesystem.
>
> mm-folk: can we please get this vmalloc/gfp_flags passing API
> fixed once and for all? This is the fourth time in the last month or
> so that I've seen XFS bug reports with silent hangs and associated
> lockdep output that implicate GFP_KERNEL allocations from vm_map_ram
> in GFP_NOFS conditions as the potential cause....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> xfs: don't vmap inode cluster buffers during free

Could you write up a little more background for the commit message?

Regards,
Ben

2012-11-01 22:32:38

by Dave Chinner

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Thu, Nov 01, 2012 at 04:30:10PM -0500, Ben Myers wrote:
> Hi Dave,
>
> On Tue, Oct 30, 2012 at 09:26:13AM +1100, Dave Chinner wrote:
> > On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote:
> > > After experiencing a hang of all IO yesterday (
> > > http://marc.info/?l=linux-kernel&m=135142236520624&w=2 ), I turned on
> > > LOCKDEP after upgrading to -rc3.
> > >
> > > I then tried to replicate the load that hung yesterday and got the
> > > following lockdep report, implicating XFS instead of by stacking swap
> > > onto dm-crypt and md.
> > >
> > > [ 2844.971913]
> > > [ 2844.971920] =================================
> > > [ 2844.971921] [ INFO: inconsistent lock state ]
> > > [ 2844.971924] 3.7.0-rc3 #1 Not tainted
> > > [ 2844.971925] ---------------------------------
> > > [ 2844.971927] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> > > [ 2844.971929] kswapd0/725 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > > [ 2844.971931] (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e7ef4>] xfs_ilock+0x84/0xb0
> > > [ 2844.971941] {RECLAIM_FS-ON-W} state was registered at:
> > > [ 2844.971942] [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> > > [ 2844.971947] [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> > > [ 2844.971949] [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> > > [ 2844.971952] [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> > > [ 2844.971955] [<ffffffff811e10a6>] _xfs_buf_map_pages+0x46/0xe0
.....
> > We shouldn't be mapping pages there. See if the patch below fixes
> > it.
> >
> > Fundamentally, though, the lockdep warning has come about because
> > vm_map_ram is doing a GFP_KERNEL allocation when we need it to be
> > doing GFP_NOFS - we are within a transaction here, so memory reclaim
> > is not allowed to recurse back into the filesystem.
> >
> > mm-folk: can we please get this vmalloc/gfp_flags passing API
> > fixed once and for all? This is the fourth time in the last month or
> > so that I've seen XFS bug reports with silent hangs and associated
> > lockdep output that implicate GFP_KERNEL allocations from vm_map_ram
> > in GFP_NOFS conditions as the potential cause....
> >
> > xfs: don't vmap inode cluster buffers during free
>
> Could you write up a little more background for the commit message?

Sure, that was just a test patch and often I don't bother putting a
detailed description in them until I know they fix the problem. My
current tree has:

xfs: don't vmap inode cluster buffers during free

Inode buffers do not need to be mapped as inodes are read or written
directly from/to the pages underlying the buffer. This fixes a
regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the
default behaviour").

Cheers,

Dave.
--
Dave Chinner
[email protected]

2012-11-18 10:24:43

by Torsten Kaiser

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Oct 30, 2012 at 9:37 PM, Torsten Kaiser
<[email protected]> wrote:
> I will keep LOCKDEP enabled on that system, and if there really is
> another splat, I will report back here. But I rather doubt that this
> will be needed.

After the patch, I did not see this problem again, but today I found
another LOCKDEP report that also looks XFS related.
I found it twice in the logs, and as both were slightly different, I
will attach both versions.


Nov 6 21:57:09 thoregon kernel: [ 9941.104345]
Nov 6 21:57:09 thoregon kernel: [ 9941.104350]
=================================
Nov 6 21:57:09 thoregon kernel: [ 9941.104351] [ INFO: inconsistent
lock state ]
Nov 6 21:57:09 thoregon kernel: [ 9941.104353] 3.7.0-rc4 #1 Not tainted
Nov 6 21:57:09 thoregon kernel: [ 9941.104354]
---------------------------------
Nov 6 21:57:09 thoregon kernel: [ 9941.104355] inconsistent
{RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
Nov 6 21:57:09 thoregon kernel: [ 9941.104357] kswapd0/725
[HC0[0]:SC0[0]:HE1:SE1] takes:
Nov 6 21:57:09 thoregon kernel: [ 9941.104359]
(&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff811e8164>]
xfs_ilock+0x84/0xb0
Nov 6 21:57:09 thoregon kernel: [ 9941.104366] {RECLAIM_FS-ON-W}
state was registered at:
Nov 6 21:57:09 thoregon kernel: [ 9941.104367] [<ffffffff8108137e>]
mark_held_locks+0x7e/0x130
Nov 6 21:57:09 thoregon kernel: [ 9941.104371] [<ffffffff81081a63>]
lockdep_trace_alloc+0x63/0xc0
Nov 6 21:57:09 thoregon kernel: [ 9941.104373] [<ffffffff810b5a55>]
__alloc_pages_nodemask+0x75/0x800
Nov 6 21:57:09 thoregon kernel: [ 9941.104375] [<ffffffff810b6262>]
__get_free_pages+0x12/0x40
Nov 6 21:57:09 thoregon kernel: [ 9941.104377] [<ffffffff8102d7f0>]
pte_alloc_one_kernel+0x10/0x20
Nov 6 21:57:09 thoregon kernel: [ 9941.104380] [<ffffffff810cc3e6>]
__pte_alloc_kernel+0x16/0x90
Nov 6 21:57:09 thoregon kernel: [ 9941.104382] [<ffffffff810d9f37>]
vmap_page_range_noflush+0x287/0x320
Nov 6 21:57:09 thoregon kernel: [ 9941.104385] [<ffffffff810dbe54>]
vm_map_ram+0x694/0x770
Nov 6 21:57:09 thoregon kernel: [ 9941.104386] [<ffffffff811e1316>]
_xfs_buf_map_pages+0x46/0xe0
Nov 6 21:57:09 thoregon kernel: [ 9941.104389] [<ffffffff811e222a>]
xfs_buf_get_map+0x8a/0x130
Nov 6 21:57:09 thoregon kernel: [ 9941.104391] [<ffffffff81233ab9>]
xfs_trans_get_buf_map+0xa9/0xd0
Nov 6 21:57:09 thoregon kernel: [ 9941.104393] [<ffffffff8121e5a9>]
xfs_ifree_cluster+0x129/0x670
Nov 6 21:57:09 thoregon kernel: [ 9941.104396] [<ffffffff8121fbc9>]
xfs_ifree+0xe9/0xf0
Nov 6 21:57:09 thoregon kernel: [ 9941.104398] [<ffffffff811f4d2f>]
xfs_inactive+0x2af/0x480
Nov 6 21:57:09 thoregon kernel: [ 9941.104400] [<ffffffff811efe00>]
xfs_fs_evict_inode+0x70/0x80
Nov 6 21:57:09 thoregon kernel: [ 9941.104402] [<ffffffff8110cb8f>]
evict+0xaf/0x1b0
Nov 6 21:57:09 thoregon kernel: [ 9941.104405] [<ffffffff8110cd95>]
iput+0x105/0x210
Nov 6 21:57:09 thoregon kernel: [ 9941.104406] [<ffffffff81107ba0>]
d_delete+0x150/0x190
Nov 6 21:57:09 thoregon kernel: [ 9941.104408] [<ffffffff810ff8a7>]
vfs_rmdir+0x107/0x120
Nov 6 21:57:09 thoregon kernel: [ 9941.104411] [<ffffffff810ff9a4>]
do_rmdir+0xe4/0x130
Nov 6 21:57:09 thoregon kernel: [ 9941.104413] [<ffffffff81101c01>]
sys_rmdir+0x11/0x20
Nov 6 21:57:09 thoregon kernel: [ 9941.104415] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
Nov 6 21:57:09 thoregon kernel: [ 9941.104417] irq event stamp: 18505
Nov 6 21:57:09 thoregon kernel: [ 9941.104418] hardirqs last enabled
at (18505): [<ffffffff816aec5d>] mutex_trylock+0xfd/0x170
Nov 6 21:57:09 thoregon kernel: [ 9941.104421] hardirqs last disabled
at (18504): [<ffffffff816aeb9e>] mutex_trylock+0x3e/0x170
Nov 6 21:57:09 thoregon kernel: [ 9941.104423] softirqs last enabled
at (18492): [<ffffffff81042fb1>] __do_softirq+0x111/0x170
Nov 6 21:57:09 thoregon kernel: [ 9941.104426] softirqs last disabled
at (18477): [<ffffffff816b3e3c>] call_softirq+0x1c/0x30
Nov 6 21:57:09 thoregon kernel: [ 9941.104428]
Nov 6 21:57:09 thoregon kernel: [ 9941.104428] other info that might
help us debug this:
Nov 6 21:57:09 thoregon kernel: [ 9941.104429] Possible unsafe
locking scenario:
Nov 6 21:57:09 thoregon kernel: [ 9941.104429]
Nov 6 21:57:09 thoregon kernel: [ 9941.104430] CPU0
Nov 6 21:57:09 thoregon kernel: [ 9941.104431] ----
Nov 6 21:57:09 thoregon kernel: [ 9941.104432] lock(&(&ip->i_lock)->mr_lock);
Nov 6 21:57:09 thoregon kernel: [ 9941.104433] <Interrupt>
Nov 6 21:57:09 thoregon kernel: [ 9941.104434]
lock(&(&ip->i_lock)->mr_lock);
Nov 6 21:57:09 thoregon kernel: [ 9941.104435]
Nov 6 21:57:09 thoregon kernel: [ 9941.104435] *** DEADLOCK ***
Nov 6 21:57:09 thoregon kernel: [ 9941.104435]
Nov 6 21:57:09 thoregon kernel: [ 9941.104437] 3 locks held by kswapd0/725:
Nov 6 21:57:09 thoregon kernel: [ 9941.104438] #0:
(shrinker_rwsem){++++..}, at: [<ffffffff810bbd22>]
shrink_slab+0x32/0x1f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104442] #1:
(&type->s_umount_key#20){++++.+}, at: [<ffffffff810f5a8e>]
grab_super_passive+0x3e/0x90
Nov 6 21:57:09 thoregon kernel: [ 9941.104446] #2:
(&pag->pag_ici_reclaim_lock){+.+...}, at: [<ffffffff811f28ac>]
xfs_reclaim_inodes_ag+0xbc/0x4f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104449]
Nov 6 21:57:09 thoregon kernel: [ 9941.104449] stack backtrace:
Nov 6 21:57:09 thoregon kernel: [ 9941.104451] Pid: 725, comm:
kswapd0 Not tainted 3.7.0-rc4 #1
Nov 6 21:57:09 thoregon kernel: [ 9941.104452] Call Trace:
Nov 6 21:57:09 thoregon kernel: [ 9941.104456] [<ffffffff816a829c>]
print_usage_bug+0x1f5/0x206
Nov 6 21:57:09 thoregon kernel: [ 9941.104460] [<ffffffff8100ed8a>]
? save_stack_trace+0x2a/0x50
Nov 6 21:57:09 thoregon kernel: [ 9941.104462] [<ffffffff8107e9fd>]
mark_lock+0x28d/0x2f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104464] [<ffffffff8107de30>]
? print_irq_inversion_bug.part.37+0x1f0/0x1f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104466] [<ffffffff8107efdf>]
__lock_acquire+0x57f/0x1c00
Nov 6 21:57:09 thoregon kernel: [ 9941.104468] [<ffffffff8107c899>]
? __lock_is_held+0x59/0x70
Nov 6 21:57:09 thoregon kernel: [ 9941.104470] [<ffffffff81080b55>]
lock_acquire+0x55/0x70
Nov 6 21:57:09 thoregon kernel: [ 9941.104472] [<ffffffff811e8164>]
? xfs_ilock+0x84/0xb0
Nov 6 21:57:09 thoregon kernel: [ 9941.104476] [<ffffffff8106126a>]
down_write_nested+0x4a/0x70
Nov 6 21:57:09 thoregon kernel: [ 9941.104477] [<ffffffff811e8164>]
? xfs_ilock+0x84/0xb0
Nov 6 21:57:09 thoregon kernel: [ 9941.104479] [<ffffffff811e8164>]
xfs_ilock+0x84/0xb0
Nov 6 21:57:09 thoregon kernel: [ 9941.104481] [<ffffffff811f1ce6>]
xfs_reclaim_inode+0x136/0x340
Nov 6 21:57:09 thoregon kernel: [ 9941.104483] [<ffffffff811f2aaf>]
xfs_reclaim_inodes_ag+0x2bf/0x4f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104485] [<ffffffff811f28d0>]
? xfs_reclaim_inodes_ag+0xe0/0x4f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104487] [<ffffffff811f2e1e>]
xfs_reclaim_inodes_nr+0x2e/0x40
Nov 6 21:57:09 thoregon kernel: [ 9941.104489] [<ffffffff811ef6f0>]
xfs_fs_free_cached_objects+0x10/0x20
Nov 6 21:57:09 thoregon kernel: [ 9941.104491] [<ffffffff810f5bf3>]
prune_super+0x113/0x1a0
Nov 6 21:57:09 thoregon kernel: [ 9941.104493] [<ffffffff810bbe0e>]
shrink_slab+0x11e/0x1f0
Nov 6 21:57:09 thoregon kernel: [ 9941.104496] [<ffffffff810be400>]
kswapd+0x690/0xa10
Nov 6 21:57:09 thoregon kernel: [ 9941.104498] [<ffffffff8105ca30>]
? __init_waitqueue_head+0x60/0x60
Nov 6 21:57:09 thoregon kernel: [ 9941.104500] [<ffffffff810bdd70>]
? shrink_lruvec+0x540/0x540
Nov 6 21:57:09 thoregon kernel: [ 9941.104502] [<ffffffff8105c246>]
kthread+0xd6/0xe0
Nov 6 21:57:09 thoregon kernel: [ 9941.104504] [<ffffffff816b1efb>]
? _raw_spin_unlock_irq+0x2b/0x50
Nov 6 21:57:09 thoregon kernel: [ 9941.104506] [<ffffffff8105c170>]
? flush_kthread_worker+0xe0/0xe0
Nov 6 21:57:09 thoregon kernel: [ 9941.104508] [<ffffffff816b2c6c>]
ret_from_fork+0x7c/0xb0
Nov 6 21:57:09 thoregon kernel: [ 9941.104510] [<ffffffff8105c170>]
? flush_kthread_worker+0xe0/0xe0

Nov 17 14:07:38 thoregon kernel: [66571.610863]
Nov 17 14:07:38 thoregon kernel: [66571.610869]
=========================================================
Nov 17 14:07:38 thoregon kernel: [66571.610870] [ INFO: possible irq
lock inversion dependency detected ]
Nov 17 14:07:38 thoregon kernel: [66571.610873] 3.7.0-rc5 #1 Not tainted
Nov 17 14:07:38 thoregon kernel: [66571.610874]
---------------------------------------------------------
Nov 17 14:07:38 thoregon kernel: [66571.610875] cc1/21330 just changed
the state of lock:
Nov 17 14:07:38 thoregon kernel: [66571.610877]
(sb_internal){.+.+.?}, at: [<ffffffff8122b138>]
xfs_trans_alloc+0x28/0x50
Nov 17 14:07:38 thoregon kernel: [66571.610885] but this lock took
another, RECLAIM_FS-unsafe lock in the past:
Nov 17 14:07:38 thoregon kernel: [66571.610886]
(&(&ip->i_lock)->mr_lock/1){+.+.+.}
Nov 17 14:07:38 thoregon kernel: [66571.610886]
Nov 17 14:07:39 thoregon kernel: [66571.610886] and interrupts could
create inverse lock ordering between them.
Nov 17 14:07:39 thoregon kernel: [66571.610886]
Nov 17 14:07:39 thoregon kernel: [66571.610890]
Nov 17 14:07:39 thoregon kernel: [66571.610890] other info that might
help us debug this:
Nov 17 14:07:39 thoregon kernel: [66571.610891] Possible interrupt
unsafe locking scenario:
Nov 17 14:07:39 thoregon kernel: [66571.610891]
Nov 17 14:07:39 thoregon kernel: [66571.610892] CPU0
CPU1
Nov 17 14:07:39 thoregon kernel: [66571.610893] ----
----
Nov 17 14:07:39 thoregon kernel: [66571.610894]
lock(&(&ip->i_lock)->mr_lock/1);
Nov 17 14:07:39 thoregon kernel: [66571.610896]
local_irq_disable();
Nov 17 14:07:39 thoregon kernel: [66571.610897]
lock(sb_internal);
Nov 17 14:07:39 thoregon kernel: [66571.610898]
lock(&(&ip->i_lock)->mr_lock/1);
Nov 17 14:07:39 thoregon kernel: [66571.610900] <Interrupt>
Nov 17 14:07:39 thoregon kernel: [66571.610901] lock(sb_internal);
Nov 17 14:07:39 thoregon kernel: [66571.610902]
Nov 17 14:07:39 thoregon kernel: [66571.610902] *** DEADLOCK ***
Nov 17 14:07:39 thoregon kernel: [66571.610902]
Nov 17 14:07:39 thoregon kernel: [66571.610904] 3 locks held by cc1/21330:
Nov 17 14:07:39 thoregon kernel: [66571.610905] #0:
(&mm->mmap_sem){++++++}, at: [<ffffffff81029d8b>]
__do_page_fault+0xfb/0x480
Nov 17 14:07:39 thoregon kernel: [66571.610910] #1:
(shrinker_rwsem){++++..}, at: [<ffffffff810bbd02>]
shrink_slab+0x32/0x1f0
Nov 17 14:07:39 thoregon kernel: [66571.610915] #2:
(&type->s_umount_key#20){++++.+}, at: [<ffffffff810f5a7e>]
grab_super_passive+0x3e/0x90
Nov 17 14:07:39 thoregon kernel: [66571.610921]
Nov 17 14:07:39 thoregon kernel: [66571.610921] the shortest
dependencies between 2nd lock and 1st lock:
Nov 17 14:07:39 thoregon kernel: [66571.610927] ->
(&(&ip->i_lock)->mr_lock/1){+.+.+.} ops: 169649 {
Nov 17 14:07:39 thoregon kernel: [66571.610931] HARDIRQ-ON-W at:
Nov 17 14:07:39 thoregon kernel: [66571.610932]
[<ffffffff8107f091>] __lock_acquire+0x631/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.610935]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.610937]
[<ffffffff8106126a>] down_write_nested+0x4a/0x70
Nov 17 14:07:39 thoregon kernel: [66571.610941]
[<ffffffff811e8164>] xfs_ilock+0x84/0xb0
Nov 17 14:07:39 thoregon kernel: [66571.610944]
[<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.610946]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.610948]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.610950]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.610953]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.610955]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.610957]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.610959]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.610962]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.610964]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.610967] SOFTIRQ-ON-W at:
Nov 17 14:07:39 thoregon kernel: [66571.610968]
[<ffffffff8107f0c7>] __lock_acquire+0x667/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.610970]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.610972]
[<ffffffff8106126a>] down_write_nested+0x4a/0x70
Nov 17 14:07:39 thoregon kernel: [66571.610974]
[<ffffffff811e8164>] xfs_ilock+0x84/0xb0
Nov 17 14:07:39 thoregon kernel: [66571.610976]
[<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.610977]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.610979]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.610981]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.610983]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.610985]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.610987]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.610989]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.610991]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.610993]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.610995] RECLAIM_FS-ON-W at:
Nov 17 14:07:39 thoregon kernel: [66571.610996]
[<ffffffff8108137e>] mark_held_locks+0x7e/0x130
Nov 17 14:07:39 thoregon kernel: [66571.610998]
[<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.610999]
[<ffffffff810e9dc5>] kmem_cache_alloc+0x35/0xe0
Nov 17 14:07:39 thoregon kernel: [66571.611002]
[<ffffffff810dba21>] vm_map_ram+0x271/0x770
Nov 17 14:07:39 thoregon kernel: [66571.611004]
[<ffffffff811e12b6>] _xfs_buf_map_pages+0x46/0xe0
Nov 17 14:07:39 thoregon kernel: [66571.611008]
[<ffffffff811e21ca>] xfs_buf_get_map+0x8a/0x130
Nov 17 14:07:39 thoregon kernel: [66571.611009]
[<ffffffff81233989>] xfs_trans_get_buf_map+0xa9/0xd0
Nov 17 14:07:39 thoregon kernel: [66571.611011]
[<ffffffff8121bc2d>] xfs_ialloc_inode_init+0xcd/0x1d0
Nov 17 14:07:39 thoregon kernel: [66571.611015]
[<ffffffff8121c16f>] xfs_ialloc_ag_alloc+0x18f/0x500
Nov 17 14:07:39 thoregon kernel: [66571.611017]
[<ffffffff8121d955>] xfs_dialloc+0x185/0x2a0
Nov 17 14:07:39 thoregon kernel: [66571.611019]
[<ffffffff8121f068>] xfs_ialloc+0x58/0x650
Nov 17 14:07:39 thoregon kernel: [66571.611021]
[<ffffffff811f3995>] xfs_dir_ialloc+0x65/0x270
Nov 17 14:07:39 thoregon kernel: [66571.611023]
[<ffffffff811f537c>] xfs_create+0x3ac/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611024]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611026]
[<ffffffff811ecb61>] xfs_vn_mkdir+0x11/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611028]
[<ffffffff8110016f>] vfs_mkdir+0x7f/0xd0
Nov 17 14:07:39 thoregon kernel: [66571.611030]
[<ffffffff81101b83>] sys_mkdirat+0x43/0x80
Nov 17 14:07:39 thoregon kernel: [66571.611032]
[<ffffffff81101bd4>] sys_mkdir+0x14/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611034]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611036] INITIAL USE at:
Nov 17 14:07:39 thoregon kernel: [66571.611037]
[<ffffffff8107ed49>] __lock_acquire+0x2e9/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611038]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611040]
[<ffffffff8106126a>] down_write_nested+0x4a/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611042]
[<ffffffff811e8164>] xfs_ilock+0x84/0xb0
Nov 17 14:07:39 thoregon kernel: [66571.611044]
[<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611046]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611047]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611049]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611051]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.611053]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.611055]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.611057]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.611059]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611061]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611063] }
Nov 17 14:07:39 thoregon kernel: [66571.611064] ... key at:
[<ffffffff825b4b81>] __key.41357+0x1/0x8
Nov 17 14:07:39 thoregon kernel: [66571.611066] ... acquired at:
Nov 17 14:07:39 thoregon kernel: [66571.611067]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611069]
[<ffffffff8106126a>] down_write_nested+0x4a/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611071]
[<ffffffff811e8164>] xfs_ilock+0x84/0xb0
Nov 17 14:07:39 thoregon kernel: [66571.611073]
[<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611074]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611076]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611078]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611080]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.611082]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.611084]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.611086]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.611088]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611090]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611091]
Nov 17 14:07:39 thoregon kernel: [66571.611092] ->
(sb_internal){.+.+.?} ops: 1341531 {
Nov 17 14:07:39 thoregon kernel: [66571.611095] HARDIRQ-ON-R at:
Nov 17 14:07:39 thoregon kernel: [66571.611096]
[<ffffffff8107ef6a>] __lock_acquire+0x50a/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611098]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611100]
[<ffffffff810f451b>] __sb_start_write+0xab/0x190
Nov 17 14:07:39 thoregon kernel: [66571.611102]
[<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611104]
[<ffffffff811f5157>] xfs_create+0x187/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611105]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611107]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611109]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611111]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.611113]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.611115]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.611117]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.611119]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611121]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611123] SOFTIRQ-ON-R at:
Nov 17 14:07:39 thoregon kernel: [66571.611124]
[<ffffffff8107f0c7>] __lock_acquire+0x667/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611126]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611128]
[<ffffffff810f451b>] __sb_start_write+0xab/0x190
Nov 17 14:07:39 thoregon kernel: [66571.611130]
[<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611132]
[<ffffffff811f5157>] xfs_create+0x187/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611133]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611135]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611137]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611139]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.611141]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.611143]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.611145]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.611147]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611149]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611151] IN-RECLAIM_FS-R at:
Nov 17 14:07:39 thoregon kernel: [66571.611152]
[<ffffffff8107efdf>] __lock_acquire+0x57f/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611154]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611156]
[<ffffffff810f451b>] __sb_start_write+0xab/0x190
Nov 17 14:07:39 thoregon kernel: [66571.611158]
[<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611159]
[<ffffffff811f3e64>] xfs_free_eofblocks+0x104/0x250
Nov 17 14:07:39 thoregon kernel: [66571.611161]
[<ffffffff811f4b39>] xfs_inactive+0xa9/0x480
Nov 17 14:07:39 thoregon kernel: [66571.611163]
[<ffffffff811efe10>] xfs_fs_evict_inode+0x70/0x80
Nov 17 14:07:39 thoregon kernel: [66571.611165]
[<ffffffff8110cb7f>] evict+0xaf/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611168]
[<ffffffff8110d209>] dispose_list+0x39/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611170]
[<ffffffff8110dc63>] prune_icache_sb+0x183/0x340
Nov 17 14:07:39 thoregon kernel: [66571.611172]
[<ffffffff810f5bc3>] prune_super+0xf3/0x1a0
Nov 17 14:07:39 thoregon kernel: [66571.611173]
[<ffffffff810bbdee>] shrink_slab+0x11e/0x1f0
Nov 17 14:07:39 thoregon kernel: [66571.611175]
[<ffffffff810be98f>] try_to_free_pages+0x21f/0x4e0
Nov 17 14:07:39 thoregon kernel: [66571.611177]
[<ffffffff810b5ec6>] __alloc_pages_nodemask+0x506/0x800
Nov 17 14:07:39 thoregon kernel: [66571.611179]
[<ffffffff810ce56e>] handle_pte_fault+0x5ae/0x7a0
Nov 17 14:07:39 thoregon kernel: [66571.611182]
[<ffffffff810cf769>] handle_mm_fault+0x1f9/0x2a0
Nov 17 14:07:39 thoregon kernel: [66571.611184]
[<ffffffff81029dfc>] __do_page_fault+0x16c/0x480
Nov 17 14:07:39 thoregon kernel: [66571.611186]
[<ffffffff8102a139>] do_page_fault+0x9/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611188]
[<ffffffff816b287f>] page_fault+0x1f/0x30
Nov 17 14:07:39 thoregon kernel: [66571.611190] RECLAIM_FS-ON-R at:
Nov 17 14:07:39 thoregon kernel: [66571.611191]
[<ffffffff8108137e>] mark_held_locks+0x7e/0x130
Nov 17 14:07:39 thoregon kernel: [66571.611193]
[<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611195]
[<ffffffff810e9dc5>] kmem_cache_alloc+0x35/0xe0
Nov 17 14:07:39 thoregon kernel: [66571.611197]
[<ffffffff811f6d4f>] kmem_zone_alloc+0x5f/0xe0
Nov 17 14:07:39 thoregon kernel: [66571.611198]
[<ffffffff811f6de8>] kmem_zone_zalloc+0x18/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611200]
[<ffffffff8122b0b2>] _xfs_trans_alloc+0x32/0x90
Nov 17 14:07:39 thoregon kernel: [66571.611202]
[<ffffffff8122b148>] xfs_trans_alloc+0x38/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611204]
[<ffffffff811f5157>] xfs_create+0x187/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611205]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611207]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611209]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611211]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.611213]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.611215]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.611217]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.611219]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611221]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611223] INITIAL USE at:
Nov 17 14:07:39 thoregon kernel: [66571.611224]
[<ffffffff8107ed49>] __lock_acquire+0x2e9/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611225]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611227]
[<ffffffff810f451b>] __sb_start_write+0xab/0x190
Nov 17 14:07:39 thoregon kernel: [66571.611229]
[<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611231]
[<ffffffff811f5157>] xfs_create+0x187/0x5a0
Nov 17 14:07:39 thoregon kernel: [66571.611232]
[<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611234]
[<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611236]
[<ffffffff81100322>] vfs_create+0x72/0xc0
Nov 17 14:07:39 thoregon kernel: [66571.611238]
[<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
Nov 17 14:07:39 thoregon kernel: [66571.611240]
[<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
Nov 17 14:07:39 thoregon kernel: [66571.611242]
[<ffffffff8110183d>] do_filp_open+0x3d/0xa0
Nov 17 14:07:39 thoregon kernel: [66571.611244]
[<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
Nov 17 14:07:39 thoregon kernel: [66571.611246]
[<ffffffff810f222c>] sys_open+0x1c/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611248]
[<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
Nov 17 14:07:39 thoregon kernel: [66571.611250] }
Nov 17 14:07:39 thoregon kernel: [66571.611251] ... key at:
[<ffffffff81c34e40>] xfs_fs_type+0x60/0x80
Nov 17 14:07:39 thoregon kernel: [66571.611254] ... acquired at:
Nov 17 14:07:39 thoregon kernel: [66571.611254]
[<ffffffff8107df3b>] check_usage_forwards+0x10b/0x140
Nov 17 14:07:39 thoregon kernel: [66571.611256]
[<ffffffff8107e900>] mark_lock+0x190/0x2f0
Nov 17 14:07:39 thoregon kernel: [66571.611258]
[<ffffffff8107efdf>] __lock_acquire+0x57f/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611260]
[<ffffffff81080b55>] lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611261]
[<ffffffff810f451b>] __sb_start_write+0xab/0x190
Nov 17 14:07:39 thoregon kernel: [66571.611263]
[<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611265]
[<ffffffff811f3e64>] xfs_free_eofblocks+0x104/0x250
Nov 17 14:07:39 thoregon kernel: [66571.611266]
[<ffffffff811f4b39>] xfs_inactive+0xa9/0x480
Nov 17 14:07:39 thoregon kernel: [66571.611268]
[<ffffffff811efe10>] xfs_fs_evict_inode+0x70/0x80
Nov 17 14:07:39 thoregon kernel: [66571.611270]
[<ffffffff8110cb7f>] evict+0xaf/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611271]
[<ffffffff8110d209>] dispose_list+0x39/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611273]
[<ffffffff8110dc63>] prune_icache_sb+0x183/0x340
Nov 17 14:07:39 thoregon kernel: [66571.611275]
[<ffffffff810f5bc3>] prune_super+0xf3/0x1a0
Nov 17 14:07:39 thoregon kernel: [66571.611277]
[<ffffffff810bbdee>] shrink_slab+0x11e/0x1f0
Nov 17 14:07:39 thoregon kernel: [66571.611278]
[<ffffffff810be98f>] try_to_free_pages+0x21f/0x4e0
Nov 17 14:07:39 thoregon kernel: [66571.611280]
[<ffffffff810b5ec6>] __alloc_pages_nodemask+0x506/0x800
Nov 17 14:07:39 thoregon kernel: [66571.611282]
[<ffffffff810ce56e>] handle_pte_fault+0x5ae/0x7a0
Nov 17 14:07:39 thoregon kernel: [66571.611284]
[<ffffffff810cf769>] handle_mm_fault+0x1f9/0x2a0
Nov 17 14:07:39 thoregon kernel: [66571.611286]
[<ffffffff81029dfc>] __do_page_fault+0x16c/0x480
Nov 17 14:07:39 thoregon kernel: [66571.611287]
[<ffffffff8102a139>] do_page_fault+0x9/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611289]
[<ffffffff816b287f>] page_fault+0x1f/0x30
Nov 17 14:07:39 thoregon kernel: [66571.611291]
Nov 17 14:07:39 thoregon kernel: [66571.611292]
Nov 17 14:07:39 thoregon kernel: [66571.611292] stack backtrace:
Nov 17 14:07:39 thoregon kernel: [66571.611294] Pid: 21330, comm: cc1
Not tainted 3.7.0-rc5 #1
Nov 17 14:07:39 thoregon kernel: [66571.611295] Call Trace:
Nov 17 14:07:39 thoregon kernel: [66571.611298] [<ffffffff8107de28>]
print_irq_inversion_bug.part.37+0x1e8/0x1f0
Nov 17 14:07:39 thoregon kernel: [66571.611300] [<ffffffff8107df3b>]
check_usage_forwards+0x10b/0x140
Nov 17 14:07:39 thoregon kernel: [66571.611303] [<ffffffff8107e900>]
mark_lock+0x190/0x2f0
Nov 17 14:07:39 thoregon kernel: [66571.611306] [<ffffffff8150406e>]
? dm_request+0x2e/0x2a0
Nov 17 14:07:39 thoregon kernel: [66571.611308] [<ffffffff8107de30>]
? print_irq_inversion_bug.part.37+0x1f0/0x1f0
Nov 17 14:07:39 thoregon kernel: [66571.611310] [<ffffffff8107efdf>]
__lock_acquire+0x57f/0x1c00
Nov 17 14:07:39 thoregon kernel: [66571.611313] [<ffffffff812202f4>]
? xfs_iext_bno_to_ext+0x84/0x160
Nov 17 14:07:39 thoregon kernel: [66571.611316] [<ffffffff8120a023>]
? xfs_bmbt_get_all+0x13/0x20
Nov 17 14:07:39 thoregon kernel: [66571.611318] [<ffffffff81200fb4>]
? xfs_bmap_search_multi_extents+0xa4/0x110
Nov 17 14:07:39 thoregon kernel: [66571.611320] [<ffffffff81080b55>]
lock_acquire+0x55/0x70
Nov 17 14:07:39 thoregon kernel: [66571.611322] [<ffffffff8122b138>]
? xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611324] [<ffffffff810f451b>]
__sb_start_write+0xab/0x190
Nov 17 14:07:39 thoregon kernel: [66571.611326] [<ffffffff8122b138>]
? xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611328] [<ffffffff8122b138>]
? xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611330] [<ffffffff8122b138>]
xfs_trans_alloc+0x28/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611332] [<ffffffff811f3e64>]
xfs_free_eofblocks+0x104/0x250
Nov 17 14:07:39 thoregon kernel: [66571.611334] [<ffffffff816b204b>]
? _raw_spin_unlock_irq+0x2b/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611336] [<ffffffff811f4b39>]
xfs_inactive+0xa9/0x480
Nov 17 14:07:39 thoregon kernel: [66571.611337] [<ffffffff816b204b>]
? _raw_spin_unlock_irq+0x2b/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611340] [<ffffffff811efe10>]
xfs_fs_evict_inode+0x70/0x80
Nov 17 14:07:39 thoregon kernel: [66571.611342] [<ffffffff8110cb7f>]
evict+0xaf/0x1b0
Nov 17 14:07:39 thoregon kernel: [66571.611344] [<ffffffff8110d209>]
dispose_list+0x39/0x50
Nov 17 14:07:39 thoregon kernel: [66571.611346] [<ffffffff8110dc63>]
prune_icache_sb+0x183/0x340
Nov 17 14:07:39 thoregon kernel: [66571.611347] [<ffffffff810f5bc3>]
prune_super+0xf3/0x1a0
Nov 17 14:07:39 thoregon kernel: [66571.611349] [<ffffffff810bbdee>]
shrink_slab+0x11e/0x1f0
Nov 17 14:07:39 thoregon kernel: [66571.611352] [<ffffffff810be98f>]
try_to_free_pages+0x21f/0x4e0
Nov 17 14:07:39 thoregon kernel: [66571.611354] [<ffffffff810b5ec6>]
__alloc_pages_nodemask+0x506/0x800
Nov 17 14:07:39 thoregon kernel: [66571.611356] [<ffffffff810b9e40>]
? lru_deactivate_fn+0x1c0/0x1c0
Nov 17 14:07:39 thoregon kernel: [66571.611358] [<ffffffff810ce56e>]
handle_pte_fault+0x5ae/0x7a0
Nov 17 14:07:39 thoregon kernel: [66571.611360] [<ffffffff810cf769>]
handle_mm_fault+0x1f9/0x2a0
Nov 17 14:07:39 thoregon kernel: [66571.611363] [<ffffffff81029dfc>]
__do_page_fault+0x16c/0x480
Nov 17 14:07:39 thoregon kernel: [66571.611366] [<ffffffff8129c7ad>]
? trace_hardirqs_off_thunk+0x3a/0x3c
Nov 17 14:07:39 thoregon kernel: [66571.611368] [<ffffffff8102a139>]
do_page_fault+0x9/0x10
Nov 17 14:07:39 thoregon kernel: [66571.611370] [<ffffffff816b287f>]
page_fault+0x1f/0x30

2012-11-18 15:29:25

by Torsten Kaiser

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Sun, Nov 18, 2012 at 11:24 AM, Torsten Kaiser
<[email protected]> wrote:
> On Tue, Oct 30, 2012 at 9:37 PM, Torsten Kaiser
> <[email protected]> wrote:
>> I will keep LOCKDEP enabled on that system, and if there really is
>> another splat, I will report back here. But I rather doubt that this
>> will be needed.
>
> After the patch, I did not see this problem again, but today I found
> another LOCKDEP report that also looks XFS related.
> I found it twice in the logs, and as both were slightly different, I
> will attach both versions.

> Nov 6 21:57:09 thoregon kernel: [ 9941.104353] 3.7.0-rc4 #1 Not tainted
> Nov 6 21:57:09 thoregon kernel: [ 9941.104355] inconsistent
> {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> Nov 6 21:57:09 thoregon kernel: [ 9941.104430] CPU0
> Nov 6 21:57:09 thoregon kernel: [ 9941.104431] ----
> Nov 6 21:57:09 thoregon kernel: [ 9941.104432] lock(&(&ip->i_lock)->mr_lock);
> Nov 6 21:57:09 thoregon kernel: [ 9941.104433] <Interrupt>
> Nov 6 21:57:09 thoregon kernel: [ 9941.104434]
> lock(&(&ip->i_lock)->mr_lock);
> Nov 6 21:57:09 thoregon kernel: [ 9941.104435]
> Nov 6 21:57:09 thoregon kernel: [ 9941.104435] *** DEADLOCK ***

Sorry! Copied the wrong report. Your fix only landed in -rc5, so my
vanilla -rc4 did (also) report the old problem again.
And I copy&pasted that report instead of the second appearance of the
new problem.

Here is the correct second report of the sb_internal vs
ip->i_lock->mr_lock problem:
[110926.972477]
[110926.972482] =========================================================
[110926.972484] [ INFO: possible irq lock inversion dependency detected ]
[110926.972486] 3.7.0-rc4 #1 Not tainted
[110926.972487] ---------------------------------------------------------
[110926.972489] kswapd0/725 just changed the state of lock:
[110926.972490] (sb_internal){.+.+.?}, at: [<ffffffff8122b268>]
xfs_trans_alloc+0x28/0x50
[110926.972499] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[110926.972500] (&(&ip->i_lock)->mr_lock/1){+.+.+.}
[110926.972500]
[110926.972500] and interrupts could create inverse lock ordering between them.
[110926.972500]
[110926.972503]
[110926.972503] other info that might help us debug this:
[110926.972504] Possible interrupt unsafe locking scenario:
[110926.972504]
[110926.972505] CPU0 CPU1
[110926.972506] ---- ----
[110926.972507] lock(&(&ip->i_lock)->mr_lock/1);
[110926.972509] local_irq_disable();
[110926.972509] lock(sb_internal);
[110926.972511] lock(&(&ip->i_lock)->mr_lock/1);
[110926.972512] <Interrupt>
[110926.972513] lock(sb_internal);
[110926.972514]
[110926.972514] *** DEADLOCK ***
[110926.972514]
[110926.972516] 2 locks held by kswapd0/725:
[110926.972517] #0: (shrinker_rwsem){++++..}, at:
[<ffffffff810bbd22>] shrink_slab+0x32/0x1f0
[110926.972522] #1: (&type->s_umount_key#20){++++.+}, at:
[<ffffffff810f5a8e>] grab_super_passive+0x3e/0x90
[110926.972527]
[110926.972527] the shortest dependencies between 2nd lock and 1st lock:
[110926.972533] -> (&(&ip->i_lock)->mr_lock/1){+.+.+.} ops: 58117 {
[110926.972536] HARDIRQ-ON-W at:
[110926.972537] [<ffffffff8107f091>]
__lock_acquire+0x631/0x1c00
[110926.972540] [<ffffffff81080b55>]
lock_acquire+0x55/0x70
[110926.972542] [<ffffffff8106126a>]
down_write_nested+0x4a/0x70
[110926.972545] [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
[110926.972548] [<ffffffff811f5194>]
xfs_create+0x1d4/0x5a0
[110926.972550] [<ffffffff811eca1a>]
xfs_vn_mknod+0x8a/0x1b0
[110926.972552] [<ffffffff811ecb6e>]
xfs_vn_create+0xe/0x10
[110926.972554] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972556] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972558] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972560] [<ffffffff8110184d>]
do_filp_open+0x3d/0xa0
[110926.972562] [<ffffffff810f2139>]
do_sys_open+0xf9/0x1e0
[110926.972565] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972567] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972570] SOFTIRQ-ON-W at:
[110926.972571] [<ffffffff8107f0c7>]
__lock_acquire+0x667/0x1c00
[110926.972573] [<ffffffff81080b55>]
lock_acquire+0x55/0x70
[110926.972574] [<ffffffff8106126a>]
down_write_nested+0x4a/0x70
[110926.972576] [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
[110926.972578] [<ffffffff811f5194>]
xfs_create+0x1d4/0x5a0
[110926.972580] [<ffffffff811eca1a>]
xfs_vn_mknod+0x8a/0x1b0
[110926.972581] [<ffffffff811ecb6e>]
xfs_vn_create+0xe/0x10
[110926.972583] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972585] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972587] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972589] [<ffffffff8110184d>]
do_filp_open+0x3d/0xa0
[110926.972591] [<ffffffff810f2139>]
do_sys_open+0xf9/0x1e0
[110926.972593] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972595] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972597] RECLAIM_FS-ON-W at:
[110926.972598] [<ffffffff8108137e>]
mark_held_locks+0x7e/0x130
[110926.972600] [<ffffffff81081a63>]
lockdep_trace_alloc+0x63/0xc0
[110926.972601] [<ffffffff810e9dd5>]
kmem_cache_alloc+0x35/0xe0
[110926.972603] [<ffffffff810dba31>]
vm_map_ram+0x271/0x770
[110926.972606] [<ffffffff811e1316>]
_xfs_buf_map_pages+0x46/0xe0
[110926.972609] [<ffffffff811e222a>]
xfs_buf_get_map+0x8a/0x130
[110926.972610] [<ffffffff81233ab9>]
xfs_trans_get_buf_map+0xa9/0xd0
[110926.972613] [<ffffffff8121bced>]
xfs_ialloc_inode_init+0xcd/0x1d0
[110926.972616] [<ffffffff8121c25e>]
xfs_ialloc_ag_alloc+0x1be/0x560
[110926.972618] [<ffffffff8121da65>]
xfs_dialloc+0x185/0x2a0
[110926.972619] [<ffffffff8121f198>]
xfs_ialloc+0x58/0x650
[110926.972621] [<ffffffff811f3985>]
xfs_dir_ialloc+0x65/0x270
[110926.972623] [<ffffffff811f536c>]
xfs_create+0x3ac/0x5a0
[110926.972624] [<ffffffff811eca1a>]
xfs_vn_mknod+0x8a/0x1b0
[110926.972626] [<ffffffff811ecb6e>]
xfs_vn_create+0xe/0x10
[110926.972628] [<ffffffff81100332>]
vfs_create+0x72/0xc0
[110926.972630] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972632] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972634] [<ffffffff8110184d>]
do_filp_open+0x3d/0xa0
[110926.972636] [<ffffffff810f2139>]
do_sys_open+0xf9/0x1e0
[110926.972638] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972640] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972642] INITIAL USE at:
[110926.972642] [<ffffffff8107ed49>]
__lock_acquire+0x2e9/0x1c00
[110926.972644] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972646] [<ffffffff8106126a>]
down_write_nested+0x4a/0x70
[110926.972648] [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
[110926.972650] [<ffffffff811f5194>] xfs_create+0x1d4/0x5a0
[110926.972651] [<ffffffff811eca1a>]
xfs_vn_mknod+0x8a/0x1b0
[110926.972653] [<ffffffff811ecb6e>] xfs_vn_create+0xe/0x10
[110926.972655] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972657] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972659] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972661] [<ffffffff8110184d>] do_filp_open+0x3d/0xa0
[110926.972663] [<ffffffff810f2139>] do_sys_open+0xf9/0x1e0
[110926.972664] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972666] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972668] }
[110926.972669] ... key at: [<ffffffff825b4b81>] __key.41355+0x1/0x8
[110926.972672] ... acquired at:
[110926.972672] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972674] [<ffffffff8106126a>] down_write_nested+0x4a/0x70
[110926.972676] [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
[110926.972678] [<ffffffff811f5194>] xfs_create+0x1d4/0x5a0
[110926.972679] [<ffffffff811eca1a>] xfs_vn_mknod+0x8a/0x1b0
[110926.972681] [<ffffffff811ecb6e>] xfs_vn_create+0xe/0x10
[110926.972683] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972684] [<ffffffff81100b8e>] do_last.isra.69+0x80e/0xc80
[110926.972686] [<ffffffff811010ab>] path_openat.isra.70+0xab/0x490
[110926.972688] [<ffffffff8110184d>] do_filp_open+0x3d/0xa0
[110926.972690] [<ffffffff810f2139>] do_sys_open+0xf9/0x1e0
[110926.972692] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972694] [<ffffffff816b2d12>] system_call_fastpath+0x16/0x1b
[110926.972696]
[110926.972696] -> (sb_internal){.+.+.?} ops: 1710064 {
[110926.972699] HARDIRQ-ON-R at:
[110926.972700] [<ffffffff8107ef6a>]
__lock_acquire+0x50a/0x1c00
[110926.972702] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972704] [<ffffffff810f452b>]
__sb_start_write+0xab/0x190
[110926.972705] [<ffffffff8122b268>]
xfs_trans_alloc+0x28/0x50
[110926.972707] [<ffffffff811f5147>] xfs_create+0x187/0x5a0
[110926.972709] [<ffffffff811eca1a>] xfs_vn_mknod+0x8a/0x1b0
[110926.972711] [<ffffffff811ecb6e>] xfs_vn_create+0xe/0x10
[110926.972712] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972714] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972716] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972718] [<ffffffff8110184d>] do_filp_open+0x3d/0xa0
[110926.972720] [<ffffffff810f2139>] do_sys_open+0xf9/0x1e0
[110926.972722] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972724] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972726] SOFTIRQ-ON-R at:
[110926.972727] [<ffffffff8107f0c7>]
__lock_acquire+0x667/0x1c00
[110926.972728] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972730] [<ffffffff810f452b>]
__sb_start_write+0xab/0x190
[110926.972732] [<ffffffff8122b268>]
xfs_trans_alloc+0x28/0x50
[110926.972734] [<ffffffff811f5147>] xfs_create+0x187/0x5a0
[110926.972735] [<ffffffff811eca1a>] xfs_vn_mknod+0x8a/0x1b0
[110926.972737] [<ffffffff811ecb6e>] xfs_vn_create+0xe/0x10
[110926.972739] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972741] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972743] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972745] [<ffffffff8110184d>] do_filp_open+0x3d/0xa0
[110926.972747] [<ffffffff810f2139>] do_sys_open+0xf9/0x1e0
[110926.972749] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972750] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972752] IN-RECLAIM_FS-R at:
[110926.972753] [<ffffffff8107efdf>]
__lock_acquire+0x57f/0x1c00
[110926.972755] [<ffffffff81080b55>]
lock_acquire+0x55/0x70
[110926.972757] [<ffffffff810f452b>]
__sb_start_write+0xab/0x190
[110926.972758] [<ffffffff8122b268>]
xfs_trans_alloc+0x28/0x50
[110926.972760] [<ffffffff811f3e54>]
xfs_free_eofblocks+0x104/0x250
[110926.972762] [<ffffffff811f4b29>]
xfs_inactive+0xa9/0x480
[110926.972763] [<ffffffff811efe00>]
xfs_fs_evict_inode+0x70/0x80
[110926.972765] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
[110926.972768] [<ffffffff8110d219>]
dispose_list+0x39/0x50
[110926.972770] [<ffffffff8110dc73>]
prune_icache_sb+0x183/0x340
[110926.972772] [<ffffffff810f5bd3>]
prune_super+0xf3/0x1a0
[110926.972773] [<ffffffff810bbe0e>]
shrink_slab+0x11e/0x1f0
[110926.972775] [<ffffffff810be400>] kswapd+0x690/0xa10
[110926.972777] [<ffffffff8105c246>] kthread+0xd6/0xe0
[110926.972779] [<ffffffff816b2c6c>]
ret_from_fork+0x7c/0xb0
[110926.972781] RECLAIM_FS-ON-R at:
[110926.972782] [<ffffffff8108137e>]
mark_held_locks+0x7e/0x130
[110926.972784] [<ffffffff81081a63>]
lockdep_trace_alloc+0x63/0xc0
[110926.972785] [<ffffffff810e9dd5>]
kmem_cache_alloc+0x35/0xe0
[110926.972787] [<ffffffff811f6d3f>]
kmem_zone_alloc+0x5f/0xe0
[110926.972789] [<ffffffff811f6dd8>]
kmem_zone_zalloc+0x18/0x50
[110926.972790] [<ffffffff8122b1e2>]
_xfs_trans_alloc+0x32/0x90
[110926.972792] [<ffffffff8122b278>]
xfs_trans_alloc+0x38/0x50
[110926.972794] [<ffffffff811f5147>]
xfs_create+0x187/0x5a0
[110926.972796] [<ffffffff811eca1a>]
xfs_vn_mknod+0x8a/0x1b0
[110926.972797] [<ffffffff811ecb6e>]
xfs_vn_create+0xe/0x10
[110926.972799] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972801] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972803] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972805] [<ffffffff8110184d>]
do_filp_open+0x3d/0xa0
[110926.972807] [<ffffffff810f2139>]
do_sys_open+0xf9/0x1e0
[110926.972809] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972811] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972813] INITIAL USE at:
[110926.972814] [<ffffffff8107ed49>]
__lock_acquire+0x2e9/0x1c00
[110926.972815] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972817] [<ffffffff810f452b>]
__sb_start_write+0xab/0x190
[110926.972819] [<ffffffff8122b268>]
xfs_trans_alloc+0x28/0x50
[110926.972820] [<ffffffff811f5147>] xfs_create+0x187/0x5a0
[110926.972822] [<ffffffff811eca1a>] xfs_vn_mknod+0x8a/0x1b0
[110926.972824] [<ffffffff811ecb6e>] xfs_vn_create+0xe/0x10
[110926.972826] [<ffffffff81100332>] vfs_create+0x72/0xc0
[110926.972827] [<ffffffff81100b8e>]
do_last.isra.69+0x80e/0xc80
[110926.972829] [<ffffffff811010ab>]
path_openat.isra.70+0xab/0x490
[110926.972831] [<ffffffff8110184d>] do_filp_open+0x3d/0xa0
[110926.972833] [<ffffffff810f2139>] do_sys_open+0xf9/0x1e0
[110926.972835] [<ffffffff810f223c>] sys_open+0x1c/0x20
[110926.972837] [<ffffffff816b2d12>]
system_call_fastpath+0x16/0x1b
[110926.972839] }
[110926.972840] ... key at: [<ffffffff81c34e40>] xfs_fs_type+0x60/0x80
[110926.972842] ... acquired at:
[110926.972843] [<ffffffff8107df3b>] check_usage_forwards+0x10b/0x140
[110926.972845] [<ffffffff8107e900>] mark_lock+0x190/0x2f0
[110926.972846] [<ffffffff8107efdf>] __lock_acquire+0x57f/0x1c00
[110926.972848] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972850] [<ffffffff810f452b>] __sb_start_write+0xab/0x190
[110926.972851] [<ffffffff8122b268>] xfs_trans_alloc+0x28/0x50
[110926.972853] [<ffffffff811f3e54>] xfs_free_eofblocks+0x104/0x250
[110926.972855] [<ffffffff811f4b29>] xfs_inactive+0xa9/0x480
[110926.972856] [<ffffffff811efe00>] xfs_fs_evict_inode+0x70/0x80
[110926.972858] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
[110926.972860] [<ffffffff8110d219>] dispose_list+0x39/0x50
[110926.972861] [<ffffffff8110dc73>] prune_icache_sb+0x183/0x340
[110926.972863] [<ffffffff810f5bd3>] prune_super+0xf3/0x1a0
[110926.972865] [<ffffffff810bbe0e>] shrink_slab+0x11e/0x1f0
[110926.972866] [<ffffffff810be400>] kswapd+0x690/0xa10
[110926.972868] [<ffffffff8105c246>] kthread+0xd6/0xe0
[110926.972870] [<ffffffff816b2c6c>] ret_from_fork+0x7c/0xb0
[110926.972871]
[110926.972872]
[110926.972872] stack backtrace:
[110926.972874] Pid: 725, comm: kswapd0 Not tainted 3.7.0-rc4 #1
[110926.972875] Call Trace:
[110926.972878] [<ffffffff8107de28>]
print_irq_inversion_bug.part.37+0x1e8/0x1f0
[110926.972880] [<ffffffff8107df3b>] check_usage_forwards+0x10b/0x140
[110926.972883] [<ffffffff8107e900>] mark_lock+0x190/0x2f0
[110926.972885] [<ffffffff8107de30>] ?
print_irq_inversion_bug.part.37+0x1f0/0x1f0
[110926.972887] [<ffffffff8107efdf>] __lock_acquire+0x57f/0x1c00
[110926.972889] [<ffffffff81220424>] ? xfs_iext_bno_to_ext+0x84/0x160
[110926.972892] [<ffffffff8120a0e3>] ? xfs_bmbt_get_all+0x13/0x20
[110926.972895] [<ffffffff81201104>] ? xfs_bmap_search_multi_extents+0xa4/0x110
[110926.972897] [<ffffffff81080b55>] lock_acquire+0x55/0x70
[110926.972899] [<ffffffff8122b268>] ? xfs_trans_alloc+0x28/0x50
[110926.972901] [<ffffffff810f452b>] __sb_start_write+0xab/0x190
[110926.972903] [<ffffffff8122b268>] ? xfs_trans_alloc+0x28/0x50
[110926.972905] [<ffffffff8122b268>] ? xfs_trans_alloc+0x28/0x50
[110926.972907] [<ffffffff8122b268>] xfs_trans_alloc+0x28/0x50
[110926.972908] [<ffffffff811f3e54>] xfs_free_eofblocks+0x104/0x250
[110926.972910] [<ffffffff816b1efb>] ? _raw_spin_unlock_irq+0x2b/0x50
[110926.972912] [<ffffffff811f4b29>] xfs_inactive+0xa9/0x480
[110926.972914] [<ffffffff816b1efb>] ? _raw_spin_unlock_irq+0x2b/0x50
[110926.972916] [<ffffffff811efe00>] xfs_fs_evict_inode+0x70/0x80
[110926.972918] [<ffffffff8110cb8f>] evict+0xaf/0x1b0
[110926.972920] [<ffffffff8110d219>] dispose_list+0x39/0x50
[110926.972922] [<ffffffff8110dc73>] prune_icache_sb+0x183/0x340
[110926.972924] [<ffffffff810f5bd3>] prune_super+0xf3/0x1a0
[110926.972926] [<ffffffff810bbe0e>] shrink_slab+0x11e/0x1f0
[110926.972928] [<ffffffff810be400>] kswapd+0x690/0xa10
[110926.972930] [<ffffffff8105ca30>] ? __init_waitqueue_head+0x60/0x60
[110926.972932] [<ffffffff810bdd70>] ? shrink_lruvec+0x540/0x540
[110926.972934] [<ffffffff8105c246>] kthread+0xd6/0xe0
[110926.972936] [<ffffffff816b1efb>] ? _raw_spin_unlock_irq+0x2b/0x50
[110926.972938] [<ffffffff8105c170>] ? flush_kthread_worker+0xe0/0xe0
[110926.972940] [<ffffffff816b2c6c>] ret_from_fork+0x7c/0xb0
[110926.972942] [<ffffffff8105c170>] ? flush_kthread_worker+0xe0/0xe0

> Nov 17 14:07:38 thoregon kernel: [66571.610863]
> Nov 17 14:07:38 thoregon kernel: [66571.610869]
> =========================================================
> Nov 17 14:07:38 thoregon kernel: [66571.610870] [ INFO: possible irq
> lock inversion dependency detected ]
> Nov 17 14:07:38 thoregon kernel: [66571.610873] 3.7.0-rc5 #1 Not tainted
> Nov 17 14:07:38 thoregon kernel: [66571.610874]
> ---------------------------------------------------------
> Nov 17 14:07:38 thoregon kernel: [66571.610875] cc1/21330 just changed
> the state of lock:
> Nov 17 14:07:38 thoregon kernel: [66571.610877]
> (sb_internal){.+.+.?}, at: [<ffffffff8122b138>]
> xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:38 thoregon kernel: [66571.610885] but this lock took
> another, RECLAIM_FS-unsafe lock in the past:
> Nov 17 14:07:38 thoregon kernel: [66571.610886]
> (&(&ip->i_lock)->mr_lock/1){+.+.+.}
> Nov 17 14:07:38 thoregon kernel: [66571.610886]
> Nov 17 14:07:39 thoregon kernel: [66571.610886] and interrupts could
> create inverse lock ordering between them.
> Nov 17 14:07:39 thoregon kernel: [66571.610886]
> Nov 17 14:07:39 thoregon kernel: [66571.610890]
> Nov 17 14:07:39 thoregon kernel: [66571.610890] other info that might
> help us debug this:
> Nov 17 14:07:39 thoregon kernel: [66571.610891] Possible interrupt
> unsafe locking scenario:
> Nov 17 14:07:39 thoregon kernel: [66571.610891]
> Nov 17 14:07:39 thoregon kernel: [66571.610892] CPU0
> CPU1
> Nov 17 14:07:39 thoregon kernel: [66571.610893] ----
> ----
> Nov 17 14:07:39 thoregon kernel: [66571.610894]
> lock(&(&ip->i_lock)->mr_lock/1);
> Nov 17 14:07:39 thoregon kernel: [66571.610896]
> local_irq_disable();
> Nov 17 14:07:39 thoregon kernel: [66571.610897]
> lock(sb_internal);
> Nov 17 14:07:39 thoregon kernel: [66571.610898]
> lock(&(&ip->i_lock)->mr_lock/1);
> Nov 17 14:07:39 thoregon kernel: [66571.610900] <Interrupt>
> Nov 17 14:07:39 thoregon kernel: [66571.610901] lock(sb_internal);
> Nov 17 14:07:39 thoregon kernel: [66571.610902]
> Nov 17 14:07:39 thoregon kernel: [66571.610902] *** DEADLOCK ***
> Nov 17 14:07:39 thoregon kernel: [66571.610902]
> Nov 17 14:07:39 thoregon kernel: [66571.610904] 3 locks held by cc1/21330:
> Nov 17 14:07:39 thoregon kernel: [66571.610905] #0:
> (&mm->mmap_sem){++++++}, at: [<ffffffff81029d8b>]
> __do_page_fault+0xfb/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.610910] #1:
> (shrinker_rwsem){++++..}, at: [<ffffffff810bbd02>]
> shrink_slab+0x32/0x1f0
> Nov 17 14:07:39 thoregon kernel: [66571.610915] #2:
> (&type->s_umount_key#20){++++.+}, at: [<ffffffff810f5a7e>]
> grab_super_passive+0x3e/0x90
> Nov 17 14:07:39 thoregon kernel: [66571.610921]
> Nov 17 14:07:39 thoregon kernel: [66571.610921] the shortest
> dependencies between 2nd lock and 1st lock:
> Nov 17 14:07:39 thoregon kernel: [66571.610927] ->
> (&(&ip->i_lock)->mr_lock/1){+.+.+.} ops: 169649 {
> Nov 17 14:07:39 thoregon kernel: [66571.610931] HARDIRQ-ON-W at:
> Nov 17 14:07:39 thoregon kernel: [66571.610932]
> [<ffffffff8107f091>] __lock_acquire+0x631/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.610935]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.610937]
> [<ffffffff8106126a>] down_write_nested+0x4a/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.610941]
> [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
> Nov 17 14:07:39 thoregon kernel: [66571.610944]
> [<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.610946]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.610948]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.610950]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.610953]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.610955]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.610957]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.610959]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.610962]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.610964]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.610967] SOFTIRQ-ON-W at:
> Nov 17 14:07:39 thoregon kernel: [66571.610968]
> [<ffffffff8107f0c7>] __lock_acquire+0x667/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.610970]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.610972]
> [<ffffffff8106126a>] down_write_nested+0x4a/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.610974]
> [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
> Nov 17 14:07:39 thoregon kernel: [66571.610976]
> [<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.610977]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.610979]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.610981]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.610983]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.610985]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.610987]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.610989]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.610991]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.610993]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.610995] RECLAIM_FS-ON-W at:
> Nov 17 14:07:39 thoregon kernel: [66571.610996]
> [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> Nov 17 14:07:39 thoregon kernel: [66571.610998]
> [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.610999]
> [<ffffffff810e9dc5>] kmem_cache_alloc+0x35/0xe0
> Nov 17 14:07:39 thoregon kernel: [66571.611002]
> [<ffffffff810dba21>] vm_map_ram+0x271/0x770
> Nov 17 14:07:39 thoregon kernel: [66571.611004]
> [<ffffffff811e12b6>] _xfs_buf_map_pages+0x46/0xe0
> Nov 17 14:07:39 thoregon kernel: [66571.611008]
> [<ffffffff811e21ca>] xfs_buf_get_map+0x8a/0x130
> Nov 17 14:07:39 thoregon kernel: [66571.611009]
> [<ffffffff81233989>] xfs_trans_get_buf_map+0xa9/0xd0
> Nov 17 14:07:39 thoregon kernel: [66571.611011]
> [<ffffffff8121bc2d>] xfs_ialloc_inode_init+0xcd/0x1d0
> Nov 17 14:07:39 thoregon kernel: [66571.611015]
> [<ffffffff8121c16f>] xfs_ialloc_ag_alloc+0x18f/0x500
> Nov 17 14:07:39 thoregon kernel: [66571.611017]
> [<ffffffff8121d955>] xfs_dialloc+0x185/0x2a0
> Nov 17 14:07:39 thoregon kernel: [66571.611019]
> [<ffffffff8121f068>] xfs_ialloc+0x58/0x650
> Nov 17 14:07:39 thoregon kernel: [66571.611021]
> [<ffffffff811f3995>] xfs_dir_ialloc+0x65/0x270
> Nov 17 14:07:39 thoregon kernel: [66571.611023]
> [<ffffffff811f537c>] xfs_create+0x3ac/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611024]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611026]
> [<ffffffff811ecb61>] xfs_vn_mkdir+0x11/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611028]
> [<ffffffff8110016f>] vfs_mkdir+0x7f/0xd0
> Nov 17 14:07:39 thoregon kernel: [66571.611030]
> [<ffffffff81101b83>] sys_mkdirat+0x43/0x80
> Nov 17 14:07:39 thoregon kernel: [66571.611032]
> [<ffffffff81101bd4>] sys_mkdir+0x14/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611034]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611036] INITIAL USE at:
> Nov 17 14:07:39 thoregon kernel: [66571.611037]
> [<ffffffff8107ed49>] __lock_acquire+0x2e9/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611038]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611040]
> [<ffffffff8106126a>] down_write_nested+0x4a/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611042]
> [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
> Nov 17 14:07:39 thoregon kernel: [66571.611044]
> [<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611046]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611047]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611049]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611051]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.611053]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.611055]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.611057]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.611059]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611061]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611063] }
> Nov 17 14:07:39 thoregon kernel: [66571.611064] ... key at:
> [<ffffffff825b4b81>] __key.41357+0x1/0x8
> Nov 17 14:07:39 thoregon kernel: [66571.611066] ... acquired at:
> Nov 17 14:07:39 thoregon kernel: [66571.611067]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611069]
> [<ffffffff8106126a>] down_write_nested+0x4a/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611071]
> [<ffffffff811e8164>] xfs_ilock+0x84/0xb0
> Nov 17 14:07:39 thoregon kernel: [66571.611073]
> [<ffffffff811f51a4>] xfs_create+0x1d4/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611074]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611076]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611078]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611080]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.611082]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.611084]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.611086]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.611088]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611090]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611091]
> Nov 17 14:07:39 thoregon kernel: [66571.611092] ->
> (sb_internal){.+.+.?} ops: 1341531 {
> Nov 17 14:07:39 thoregon kernel: [66571.611095] HARDIRQ-ON-R at:
> Nov 17 14:07:39 thoregon kernel: [66571.611096]
> [<ffffffff8107ef6a>] __lock_acquire+0x50a/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611098]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611100]
> [<ffffffff810f451b>] __sb_start_write+0xab/0x190
> Nov 17 14:07:39 thoregon kernel: [66571.611102]
> [<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611104]
> [<ffffffff811f5157>] xfs_create+0x187/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611105]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611107]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611109]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611111]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.611113]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.611115]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.611117]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.611119]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611121]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611123] SOFTIRQ-ON-R at:
> Nov 17 14:07:39 thoregon kernel: [66571.611124]
> [<ffffffff8107f0c7>] __lock_acquire+0x667/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611126]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611128]
> [<ffffffff810f451b>] __sb_start_write+0xab/0x190
> Nov 17 14:07:39 thoregon kernel: [66571.611130]
> [<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611132]
> [<ffffffff811f5157>] xfs_create+0x187/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611133]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611135]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611137]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611139]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.611141]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.611143]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.611145]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.611147]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611149]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611151] IN-RECLAIM_FS-R at:
> Nov 17 14:07:39 thoregon kernel: [66571.611152]
> [<ffffffff8107efdf>] __lock_acquire+0x57f/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611154]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611156]
> [<ffffffff810f451b>] __sb_start_write+0xab/0x190
> Nov 17 14:07:39 thoregon kernel: [66571.611158]
> [<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611159]
> [<ffffffff811f3e64>] xfs_free_eofblocks+0x104/0x250
> Nov 17 14:07:39 thoregon kernel: [66571.611161]
> [<ffffffff811f4b39>] xfs_inactive+0xa9/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.611163]
> [<ffffffff811efe10>] xfs_fs_evict_inode+0x70/0x80
> Nov 17 14:07:39 thoregon kernel: [66571.611165]
> [<ffffffff8110cb7f>] evict+0xaf/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611168]
> [<ffffffff8110d209>] dispose_list+0x39/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611170]
> [<ffffffff8110dc63>] prune_icache_sb+0x183/0x340
> Nov 17 14:07:39 thoregon kernel: [66571.611172]
> [<ffffffff810f5bc3>] prune_super+0xf3/0x1a0
> Nov 17 14:07:39 thoregon kernel: [66571.611173]
> [<ffffffff810bbdee>] shrink_slab+0x11e/0x1f0
> Nov 17 14:07:39 thoregon kernel: [66571.611175]
> [<ffffffff810be98f>] try_to_free_pages+0x21f/0x4e0
> Nov 17 14:07:39 thoregon kernel: [66571.611177]
> [<ffffffff810b5ec6>] __alloc_pages_nodemask+0x506/0x800
> Nov 17 14:07:39 thoregon kernel: [66571.611179]
> [<ffffffff810ce56e>] handle_pte_fault+0x5ae/0x7a0
> Nov 17 14:07:39 thoregon kernel: [66571.611182]
> [<ffffffff810cf769>] handle_mm_fault+0x1f9/0x2a0
> Nov 17 14:07:39 thoregon kernel: [66571.611184]
> [<ffffffff81029dfc>] __do_page_fault+0x16c/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.611186]
> [<ffffffff8102a139>] do_page_fault+0x9/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611188]
> [<ffffffff816b287f>] page_fault+0x1f/0x30
> Nov 17 14:07:39 thoregon kernel: [66571.611190] RECLAIM_FS-ON-R at:
> Nov 17 14:07:39 thoregon kernel: [66571.611191]
> [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> Nov 17 14:07:39 thoregon kernel: [66571.611193]
> [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611195]
> [<ffffffff810e9dc5>] kmem_cache_alloc+0x35/0xe0
> Nov 17 14:07:39 thoregon kernel: [66571.611197]
> [<ffffffff811f6d4f>] kmem_zone_alloc+0x5f/0xe0
> Nov 17 14:07:39 thoregon kernel: [66571.611198]
> [<ffffffff811f6de8>] kmem_zone_zalloc+0x18/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611200]
> [<ffffffff8122b0b2>] _xfs_trans_alloc+0x32/0x90
> Nov 17 14:07:39 thoregon kernel: [66571.611202]
> [<ffffffff8122b148>] xfs_trans_alloc+0x38/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611204]
> [<ffffffff811f5157>] xfs_create+0x187/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611205]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611207]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611209]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611211]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.611213]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.611215]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.611217]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.611219]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611221]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611223] INITIAL USE at:
> Nov 17 14:07:39 thoregon kernel: [66571.611224]
> [<ffffffff8107ed49>] __lock_acquire+0x2e9/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611225]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611227]
> [<ffffffff810f451b>] __sb_start_write+0xab/0x190
> Nov 17 14:07:39 thoregon kernel: [66571.611229]
> [<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611231]
> [<ffffffff811f5157>] xfs_create+0x187/0x5a0
> Nov 17 14:07:39 thoregon kernel: [66571.611232]
> [<ffffffff811eca2a>] xfs_vn_mknod+0x8a/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611234]
> [<ffffffff811ecb7e>] xfs_vn_create+0xe/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611236]
> [<ffffffff81100322>] vfs_create+0x72/0xc0
> Nov 17 14:07:39 thoregon kernel: [66571.611238]
> [<ffffffff81100b7e>] do_last.isra.69+0x80e/0xc80
> Nov 17 14:07:39 thoregon kernel: [66571.611240]
> [<ffffffff8110109b>] path_openat.isra.70+0xab/0x490
> Nov 17 14:07:39 thoregon kernel: [66571.611242]
> [<ffffffff8110183d>] do_filp_open+0x3d/0xa0
> Nov 17 14:07:39 thoregon kernel: [66571.611244]
> [<ffffffff810f2129>] do_sys_open+0xf9/0x1e0
> Nov 17 14:07:39 thoregon kernel: [66571.611246]
> [<ffffffff810f222c>] sys_open+0x1c/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611248]
> [<ffffffff816b2e52>] system_call_fastpath+0x16/0x1b
> Nov 17 14:07:39 thoregon kernel: [66571.611250] }
> Nov 17 14:07:39 thoregon kernel: [66571.611251] ... key at:
> [<ffffffff81c34e40>] xfs_fs_type+0x60/0x80
> Nov 17 14:07:39 thoregon kernel: [66571.611254] ... acquired at:
> Nov 17 14:07:39 thoregon kernel: [66571.611254]
> [<ffffffff8107df3b>] check_usage_forwards+0x10b/0x140
> Nov 17 14:07:39 thoregon kernel: [66571.611256]
> [<ffffffff8107e900>] mark_lock+0x190/0x2f0
> Nov 17 14:07:39 thoregon kernel: [66571.611258]
> [<ffffffff8107efdf>] __lock_acquire+0x57f/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611260]
> [<ffffffff81080b55>] lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611261]
> [<ffffffff810f451b>] __sb_start_write+0xab/0x190
> Nov 17 14:07:39 thoregon kernel: [66571.611263]
> [<ffffffff8122b138>] xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611265]
> [<ffffffff811f3e64>] xfs_free_eofblocks+0x104/0x250
> Nov 17 14:07:39 thoregon kernel: [66571.611266]
> [<ffffffff811f4b39>] xfs_inactive+0xa9/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.611268]
> [<ffffffff811efe10>] xfs_fs_evict_inode+0x70/0x80
> Nov 17 14:07:39 thoregon kernel: [66571.611270]
> [<ffffffff8110cb7f>] evict+0xaf/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611271]
> [<ffffffff8110d209>] dispose_list+0x39/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611273]
> [<ffffffff8110dc63>] prune_icache_sb+0x183/0x340
> Nov 17 14:07:39 thoregon kernel: [66571.611275]
> [<ffffffff810f5bc3>] prune_super+0xf3/0x1a0
> Nov 17 14:07:39 thoregon kernel: [66571.611277]
> [<ffffffff810bbdee>] shrink_slab+0x11e/0x1f0
> Nov 17 14:07:39 thoregon kernel: [66571.611278]
> [<ffffffff810be98f>] try_to_free_pages+0x21f/0x4e0
> Nov 17 14:07:39 thoregon kernel: [66571.611280]
> [<ffffffff810b5ec6>] __alloc_pages_nodemask+0x506/0x800
> Nov 17 14:07:39 thoregon kernel: [66571.611282]
> [<ffffffff810ce56e>] handle_pte_fault+0x5ae/0x7a0
> Nov 17 14:07:39 thoregon kernel: [66571.611284]
> [<ffffffff810cf769>] handle_mm_fault+0x1f9/0x2a0
> Nov 17 14:07:39 thoregon kernel: [66571.611286]
> [<ffffffff81029dfc>] __do_page_fault+0x16c/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.611287]
> [<ffffffff8102a139>] do_page_fault+0x9/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611289]
> [<ffffffff816b287f>] page_fault+0x1f/0x30
> Nov 17 14:07:39 thoregon kernel: [66571.611291]
> Nov 17 14:07:39 thoregon kernel: [66571.611292]
> Nov 17 14:07:39 thoregon kernel: [66571.611292] stack backtrace:
> Nov 17 14:07:39 thoregon kernel: [66571.611294] Pid: 21330, comm: cc1
> Not tainted 3.7.0-rc5 #1
> Nov 17 14:07:39 thoregon kernel: [66571.611295] Call Trace:
> Nov 17 14:07:39 thoregon kernel: [66571.611298] [<ffffffff8107de28>]
> print_irq_inversion_bug.part.37+0x1e8/0x1f0
> Nov 17 14:07:39 thoregon kernel: [66571.611300] [<ffffffff8107df3b>]
> check_usage_forwards+0x10b/0x140
> Nov 17 14:07:39 thoregon kernel: [66571.611303] [<ffffffff8107e900>]
> mark_lock+0x190/0x2f0
> Nov 17 14:07:39 thoregon kernel: [66571.611306] [<ffffffff8150406e>]
> ? dm_request+0x2e/0x2a0
> Nov 17 14:07:39 thoregon kernel: [66571.611308] [<ffffffff8107de30>]
> ? print_irq_inversion_bug.part.37+0x1f0/0x1f0
> Nov 17 14:07:39 thoregon kernel: [66571.611310] [<ffffffff8107efdf>]
> __lock_acquire+0x57f/0x1c00
> Nov 17 14:07:39 thoregon kernel: [66571.611313] [<ffffffff812202f4>]
> ? xfs_iext_bno_to_ext+0x84/0x160
> Nov 17 14:07:39 thoregon kernel: [66571.611316] [<ffffffff8120a023>]
> ? xfs_bmbt_get_all+0x13/0x20
> Nov 17 14:07:39 thoregon kernel: [66571.611318] [<ffffffff81200fb4>]
> ? xfs_bmap_search_multi_extents+0xa4/0x110
> Nov 17 14:07:39 thoregon kernel: [66571.611320] [<ffffffff81080b55>]
> lock_acquire+0x55/0x70
> Nov 17 14:07:39 thoregon kernel: [66571.611322] [<ffffffff8122b138>]
> ? xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611324] [<ffffffff810f451b>]
> __sb_start_write+0xab/0x190
> Nov 17 14:07:39 thoregon kernel: [66571.611326] [<ffffffff8122b138>]
> ? xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611328] [<ffffffff8122b138>]
> ? xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611330] [<ffffffff8122b138>]
> xfs_trans_alloc+0x28/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611332] [<ffffffff811f3e64>]
> xfs_free_eofblocks+0x104/0x250
> Nov 17 14:07:39 thoregon kernel: [66571.611334] [<ffffffff816b204b>]
> ? _raw_spin_unlock_irq+0x2b/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611336] [<ffffffff811f4b39>]
> xfs_inactive+0xa9/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.611337] [<ffffffff816b204b>]
> ? _raw_spin_unlock_irq+0x2b/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611340] [<ffffffff811efe10>]
> xfs_fs_evict_inode+0x70/0x80
> Nov 17 14:07:39 thoregon kernel: [66571.611342] [<ffffffff8110cb7f>]
> evict+0xaf/0x1b0
> Nov 17 14:07:39 thoregon kernel: [66571.611344] [<ffffffff8110d209>]
> dispose_list+0x39/0x50
> Nov 17 14:07:39 thoregon kernel: [66571.611346] [<ffffffff8110dc63>]
> prune_icache_sb+0x183/0x340
> Nov 17 14:07:39 thoregon kernel: [66571.611347] [<ffffffff810f5bc3>]
> prune_super+0xf3/0x1a0
> Nov 17 14:07:39 thoregon kernel: [66571.611349] [<ffffffff810bbdee>]
> shrink_slab+0x11e/0x1f0
> Nov 17 14:07:39 thoregon kernel: [66571.611352] [<ffffffff810be98f>]
> try_to_free_pages+0x21f/0x4e0
> Nov 17 14:07:39 thoregon kernel: [66571.611354] [<ffffffff810b5ec6>]
> __alloc_pages_nodemask+0x506/0x800
> Nov 17 14:07:39 thoregon kernel: [66571.611356] [<ffffffff810b9e40>]
> ? lru_deactivate_fn+0x1c0/0x1c0
> Nov 17 14:07:39 thoregon kernel: [66571.611358] [<ffffffff810ce56e>]
> handle_pte_fault+0x5ae/0x7a0
> Nov 17 14:07:39 thoregon kernel: [66571.611360] [<ffffffff810cf769>]
> handle_mm_fault+0x1f9/0x2a0
> Nov 17 14:07:39 thoregon kernel: [66571.611363] [<ffffffff81029dfc>]
> __do_page_fault+0x16c/0x480
> Nov 17 14:07:39 thoregon kernel: [66571.611366] [<ffffffff8129c7ad>]
> ? trace_hardirqs_off_thunk+0x3a/0x3c
> Nov 17 14:07:39 thoregon kernel: [66571.611368] [<ffffffff8102a139>]
> do_page_fault+0x9/0x10
> Nov 17 14:07:39 thoregon kernel: [66571.611370] [<ffffffff816b287f>]
> page_fault+0x1f/0x30

2012-11-18 23:51:10

by Dave Chinner

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Sun, Nov 18, 2012 at 04:29:22PM +0100, Torsten Kaiser wrote:
> On Sun, Nov 18, 2012 at 11:24 AM, Torsten Kaiser
> <[email protected]> wrote:
> > On Tue, Oct 30, 2012 at 9:37 PM, Torsten Kaiser
> > <[email protected]> wrote:
> >> I will keep LOCKDEP enabled on that system, and if there really is
> >> another splat, I will report back here. But I rather doubt that this
> >> will be needed.
> >
> > After the patch, I did not see this problem again, but today I found
> > another LOCKDEP report that also looks XFS related.
> > I found it twice in the logs, and as both were slightly different, I
> > will attach both versions.
>
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104353] 3.7.0-rc4 #1 Not tainted
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104355] inconsistent
> > {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104430] CPU0
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104431] ----
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104432] lock(&(&ip->i_lock)->mr_lock);
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104433] <Interrupt>
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104434]
> > lock(&(&ip->i_lock)->mr_lock);
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104435]
> > Nov 6 21:57:09 thoregon kernel: [ 9941.104435] *** DEADLOCK ***
>
> Sorry! Copied the wrong report. Your fix only landed in -rc5, so my
> vanilla -rc4 did (also) report the old problem again.
> And I copy&pasted that report instead of the second appearance of the
> new problem.

Can you repost it with line wrapping turned off? The output simply
becomes unreadable when it wraps....

Yeah, I know I can put it back together, but I've got better things
to do with my time than stitch a couple of hundred lines of debug
back into a readable format....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2012-11-19 06:50:10

by Torsten Kaiser

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Mon, Nov 19, 2012 at 12:51 AM, Dave Chinner <[email protected]> wrote:
> On Sun, Nov 18, 2012 at 04:29:22PM +0100, Torsten Kaiser wrote:
>> On Sun, Nov 18, 2012 at 11:24 AM, Torsten Kaiser
>> <[email protected]> wrote:
>> > On Tue, Oct 30, 2012 at 9:37 PM, Torsten Kaiser
>> > <[email protected]> wrote:
>> >> I will keep LOCKDEP enabled on that system, and if there really is
>> >> another splat, I will report back here. But I rather doubt that this
>> >> will be needed.
>> >
>> > After the patch, I did not see this problem again, but today I found
>> > another LOCKDEP report that also looks XFS related.
>> > I found it twice in the logs, and as both were slightly different, I
>> > will attach both versions.
>>
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104353] 3.7.0-rc4 #1 Not tainted
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104355] inconsistent
>> > {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104430] CPU0
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104431] ----
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104432] lock(&(&ip->i_lock)->mr_lock);
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104433] <Interrupt>
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104434]
>> > lock(&(&ip->i_lock)->mr_lock);
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104435]
>> > Nov 6 21:57:09 thoregon kernel: [ 9941.104435] *** DEADLOCK ***
>>
>> Sorry! Copied the wrong report. Your fix only landed in -rc5, so my
>> vanilla -rc4 did (also) report the old problem again.
>> And I copy&pasted that report instead of the second appearance of the
>> new problem.
>
> Can you repost it with line wrapping turned off? The output simply
> becomes unreadable when it wraps....
>
> Yeah, I know I can put it back together, but I've got better things
> to do with my time than stitch a couple of hundred lines of debug
> back into a readable format....

Sorry about that, but I can't find any option to turn that off in Gmail.

I have added the reports as attachment, I hope thats OK for you.

Thanks for looking into this.

Torsten


Attachments:
lockdep-reports.txt (36.00 kB)

2012-11-19 23:53:14

by Dave Chinner

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Mon, Nov 19, 2012 at 07:50:06AM +0100, Torsten Kaiser wrote:
> On Mon, Nov 19, 2012 at 12:51 AM, Dave Chinner <[email protected]> wrote:
> > On Sun, Nov 18, 2012 at 04:29:22PM +0100, Torsten Kaiser wrote:
> >> On Sun, Nov 18, 2012 at 11:24 AM, Torsten Kaiser
> >> <[email protected]> wrote:
> >> > On Tue, Oct 30, 2012 at 9:37 PM, Torsten Kaiser
> >> > <[email protected]> wrote:
> >> >> I will keep LOCKDEP enabled on that system, and if there really is
> >> >> another splat, I will report back here. But I rather doubt that this
> >> >> will be needed.
> >> >
> >> > After the patch, I did not see this problem again, but today I found
> >> > another LOCKDEP report that also looks XFS related.
> >> > I found it twice in the logs, and as both were slightly different, I
> >> > will attach both versions.
> >>
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104353] 3.7.0-rc4 #1 Not tainted
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104355] inconsistent
> >> > {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104430] CPU0
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104431] ----
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104432] lock(&(&ip->i_lock)->mr_lock);
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104433] <Interrupt>
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104434]
> >> > lock(&(&ip->i_lock)->mr_lock);
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104435]
> >> > Nov 6 21:57:09 thoregon kernel: [ 9941.104435] *** DEADLOCK ***
> >>
> >> Sorry! Copied the wrong report. Your fix only landed in -rc5, so my
> >> vanilla -rc4 did (also) report the old problem again.
> >> And I copy&pasted that report instead of the second appearance of the
> >> new problem.
> >
> > Can you repost it with line wrapping turned off? The output simply
> > becomes unreadable when it wraps....
> >
> > Yeah, I know I can put it back together, but I've got better things
> > to do with my time than stitch a couple of hundred lines of debug
> > back into a readable format....
>
> Sorry about that, but I can't find any option to turn that off in Gmail.

Seems like you can't, as per Documentation/email-clients.txt

> I have added the reports as attachment, I hope thats OK for you.

Encoded as text, so it does.

So, both lockdep thingy's are the same:

> [110926.972482] =========================================================
> [110926.972484] [ INFO: possible irq lock inversion dependency detected ]
> [110926.972486] 3.7.0-rc4 #1 Not tainted
> [110926.972487] ---------------------------------------------------------
> [110926.972489] kswapd0/725 just changed the state of lock:
> [110926.972490] (sb_internal){.+.+.?}, at: [<ffffffff8122b268>] xfs_trans_alloc+0x28/0x50
> [110926.972499] but this lock took another, RECLAIM_FS-unsafe lock in the past:
> [110926.972500] (&(&ip->i_lock)->mr_lock/1){+.+.+.}

Ah, what? Since when has the ilock been reclaim unsafe?

> [110926.972500] and interrupts could create inverse lock ordering between them.
> [110926.972500]
> [110926.972503]
> [110926.972503] other info that might help us debug this:
> [110926.972504] Possible interrupt unsafe locking scenario:
> [110926.972504]
> [110926.972505] CPU0 CPU1
> [110926.972506] ---- ----
> [110926.972507] lock(&(&ip->i_lock)->mr_lock/1);
> [110926.972509] local_irq_disable();
> [110926.972509] lock(sb_internal);
> [110926.972511] lock(&(&ip->i_lock)->mr_lock/1);
> [110926.972512] <Interrupt>
> [110926.972513] lock(sb_internal);

Um, that's just bizzare. No XFS code runs with interrupts disabled,
so I cannot see how this possible.

.....


[<ffffffff8108137e>] mark_held_locks+0x7e/0x130
[<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
[<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
[<ffffffff810dba31>] vm_map_ram+0x271/0x770
[<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
[<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
[<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
[<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0

We shouldn't be mapping buffers there, there's a patch below to fix
this. It's probably the source of this report, even though I cannot
lockdep seems to be off with the fairies...

Cheers,

Dave.
--
Dave Chinner
[email protected]

xfs: inode allocation should use unmapped buffers.

From: Dave Chinner <[email protected]>

Inode buffers do not need to be mapped as inodes are read or written
directly from/to the pages underlying the buffer. This fixes a
regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the
default behaviour").

Signed-off-by: Dave Chinner <[email protected]>
---
fs/xfs/xfs_ialloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
index 2d6495e..a815412 100644
--- a/fs/xfs/xfs_ialloc.c
+++ b/fs/xfs/xfs_ialloc.c
@@ -200,7 +200,8 @@ xfs_ialloc_inode_init(
*/
d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster));
fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d,
- mp->m_bsize * blks_per_cluster, 0);
+ mp->m_bsize * blks_per_cluster,
+ XBF_UNMAPPED);
if (!fbuf)
return ENOMEM;
/*

2012-11-20 07:09:09

by Torsten Kaiser

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <[email protected]> wrote:
> On Mon, Nov 19, 2012 at 07:50:06AM +0100, Torsten Kaiser wrote:
> So, both lockdep thingy's are the same:

I suspected this, but as the reports where slightly different I
attached bith of them, as I couldn't decide witch one was the
better/simpler report to debug this.

>> [110926.972482] =========================================================
>> [110926.972484] [ INFO: possible irq lock inversion dependency detected ]
>> [110926.972486] 3.7.0-rc4 #1 Not tainted
>> [110926.972487] ---------------------------------------------------------
>> [110926.972489] kswapd0/725 just changed the state of lock:
>> [110926.972490] (sb_internal){.+.+.?}, at: [<ffffffff8122b268>] xfs_trans_alloc+0x28/0x50
>> [110926.972499] but this lock took another, RECLAIM_FS-unsafe lock in the past:
>> [110926.972500] (&(&ip->i_lock)->mr_lock/1){+.+.+.}
>
> Ah, what? Since when has the ilock been reclaim unsafe?
>
>> [110926.972500] and interrupts could create inverse lock ordering between them.
>> [110926.972500]
>> [110926.972503]
>> [110926.972503] other info that might help us debug this:
>> [110926.972504] Possible interrupt unsafe locking scenario:
>> [110926.972504]
>> [110926.972505] CPU0 CPU1
>> [110926.972506] ---- ----
>> [110926.972507] lock(&(&ip->i_lock)->mr_lock/1);
>> [110926.972509] local_irq_disable();
>> [110926.972509] lock(sb_internal);
>> [110926.972511] lock(&(&ip->i_lock)->mr_lock/1);
>> [110926.972512] <Interrupt>
>> [110926.972513] lock(sb_internal);
>
> Um, that's just bizzare. No XFS code runs with interrupts disabled,
> so I cannot see how this possible.
>
> .....
>
>
> [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
> [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
> [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
> [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0
>
> We shouldn't be mapping buffers there, there's a patch below to fix
> this. It's probably the source of this report, even though I cannot
> lockdep seems to be off with the fairies...

I also tried to understand what lockdep was saying, but
Documentation/lockdep-design.txt is not too helpful.
I think 'CLASS'-ON-R / -ON-W means that this lock was 'ON' / held
while 'CLASS' (HARDIRQ, SOFTIRQ, RECLAIM_FS) happend and that makes
this lock unsafe for these contexts. IN-'CLASS'-R / -W seems to be
'lock taken in context 'CLASS'.
A note that 'CLASS'-ON-? means 'CLASS'-unsafe in there would be helpful to me...

Wrt. above interrupt output: I think lockdep doesn't really know about
RECLAIM_FS and threats it as another interrupt. I think that output
should have been something like this:
CPU0 CPU1
---- ----
lock(&(&ip->i_lock)->mr_lock/1);
<Allocation enters reclaim>
lock(sb_internal);
lock(&(&ip->i_lock)->mr_lock/1);
<Allocation enters reclaim>
lock(sb_internal);

Entering reclaim on CPU1 would mean that CPU1 would not enter reclaim
again, so the reclaim-'interrupt' would be disabled.
And instead of interrupts disrupting the normal codeflow on CPU0 it
would be 'interrupted' be instead of doing a normal allocation, it
would 'interrupt' the allocation to reclaim memory.
print_irq_lock_scenario() would need to be taught to print a slightly
different message for reclaim-'interrupts'.

I will try your patch, but as I do not have a reliable reproducer to
create this lockdep report, I can't really verify if this fixes it.
But I will definitely mail you, if it happens again with this patch.

Thanks, Torsten

> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> xfs: inode allocation should use unmapped buffers.
>
> From: Dave Chinner <[email protected]>
>
> Inode buffers do not need to be mapped as inodes are read or written
> directly from/to the pages underlying the buffer. This fixes a
> regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the
> default behaviour").
>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/xfs/xfs_ialloc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
> index 2d6495e..a815412 100644
> --- a/fs/xfs/xfs_ialloc.c
> +++ b/fs/xfs/xfs_ialloc.c
> @@ -200,7 +200,8 @@ xfs_ialloc_inode_init(
> */
> d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster));
> fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d,
> - mp->m_bsize * blks_per_cluster, 0);
> + mp->m_bsize * blks_per_cluster,
> + XBF_UNMAPPED);
> if (!fbuf)
> return ENOMEM;
> /*

2012-11-20 19:45:08

by Torsten Kaiser

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <[email protected]> wrote:
> [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
> [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
> [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
> [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0
>
> We shouldn't be mapping buffers there, there's a patch below to fix
> this. It's probably the source of this report, even though I cannot
> lockdep seems to be off with the fairies...

That patch seems to break my system.
After it started to swap, because I was compiling seamonkey (firefox
turned into the full navigator suite) on a tmpfs, several processes
got stuck and triggered the hung-task-check.
As a kswapd, xfsaild/md4 and flush-9:4 also got stuck, not even a
shutdown worked.

The attached log first contains the hung-task-notices, then the output
from SysRq+W.

After the shutdown got stuck trying to turn of swap, I first tries
SysRq+S, but did not get a 'Done' and on SysRq+U lockdep complained
about an lock imbalance wrt. sb_writer. SysRq+O also did no longer
work, only SysRq+B.

I don't know which one got stuck first, but I'm somewhat suspicious of
the plasma-desktop and the sshd that SysRq+W reported stuck in xfs
reclaim, even if these processes did never trigger the hung task
check.

Torsten

> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
>
> xfs: inode allocation should use unmapped buffers.
>
> From: Dave Chinner <[email protected]>
>
> Inode buffers do not need to be mapped as inodes are read or written
> directly from/to the pages underlying the buffer. This fixes a
> regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the
> default behaviour").
>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/xfs/xfs_ialloc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
> index 2d6495e..a815412 100644
> --- a/fs/xfs/xfs_ialloc.c
> +++ b/fs/xfs/xfs_ialloc.c
> @@ -200,7 +200,8 @@ xfs_ialloc_inode_init(
> */
> d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster));
> fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d,
> - mp->m_bsize * blks_per_cluster, 0);
> + mp->m_bsize * blks_per_cluster,
> + XBF_UNMAPPED);
> if (!fbuf)
> return ENOMEM;
> /*


Attachments:
xfs-reclaim-hang-messages.txt (77.93 kB)

2012-11-20 20:27:49

by Dave Chinner

[permalink] [raw]
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Nov 20, 2012 at 08:45:03PM +0100, Torsten Kaiser wrote:
> On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <[email protected]> wrote:
> > [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
> > [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
> > [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
> > [<ffffffff810dba31>] vm_map_ram+0x271/0x770
> > [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
> > [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
> > [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
> > [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0
> >
> > We shouldn't be mapping buffers there, there's a patch below to fix
> > this. It's probably the source of this report, even though I cannot
> > lockdep seems to be off with the fairies...
>
> That patch seems to break my system.

You've got an IO problem, not an XFS problem. Everything is hung up
on MD.

INFO: task kswapd0:725 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0 D 0000000000000001 0 725 2 0x00000000
ffff8803280d13f8 0000000000000046 ffff880329a0ab80 ffff8803280d1fd8
ffff8803280d1fd8 ffff8803280d1fd8 ffff880046b7c880 ffff880329a0ab80
ffff8803280d1408 ffff8803278dbbd0 ffff8803278db800 00000000ffffffff
Call Trace:
[<ffffffff816b1224>] schedule+0x24/0x60
[<ffffffff814f9dad>] md_super_wait+0x4d/0x80
[<ffffffff81500753>] bitmap_unplug+0x173/0x180
[<ffffffff814e8eb8>] raid1_unplug+0x98/0x110
[<ffffffff81278a6d>] blk_flush_plug_list+0xad/0x240
[<ffffffff816b15c3>] io_schedule_timeout+0x83/0xf0
[<ffffffff810b0e1d>] mempool_alloc+0x12d/0x160
[<ffffffff811263da>] bvec_alloc_bs+0xda/0x100
[<ffffffff811264ea>] bio_alloc_bioset+0xea/0x110
[<ffffffff81126656>] bio_clone_bioset+0x16/0x40
[<ffffffff814f471a>] bio_clone_mddev+0x1a/0x30
[<ffffffff814edbb1>] make_request+0x551/0xde0
[<ffffffff814f80bb>] md_make_request+0x21b/0x4d0
[<ffffffff81276e52>] generic_make_request+0xc2/0x100
[<ffffffff81276ef5>] submit_bio+0x65/0x110
[<ffffffff811e07bf>] xfs_submit_ioend_bio.isra.21+0x2f/0x40
[<ffffffff811e088e>] xfs_submit_ioend+0xbe/0x110
[<ffffffff811e0c91>] xfs_vm_writepage+0x3b1/0x540
[<ffffffff810bcd84>] shrink_page_list+0x564/0x890
[<ffffffff810bd637>] shrink_inactive_list+0x1d7/0x310
[<ffffffff810bdb9d>] shrink_lruvec+0x42d/0x530
[<ffffffff810be323>] kswapd+0x683/0xa20
[<ffffffff8105c246>] kthread+0xd6/0xe0
[<ffffffff816b31ac>] ret_from_fork+0x7c/0xb0
no locks held by kswapd0/725.

So kswapd is trying to clean pages, but it's blocked in an unplug
during IO submission. Probably one to report to the linux-raid list.

INFO: task xfsaild/md4:1742 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
xfsaild/md4 D 0000000000000003 0 1742 2 0x00000000
ffff88032438bb68 0000000000000046 ffff880329965700 ffff88032438bfd8
ffff88032438bfd8 ffff88032438bfd8 ffff88032827e580 ffff880329965700
ffff88032438bb78 ffff8803278dbbd0 ffff8803278db800 00000000ffffffff
Call Trace:
[<ffffffff816b1224>] schedule+0x24/0x60
[<ffffffff814f9dad>] md_super_wait+0x4d/0x80
[<ffffffff8105ca30>] ? __init_waitqueue_head+0x60/0x60
[<ffffffff81500753>] bitmap_unplug+0x173/0x180
[<ffffffff81278c13>] ? blk_finish_plug+0x13/0x50
[<ffffffff814e8eb8>] raid1_unplug+0x98/0x110
[<ffffffff81278a6d>] blk_flush_plug_list+0xad/0x240
[<ffffffff81278c13>] blk_finish_plug+0x13/0x50
[<ffffffff811e296a>] __xfs_buf_delwri_submit+0x1ca/0x1e0
[<ffffffff811e2ffb>] xfs_buf_delwri_submit_nowait+0x1b/0x20
[<ffffffff81233066>] xfsaild+0x226/0x4c0
[<ffffffff81065dfa>] ? finish_task_switch+0x3a/0x100
[<ffffffff81232e40>] ? xfs_trans_ail_cursor_first+0xa0/0xa0
[<ffffffff8105c246>] kthread+0xd6/0xe0
[<ffffffff816b246b>] ? _raw_spin_unlock_irq+0x2b/0x50
[<ffffffff8105c170>] ? flush_kthread_worker+0xe0/0xe0
[<ffffffff816b31ac>] ret_from_fork+0x7c/0xb0
[<ffffffff8105c170>] ? flush_kthread_worker+0xe0/0xe0
no locks held by xfsaild/md4/1742.

Same here - metadata writes are backed up waiting for MD to submit
IO. Everything else is stuck on thesei or MD, too...

Cheers,

Dave.
--
Dave Chinner
[email protected]