Christoph Hellwig wrote:
> On Fri, Feb 20, 2009 at 08:52:59PM +0300, Alexander Beregalov wrote:
>> Hi
>>
>> [ INFO: inconsistent lock state ]
>> 2.6.29-rc5-next-20090220 #2
>> ---------------------------------
>> inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
>> kswapd0/324 [HC0[0]:SC0[0]:HE1:SE1] takes:
>> (&(&ip->i_lock)->mr_lock){+++++?}, at: [<ffffffff803ca60a>]
>> xfs_ilock+0xaa/0x120
>> {RECLAIM_FS-ON-W} state was registered at:
>
> That's a false positive. While the ilock can be taken in reclaim the
> allocation here is done before the inode is added to the inode cache.
>
> The patch below should help avoiding the warning:
Seems ok to me. I hate to see the BUG() added but I guess in this case
something truly bizarre would have to happen for the ilock to fail on
this inode.
on irc you sugggested ASSERT(0); instead of BUG(); I might prefer that
but either way:
Reviewed-by: Eric Sandeen <[email protected]>
>
> Index: xfs/fs/xfs/xfs_iget.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_iget.c 2009-02-24 20:56:00.716027739 +0100
> +++ xfs/fs/xfs/xfs_iget.c 2009-02-24 20:56:46.089031360 +0100
> @@ -246,9 +246,6 @@ xfs_iget_cache_miss(
> goto out_destroy;
> }
>
> - if (lock_flags)
> - xfs_ilock(ip, lock_flags);
> -
> /*
> * Preload the radix tree so we can insert safely under the
> * write spinlock. Note that we cannot sleep inside the preload
> @@ -259,6 +256,15 @@ xfs_iget_cache_miss(
> goto out_unlock;
> }
>
> + /*
> + * Because the inode hasn't been added to the radix-tree yet it can't
> + * be found by another thread, so we can do the non-sleeping lock here.
> + */
> + if (lock_flags) {
> + if (!xfs_ilock_nowait(ip, lock_flags))
> + BUG();
> + }
> +
> mask = ~(((XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_inodelog)) - 1);
> first_index = agino & mask;
> write_lock(&pag->pag_ici_lock);
>
> _______________________________________________
> xfs mailing list
> [email protected]
> http://oss.sgi.com/mailman/listinfo/xfs
>
On Mar 3, 2009, at 10:00 AM, Eric Sandeen wrote:
> Christoph Hellwig wrote:
>> On Fri, Feb 20, 2009 at 08:52:59PM +0300, Alexander Beregalov wrote:
>>> Hi
>>>
>>> [ INFO: inconsistent lock state ]
>>> 2.6.29-rc5-next-20090220 #2
>>> ---------------------------------
>>> inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
>>> kswapd0/324 [HC0[0]:SC0[0]:HE1:SE1] takes:
>>> (&(&ip->i_lock)->mr_lock){+++++?}, at: [<ffffffff803ca60a>]
>>> xfs_ilock+0xaa/0x120
>>> {RECLAIM_FS-ON-W} state was registered at:
>>
>> That's a false positive. While the ilock can be taken in reclaim the
>> allocation here is done before the inode is added to the inode cache.
>>
>> The patch below should help avoiding the warning:
>
> Seems ok to me. I hate to see the BUG() added but I guess in this
> case
> something truly bizarre would have to happen for the ilock to fail on
> this inode.
>
> on irc you sugggested ASSERT(0); instead of BUG();
That would mean that instead of bombing out here, we do it
in xfs debug kernels only, which is a good thing. However,
do we just silently ignore it in non debug kernels, and
later try to unlock without locking first?
Maybe the following be better:
if (lock_flags) {
if (!xfs_ilock_nowait(ip, lock_flags)) {
ASSERT(0);
error = EAGAIN;
goto out_destroy;
}
}
Or just keep the BUG(); , as it shouldn't happen (we hope).
Reviewed-by: Felix Blyakher <[email protected]>
> I might prefer that
> but either way:
>
> Reviewed-by: Eric Sandeen <[email protected]>
>
>>
>> Index: xfs/fs/xfs/xfs_iget.c
>> ===================================================================
>> --- xfs.orig/fs/xfs/xfs_iget.c 2009-02-24 20:56:00.716027739 +0100
>> +++ xfs/fs/xfs/xfs_iget.c 2009-02-24 20:56:46.089031360 +0100
>> @@ -246,9 +246,6 @@ xfs_iget_cache_miss(
>> goto out_destroy;
>> }
>>
>> - if (lock_flags)
>> - xfs_ilock(ip, lock_flags);
>> -
>> /*
>> * Preload the radix tree so we can insert safely under the
>> * write spinlock. Note that we cannot sleep inside the preload
>> @@ -259,6 +256,15 @@ xfs_iget_cache_miss(
>> goto out_unlock;
>> }
>>
>> + /*
>> + * Because the inode hasn't been added to the radix-tree yet it
>> can't
>> + * be found by another thread, so we can do the non-sleeping lock
>> here.
>> + */
>> + if (lock_flags) {
>> + if (!xfs_ilock_nowait(ip, lock_flags))
>> + BUG();
>> + }
>> +
>> mask = ~(((XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_inodelog)) -
>> 1);
>> first_index = agino & mask;
>> write_lock(&pag->pag_ici_lock);
>>
>> _______________________________________________
>> xfs mailing list
>> [email protected]
>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Tue, Mar 03, 2009 at 10:57:07AM -0600, Felix Blyakher wrote:
> if (lock_flags) {
> if (!xfs_ilock_nowait(ip, lock_flags)) {
> ASSERT(0);
> error = EAGAIN;
> goto out_destroy;
> }
> }
>
> Or just keep the BUG(); , as it shouldn't happen (we hope).
Ok, let's keep the BUG for now and I'll throw in your error undwinding
fix. Will resend the series for 2.6.29 patches after QAing them.
Hi
Is it the same issue?
[ INFO: inconsistent lock state ]
2.6.29-rc7-next-20090305 #8
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
kswapd0/318 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&ip->i_lock)->mr_lock){+++++?}, at: [<ffffffff803cbc4a>]
xfs_ilock+0xaa/0x120
{RECLAIM_FS-ON-W} state was registered at:
[<ffffffff8026db49>] mark_held_locks+0x69/0x90
[<ffffffff8026dbb1>] lockdep_trace_alloc+0x41/0xb0
[<ffffffff802c6cbd>] kmem_cache_alloc+0x2d/0x100
[<ffffffff803f37e7>] kmem_zone_alloc+0x97/0xe0
[<ffffffff803f3849>] kmem_zone_zalloc+0x19/0x50
[<ffffffff803b4955>] xfs_da_state_alloc+0x15/0x20
[<ffffffff803c0387>] xfs_dir2_node_lookup+0x17/0x110
[<ffffffff803b9688>] xfs_dir_lookup+0x1c8/0x1e0
[<ffffffff803f08af>] xfs_lookup+0x4f/0xe0
[<ffffffff803fcb99>] xfs_vn_lookup+0x49/0x90
[<ffffffff802d5556>] do_lookup+0x1b6/0x250
[<ffffffff802d5885>] __link_path_walk+0x295/0xec0
[<ffffffff802d66ee>] path_walk+0x6e/0xe0
[<ffffffff802d6866>] do_path_lookup+0xa6/0x1d0
[<ffffffff802d6a65>] path_lookup_open+0x65/0xd0
[<ffffffff802d796a>] do_filp_open+0xaa/0x8f0
[<ffffffff802c8a48>] do_sys_open+0x78/0x110
[<ffffffff802c8b0b>] sys_open+0x1b/0x20
[<ffffffff8020bbdb>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 531011
hardirqs last enabled at (531011): [<ffffffff80284194>]
__rcu_read_unlock+0xa4/0xc0
hardirqs last disabled at (531010): [<ffffffff80284149>]
__rcu_read_unlock+0x59/0xc0
softirqs last enabled at (524334): [<ffffffff80249719>]
__do_softirq+0x139/0x150
softirqs last disabled at (524329): [<ffffffff8020cd5c>] call_softirq+0x1c/0x30
other info that might help us debug this:
2 locks held by kswapd0/318:
#0: (shrinker_rwsem){++++..}, at: [<ffffffff802a4c12>] shrink_slab+0x32/0x1c0
#1: (iprune_mutex){+.+.-.}, at: [<ffffffff802dfbc4>]
shrink_icache_memory+0x84/0x2a0
stack backtrace:
Pid: 318, comm: kswapd0 Not tainted 2.6.29-rc7-next-20090305 #8
Call Trace:
[<ffffffff8026ca5d>] print_usage_bug+0x17d/0x190
[<ffffffff8026d9ad>] mark_lock+0x31d/0x450
[<ffffffff8026cbd0>] ? check_usage_forwards+0x0/0xc0
[<ffffffff8026ebcd>] __lock_acquire+0x40d/0x12a0
[<ffffffff8026faf1>] lock_acquire+0x91/0xc0
[<ffffffff803cbc4a>] ? xfs_ilock+0xaa/0x120
[<ffffffff8025f7d0>] down_read_nested+0x50/0x90
[<ffffffff803cbc4a>] ? xfs_ilock+0xaa/0x120
[<ffffffff803cbc4a>] xfs_ilock+0xaa/0x120
[<ffffffff803f09c4>] xfs_free_eofblocks+0x84/0x280
[<ffffffff8026ea8c>] ? __lock_acquire+0x2cc/0x12a0
[<ffffffff803f144e>] xfs_inactive+0xee/0x540
[<ffffffff803fea57>] xfs_fs_clear_inode+0x67/0x70
[<ffffffff802df7fa>] clear_inode+0x9a/0x120
[<ffffffff802dfa60>] dispose_list+0x30/0x110
[<ffffffff802dfd88>] shrink_icache_memory+0x248/0x2a0
[<ffffffff802a4d3c>] shrink_slab+0x15c/0x1c0
[<ffffffff802a6cba>] kswapd+0x56a/0x6b0
[<ffffffff802374b6>] ? finish_task_switch+0x46/0x110
[<ffffffff802a4120>] ? isolate_pages_global+0x0/0x270
[<ffffffff8025b450>] ? autoremove_wake_function+0x0/0x40
[<ffffffff802a6750>] ? kswapd+0x0/0x6b0
[<ffffffff8025af66>] kthread+0x56/0x90
[<ffffffff8020cc5a>] child_rip+0xa/0x20
[<ffffffff802374f9>] ? finish_task_switch+0x89/0x110
[<ffffffff8063da36>] ? _spin_unlock_irq+0x36/0x60
[<ffffffff8020c640>] ? restore_args+0x0/0x30
[<ffffffff8025af10>] ? kthread+0x0/0x90
[<ffffffff8020cc50>] ? child_rip+0x0/0x20