Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752496Ab2KTHJJ (ORCPT ); Tue, 20 Nov 2012 02:09:09 -0500 Received: from mail-ee0-f46.google.com ([74.125.83.46]:62702 "EHLO mail-ee0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752051Ab2KTHJE (ORCPT ); Tue, 20 Nov 2012 02:09:04 -0500 MIME-Version: 1.0 In-Reply-To: <20121119235306.GX14281@dastard> References: <20121029222613.GU29378@dastard> <20121118235105.GT14281@dastard> <20121119235306.GX14281@dastard> Date: Tue, 20 Nov 2012 08:09:02 +0100 Message-ID: Subject: Re: Hang in XFS reclaim on 3.7.0-rc3 From: Torsten Kaiser To: Dave Chinner Cc: xfs@oss.sgi.com, Linux Kernel Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5529 Lines: 127 On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner wrote: > On Mon, Nov 19, 2012 at 07:50:06AM +0100, Torsten Kaiser wrote: > So, both lockdep thingy's are the same: I suspected this, but as the reports where slightly different I attached bith of them, as I couldn't decide witch one was the better/simpler report to debug this. >> [110926.972482] ========================================================= >> [110926.972484] [ INFO: possible irq lock inversion dependency detected ] >> [110926.972486] 3.7.0-rc4 #1 Not tainted >> [110926.972487] --------------------------------------------------------- >> [110926.972489] kswapd0/725 just changed the state of lock: >> [110926.972490] (sb_internal){.+.+.?}, at: [] xfs_trans_alloc+0x28/0x50 >> [110926.972499] but this lock took another, RECLAIM_FS-unsafe lock in the past: >> [110926.972500] (&(&ip->i_lock)->mr_lock/1){+.+.+.} > > Ah, what? Since when has the ilock been reclaim unsafe? > >> [110926.972500] and interrupts could create inverse lock ordering between them. >> [110926.972500] >> [110926.972503] >> [110926.972503] other info that might help us debug this: >> [110926.972504] Possible interrupt unsafe locking scenario: >> [110926.972504] >> [110926.972505] CPU0 CPU1 >> [110926.972506] ---- ---- >> [110926.972507] lock(&(&ip->i_lock)->mr_lock/1); >> [110926.972509] local_irq_disable(); >> [110926.972509] lock(sb_internal); >> [110926.972511] lock(&(&ip->i_lock)->mr_lock/1); >> [110926.972512] >> [110926.972513] lock(sb_internal); > > Um, that's just bizzare. No XFS code runs with interrupts disabled, > so I cannot see how this possible. > > ..... > > > [] mark_held_locks+0x7e/0x130 > [] lockdep_trace_alloc+0x63/0xc0 > [] kmem_cache_alloc+0x35/0xe0 > [] vm_map_ram+0x271/0x770 > [] _xfs_buf_map_pages+0x46/0xe0 > [] xfs_buf_get_map+0x8a/0x130 > [] xfs_trans_get_buf_map+0xa9/0xd0 > [] xfs_ialloc_inode_init+0xcd/0x1d0 > > We shouldn't be mapping buffers there, there's a patch below to fix > this. It's probably the source of this report, even though I cannot > lockdep seems to be off with the fairies... I also tried to understand what lockdep was saying, but Documentation/lockdep-design.txt is not too helpful. I think 'CLASS'-ON-R / -ON-W means that this lock was 'ON' / held while 'CLASS' (HARDIRQ, SOFTIRQ, RECLAIM_FS) happend and that makes this lock unsafe for these contexts. IN-'CLASS'-R / -W seems to be 'lock taken in context 'CLASS'. A note that 'CLASS'-ON-? means 'CLASS'-unsafe in there would be helpful to me... Wrt. above interrupt output: I think lockdep doesn't really know about RECLAIM_FS and threats it as another interrupt. I think that output should have been something like this: CPU0 CPU1 ---- ---- lock(&(&ip->i_lock)->mr_lock/1); lock(sb_internal); lock(&(&ip->i_lock)->mr_lock/1); lock(sb_internal); Entering reclaim on CPU1 would mean that CPU1 would not enter reclaim again, so the reclaim-'interrupt' would be disabled. And instead of interrupts disrupting the normal codeflow on CPU0 it would be 'interrupted' be instead of doing a normal allocation, it would 'interrupt' the allocation to reclaim memory. print_irq_lock_scenario() would need to be taught to print a slightly different message for reclaim-'interrupts'. I will try your patch, but as I do not have a reliable reproducer to create this lockdep report, I can't really verify if this fixes it. But I will definitely mail you, if it happens again with this patch. Thanks, Torsten > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > > xfs: inode allocation should use unmapped buffers. > > From: Dave Chinner > > Inode buffers do not need to be mapped as inodes are read or written > directly from/to the pages underlying the buffer. This fixes a > regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the > default behaviour"). > > Signed-off-by: Dave Chinner > --- > fs/xfs/xfs_ialloc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c > index 2d6495e..a815412 100644 > --- a/fs/xfs/xfs_ialloc.c > +++ b/fs/xfs/xfs_ialloc.c > @@ -200,7 +200,8 @@ xfs_ialloc_inode_init( > */ > d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster)); > fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d, > - mp->m_bsize * blks_per_cluster, 0); > + mp->m_bsize * blks_per_cluster, > + XBF_UNMAPPED); > if (!fbuf) > return ENOMEM; > /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/