Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761250AbYGOCIS (ORCPT ); Mon, 14 Jul 2008 22:08:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752847AbYGOCIJ (ORCPT ); Mon, 14 Jul 2008 22:08:09 -0400 Received: from relay1.sgi.com ([192.48.171.29]:44611 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754775AbYGOCII (ORCPT ); Mon, 14 Jul 2008 22:08:08 -0400 Message-ID: <487C07A4.70202@sgi.com> Date: Tue, 15 Jul 2008 12:12:52 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.14 (X11/20080421) MIME-Version: 1.0 To: Lachlan McIlroy , Mikael Abrahamsson , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, linux-mm@kvack.org Subject: Re: xfs bug in 2.6.26-rc9 References: <20080711084248.GU29319@disturbed> <487B019B.9090401@sgi.com> <20080714121332.GX29319@disturbed> In-Reply-To: <20080714121332.GX29319@disturbed> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2413 Lines: 54 Dave Chinner wrote: > On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote: >> Mikael Abrahamsson wrote: >>> On Fri, 11 Jul 2008, Dave Chinner wrote: >>> >>>> That aside, what was the assert failure reported prior to the oops? >>>> i.e. paste the lines in the log before the ---[ cut here ]--- line? >>>> One of them will start with 'Assertion failed:', I think.... >>> These ones? >>> >>> Jul 8 04:44:56 via kernel: [554197.888008] Assertion failed: whichfork >>> == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c, >>> line: 5879 >>> Jul 9 03:25:21 via kernel: [42940.748007] Assertion failed: whichfork >>> == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c, >>> line: 5879 >> xfs_ilock(ip, XFS_IOLOCK_SHARED); >> >> if (whichfork == XFS_DATA_FORK && >> (ip->i_delayed_blks || ip->i_size > ip->i_d.di_size)) { >> /* xfs_fsize_t last_byte = xfs_file_last_byte(ip); */ >> error = xfs_flush_pages(ip, (xfs_off_t)0, >> -1, 0, FI_REMAPF); >> if (error) { >> xfs_iunlock(ip, XFS_IOLOCK_SHARED); >> return error; >> } >> } >> >> ASSERT(whichfork == XFS_ATTR_FORK || ip->i_delayed_blks == 0); >> >> This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the >> iolock and then flushes the file and because it has the iolock it doesn't >> expect any new delayed allocations to occur. A mmap write can allocate >> delayed allocations without acquiring the iolock so is able to get in >> after the flush but before the ASSERT. > > Christoph and I were contemplating this problem with ->page_mkwrite > reecently. The problem is that we can't, right now, return an > EAGAIN-like error to ->page_mkwrite() and have it retry the > page fault. Other parts of the page faulting code can do this, > so it seems like a solvable problem. > > The basic concept is that if we can return a EAGAIN result we can > try-lock the inode and hold the locks necessary to avoid this race > or prevent the page fault from dirtying the page until the > filesystem is unfrozen. Why do we need to try-lock the inode? Will we have an ABBA deadlock if we block on the iolock in ->page_mkwrite()? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/