Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752960Ab0F2CAo (ORCPT ); Mon, 28 Jun 2010 22:00:44 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:20932 "EHLO rcsinet10.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752396Ab0F2CAl (ORCPT ); Mon, 28 Jun 2010 22:00:41 -0400 Date: Mon, 28 Jun 2010 18:58:23 -0700 From: Joel Becker To: Linus Torvalds Cc: Dave Chinner , Linux Kernel , ocfs2-devel@oss.oracle.com, Tao Ma , Dave Chinner , Christoph Hellwig , Mark Fasheh Subject: Re: [Ocfs2-devel] [PATCH] Revert "writeback: limit write_cache_pages integrity scanning to current EOF" Message-ID: <20100629015822.GD24343@mail.oracle.com> Mail-Followup-To: Linus Torvalds , Dave Chinner , Linux Kernel , ocfs2-devel@oss.oracle.com, Tao Ma , Dave Chinner , Christoph Hellwig , Mark Fasheh References: <20100628173529.GA10573@mail.oracle.com> <20100629002421.GY6590@dastard> <20100629005403.GC24343@mail.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Burt-Line: Trees are cool. X-Red-Smith: Ninety feet between bases is perhaps as close as man has ever come to perfection. User-Agent: Mutt/1.5.20 (2009-06-14) X-Source-IP: acsmt355.oracle.com [141.146.40.155] X-Auth-Type: Internal IP X-CT-RefId: str=0001.0A090202.4C29538D.01FF:SCFMA4539814,ss=1,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2212 Lines: 50 On Mon, Jun 28, 2010 at 06:12:35PM -0700, Linus Torvalds wrote: > On Mon, Jun 28, 2010 at 5:54 PM, Joel Becker wrote: > > ? ? ? ?Your contention is that we've never gotten those tail blocks to > > disk. ?Instead, our code either handles the future extensions of i_size > > or we've just gotten lucky with our testing. ?Our current BUG trigger is > > because we have a new check that catches this case. ?Does that summarize > > your position correctly? > > Maybe Dave has some more exhaustive answer, but his point that > block_write_full_page() already just drops the page does seem to be > very valid. Which makes me suspect that it would be better to remove > the ocfs2 BUG_ON() as a stop-gap measure, rather than reverting the > commit. It seems to be true that the "don't bother flushing past EOF" > commit really just uncovered an older bug. Well, shit. Something has changed in here, or we're really really (un)lucky. We visited this code a year ago or so when we had serious zeroing problems, and we tested the hell out of it. Now it is broken again. And it sure looks like that block_write_full_page() check has been there since before git. > So maybe ocfs2 should just replace the bug-on with invalidating the > page (perhaps with a WARN_ONCE() to make sure the problem doesn't get > forgotten about?) Oh, no, that's not it at all. This is a disaster. I can't see for the life of me why we haven't had 100,000 bug reports. You're going to have an ocfs2 patch by the end of the week. It will be ugly, I'm sure of it, but it has to be done. For every extend, we're going to have to zero and potentially CoW around old_i_size if the old allocation isn't within the bounds of the current write. Joel -- "In a crisis, don't hide behind anything or anybody. They're going to find you anyway." - Paul "Bear" Bryant Joel Becker Consulting Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/