Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932219Ab0FDCDB (ORCPT ); Thu, 3 Jun 2010 22:03:01 -0400 Received: from bld-mail17.adl2.internode.on.net ([150.101.137.102]:35082 "EHLO mail.internode.on.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751677Ab0FDCC7 (ORCPT ); Thu, 3 Jun 2010 22:02:59 -0400 Date: Fri, 4 Jun 2010 12:02:43 +1000 From: Dave Chinner To: Chris Mason , Nick Piggin , "Martin K. Petersen" , James Bottomley , Matthew Wilcox , Christof Schmitt , Boaz Harrosh , linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: Wrong DIF guard tag on ext2 write Message-ID: <20100604020243.GE19651@dastard> References: <20100601162929.GC32708@parisc-linux.org> <20100601164750.GQ8980@think> <1275411293.21962.387.camel@mulgrave.site> <20100601180905.GR8980@think> <20100601184649.GE9453@laptop> <20100601193528.GV8980@think> <20100602032030.GF9453@laptop> <20100602134121.GD6152@laptop> <20100603154634.GC8980@think> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100603154634.GC8980@think> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2020 Lines: 42 On Thu, Jun 03, 2010 at 11:46:34AM -0400, Chris Mason wrote: > On Wed, Jun 02, 2010 at 11:41:21PM +1000, Nick Piggin wrote: > > Closing the while it is dirty, while it is being written back window > > still leaves a pretty big window. Also, how do you handle mmap writes? > > Write protect and checksum the destination page after every store? Or > > leave some window between when the pagecache is dirtied and when it is > > written back? So I don't know whether it's worth putting a lot of effort > > into this case. > > So, changing gears to how do we protect filesystem page cache pages > instead of the generic idea of dif/dix, btrfs crcs just before writing, > which does leave a pretty big window for the page to get corrupted. > The storage layer shouldn't care or know about that though, we hand it a > crc and it makes sure data matching that crc goes to the media. I think the only way to get accurate CRCs is to stop modifications from occurring while the page is under writeback. i.e. when a page transitions from dirty to writeback we need to unmap any writable mappings on the page, and then any new modifications (either by the write() path or through ->fault) need to block waiting for page writeback to complete before they can proceed... If we can lock out modification during writeback, we can calculate CRCs safely at any point in time the page is not mapped. e.g. we could do the CRC calculation at copy-in time and store it on new pages. During writeback, if the page has not been mapped then the stored CRC can be used. If it has been mapped (say writeable mappings clear the stored CRC during ->fault) then we can recalculate the CRC once we've transitioned the page to being under writeback... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/