Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756621Ab1DLArA (ORCPT ); Mon, 11 Apr 2011 20:47:00 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:40962 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756467Ab1DLAq6 (ORCPT ); Mon, 11 Apr 2011 20:46:58 -0400 Subject: Re: [RFC] block integrity: Fix write after checksum calculation problem From: Mingming Cao To: Chris Mason Cc: Jeff Layton , djwong , Jan Kara , Dave Chinner , Joel Becker , "Martin K. Petersen" , Jens Axboe , linux-kernel , linux-fsdevel , Mingming Cao , linux-scsi In-Reply-To: <1302543595-sup-4352@think> References: <20110224182732.GV27190@tux1.beaverton.ibm.com> <1298897186-sup-9394@think> <20110304210724.GF27190@tux1.beaverton.ibm.com> <20110308045626.GD1956@dastard> <20110319000755.GD1110@tux1.beaverton.ibm.com> <20110321140451.GA7153@quack.suse.cz> <1300716666-sup-2087@think> <20110321164305.GC7153@quack.suse.cz> <20110406232938.GF1110@tux1.beaverton.ibm.com> <20110407165700.GB7363@quack.suse.cz> <20110408203135.GH1110@tux1.beaverton.ibm.com> <20110411124229.47bc28f6@corrin.poochiereds.net> <1302543595-sup-4352@think> Content-Type: text/plain; charset="UTF-8" Date: Mon, 11 Apr 2011 17:46:52 -0700 Message-ID: <1302569212.2580.13.camel@mingming-laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1932 Lines: 49 On Mon, 2011-04-11 at 13:41 -0400, Chris Mason wrote: > Excerpts from Jeff Layton's message of 2011-04-11 12:42:29 -0400: > > > @@ -5839,6 +5844,15 @@ int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) > > > if (ret < 0) > > > goto out_unlock; > > > ret = 0; > > > + > > > + /* > > > + * write_begin/end might have created a dirty page and someone > > > + * could wander in and start the IO. Make sure that hasn't > > > + * happened. > > > + */ > > > + lock_page(page); > > > + wait_on_page_writeback(page); > > > + unlock_page(page); > > I am little puzzled here. if someone wander in and start the IO, the page is up-to-date (dirtied by this page_mkwrite). We shouldn't see the checksum inconsistancy, right? > > nit: > > > > The callers of page_mkwrite always lock the page afterward if you > > return from page_mkwrite with it unlocked. If you plan to take page > > lock anyway, it's probably slightly more efficient not to unlock it and > > instead return VM_FAULT_LOCKED. > > > > Actually this isn't a nit. Keeping the page locked closes an important > hole where it can become writeback again. It might fix the last > remaining problem. > Oh, right. Currently ext4_page_mkwrite drops the page lock before calling it's dirty the page (by write_begin() and write_end(). I suspect regrab the lock() after write_end() (with your proposed change) and returning with locked still leave the dirty by ext4_page_mkwrite unlocked. We probably should to keep the page locked the page during the entire ext4_page_mkwrite() call. Any reason to drop the page lock() before calling aops->write_begin()? Mingming -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/