Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751934Ab0KYKGN (ORCPT ); Thu, 25 Nov 2010 05:06:13 -0500 Received: from ipmail04.adl6.internode.on.net ([150.101.137.141]:37406 "EHLO ipmail04.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750997Ab0KYKGL (ORCPT ); Thu, 25 Nov 2010 05:06:11 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAF/D7Ux5LcZJ/2dsb2JhbACjCHK9WYVHBA Date: Thu, 25 Nov 2010 21:06:03 +1100 From: Nick Piggin To: Boaz Harrosh Cc: Nick Piggin , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, Roman Zippel , "Tigran A. Aivazian" , OGAWA Hirofumi , Dave Kleikamp , Bob Copeland , reiserfs-devel@vger.kernel.org, Christoph Hellwig , Evgeniy Dushistov , Jan Kara Subject: Re: [RFC][PATCH] Possible data integrity problems in lots of filesystems? Message-ID: <20101125100603.GA3164@amd> References: <20101125074909.GA4160@amd> <4CEE2C2E.4010003@panasas.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CEE2C2E.4010003@panasas.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2534 Lines: 73 On Thu, Nov 25, 2010 at 11:28:14AM +0200, Boaz Harrosh wrote: > Hi Nick. > Thanks for digging into this issue, I bet it's causing pain. Which > I totally missed in my tests. I wish I had a better xsync+reboot > tests for all this. That's no problem, thanks for looking. > So in that previous patch you had: > > Index: linux-2.6/fs/exofs/file.c > > =================================================================== > > --- linux-2.6.orig/fs/exofs/file.c 2010-11-19 16:50:00.000000000 +1100 > > +++ linux-2.6/fs/exofs/file.c 2010-11-19 16:50:07.000000000 +1100 > > @@ -48,11 +48,6 @@ static int exofs_file_fsync(struct file > > struct inode *inode = filp->f_mapping->host; > > struct super_block *sb; > > > > - if (!(inode->i_state & I_DIRTY)) > > - return 0; > > - if (datasync && !(inode->i_state & I_DIRTY_DATASYNC)) > > - return 0; > > - > > ret = sync_inode_metadata(inode, 1); > > > > /* This is a good place to write the sb */ > > > > Is that a good enough fix for the issue in your opinion? > Or is there more involved? For the inode dirty bit race problem, yes it should fix it. sync_inode_metadata basically makes the same checks without races (in a subsequent patch I re-introduced the datasync optimisation). > In exofs there is nothing special to do other than VFS > managment and the final call, by vfs, to .write_inode. > > I wish we had a simple_file_fsync() from VFS that does > what the VFS expects us to do. So when code evolves it > does not need to change all FSs. This is the third time > I'm fixing this code trying to second guess the VFS. Well in your fsync, you need to wait for inode writeback that might have been started by an asynchronous write_inode. Also, with your sync_inode_metadata call, you shouldn't need the sync_inode call by the looks. > Actually the only other thing I need to do in file_fsync > today is sb_sync. But this is a stupidity (and a bug) that > I'm fixing soon. So that theoretical simple_file_fsync() > would be all I need. > > Please advise? > BTW: Do you want that I take the changes through my tree? At this point I'd just like some review and feedback, we might get some other opinions on how to fix it, so don't take the changes quite yet. I'll cc you again with a broken out patch. Thanks, Nick -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/