Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752856Ab0KYMBm (ORCPT ); Thu, 25 Nov 2010 07:01:42 -0500 Received: from bombadil.infradead.org ([18.85.46.34]:48445 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752612Ab0KYMBl (ORCPT ); Thu, 25 Nov 2010 07:01:41 -0500 Date: Thu, 25 Nov 2010 07:01:33 -0500 From: Christoph Hellwig To: Nick Piggin Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, Roman Zippel , "Tigran A. Aivazian" , Boaz Harrosh , OGAWA Hirofumi , Dave Kleikamp , Bob Copeland , reiserfs-devel@vger.kernel.org, Christoph Hellwig , Evgeniy Dushistov , Jan Kara Subject: Re: [RFC][PATCH] Possible data integrity problems in lots of filesystems? Message-ID: <20101125120133.GA22222@infradead.org> References: <20101125074909.GA4160@amd> <20101125115457.GB3643@amd> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101125115457.GB3643@amd> User-Agent: Mutt/1.5.21 (2010-09-15) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1644 Lines: 32 On Thu, Nov 25, 2010 at 10:54:57PM +1100, Nick Piggin wrote: > On Thu, Nov 25, 2010 at 06:49:09PM +1100, Nick Piggin wrote: > > Second is confusing sync and async inode metadata writeout > > Core code clears I_DIRTY_SYNC and I_DIRTY_DATASYNC before calling > > ->write_inode *regardless* of whether it is a for-integrity call or > > not. This means background writeback can clear it, and subsequent > > sync_inode_metadata or sync(2) call will skip the next ->write_inode > > completely. > > Hmm, this also means that write_inode_now(sync=1) is buggy. It > needs to in fact call ->fsync -- which is a file operation > unfortunately, Christoph didn't you have some patches to move it > into an inode operation? No, it doesn't really make much sense either. But what I've slowly started doing is to phase out write_inode_now. For the cases where we really only want to write the inode we should use sync_inode_metadata. That only leaves two others callsers: - iput_final for a filesystem during unmount. This should be caught by the need to call ->sync_fs rule you mentioned above, but needs a closer audit. - nfsd. Any filesystem that cares should just use the commit_metadata export operations, which is a subsystem of ->fsync as it only need to guarantee that metadata is on disk, but not actually any file data - so no cache flush mess as in a real fsync implementation. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/