Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967189AbXEGXgo (ORCPT ); Mon, 7 May 2007 19:36:44 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S967226AbXEGXgi (ORCPT ); Mon, 7 May 2007 19:36:38 -0400 Received: from THUNK.ORG ([69.25.196.29]:55259 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966055AbXEGXgg (ORCPT ); Mon, 7 May 2007 19:36:36 -0400 Date: Mon, 7 May 2007 19:36:22 -0400 From: Theodore Tso To: Jeff Garzik Cc: Andrew Morton , "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: [PATCH 4/5] ext4: fallocate support in ext4 Message-ID: <20070507233622.GB29907@thunk.org> Mail-Followup-To: Theodore Tso , Jeff Garzik , Andrew Morton , "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com References: <20070420135146.GA21352@amitarora.in.ibm.com> <20070420145918.GY355@devserv.devel.redhat.com> <20070424121632.GA10136@amitarora.in.ibm.com> <20070426175056.GA25321@amitarora.in.ibm.com> <20070426181332.GD7209@amitarora.in.ibm.com> <20070503213133.d1559f52.akpm@linux-foundation.org> <20070507113753.GA5439@schatzie.adilger.int> <20070507135825.f8545a65.akpm@linux-foundation.org> <20070507222103.GJ8181@schatzie.adilger.int> <463FB008.3080706@garzik.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <463FB008.3080706@garzik.org> User-Agent: Mutt/1.5.13 (2006-08-11) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1970 Lines: 41 On Mon, May 07, 2007 at 07:02:32PM -0400, Jeff Garzik wrote: > Andreas Dilger wrote: > >On May 07, 2007 13:58 -0700, Andrew Morton wrote: > >>Final point: it's fairly disappointing that the present implementation is > >>ext4-only, and extent-only. I do think we should be aiming at an ext4 > >>bitmap-based implementation and an ext3 implementation. > > > >Actually, this is a non-issue. The reason that it is handled for > >extent-only > >is that this is the only way to allocate space in the filesystem without > >doing the explicit zeroing. For other filesystems (including ext3 and > > Precisely /how/ do you avoid the zeroing issue, for extents? > > If I posix_fallocate() 20GB on ext4, it damn well better be zeroed, > otherwise the implementation is broken. There is a bit in the extent structure which indicates that the extent has not been initialized. When reading from a block where the extent is marked as unitialized, ext4 returns zero's, to avoid returning the uninitalized contents of the disk, which might contain someone else's love letters, p0rn, or other information which we shouldn't leak out. When writing to an extent which is uninitalized, we may potentially have to split the extent into three extents in the worst case. My understanding is that XFS uses a similar implementation; it's a pretty obvious and standard way to implement allocated-but-not-initialized extents. We thought about supporting persistent preallocation for inodes using indirect blocks, but it would require stealing a bit from each entry in the indirect block, reducing the maximum size of the filesystem by two (i.e., 2**31 blocks). It was decided it wasn't worth the complexity, given the tradeoffs. - Ted - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/