Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758130AbYCYQuF (ORCPT ); Tue, 25 Mar 2008 12:50:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756300AbYCYQtx (ORCPT ); Tue, 25 Mar 2008 12:49:53 -0400 Received: from BISCAYNE-ONE-STATION.MIT.EDU ([18.7.7.80]:56739 "EHLO biscayne-one-station.mit.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755115AbYCYQtw (ORCPT ); Tue, 25 Mar 2008 12:49:52 -0400 Date: Tue, 25 Mar 2008 12:47:50 -0400 From: Theodore Tso To: Ric Wheeler Cc: Matthew Wilcox , Mark Lord , Linus Torvalds , Jens Axboe , Jeff Garzik , Tejun Heo , Greg KH l , Andrew Morton , Linux Kernel , IDE/ATA development list , linux-scsi Subject: Re: What to do about the 2TB limit on HDIO_GETGEO ? Message-ID: <20080325164750.GG16358@mit.edu> Mail-Followup-To: Theodore Tso , Ric Wheeler , Matthew Wilcox , Mark Lord , Linus Torvalds , Jens Axboe , Jeff Garzik , Tejun Heo , Greg KH l , Andrew Morton , Linux Kernel , IDE/ATA development list , linux-scsi References: <47E875AD.1000901@rtr.ca> <47E8FF58.8050209@rtr.ca> <47E90CDA.600@emc.com> <20080325153423.GD16721@parisc-linux.org> <47E91EE2.9080801@emc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47E91EE2.9080801@emc.com> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) X-Spam-Flag: NO X-Spam-Score: 0.00 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2209 Lines: 43 On Tue, Mar 25, 2008 at 11:48:50AM -0400, Ric Wheeler wrote: >> Don't those devices run into trouble with fsck? The amount of memory >> you need to fsck a device is obviously going to depend on the filesystem, >> but it has to grow with device size, and I'm not sure that 4GB is enough >> virtual address space to fsck 2TB. Well 2TB, assuming a 4k blocksize, means a block bitmap is 512 megs. So at least for ext3, 4GB should be just enough, unless you hit certainly really nasty complicated corruptions (i.e. large number of blocks claimed by more than one inode, which can happen if an inode table is written to the wrong location on disk --- on top of some other portion of the inode table), or if the filesystem has a large number of files with hard links (such as the case with certain backup programs). The plan is to implement some kind of run-length encoding to compress the in-memory requirements for storing the bitmaps, but that hasn't been coded yet. If someone is a staff programmer for one of these bookshelf NAS manufacturers is interested in implementing such a beast, they should talk to me; I've thought quite a bit about the design, and I just need a minion to implement it. :-) > Absolutely - they more or less hit a stonewall once the disk has any > trouble and you need to fsck. On the other hand, this might be merciful > since on 64 bit boxes, we will let you run the fsck and watch it run for a > week or so before you despair ;-) > > On a serious note, fsck time tends to track more the number of active > inodes, so you can fsck a large file system if you use it to store large > files (especially if you use a file system with dynamic inode creation or > something like the uninitialized ext4 inodes). And ext4 extents will help because it reduces the number of indirect blocks you have to read, which will significantly reduce the fsck time. So there will be improvements on the horizon. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/