Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756896Ab0KSBdL (ORCPT ); Thu, 18 Nov 2010 20:33:11 -0500 Received: from thunk.org ([69.25.196.29]:60121 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750982Ab0KSBdJ (ORCPT ); Thu, 18 Nov 2010 20:33:09 -0500 Date: Thu, 18 Nov 2010 20:33:01 -0500 From: "Ted Ts'o" To: Mark Lord Cc: James Bottomley , Greg Freemyer , Jeff Moyer , Christoph Hellwig , Matthew Wilcox , Josef Bacik , Lukas Czerner , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, sandeen@redhat.com Subject: Re: [PATCH 1/2] fs: Do not dispatch FITRIM through separate super_operation Message-ID: <20101119013301.GU3290@thunk.org> Mail-Followup-To: Ted Ts'o , Mark Lord , James Bottomley , Greg Freemyer , Jeff Moyer , Christoph Hellwig , Matthew Wilcox , Josef Bacik , Lukas Czerner , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, sandeen@redhat.com References: <20101118134804.GN5618@dhcp231-156.rdu.redhat.com> <20101118141957.GK6178@parisc-linux.org> <20101118142918.GA18510@infradead.org> <1290100750.3041.72.camel@mulgrave.site> <1290102098.3041.77.camel@mulgrave.site> <4CE59E57.2090009@teksavvy.com> <1290117009.11007.42.camel@mulgrave.site> <4CE5A386.7000105@teksavvy.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CE5A386.7000105@teksavvy.com> User-Agent: Mutt/1.5.20 (2009-06-14) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2419 Lines: 48 > > > >Before we go gung ho on this, there's no evidence that N discontiguous > >ranges in one command are any better than the ranges sent N times ... > >the same amount of erase overhead gets sent on SSDs. > > No, we do have evidence: execution time of the TRIM commands on the SSD. > > The one-range-at-a-time is incredibly slow compared to multiple > ranges at a time. That slowness comes from somewhere, with about > 99.9% certainty that it is due to the drive performing slow flash > erase cycles. Mark, I think you are over-generalizing here. You have observed with some number of flash drives --- maybe only one, but I don't know that for sure --- that TRIM is slow. Even if we grant that you are correct in your conclusion that it is because the drive is doing slow flash erase cycles (and I don't completely accept that; I haven't seen your your measurements since we know that any kind of command that requires a queue drain/flush before it can execute is going to be slow, and I don't know what kind of _slow_ you are observing). But even if we *do* grant that you've seen one disk, or even a lot of disks which is doing something stupid, that just means that their manufacturer has some idiotic engineers. It does not follow that all SSD's, or thin-provisioned drives, or other devices implementing the the ATA TRIM command, will do so in an incompetent way. If you look a the the T13 definition of TRIM, it is just a hint that the contents of the block range do not _have_ to be preserved. It does not say that they *must* be erased. This is not a security erase command. In fact, it is perfectly reasonable for the TRIM command to store state in volatile storage, and the information of which blocks have been TRIM gets discarded on a power failure. So if SSD's are doing a full flash erase cycle for each TRIM, that may not necessarily be a good idea. I accept that there may be some incompetent implementations out there. But I don't think this means we should assume that _all_ implementations are incompetent. It does mean, though, that we can't turn any of these features on by default. But that's something we know already. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/