Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965164AbcCPAwT (ORCPT ); Tue, 15 Mar 2016 20:52:19 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:58208 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932337AbcCPAwL (ORCPT ); Tue, 15 Mar 2016 20:52:11 -0400 Date: Tue, 15 Mar 2016 17:51:17 -0700 From: Chris Mason To: CC: Linus Torvalds , Dave Chinner , "Theodore Ts'o" , Ric Wheeler , Andy Lutomirski , One Thousand Gnomes , Gregory Farnum , "Martin K. Petersen" , Christoph Hellwig , "Darrick J. Wong" , Jens Axboe , Andrew Morton , Linux API , Linux Kernel Mailing List , , Bruce Fields , linux-fsdevel , Jeff Layton Subject: Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks Message-ID: <20160316005117.GA34410@clm-mbp.thefacebook.com> Mail-Followup-To: Chris Mason , sandeen@redhat.com, Linus Torvalds , Dave Chinner , Theodore Ts'o , Ric Wheeler , Andy Lutomirski , One Thousand Gnomes , Gregory Farnum , "Martin K. Petersen" , Christoph Hellwig , "Darrick J. Wong" , Jens Axboe , Andrew Morton , Linux API , Linux Kernel Mailing List , shane.seymour@hpe.com, Bruce Fields , linux-fsdevel , Jeff Layton References: <20160313233049.GA30721@dastard> <56E69398.7030508@redhat.com> <20160314144603.GO29218@thunk.org> <20160315201431.GG30721@dastard> <20160315223313.GH30721@dastard> <20160315235216.GI30721@dastard> <56E8A916.8050702@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <56E8A916.8050702@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-03-16_01:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1868 Lines: 39 On Tue, Mar 15, 2016 at 07:30:14PM -0500, Eric Sandeen wrote: > On 3/15/16 7:06 PM, Linus Torvalds wrote: > > On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner wrote: > >> > > >> > It is pretty clear that the onus is on the patch submitter to > >> > provide justification for inclusion, not for the reviewer/Maintainer > >> > to have to prove that the solution is unworkable. > > I agree, but quite frankly, performance is a good justification. > > > > So if Ted can give performance numbers, that's justification enough. > > We've certainly taken changes with less. > > I've been away from ext4 for a while, so I'm really not on top of the > mechanics of the underlying problem at the moment. > > But I would say that in addition to numbers showing that ext4 has trouble > with unwritten extent conversion, we should have an explanation of > why it can't be solved in a way that doesn't open up these concerns. > > XFS certainly has different mechanisms, but is the demonstrated workload > problematic on XFS (or btrfs) as well? If not, can ext4 adopt any of the > solutions that make the workload perform better on other filesystems? When I've benchmarked this in the past, doing small random buffered writes into an preallocated extent was dramatically (3x or more) slower on xfs than doing them into a fully written extent. That was two years ago, but I can redo it. On a fio card, this gets 16,000 iops on a preallocated extent and 40,000 iops if you run it a second time. It's not random writes, but the fsync probably means the preallocated conversion is more expensive. That's on a 4.0 kernel, but I'll rerun it on nvme on newer kernels. fio --name=fsync --rw=write --fsync=1 --bs=4k --filename=/xfs/fio_4096 --size=4g --overwrite=0 I'm happy to run variations on things, just let me know what workloads are interesting. -chris