Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936442AbcCQSgd (ORCPT ); Thu, 17 Mar 2016 14:36:33 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:17159 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934275AbcCQSg2 (ORCPT ); Thu, 17 Mar 2016 14:36:28 -0400 Date: Thu, 17 Mar 2016 11:35:12 -0700 From: Chris Mason To: Linus Torvalds CC: Gregory Farnum , Eric Sandeen , "Theodore Ts'o" , Andreas Dilger , "Darrick J. Wong" , Dave Chinner , Ric Wheeler , Andy Lutomirski , One Thousand Gnomes , Martin Petersen , Christoph Hellwig , Jens Axboe , Andrew Morton , Linux API , Linux Kernel Mailing List , , Bruce Fields , linux-fsdevel , Jeff Layton Subject: Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks Message-ID: <20160317183512.GA76233@clm-mbp.thefacebook.com> Mail-Followup-To: Chris Mason , Linus Torvalds , Gregory Farnum , Eric Sandeen , Theodore Ts'o , Andreas Dilger , "Darrick J. Wong" , Dave Chinner , Ric Wheeler , Andy Lutomirski , One Thousand Gnomes , Martin Petersen , Christoph Hellwig , Jens Axboe , Andrew Morton , Linux API , Linux Kernel Mailing List , shane.seymour@hpe.com, Bruce Fields , linux-fsdevel , Jeff Layton References: <20160315201431.GG30721@dastard> <20160315223313.GH30721@dastard> <20160315225224.GD23848@thunk.org> <20160316015139.GC5826@birch.djwong.org> <7674C689-C07E-4D38-85EB-4FD9B55CBB35@dilger.ca> <20160317001502.GF23593@thunk.org> <56E9FB73.6040803@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-03-17_06:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1663 Lines: 42 On Thu, Mar 17, 2016 at 10:47:29AM -0700, Linus Torvalds wrote: > On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum wrote: > > > > So we've not asked for NO_HIDE_STALE on the mailing lists, but I think > > it was one of the problems Sage had using xfs in his BlueStore > > implementation and was a big part of why it moved to pure userspace. > > FileStore might use NO_HIDE_STALE in some places but it would be > > pretty limited. When it came up at Linux FAST we were discussing how > > it and similar things had been problems for us in the past and it > > would've been nice if they were upstream. > > Hmm. > > So to me it really sounds like somebody should cook up a patch, but we > shouldn't put it in the upstream kernel until we get numbers and > actual "yes, we'd use this" from outside of google. We haven't had internal tiers yelling at us for fallocate performance, so I'm unlikely to suggest it, just because its a potential privacy leak we'd have to educate people about. What I'd be more likely to use is code inside the filesystem like this: somefs_fallocate() { if (trim_can_really_zero(my_device)) { trim allocate a regular extent return } else { do normal fallocate } } Then the out of tree patch (for google or whoever) becomes a hack to flip trim_can_really_zero on a given block device. The rest of us can use explicit interfaces from the hardware when deciding what we want preallocation to mean. It gets messy for crcs in btrfs, so we'd need the old fashioned preallocation anyway. But the database workloads where this matters aren't our target right now, so its more an ext4/xfs thing anyway. -chris