Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756565AbcCBSwG (ORCPT ); Wed, 2 Mar 2016 13:52:06 -0500 Received: from mail-ig0-f195.google.com ([209.85.213.195]:32983 "EHLO mail-ig0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752839AbcCBSwD (ORCPT ); Wed, 2 Mar 2016 13:52:03 -0500 MIME-Version: 1.0 In-Reply-To: <20160302040947.16685.42926.stgit@birch.djwong.org> References: <20160302040932.16685.62789.stgit@birch.djwong.org> <20160302040947.16685.42926.stgit@birch.djwong.org> Date: Wed, 2 Mar 2016 10:52:01 -0800 X-Google-Sender-Auth: PbmZ7fE04_abmsLz8IWzBeC4GJQ Message-ID: Subject: Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks From: Linus Torvalds To: "Darrick J. Wong" Cc: Jens Axboe , Christoph Hellwig , Andrew Morton , "Martin K. Petersen" , Linux API , Linux Kernel Mailing List , shane.seymour@hpe.com, Bruce Fields , linux-fsdevel , Jeff Layton Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2638 Lines: 59 On Tue, Mar 1, 2016 at 8:09 PM, Darrick J. Wong wrote: > Create a new ioctl to expose the block layer's newfound ability to > issue either a zeroing discard, a WRITE SAME with a zero page, or a > regular write with the zero page. This BLKZEROOUT2 ioctl takes > {start, length, flags} as parameters. So far, the only flag available > is to enable the zeroing discard part -- without it, the call invokes > the old BLKZEROOUT behavior. start and length have the same meaning > as in BLKZEROOUT. NAK, just based on annoyance with the randomness of this interface: - without describing what the point of the new flag and lots of extra expansion room is, this should never be merged. We don't add padding just randomly. - Somewhat related to that: the flags are checked for zero, but the random expansion room isn't. So not only are the random expansion fields not explained, they will contain random garbage in the future. - why is that "FMODE_WRITE" check not in the common code, but duplicated? it all seems very ad-hoc. It makes a big deal about that idiotic "discard" behavior, which is entirely pointless. Who cares about that discard behavior anyway? Why is it set to "false" in the current version of BLKZEROOUT in the first place? Why do we do that WRITE_SAME without questioning it, but "discard" is somehow so special that it has a flag, and it's turned off by default? If this is some "security issue" where somebody believes that discard is not secure, then those people are full of shit. Discard and write-same have exactly the same semantics - they may just unmap the target range with the guarantee that you'll get zeroes on read. So quite frankly, right now it seems that (a) the *only* excuse for this patch is that people want to use "discard" (b) the reason we already don't use "discard" for the old BLKZEROOUT is very questionable (c) any future possible use of flags is not described and is questionable You'll find people who think that "write-same with some non-zero pattern" would be a real over-write and good for security. Those people will then argue that a sane extension would be to make that pattern part of the future expansion of BLKZEROOUT2. And those people are full of shit. Write-same with a non-zero pattern may well be just a discard with the pattern set in another table. So the whole patch looks pointless. Why isn't the patch just "change false to true in blk_ioctl_zeroout() when it calls blkdev_issue_zeroout()". No new interface, no new random padding, just a simple "it makes absolutely no sense to not allow discard". Linus