Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755633AbcCNKeP (ORCPT ); Mon, 14 Mar 2016 06:34:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44489 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755604AbcCNKeF (ORCPT ); Mon, 14 Mar 2016 06:34:05 -0400 Subject: Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks To: Dave Chinner , Linus Torvalds References: <56E18B9B.5070503@gmail.com> <56E24CA5.3030702@redhat.com> <20160311135952.57a44931@lxorguk.ukuu.org.uk> <20160311223047.GZ30721@dastard> <20160312003556.GF32214@thunk.org> <20160313233049.GA30721@dastard> Cc: "Theodore Ts'o" , Andy Lutomirski , One Thousand Gnomes , Gregory Farnum , "Martin K. Petersen" , Christoph Hellwig , "Darrick J. Wong" , Jens Axboe , Andrew Morton , Linux API , Linux Kernel Mailing List , shane.seymour@hpe.com, Bruce Fields , linux-fsdevel , Jeff Layton , Eric Sandeen From: Ric Wheeler Message-ID: <56E69398.7030508@redhat.com> Date: Mon, 14 Mar 2016 06:34:00 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20160313233049.GA30721@dastard> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3024 Lines: 63 On 03/13/2016 07:30 PM, Dave Chinner wrote: > On Fri, Mar 11, 2016 at 04:44:16PM -0800, Linus Torvalds wrote: >> On Fri, Mar 11, 2016 at 4:35 PM, Theodore Ts'o wrote: >>> At the end of the day it's about whether you trust the userspace >>> program or not. >> There's a big difference between "give the user rope", and "tie the >> rope in a noose and put a banana peel so that the user might stumble >> into the rope and hang himself", though. >> >> So I do think that Dave is right that we should also strive to make >> sure that our interfaces are not just secure in theory, but that they >> are also good interfaces to make mistakes less likely. > At which point I have to ask: how do we safely allow filesystems to > expose stale data in files? There's a big "we need to trust > userspace" component in ever proposal that has been made so far - > that's the part I have extreme trouble with. > > For example, what happens when a backup process running as root a > file that has exposed stale data? Yes, we could set the "NODUMP" > flag on the inode to tell backup programs to skip backing up such > files, but we're now trusting some random userspace application > (e.g. tar, rsync, etc) not to do something we don't want it to do > with the data in that file. > > AFAICT, we can't stop root from copying files that have exposed > stale data or changing their ownership without some kind of special > handling of "contains stale data" files within the kernel. At this > point we are back to needing persistent tracking of the "exposed > stale data" state in the inode as the only safe way to allow us to > expose stale data. That's fairly ironic given that the stated > purpose of exposing stale data through fallocate is to avoid the > overhead of the existing mechanisms we use to track extents > containing stale data.... I think that once we enter this mode, the local file system has effectively ceded its role to prevent stale data exposure to the upper layer. In effect, this ceases to become a normal file system for any enabled process if we control this through fallocate() or for all processes if we do the brute force mount option that would be file system wide. That means we would not need to track this. Extents would be marked as if they always have had valid data (no more allocated but unwritten state). In the end, that is the actual goal - move this enforcement up a layer for overlay/user space file systems that are then responsible for policing this ind of thing. Regards, Ric > >> I think we _should_ give users rope, but maybe we should also make >> sure that there isn't some hidden rapidly spinning saw-blade right >> next to the rope that the user doesn't even think about. > IMO we already have a good, safe interface that provides the rope > without the saw blades. I'm happy to be proven wrong, but IMO I > don't see that we can provide stale data exposure in a safe, > non-saw-bladey way without any kernel/filesystem side overhead..... > > Cheers, > > Dave.