From: Allison Henderson Subject: Re: [PATCH 1/2 v3] EXT4: Secure Delete: Zero out file data Date: Thu, 07 Jul 2011 18:55:38 -0700 Message-ID: <4E16639A.3060504@linux.vnet.ibm.com> References: <1309468923-5677-1-git-send-email-achender@linux.vnet.ibm.com> <1309468923-5677-2-git-send-email-achender@linux.vnet.ibm.com> <4E14CE15.90404@linux.vnet.ibm.com> <2DE49B61-CC67-4613-99EB-88601D6EC564@dilger.ca> <4E1614C1.1050209@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Andreas Dilger , linux-ext4@vger.kernel.org To: Amir Goldstein Return-path: Received: from e38.co.us.ibm.com ([32.97.110.159]:41143 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752658Ab1GHBzm (ORCPT ); Thu, 7 Jul 2011 21:55:42 -0400 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e38.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p681l7Ai026976 for ; Thu, 7 Jul 2011 19:47:07 -0600 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p681tf1g113034 for ; Thu, 7 Jul 2011 19:55:41 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p67JtfYT016502 for ; Thu, 7 Jul 2011 13:55:41 -0600 In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: On 07/07/2011 05:09 PM, Amir Goldstein wrote: > On Thu, Jul 7, 2011 at 11:19 PM, Allison Henderson > wrote: >> On 07/07/2011 12:52 PM, Andreas Dilger wrote: >>> >>> On 2011-07-07, at 1:05 AM, Amir Goldstein wrote: >>>> >>>> On Thu, Jul 7, 2011 at 12:05 AM, Allison Henderson >>>> wrote: >>>>> >>>>> On 07/02/2011 02:33 AM, Amir Goldstein wrote: >>>>>> >>>>>> On Fri, Jul 1, 2011 at 12:22 AM, Allison Henderson >>>>>> wrote: >>>>>>> >>>>>>> @@ -4485,6 +4485,14 @@ void ext4_free_blocks(handle_t *handle, struct >>>>>>> inode *inode, >>>>>>> ext4_debug("freeing block %llu\n", block); >>>>>>> trace_ext4_free_blocks(inode, block, count, flags); >>>>>>> >>>>>>> + if (flags& EXT4_FREE_BLOCKS_ZERO) { >>>>>>> + err = sb_issue_zeroout(inode->i_sb, block, count, >>>>>>> GFP_NOFS); >>>>>> >>>>>> But the delete of these blocks in not yet committed, >>>>>> so after reboot, you can end up with a non-deleted but zeroed file >>>>>> data. >>>>>> Is that acceptable? I should think not. >>>>>> >>>>>> One way around this is a 2-phase unlink/truncate. >>>>>> Phase 1: add to orphan list and register a callback on commit >>>>>> Phase 2: issue zeroout and free the blocks >>>>>> >>>>>> This won't work for punch hole, but then again, for punch hole >>>>>> it's probably OK to end up with zeroed data, but non-deleted blocks. >>>>>> Right? >>>>> >>>>> Hi, I had a quick question about the orphan list. I notice that >>>>> ext4_ext_truncate and also ext4_ext_punch_hole already have a call to >>>>> ext4_orphan_add that happens really early before any calls to free >>>>> blocks. >>>>> Does this address your earlier concerns, or is there another reason I >>>>> missed? Thx! >>>> >>>> It doesn't address the concerns of getting a non-deleted file with zeroed >>>> data >>>> after crash, because the existence of the inode on the orphan list after >>>> crash >>>> depends on the transaction that added it to the list being committed. >>>> And your patch zeroes the blocks before that transaction is committed. >>>> >>>> However, the orphan list gives you a very good framework to implement >>>> deferred delete (by a kernel thread) as Andreas suggested. >>>> Unlink should be simple, because freeing unlinked inode blocks it is >>>> anyway >>>> deferred till the inode refcount drops to zero. >>> >>> Right. The patch that I referenced moved all of the blocks from unlink >>> and truncate-to-zero from the current inode to a new temporary inode on >>> the >>> orphan list (simply copying the i_blocks field + i_block and i_size, IIRC, >>> and zeroing them on the original inode). >>> >>>> Truncate is more tricky, because of the truncate shrink/extend >>>> requirement >>>> (that all data is zeroes after extending the inode's size via truncate >>>> system call), so a shrinking-deferred truncate would have to mark all the >>>> to-be-deleted extents uninitialized. >>> >>> It would be possible to do this for partial truncate/punch as well, to >>> move whole blocks over to a new inode on the orphan list and zeroing only >>> the 1 or 2 partial blocks inline. >>> >>> It should even be possible to leverage the "block migrate" facility used >>> by defrag, so that we don't duplicate this code. That would mean just >>> allocating a temp "unlink" inode in the kernel and putting it on the >>> orphan >>> list (like an open-unlinked file), migrate the selected range of blocks, >>> and then zeroing the blocks in the background before unlinking the inode. >>> >>> I don't think that just marking the deleted extents as uninitialized is >>> enough, since it would still leave "private" data on disk that could be >>> read afterward. This would also only work for extent-mapped filesystems. >>> >>> There may need to be some work to enable the migrate code on block-mapped >>> files, if you want to allow secure-delete on those files, but that is good >>> IMHO since it also means that we could defrag block-mapped files. >>> >>> Cheers, Andreas >>> >> >> Ah, ok then. Yes, part of the requirements was to make secure delete work >> for partial truncates, punch hole, and also indexed files. So that will >> save me some time if I can get the migrate routines work. Thx for the >> pointers all! >> > > I realized that there is a basic flaw in the concept of deferred-secure-delete. > From a security point of view, after a crash during a secure-delete, > if the file is not there, all its data should have been wiped. > Orphan cleanup on the next mount may be done on a system that > doesn't respect secure delete. > So for real security, the unlink/truncate command cannot return before > all data is wiped. > The unlink/truncate metadata changes must not even be committed > before all data is wiped (or at least part of the data with partial truncate). > > Amir. I see, so then it sounds like using a background thread to do the zeroing would not help us if we have to wait for it complete anyway. Going back to the 2 phase approach, this means that we need to do the zero out and then the free before we do the orphan list and commit? Just trying to make sure I understand things correctly :) Allison Henderson