Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp128255pxf; Wed, 7 Apr 2021 22:23:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwgo65avH7tmziHnNxr9A+6ddoSHgO/Wa5tbhC5yntzA4/dBrhRmM0mfz5IEkc78jPPGZ+9 X-Received: by 2002:a17:902:a607:b029:e4:c03e:3a9f with SMTP id u7-20020a170902a607b02900e4c03e3a9fmr6420895plq.14.1617859390311; Wed, 07 Apr 2021 22:23:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617859390; cv=none; d=google.com; s=arc-20160816; b=ubRfpsh3wkqz3SGZSvL6UfiCqSDhtaQkOH0KyIkwLqlzXXgLGAptk5JdAj3xf2TWr1 CU7OgrKd2a/a/JzdmCiHS9Vxij8gwB3UtwAhgay19gc+XiGy2k8GQgydMZEtiToyqCzV iraSxEfOSSdT6NQNMoWIYm3Ay5nUDcv1GBt3KSMXE7RFNae4uZn2QpYgvn9yh+5Vnb/F DSDTntFpy86/cvBVtSPznePqReLdJiVKgIpt6v3df3ukJz9KEIaOxl1kZbEW9LyVfcyX lQixAUsG2C5i5+ToHh5QEx8cvdUIF9ZzuIqbSIP6bWmwJyrWuRe1kzktnlUOUqIkvlvS p6/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=BB+3x+caBtxXRInf1u5zw+6L21706f9jRVfYHxD3h5U=; b=J8LUOegPjk+uigNJp5o62ErrI57okE1o5IT94rFjUv5yb+0/I23BQ3ZNYtMYA+gNQv hoqO9XpE6ujuk1eDz6WU01mH/BczvCDb/WOS/1uJpHauMRPkaLWkW6T268Kwl0FT9a18 Re5qs3kNfdlxCsb3gWGEyxKEJ23NGswF/mo4nMJF4ZNQywct9O6kMst49BG0W2cV4nl7 CsFVUhHZtHh6343Y+46OuTBHdn5jcHEBUD8MgSMRN6Z58zcFuO53ZSev9GCioTvXMmM0 8pwAZHiNOSUqjJLoITG5Kpj1m2onu4gnyNZ06n1y1if5PLLD3/z4oGoh42oM6ySdl6R3 8XRQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n4si27411006plk.422.2021.04.07.22.22.52; Wed, 07 Apr 2021 22:23:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229506AbhDHFWK (ORCPT + 99 others); Thu, 8 Apr 2021 01:22:10 -0400 Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:56006 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229649AbhDHFWK (ORCPT ); Thu, 8 Apr 2021 01:22:10 -0400 Received: from dread.disaster.area (pa49-181-239-12.pa.nsw.optusnet.com.au [49.181.239.12]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id B22AD82A2C6; Thu, 8 Apr 2021 15:21:56 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1lUN75-00FTEE-Nj; Thu, 08 Apr 2021 15:21:55 +1000 Date: Thu, 8 Apr 2021 15:21:55 +1000 From: Dave Chinner To: Theodore Ts'o Cc: Eric Biggers , Leah Rumancik , linux-ext4@vger.kernel.org Subject: Re: [PATCH v2 1/2] ext4: wipe filename upon file deletion Message-ID: <20210408052155.GK1990290@dread.disaster.area> References: <20210407154202.1527941-1-leah.rumancik@gmail.com> <20210407154202.1527941-2-leah.rumancik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=F8MpiZpN c=1 sm=1 tr=0 cx=a_idp_f a=gO82wUwQTSpaJfP49aMSow==:117 a=gO82wUwQTSpaJfP49aMSow==:17 a=kj9zAlcOel0A:10 a=3YhXtTcJ-WEA:10 a=7-415B0cAAAA:8 a=pwG_6AaPxDBl9Fnz-X8A:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Apr 07, 2021 at 11:48:40PM -0400, Theodore Ts'o wrote: > On Wed, Apr 07, 2021 at 02:33:15PM -0700, Eric Biggers wrote: > > On Wed, Apr 07, 2021 at 03:42:01PM +0000, Leah Rumancik wrote: > > > Zero out filename and file type fields when file is deleted. > > > > Why? > > Eric is right that we need to have a better explanation in the commit > description. > > In answer to Eric's question, the problem that is trying to be solved > here is that if a customer happens to be storing PII in filenames "if" Is this purely a hypothetical "if", or is it "we have a customer that actaully does this"? Because if this is just hypothetical, then future customers should already be advised and know not to store PII information in clear text *anywhere* in their systems. > (e-mail addresses, SSN's, etc.) that they might want to have a > guarantee that if a file is deleted, the filename and the file's > contents can be considered as *gone* after some wipeout time period > has elapsed. So the use case is every N hours, some system daemon > will execute FITRIM and FS_IOC_CHKPT_JRNL with the CHKPT_JRNL_DISCARD > flag set, in order to meet this particular guarantee. This seems like a better fit for FITRIM than anything else. Ooohh. We sure do suck at APIs, don't we? FITRIM has no flags field, so we can't extend that. But it still makes more sense to me to have something like: int fstrim(int fd, struct fstrim_range *r, int flags) syscall where the flags field can indicate that the journal should be trimmed. At that point, the "journal checkpoint and flush" is implied by the fact userspace is asking for the journal to be discarded.... > P.S. By the way, this is a guarantee that we're going to eventually > want to care about for XFS as well, since as of COS-85 > (Container-Optimized OS), XFS is supported in Preview Mode. This > means that eventually we're going to want submit patches so as to be > able to support the CHKPT_JRNL_DISCARD flag for FS_IOC_CHKPT_JRNL in > XFS as well. Oh, that won't be fun. XFS places a whiteout over the dirent to indicate that it has been freed, and it does not actually log anything other than the 4 byte whiteout at the start of the dirent and the 2 byte XFS_DIR2_DATA_FREE_TAG tag at the end of the dirent. So zeroing dirents is going to require changing the size and shape of dirent logging during unlinks... This will have to be done correclty for all the node merge, split and compaction cases, too, not just the "remove name" code. > P.P.S. We'll also want to have a mount option which supresses file > names (for example, from ext4_error() messages) from showing up in > kernel logs, to ease potential privacy concerns with respect to serial > console and kernel logs. But that's for another patch set.... This sounds more and more like "Don't encode PII in clear text anywhere" is a best practice that should be enforced with a big hammer. Filenames get everywhere and there's no easy way to prevent that because path lookups can be done by anyone in the kernel. This so much sounds like you're starting a game of whack-a-mole that can never be won. From a security perspective, this is just bad design. Storing PII in clear text filenames pretty much guarantees that the PII will leak because it can't be scrubbed/contained within application controlled boundaries. Trying to contain the spread of filenames within random kernel subsystems sounds like a fool's errand to me, especially given how few kernel developers will even know that filenames are considered sensitive information from a security perspective... Fundamentally, applications should *never* place PII in clear text in potentially leaky environments. The environment for storing PII should be designed to be secure and free of data leaks from the ground up. And ext4 has already got this with fscrypt support..... Cheers, Dave. -- Dave Chinner david@fromorbit.com