From: Michael Halcrow Subject: Re: [PATCH] fscrypto: make XTS tweak initialization endian-independent Date: Wed, 5 Oct 2016 14:11:57 -0700 Message-ID: <20161005211157.GB1164@google.com> References: <20161005170659.GA110549@google.com> <20161005182307.GA1164@google.com> <5c01fd8e-95e6-669c-9f9d-30ab5a7af9fd@nod.at> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: David Gstir , linux-fsdevel , linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, Theodore Ts'o , jaegeuk@kernel.org, Eric Biggers , Anand Jain , Tyler Hicks To: Richard Weinberger Return-path: Content-Disposition: inline In-Reply-To: <5c01fd8e-95e6-669c-9f9d-30ab5a7af9fd@nod.at> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Wed, Oct 05, 2016 at 08:44:09PM +0200, Richard Weinberger wrote: > Michael, > > On 05.10.2016 20:23, Michael Halcrow wrote: > >> Eric, > >> > >>> On 04.10.2016, at 18:38, Eric Biggers wrote: > >>> > >>> On Tue, Oct 04, 2016 at 10:46:54AM +0200, Richard Weinberger wrote: > >>>>> Also, currently this code *is* only supposed to be used for XTS. > >>>>> There's a bug where a specially crafted filesystem can cause > >>>>> this code path to be entered with CTS, but I have a patch > >>>>> pending in the ext4 tree to fix that. > >>>> > >>>> David and I are currently working on UBIFS encryption and we have > >>>> to support other cipher modes than XTS. So, keeping fscrypto as > >>>> generic as possible would be nice. :-) > >>>> > >>> > >>> The problem was that the kernel supported reading a file whose > >>> contents was encrypted with CTS, which is only supposed to be used > >>> for filenames. This was inconsistent with > >>> FS_IOC_SET_ENCRYPTION_POLICY which currently only allows XTS for > >>> contents and CTS for filenames. So in other words I wanted to > >>> eliminate a strange scenario that was not intended to happen and > >>> was almost certainly never tested. > >>> > >>> Either way, new modes can still be added if there is a good reason > >>> to do so. What new encryption modes are you thinking of adding, > >>> would they be for contents or for filenames, and are you thinking > >>> they would be offered by all filesystems (ext4 and f2fs too)? > >> > >> We currently have one case where our embedded platform is only able > >> to do AES-CBC in hardware, not AES-XTS. So switching to AES-CBC for > >> file contents would yield far better performance while still being > >> "secure enough". > > > > Great to see more interest in file system encryption. A few thoughts. > > > > I'm concerned about the proliferation of storage encryption code in > > the kernel. Of course, I'm perhaps the worst instigator. However > > what's happening now is that we have several file systems that are > > proposing their own encryption, as well as several attempts at support > > for hardware encryption. > > > > High-performance random access read/write block storage encryption > > with authentication is hard to get right. The way I see it, the ideal > > solution would have these properties: > > > > * The actual cryptographic transform happens in as few places as > > possible -- preferably one place in software, with a sensible > > vendor-neutral API for defering to hardware. > > > > * All blocks in the file system, including both file contents and > > file system metadata, are cryptographically protected. > > > > * Encryption is authenticated and has versioning support to enforce > > consistency and defend against rollback. > > > > * File systems can select which keys protect which blocks. > > > > * Authentication of all storage chains back to Secure Boot. > > > > To solve all of these simultaneously, it looks like we'll want to > > consider changes to the kernel block API: > > Not all filesystems use the block layer, hint: UBIFS. > > > From here, we can delegate to dm-crypt to perform the block > > transformation using the key in the bio. Or we can defer to the block > > storage driver to provision the key into the hardware encryption > > element and tag requests to use that key. > > > > This promises to get a big chunk of the file contents encryption logic > > out of the file system layer. > > > > If the file system doesn't provide a bi_crypt_ctx, then dm-crypt can > > use the default key, which would be shared among all tenants of the > > system. That shared key can potentially be further protected by the > > distro by leveraging a secure element like a TPM. > > No dm-crypt available in MTD land. > > > For user-specific file contents -- say, what's in the user's home > > directory -- then that can be protected with a key that's only made > > available after the user logs in (providing their credentials). Other > > tenants on the same device who can get at the shared key might still > > get information like how many files other users have or what the > > directory structure is, but at least they can't read the contents of > > other users' files. Meanwhile, the volume is comprehensively > > protected against the "left in a taxi" scenario. > > > >> Generally speaking though, it would be great to have encryption > >> _and_ authentication for file contents. > > > > Not good enough for me. I want authenticated encryption for > > everything, contents or metadata. > > Well, let's focus first on file contents. > We have already the fscrypo framework. > > What you suggest is completely different from what we have now. > > >> AEAD modes like GCM or future finalists of the CAESAR competition > >> come to mind. > > > > GCM is problematic for block storage, primarily because it's > > catastrophic to reuse a key/IV pair. > > > > If you naively use the same key when writing more than 2^32 blocks > > with a random IV, you've just stepped into the collision "danger > > zone" (per NIST SP 800-38D). We have a design that involves frequent > > encryption key derivation in order to address the collision space > > problem. But that's just one piece of the solution to the whole > > problem. > > > >> IIRC the ext4 encryption design document mentions this, but it's > >> unclear to me why AES-GCM wasn't used for file contents from the > >> beginning. I'd guess it has to do with where to store the > >> authentication tag and performance. > > > > Comparatively, that's the easy part. The hard part is ensuring > > *consistency* between the ciphertext and the cryptographic metadata. > > If you write out the ciphertext and don't get the IV you used for it > > out to storage simultaneously, you've just lost the block. And > > vice-versa. > > > > Then there's the problem of internal consistency. Supposing you do > > manage to get the blocks and their crypto metadata out together, > > what's to stop an attacker from punching holes (for example)? You > > need an authenticated dictionary structure at that point, such as a > > Merkle tree or an authenticated skiplist. > > > > Now you have an additional data structure to maintain. And you're > > rebalancing a Merkle tree in the midst of modifications, or you're > > producing an implementation of ASL in the Linux kernel (which, BTW, my > > team does have a prototype for right now). > > > > Once we have a root of an authenticated dictionary, we can look to a > > high-performance secure hardware element to sign that root against a > > monotonic counter to get rollback protection. > > > > To protect the entire block device, we need the authentication data to > > be consistent with the ciphertext at the block level. So that means > > something like copy-on-write or log-structured volume at the dm- > > layer. Right now the best shortcut I've been able to come up starts > > with a loopback mount on btrfs. > > > >> Does anybody have details on that? > > > > Hopefully I've been able to shine some light on the reasons why > > high-performance random access read/write block storage encryption > > with authentication is a harder problem than it looks on the surface. > > > > In the meantime, to address the CBC thing, I'd want to understand what > > the hardware is doing exactly. I wouldn't want the existence of code > > that supports CBC in fs/crypto to be interpreted as some sort of > > endorsement for using it rather than XTS (when unauthenticated > > encryption is for some reason the only viable option) for new storage > > encryption applications. > > The hardware offers AES-CBC, accessible via the kernel crypto API. I presume your goal is to usually package up relatively large segments of data you'd like to chain together under one key/IV? Else, for random-access block storage, I would like to get on idea on what the latency/throughput/power impact would be vs. just doing AES-XTS on the CPU. Regardless, if you need IV generation in fs/crypto, you can use ESSIV from eCryptfs as an example. Except you'll probably want to use SHA-256 instead of MD5, if only for the sake of hygiene.