2014-07-23 21:23:23

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH 0/5] ext4: RFC: Encryption

This patchset proposes a method for encrypting in EXT4 data read and
write paths. It's a proof-of-concept/prototype only right
now. Outstanding issues:

* While it seems to work well with complex tasks like a parallel
kernel build, fsx is pretty good at reliably breaking it in its
current form. I think it's trying to decrypt a page of all zeros
when doing a mmap'd write after an falloc. I want to get feedback
on the overall approach before I spend too much time bug-hunting.

* It has not undergone a security audit/review. It isn't IND-CCA2
secure, and that's the goal. We need a way to store (at least)
page-granular metadata.

* Only the file data is encrypted. I'd like to look into also
encrypting the file system metadata with a mount-wide key. That's
for another phase of development.

* The key management isn't fleshed out. I've hacked in some eCryptfs
stuff because it was the fastest way for me to stand up the
prototype with real crypto keys. Use ecryptfs-add-passphrase to add
a key to the keyring, and then pass the hex sig as the
encrypt_key_sig mount option:

# apt-get install ecryptfs-utils
# echo -n "hunter2" | ecryptfs-add-passphrase
Passphrase:
Inserted auth tok with sig [4cb927ea0c564410] into the user session keyring
# mount -o encrypt_key_sig=4cb927ea0c564410 /dev/sdb1 /mnt/ext4crypt

* The EXT4 block size must be the same as the page size. I'm not yet
sure whether I will want to try to support block-granular
encryption or page-granular encryption. There are implications with
respect to how much the integrity data occupies in relation to the
encrypted data.

Mimi, maybe an approach like this one will work out for IMA. We've
just got to figure out where to store the block- or page-granular
integrity data.

I've broken up the patches so that not only can each one build after
application, but discrete steps of functionality can be tested one
patch at a time.

A couple of other thoughts:

* Maybe the write submit path can complete on the encryption
callback. Not sure what that might buy us.

* Maybe a key with a specific descriptor in each user's keyring
(e.g. "EXT4_DEFAULT_KEY") can be used when creating new files so
that each user can use his own key in a common EXT4 mount. Or maybe
we can specify an encryption context in the parent directory xattr.

Michael Halcrow (5):
ext4: Adds callback support for bio read completion
ext4: Adds EXT4 encryption facilities
ext4: Implements the EXT4 encryption write path
ext4: Adds EXT4 encryption read callback support
ext4: Implements real encryption in the EXT4 write and read paths

fs/buffer.c | 46 +++-
fs/ext4/Makefile | 9 +-
fs/ext4/crypto.c | 629 ++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/ext4.h | 50 ++++
fs/ext4/file.c | 9 +-
fs/ext4/inode.c | 196 +++++++++++++-
fs/ext4/namei.c | 3 +
fs/ext4/page-io.c | 182 ++++++++++---
fs/ext4/super.c | 34 ++-
fs/ext4/xattr.h | 1 +
include/linux/bio.h | 3 +
include/linux/blk_types.h | 4 +
include/linux/buffer_head.h | 8 +
13 files changed, 1118 insertions(+), 56 deletions(-)
create mode 100644 fs/ext4/crypto.c

--
2.0.0.526.g5318336



2014-07-23 21:23:25

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH 2/5] ext4: Adds EXT4 encryption facilities

Adds EXT4 encryption facilities.

On encrypt, we will re-assign the buffer_heads to point to a bounce
page rather than the control_page (which is the original page to write
that contains the plaintext). The block I/O occurs against the bounce
page. On write completion, we re-assign the buffer_heads to the
original plaintext page.

On decrypt, we will attach a read completion callback to the bio
struct. This read completion will decrypt the read contents in-place
prior to setting the page up-to-date.

The current encryption mode, AES-256-XTS, represents the first of 5
encryption modes on the roadmap. Future in-plan modes are
HMAC-SHA1+RANDOM_NONCE (integrity only), AES-256-XTS+HMAC-SHA1,
AES-256-XTS+RANDOM_TWEAK+HMAC-SHA1, and AES-256-GCM. These all depend
on a future per-block metadata feature in EXT4.

Signed-off-by: Michael Halcrow <[email protected]>
---
fs/ext4/Makefile | 9 +-
fs/ext4/crypto.c | 624 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/ext4.h | 49 +++++
fs/ext4/super.c | 34 ++-
fs/ext4/xattr.h | 1 +
5 files changed, 711 insertions(+), 6 deletions(-)
create mode 100644 fs/ext4/crypto.c

diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile
index 0310fec..de4de1c 100644
--- a/fs/ext4/Makefile
+++ b/fs/ext4/Makefile
@@ -4,10 +4,11 @@

obj-$(CONFIG_EXT4_FS) += ext4.o

-ext4-y := balloc.o bitmap.o dir.o file.o fsync.o ialloc.o inode.o page-io.o \
- ioctl.o namei.o super.o symlink.o hash.o resize.o extents.o \
- ext4_jbd2.o migrate.o mballoc.o block_validity.o move_extent.o \
- mmp.o indirect.o extents_status.o xattr.o xattr_user.o \
+ext4-y := balloc.o bitmap.o crypto.o dir.o file.o fsync.o ialloc.o \
+ inode.o page-io.o ioctl.o namei.o super.o symlink.o \
+ hash.o resize.o extents.o ext4_jbd2.o migrate.o \
+ mballoc.o block_validity.o move_extent.o mmp.o \
+ indirect.o extents_status.o xattr.o xattr_user.o \
xattr_trusted.o inline.o

ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o
diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
new file mode 100644
index 0000000..6fbb4fa
--- /dev/null
+++ b/fs/ext4/crypto.c
@@ -0,0 +1,624 @@
+/*
+ * linux/fs/ext4/crypto.c
+ *
+ * This contains encryption functions for ext4
+ *
+ * Written by Michael Halcrow, 2014.
+ *
+ * This has not yet undergone a rigorous security audit. The usage of
+ * AES-XTS should conform to recommendations in NIST Special
+ * Publication 800-38E under the stated adversarial model.
+ *
+ * This intends to protect only file data content confidentiality
+ * against a single point-in-time permanent offline compromise of
+ * block device. If the adversary can access the changing ciphertext
+ * at various points in time, this is susceptible to attacks.
+ *
+ * The roadmap includes adding support for encryption modes with
+ * integrity in order to achieve IND-CCA2 security.
+ *
+ * The key management is a minimally functional placeholder for a more
+ * sophisticated mechanism down the road.
+ */
+
+#include <keys/user-type.h>
+#include <keys/encrypted-type.h>
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/key.h>
+#include <linux/list.h>
+#include <linux/mempool.h>
+#include <linux/random.h>
+#include <linux/scatterlist.h>
+#include <linux/spinlock_types.h>
+
+#include "ext4.h"
+#include "xattr.h"
+
+/* Encryption added and removed here! (L: */
+
+mempool_t *ext4_bounce_page_pool = NULL;
+
+LIST_HEAD(ext4_free_crypto_ctxs);
+DEFINE_SPINLOCK(ext4_crypto_ctx_lock);
+
+/* TODO(mhalcrow): Remove for release */
+atomic_t ext4_dbg_pages = ATOMIC_INIT(0);
+atomic_t ext4_dbg_ctxs = ATOMIC_INIT(0);
+
+/**
+ * ext4_release_crypto_ctx() - Releases an encryption context
+ * @ctx: The encryption context to release.
+ *
+ * If the encryption context was allocated from the pre-allocated
+ * pool, returns it to that pool. Else, frees it.
+ *
+ * If there's a bounce page in the context, frees that.
+ */
+void ext4_release_crypto_ctx(ext4_crypto_ctx_t *ctx)
+{
+ unsigned long flags;
+ if (ctx->bounce_page) {
+ if (ctx->flags & EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL) {
+ __free_page(ctx->bounce_page);
+ atomic_dec(&ext4_dbg_pages);
+ } else {
+ mempool_free(ctx->bounce_page, ext4_bounce_page_pool);
+ }
+ ctx->bounce_page = NULL;
+ }
+ if (ctx->flags & EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL) {
+ if (ctx->tfm)
+ crypto_free_ablkcipher(ctx->tfm);
+ kfree(ctx);
+ atomic_dec(&ext4_dbg_ctxs);
+ } else {
+ spin_lock_irqsave(&ext4_crypto_ctx_lock, flags);
+ list_add(&ctx->free_list, &ext4_free_crypto_ctxs);
+ spin_unlock_irqrestore(&ext4_crypto_ctx_lock, flags);
+ }
+}
+
+/**
+ * __alloc_and_init_crypto_ctx() - Allocates/initializes an encryption context
+ * @mask: The allocation mask.
+ *
+ * Return: An allocated and initialized encryption context on
+ * success. An error value or NULL otherwise.
+ */
+static ext4_crypto_ctx_t *__alloc_and_init_crypto_ctx(u32 mask)
+{
+ ext4_crypto_ctx_t *ctx = kzalloc(sizeof(ext4_crypto_ctx_t), mask);
+ if (!ctx)
+ return ERR_PTR(-ENOMEM);
+ atomic_inc(&ext4_dbg_ctxs);
+ return ctx;
+}
+
+/**
+ * ext4_get_crypto_ctx() - Gets an encryption context
+ * @with_page: If true, allocates and attaches a bounce page
+ * @aes_256_xts_key: The 64-byte encryption key for AES-XTS.
+ *
+ * Allocates and initializes an encryption context.
+ *
+ * Return: An allocated and initialized encryption context on success;
+ * error value or NULL otherwise.
+ */
+ext4_crypto_ctx_t *ext4_get_crypto_ctx(
+ bool with_page, u8 aes_256_xts_key[EXT4_AES_256_XTS_KEY_SIZE])
+{
+ ext4_crypto_ctx_t *ctx = NULL;
+ int res = 0;
+ unsigned long flags;
+
+ /* We first try getting the ctx from a free list because in
+ * the common case the ctx will have an allocated and
+ * initialized crypto ablkcipher, so it's probably a
+ * worthwhile optimization. For the bounce page, we first try
+ * getting it from the kernel allocator because that's just
+ * about as fast as getting it from a list and because a cache
+ * of free pages should generally be a "last resort" option
+ * for a filesystem to be able to do its job. */
+ spin_lock_irqsave(&ext4_crypto_ctx_lock, flags);
+ ctx = list_first_entry_or_null(&ext4_free_crypto_ctxs,
+ ext4_crypto_ctx_t, free_list);
+ if (ctx)
+ list_del(&ctx->free_list);
+ spin_unlock_irqrestore(&ext4_crypto_ctx_lock, flags);
+ if (!ctx) {
+ ctx = __alloc_and_init_crypto_ctx(GFP_NOFS);
+ if (IS_ERR(ctx)) {
+ res = PTR_ERR(ctx);
+ goto out;
+ }
+ ctx->flags |= EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL;
+ } else {
+ ctx->flags &= ~EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL;
+ }
+
+ /* Allocate a new Crypto API context if we don't already have
+ * one. */
+ if (!ctx->tfm) {
+ ctx->tfm = crypto_alloc_ablkcipher("xts(aes)", 0, 0);
+ if (IS_ERR(ctx->tfm)) {
+ res = PTR_ERR(ctx->tfm);
+ ctx->tfm = NULL;
+ goto out;
+ }
+ }
+
+ /* Initialize the encryption engine with the secret symmetric
+ * key. */
+ crypto_ablkcipher_set_flags(ctx->tfm, CRYPTO_TFM_REQ_WEAK_KEY);
+ res = crypto_ablkcipher_setkey(ctx->tfm, aes_256_xts_key,
+ EXT4_AES_256_XTS_KEY_SIZE);
+ if (res)
+ goto out;
+
+ /* There shouldn't be a bounce page attached to the crypto
+ * context at this point. */
+ BUG_ON(ctx->bounce_page);
+ if (!with_page)
+ goto out;
+
+ /* The encryption operation will require a bounce page. */
+ ctx->bounce_page = alloc_page(GFP_NOFS);
+ if (!ctx->bounce_page) {
+ /* This is a potential bottleneck, but at least we'll
+ * have forward progress. */
+ ctx->bounce_page = mempool_alloc(ext4_bounce_page_pool,
+ GFP_NOFS);
+ if (WARN_ON_ONCE(!ctx->bounce_page)) {
+ ctx->bounce_page = mempool_alloc(ext4_bounce_page_pool,
+ GFP_NOFS | __GFP_WAIT);
+ }
+ ctx->flags &= ~EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL;
+ } else {
+ atomic_inc(&ext4_dbg_pages);
+ ctx->flags |= EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL;
+ }
+out:
+ if (res) {
+ if (!IS_ERR_OR_NULL(ctx))
+ ext4_release_crypto_ctx(ctx);
+ ctx = ERR_PTR(res);
+ }
+ return ctx;
+}
+
+struct workqueue_struct *mpage_read_workqueue;
+
+/**
+ * ext4_delete_crypto_ctxs() - Deletes/frees all encryption contexts
+ */
+static void ext4_delete_crypto_ctxs(void)
+{
+ ext4_crypto_ctx_t *pos, *n;
+ list_for_each_entry_safe(pos, n, &ext4_free_crypto_ctxs, free_list) {
+ if (pos->bounce_page) {
+ if (pos->flags &
+ EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL) {
+ __free_page(pos->bounce_page);
+ } else {
+ mempool_free(pos->bounce_page,
+ ext4_bounce_page_pool);
+ }
+ }
+ if (pos->tfm)
+ crypto_free_ablkcipher(pos->tfm);
+ kfree(pos);
+ }
+}
+
+/**
+ * ext4_allocate_crypto_ctxs() - Allocates a pool of encryption contexts
+ * @num_to_allocate: The number of encryption contexts to allocate.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+static int __init ext4_allocate_crypto_ctxs(size_t num_to_allocate)
+{
+ ext4_crypto_ctx_t *ctx = NULL;
+
+ while (num_to_allocate > 0) {
+ ctx = __alloc_and_init_crypto_ctx(GFP_KERNEL);
+ if (IS_ERR(ctx))
+ break;
+ list_add(&ctx->free_list, &ext4_free_crypto_ctxs);
+ num_to_allocate--;
+ }
+ if (IS_ERR(ctx))
+ ext4_delete_crypto_ctxs();
+ return PTR_ERR_OR_ZERO(ctx);
+}
+
+/**
+ * ext4_delete_crypto() - Frees all allocated encryption objects
+ */
+void ext4_delete_crypto(void)
+{
+ ext4_delete_crypto_ctxs();
+ mempool_destroy(ext4_bounce_page_pool);
+ destroy_workqueue(mpage_read_workqueue);
+}
+
+/**
+ * ext4_allocate_crypto() - Allocates encryption objects for later use
+ * @num_crypto_pages: The number of bounce pages to allocate for encryption.
+ * @num_crypto_ctxs: The number of encryption contexts to allocate.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+int __init ext4_allocate_crypto(size_t num_crypto_pages, size_t num_crypto_ctxs)
+{
+ int res = 0;
+ mpage_read_workqueue = alloc_workqueue("ext4_crypto", WQ_HIGHPRI, 0);
+ if (!mpage_read_workqueue) {
+ res = -ENOMEM;
+ goto fail;
+ }
+ res = ext4_allocate_crypto_ctxs(num_crypto_ctxs);
+ if (res)
+ goto fail;
+ ext4_bounce_page_pool = mempool_create_page_pool(num_crypto_pages, 0);
+ if (!ext4_bounce_page_pool)
+ goto fail;
+ return 0;
+fail:
+ ext4_delete_crypto();
+ return res;
+}
+
+/**
+ * ext4_xts_tweak_for_page() - Generates an XTS tweak for a page
+ * @xts_tweak: Buffer into which this writes the XTS tweak.
+ * @page: The page for which this generates a tweak.
+ *
+ * Generates an XTS tweak value for the given page.
+ */
+static void ext4_xts_tweak_for_page(u8 xts_tweak[EXT4_XTS_TWEAK_SIZE],
+ struct page *page)
+{
+ /* Only do this for XTS tweak values. For other modes (CBC,
+ * GCM, etc.), you most like will need to do something
+ * different. */
+ BUILD_BUG_ON(EXT4_XTS_TWEAK_SIZE < sizeof(page->index));
+ memcpy(xts_tweak, &page->index, sizeof(page->index));
+ memset(&xts_tweak[sizeof(page->index)], 0,
+ EXT4_XTS_TWEAK_SIZE - sizeof(page->index));
+}
+
+/**
+ * set_bh_to_page() - Re-assigns the pages for a set of buffer heads
+ * @head: The head of the buffer list to reassign.
+ * @page: The page to which to re-assign the buffer heads.
+ */
+void set_bh_to_page(struct buffer_head *head, struct page *page)
+{
+ struct buffer_head *bh = head;
+ do {
+ set_bh_page(bh, page, bh_offset(bh));
+ if (PageDirty(page))
+ set_buffer_dirty(bh);
+ if (!bh->b_this_page)
+ bh->b_this_page = head;
+ } while ((bh = bh->b_this_page) != head);
+}
+
+typedef struct ext4_crypt_result {
+ struct completion completion;
+ int res;
+} ext4_crypt_result_t;
+
+static void ext4_crypt_complete(struct crypto_async_request *req, int res)
+{
+ ext4_crypt_result_t *ecr = req->data;
+ if (res == -EINPROGRESS)
+ return;
+ ecr->res = res;
+ complete(&ecr->completion);
+}
+
+/**
+ * ext4_encrypt() - Encrypts a page
+ * @ctx: The encryption context.
+ * @plaintext_page: The page to encrypt. Must be locked.
+ *
+ * Allocates a ciphertext page and encrypts plaintext_page into it
+ * using the ctx encryption context.
+ *
+ * Called on the page write path.
+ *
+ * Return: An allocated page with the encrypted content on
+ * success. Else, an error value or NULL.
+ */
+struct page *ext4_encrypt(ext4_crypto_ctx_t *ctx, struct page *plaintext_page)
+{
+ struct page *ciphertext_page = ctx->bounce_page;
+ u8 xts_tweak[EXT4_XTS_TWEAK_SIZE];
+ struct ablkcipher_request *req = NULL;
+ struct ext4_crypt_result ecr;
+ struct scatterlist dst, src;
+ int res = 0;
+ BUG_ON(!ciphertext_page);
+ req = ablkcipher_request_alloc(ctx->tfm, GFP_NOFS);
+ if (!req) {
+ printk_ratelimited(KERN_ERR
+ "%s: crypto_request_alloc() failed\n",
+ __func__);
+ ciphertext_page = ERR_PTR(-ENOMEM);
+ goto out;
+ }
+ ablkcipher_request_set_callback(req,
+ CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
+ ext4_crypt_complete, &ecr);
+ ext4_xts_tweak_for_page(xts_tweak, plaintext_page);
+ sg_init_table(&dst, 1);
+ sg_init_table(&src, 1);
+ sg_set_page(&dst, ciphertext_page, PAGE_CACHE_SIZE, 0);
+ sg_set_page(&src, plaintext_page, PAGE_CACHE_SIZE, 0);
+ ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
+ xts_tweak);
+ res = crypto_ablkcipher_encrypt(req);
+ if (res == -EINPROGRESS || res == -EBUSY) {
+ BUG_ON(req->base.data != &ecr);
+ wait_for_completion(&ecr.completion);
+ res = ecr.res;
+ reinit_completion(&ecr.completion);
+ }
+ ablkcipher_request_free(req);
+ if (res) {
+ printk_ratelimited(KERN_ERR "%s: crypto_ablkcipher_encrypt() "
+ "returned %d\n", __func__, res);
+ ciphertext_page = ERR_PTR(res);
+ goto out;
+ }
+ SetPageDirty(ciphertext_page);
+ SetPagePrivate(ciphertext_page);
+ ctx->control_page = plaintext_page;
+ set_page_private(ciphertext_page, (unsigned long)ctx);
+ set_bh_to_page(page_buffers(plaintext_page), ciphertext_page);
+out:
+ return ciphertext_page;
+}
+
+/**
+ * ext4_decrypt() - Decrypts a page in-place
+ * @ctx: The encryption context.
+ * @page: The page to decrypt. Must be locked.
+ *
+ * Decrypts page in-place using the ctx encryption context.
+ *
+ * Called from the read completion callback.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+int ext4_decrypt(ext4_crypto_ctx_t *ctx, struct page* page)
+{
+ u8 xts_tweak[EXT4_XTS_TWEAK_SIZE];
+ struct ablkcipher_request *req = NULL;
+ struct ext4_crypt_result ecr;
+ struct scatterlist dst, src;
+ int res = 0;
+ BUG_ON(!ctx->tfm);
+ req = ablkcipher_request_alloc(ctx->tfm, GFP_NOFS);
+ if (!req) {
+ res = -ENOMEM;
+ goto out;
+ }
+ ablkcipher_request_set_callback(req,
+ CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
+ ext4_crypt_complete, &ecr);
+ ext4_xts_tweak_for_page(xts_tweak, page);
+ sg_init_table(&dst, 1);
+ sg_init_table(&src, 1);
+ sg_set_page(&dst, page, PAGE_CACHE_SIZE, 0);
+ sg_set_page(&src, page, PAGE_CACHE_SIZE, 0);
+ ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
+ xts_tweak);
+ res = crypto_ablkcipher_decrypt(req);
+ if (res == -EINPROGRESS || res == -EBUSY) {
+ BUG_ON(req->base.data != &ecr);
+ wait_for_completion(&ecr.completion);
+ res = ecr.res;
+ reinit_completion(&ecr.completion);
+ }
+ ablkcipher_request_free(req);
+out:
+ if (res)
+ printk_ratelimited(KERN_ERR "%s: res = [%d]\n", __func__, res);
+ return res;
+}
+
+/**
+ * __get_wrapping_key() - Gets the wrapping key from the user session keyring
+ * @wrapping_key: Buffer into which this writes the wrapping key.
+ * @sbi: The EXT4 superblock info struct.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+static int __get_wrapping_key(char wrapping_key[EXT4_DEFAULT_WRAPPING_KEY_SIZE],
+ struct ext4_sb_info *sbi)
+{
+ struct key *create_key;
+ struct encrypted_key_payload *payload;
+ struct ecryptfs_auth_tok *auth_tok;
+ create_key = request_key(&key_type_user, sbi->s_crypto_key_sig, NULL);
+ if (WARN_ON_ONCE(IS_ERR(create_key)))
+ return -ENOENT;
+ payload = (struct encrypted_key_payload *)create_key->payload.data;
+ if (WARN_ON_ONCE(create_key->datalen !=
+ sizeof(struct ecryptfs_auth_tok))) {
+ return -EINVAL;
+ }
+ auth_tok = (struct ecryptfs_auth_tok *)(&(payload)->payload_data);
+ if (WARN_ON_ONCE(!(auth_tok->token.password.flags &
+ ECRYPTFS_SESSION_KEY_ENCRYPTION_KEY_SET))) {
+ return -EINVAL;
+ }
+ memcpy(wrapping_key,
+ auth_tok->token.password.session_key_encryption_key,
+ EXT4_DEFAULT_WRAPPING_KEY_SIZE);
+ return 0;
+}
+
+/**
+ * ext4_unwrap_key() - Unwraps the encryption key for the inode.
+ * @crypto_key: The buffer into which this writes the unwrapped key.
+ * @wrapped_crypto_key: The wrapped encryption key.
+ * @inode: The inode for the encryption key.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+static int ext4_unwrap_key(char *crypto_key, char *wrapped_crypto_key,
+ struct inode *inode)
+{
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+ struct scatterlist dst, src;
+ struct blkcipher_desc desc = {
+ .flags = CRYPTO_TFM_REQ_MAY_SLEEP
+ };
+ char wrapping_key[EXT4_DEFAULT_WRAPPING_KEY_SIZE];
+ int res = 0;
+ desc.tfm = crypto_alloc_blkcipher("ecb(aes)", 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(desc.tfm))
+ return PTR_ERR(desc.tfm);
+ if (!desc.tfm)
+ return -ENOMEM;
+ crypto_blkcipher_set_flags(desc.tfm, CRYPTO_TFM_REQ_WEAK_KEY);
+ res = __get_wrapping_key(wrapping_key, sbi);
+ if (res)
+ goto out;
+ res = crypto_blkcipher_setkey(desc.tfm, wrapping_key,
+ EXT4_DEFAULT_WRAPPING_KEY_SIZE);
+ memset(wrapping_key, 0, EXT4_DEFAULT_WRAPPING_KEY_SIZE);
+ if (res)
+ goto out;
+ sg_init_table(&dst, 1);
+ sg_init_table(&src, 1);
+ sg_set_buf(&dst, crypto_key, EXT4_NOAUTH_DATA_KEY_SIZE);
+ sg_set_buf(&src, wrapped_crypto_key, EXT4_NOAUTH_DATA_KEY_SIZE);
+ res = crypto_blkcipher_decrypt(&desc, &dst, &src,
+ EXT4_NOAUTH_DATA_KEY_SIZE);
+out:
+ crypto_free_blkcipher(desc.tfm);
+ return res;
+}
+
+/**
+ * ext4_wrap_key() - Wraps the encryption key for the inode.
+ * @wrapped_crypto_key: The buffer into which this writes the wrapped key.
+ * @crypto_key: The encryption key.
+ * @inode: The inode for the encryption key.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+static int ext4_wrap_key(char *wrapped_crypto_key, char *crypto_key,
+ struct inode *inode)
+{
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+ struct scatterlist dst, src;
+ struct blkcipher_desc desc = {
+ .flags = CRYPTO_TFM_REQ_MAY_SLEEP
+ };
+ char wrapping_key[EXT4_DEFAULT_WRAPPING_KEY_SIZE];
+ int res = 0;
+ desc.tfm = crypto_alloc_blkcipher("ecb(aes)", 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(desc.tfm))
+ return PTR_ERR(desc.tfm);
+ if (!desc.tfm)
+ return -ENOMEM;
+ crypto_blkcipher_set_flags(desc.tfm, CRYPTO_TFM_REQ_WEAK_KEY);
+ res = __get_wrapping_key(wrapping_key, sbi);
+ if (res)
+ goto out;
+ res = crypto_blkcipher_setkey(desc.tfm, wrapping_key,
+ EXT4_DEFAULT_WRAPPING_KEY_SIZE);
+ memset(wrapping_key, 0, EXT4_DEFAULT_WRAPPING_KEY_SIZE);
+ if (res)
+ goto out;
+ sg_init_table(&dst, 1);
+ sg_init_table(&src, 1);
+ sg_set_buf(&dst, wrapped_crypto_key, EXT4_NOAUTH_DATA_KEY_SIZE);
+ sg_set_buf(&src, crypto_key, EXT4_NOAUTH_DATA_KEY_SIZE);
+ res = crypto_blkcipher_encrypt(&desc, &dst, &src,
+ EXT4_NOAUTH_DATA_KEY_SIZE);
+out:
+ crypto_free_blkcipher(desc.tfm);
+ return res;
+}
+
+/**
+ * ext4_set_crypto_key() - Generates and sets the encryption key for the inode
+ * @inode: The inode for the encryption key.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+int ext4_set_crypto_key(struct inode *inode)
+{
+ /* TODO(mhalcrow): Prerelease protector set. A real in-plan
+ * one should be in what gets merged into mainline. */
+ char protector_set[EXT4_PRERELEASE_PROTECTOR_SET_SIZE];
+ char *wrapped_crypto_key =
+ &protector_set[EXT4_PROTECTOR_SET_VERSION_SIZE];
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ int res = 0;
+
+ get_random_bytes(ei->i_crypto_key, EXT4_NOAUTH_DATA_KEY_SIZE);
+ res = ext4_wrap_key(wrapped_crypto_key, ei->i_crypto_key, inode);
+ if (res)
+ goto out;
+ ei->i_encrypt = true;
+ protector_set[0] = EXT4_PRERELEASE_PROTECTOR_SET_VERSION;
+ res = ext4_xattr_set(inode, EXT4_XATTR_INDEX_CRYPTO_PROTECTORS, "",
+ protector_set, sizeof(protector_set), 0);
+out:
+ if (res)
+ printk_ratelimited(KERN_ERR "%s: res = [%d]\n", __func__, res);
+ return res;
+}
+
+/**
+ * ext4_get_crypto_key() - Gets the encryption key for the inode.
+ * @inode: The inode for the encryption key.
+ *
+ * Return: Zero on success, non-zero otherwise.
+ */
+int ext4_get_crypto_key(struct inode *inode)
+{
+ /* TODO(mhalcrow): Prerelease protector set. A real in-plan
+ * one should be in what gets merged into mainline. */
+ char protector_set[EXT4_PRERELEASE_PROTECTOR_SET_SIZE];
+ char *wrapped_crypto_key =
+ &protector_set[EXT4_PROTECTOR_SET_VERSION_SIZE];
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ int res;
+
+ res = ext4_xattr_get(inode, EXT4_XATTR_INDEX_CRYPTO_PROTECTORS, "",
+ NULL, 0);
+ if (res != sizeof(protector_set)) {
+ res = -ENODATA;
+ goto out;
+ }
+ res = ext4_xattr_get(inode, EXT4_XATTR_INDEX_CRYPTO_PROTECTORS, "",
+ protector_set, res);
+ if (res != sizeof(protector_set)) {
+ res = -EINVAL;
+ goto out;
+ }
+ if (protector_set[0] != EXT4_PRERELEASE_PROTECTOR_SET_VERSION) {
+ printk_ratelimited(KERN_ERR "%s: Expected protector set "
+ "version [%d]; got [%d]\n",
+ __func__,
+ EXT4_PRERELEASE_PROTECTOR_SET_VERSION,
+ protector_set[0]);
+ res = -EINVAL;
+ goto out;
+ }
+ res = ext4_unwrap_key(ei->i_crypto_key, wrapped_crypto_key, inode);
+ if (!res)
+ ei->i_encrypt = true;
+out:
+ return res;
+}
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 321760d..7508261 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -32,6 +32,7 @@
#include <linux/ratelimit.h>
#include <crypto/hash.h>
#include <linux/falloc.h>
+#include <linux/ecryptfs.h>
#ifdef __KERNEL__
#include <linux/compat.h>
#endif
@@ -808,6 +809,19 @@ do { \

#endif /* defined(__KERNEL__) || defined(__linux__) */

+/* Encryption parameters */
+#define EXT4_AES_256_ECB_KEY_SIZE 32
+#define EXT4_DEFAULT_WRAPPING_KEY_SIZE EXT4_AES_256_ECB_KEY_SIZE
+#define EXT4_AES_256_XTS_KEY_SIZE 64
+#define EXT4_XTS_TWEAK_SIZE 16
+#define EXT4_NOAUTH_DATA_KEY_SIZE EXT4_AES_256_XTS_KEY_SIZE
+/* TODO(mhalcrow): The key management code isn't what's in plan at the
+ * moment. */
+#define EXT4_PRERELEASE_PROTECTOR_SET_VERSION (char)0xFF
+#define EXT4_PROTECTOR_SET_VERSION_SIZE 1
+#define EXT4_PRERELEASE_PROTECTOR_SET_SIZE (EXT4_PROTECTOR_SET_VERSION_SIZE + \
+ EXT4_NOAUTH_DATA_KEY_SIZE)
+
#include "extents_status.h"

/*
@@ -942,6 +956,10 @@ struct ext4_inode_info {

/* Precomputed uuid+inum+igen checksum for seeding inode checksums */
__u32 i_csum_seed;
+
+ /* Encryption params */
+ bool i_encrypt;
+ char i_crypto_key[EXT4_NOAUTH_DATA_KEY_SIZE];
};

/*
@@ -1339,6 +1357,10 @@ struct ext4_sb_info {
struct ratelimit_state s_err_ratelimit_state;
struct ratelimit_state s_warning_ratelimit_state;
struct ratelimit_state s_msg_ratelimit_state;
+
+ /* Encryption */
+ bool s_encrypt;
+ char s_crypto_key_sig[ECRYPTFS_SIG_SIZE_HEX + 1];
};

static inline struct ext4_sb_info *EXT4_SB(struct super_block *sb)
@@ -2787,6 +2809,33 @@ static inline void set_bitmap_uptodate(struct buffer_head *bh)
set_bit(BH_BITMAP_UPTODATE, &(bh)->b_state);
}

+/* crypto.c */
+#define EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL 0x00000001
+#define EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL 0x00000002
+
+typedef struct ext4_crypto_ctx {
+ struct crypto_ablkcipher *tfm; /* Crypto API context */
+ struct page *bounce_page; /* Ciphertext page on write path */
+ struct page *control_page; /* Original page on write path */
+ struct bio *bio; /* The bio for this context */
+ struct work_struct work; /* Work queue for read complete path */
+ struct list_head free_list; /* Free list */
+ int flags; /* Flags */
+} ext4_crypto_ctx_t;
+extern struct workqueue_struct *mpage_read_workqueue;
+int ext4_allocate_crypto(size_t num_crypto_pages, size_t num_crypto_ctxs);
+void ext4_delete_crypto(void);
+ext4_crypto_ctx_t *ext4_get_crypto_ctx(
+ bool with_page, u8 aes_256_xts_key[EXT4_AES_256_XTS_KEY_SIZE]);
+void ext4_release_crypto_ctx(ext4_crypto_ctx_t *ctx);
+void set_bh_to_page(struct buffer_head *head, struct page *page);
+struct page *ext4_encrypt(ext4_crypto_ctx_t *ctx, struct page* plaintext_page);
+int ext4_decrypt(ext4_crypto_ctx_t *ctx, struct page* page);
+int ext4_get_crypto_key(struct inode *inode);
+int ext4_set_crypto_key(struct inode *inode);
+extern atomic_t ext4_dbg_pages; /* TODO(mhalcrow): Remove for release */
+extern atomic_t ext4_dbg_ctxs; /* TODO(mhalcrow): Remove for release */
+
/*
* Disable DIO read nolock optimization, so new dioreaders will be forced
* to grab i_mutex
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 32b43ad..e818e23 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -904,6 +904,8 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
atomic_set(&ei->i_ioend_count, 0);
atomic_set(&ei->i_unwritten, 0);
INIT_WORK(&ei->i_rsv_conversion_work, ext4_end_io_rsv_work);
+ ei->i_encrypt = false;
+ memset(ei->i_crypto_key, 0, EXT4_NOAUTH_DATA_KEY_SIZE);

return &ei->vfs_inode;
}
@@ -1168,7 +1170,7 @@ enum {
Opt_inode_readahead_blks, Opt_journal_ioprio,
Opt_dioread_nolock, Opt_dioread_lock,
Opt_discard, Opt_nodiscard, Opt_init_itable, Opt_noinit_itable,
- Opt_max_dir_size_kb,
+ Opt_max_dir_size_kb, Opt_encrypt_key_sig,
};

static const match_table_t tokens = {
@@ -1244,6 +1246,7 @@ static const match_table_t tokens = {
{Opt_init_itable, "init_itable"},
{Opt_noinit_itable, "noinit_itable"},
{Opt_max_dir_size_kb, "max_dir_size_kb=%u"},
+ {Opt_encrypt_key_sig, "encrypt_key_sig=%s"},
{Opt_removed, "check=none"}, /* mount option from ext2/3 */
{Opt_removed, "nocheck"}, /* mount option from ext2/3 */
{Opt_removed, "reservation"}, /* mount option from ext2/3 */
@@ -1442,6 +1445,7 @@ static const struct mount_opts {
{Opt_jqfmt_vfsv0, QFMT_VFS_V0, MOPT_QFMT},
{Opt_jqfmt_vfsv1, QFMT_VFS_V1, MOPT_QFMT},
{Opt_max_dir_size_kb, 0, MOPT_GTE0},
+ {Opt_encrypt_key_sig, 0, MOPT_STRING},
{Opt_err, 0, 0}
};

@@ -1543,6 +1547,23 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token,
sbi->s_li_wait_mult = arg;
} else if (token == Opt_max_dir_size_kb) {
sbi->s_max_dir_size_kb = arg;
+ } else if (token == Opt_encrypt_key_sig) {
+ char *encrypt_key_sig;
+ encrypt_key_sig = match_strdup(&args[0]);
+ if (!encrypt_key_sig) {
+ ext4_msg(sb, KERN_ERR, "error: could not dup "
+ "encryption key sig string");
+ return -1;
+ }
+ if (strlen(encrypt_key_sig) != ECRYPTFS_SIG_SIZE_HEX) {
+ ext4_msg(sb, KERN_ERR, "error: encryption key sig "
+ "string must be length %d",
+ ECRYPTFS_SIG_SIZE_HEX);
+ return -1;
+ }
+ memcpy(sbi->s_crypto_key_sig, encrypt_key_sig,
+ ECRYPTFS_SIG_SIZE_HEX);
+ sbi->s_encrypt = true;
} else if (token == Opt_stripe) {
sbi->s_stripe = arg;
} else if (token == Opt_resuid) {
@@ -5507,6 +5528,8 @@ struct mutex ext4__aio_mutex[EXT4_WQ_HASH_SZ];
static int __init ext4_init_fs(void)
{
int i, err;
+ static size_t num_prealloc_crypto_pages = 32;
+ static size_t num_prealloc_crypto_ctxs = 128;

ext4_li_info = NULL;
mutex_init(&ext4_li_mtx);
@@ -5519,10 +5542,15 @@ static int __init ext4_init_fs(void)
init_waitqueue_head(&ext4__ioend_wq[i]);
}

- err = ext4_init_es();
+ err = ext4_allocate_crypto(num_prealloc_crypto_pages,
+ num_prealloc_crypto_ctxs);
if (err)
return err;

+ err = ext4_init_es();
+ if (err)
+ goto out8;
+
err = ext4_init_pageio();
if (err)
goto out7;
@@ -5575,6 +5603,8 @@ out6:
ext4_exit_pageio();
out7:
ext4_exit_es();
+out8:
+ ext4_delete_crypto();

return err;
}
diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h
index 29bedf5..fcbe815 100644
--- a/fs/ext4/xattr.h
+++ b/fs/ext4/xattr.h
@@ -23,6 +23,7 @@
#define EXT4_XATTR_INDEX_SECURITY 6
#define EXT4_XATTR_INDEX_SYSTEM 7
#define EXT4_XATTR_INDEX_RICHACL 8
+#define EXT4_XATTR_INDEX_CRYPTO_PROTECTORS 9

struct ext4_xattr_header {
__le32 h_magic; /* magic number for identification */
--
2.0.0.526.g5318336


2014-07-23 21:23:26

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH 3/5] ext4: Implements the EXT4 encryption write path

Implements the EXT4 encryption write path.

Removes real encryption and just memcpy's the page so that this patch
can be independently tested.

Signed-off-by: Michael Halcrow <[email protected]>
---
fs/ext4/crypto.c | 24 +++----
fs/ext4/inode.c | 5 +-
fs/ext4/namei.c | 3 +
fs/ext4/page-io.c | 182 +++++++++++++++++++++++++++++++++++++++++++-----------
4 files changed, 164 insertions(+), 50 deletions(-)

diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
index 6fbb4fa..3c9e9f4 100644
--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -360,20 +360,20 @@ struct page *ext4_encrypt(ext4_crypto_ctx_t *ctx, struct page *plaintext_page)
sg_set_page(&src, plaintext_page, PAGE_CACHE_SIZE, 0);
ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
xts_tweak);
- res = crypto_ablkcipher_encrypt(req);
- if (res == -EINPROGRESS || res == -EBUSY) {
- BUG_ON(req->base.data != &ecr);
- wait_for_completion(&ecr.completion);
- res = ecr.res;
- reinit_completion(&ecr.completion);
- }
ablkcipher_request_free(req);
- if (res) {
- printk_ratelimited(KERN_ERR "%s: crypto_ablkcipher_encrypt() "
- "returned %d\n", __func__, res);
- ciphertext_page = ERR_PTR(res);
- goto out;
+/* =======
+ * TODO(mhalcrow): Removed real crypto so intermediate patch
+ * for write path is still fully functional. */
+ {
+ /* TODO(mhalcrow): Temporary for testing */
+ char *ciphertext_virt, *plaintext_virt;
+ ciphertext_virt = kmap(ciphertext_page);
+ plaintext_virt = kmap(plaintext_page);
+ memcpy(ciphertext_virt, plaintext_virt, PAGE_CACHE_SIZE);
+ kunmap(plaintext_page);
+ kunmap(ciphertext_page);
}
+/* ======= */
SetPageDirty(ciphertext_page);
SetPagePrivate(ciphertext_page);
ctx->control_page = plaintext_page;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 367a60c..4d37a12 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2314,6 +2314,7 @@ static int ext4_writepages(struct address_space *mapping,
handle_t *handle = NULL;
struct mpage_da_data mpd;
struct inode *inode = mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
int needed_blocks, rsv_blocks = 0, ret = 0;
struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb);
bool done;
@@ -2330,7 +2331,7 @@ static int ext4_writepages(struct address_space *mapping,
if (!mapping->nrpages || !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
goto out_writepages;

- if (ext4_should_journal_data(inode)) {
+ if (ext4_should_journal_data(inode) || ei->i_encrypt) {
struct blk_plug plug;

blk_start_plug(&plug);
@@ -2979,6 +2980,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
{
struct file *file = iocb->ki_filp;
struct inode *inode = file->f_mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
ssize_t ret;
size_t count = iov_iter_count(iter);
int overwrite = 0;
@@ -3055,6 +3057,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
get_block_func = ext4_get_block_write;
dio_flags = DIO_LOCKING;
}
+ BUG_ON(ei->i_encrypt);
ret = __blockdev_direct_IO(rw, iocb, inode,
inode->i_sb->s_bdev, iter,
offset,
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 3520ab8..de5623a 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2238,6 +2238,7 @@ static int ext4_create(struct inode *dir, struct dentry *dentry, umode_t mode,
{
handle_t *handle;
struct inode *inode;
+ struct ext4_sb_info *sbi = EXT4_SB(dir->i_sb);
int err, credits, retries = 0;

dquot_initialize(dir);
@@ -2253,6 +2254,8 @@ retry:
inode->i_op = &ext4_file_inode_operations;
inode->i_fop = &ext4_file_operations;
ext4_set_aops(inode);
+ if (sbi->s_encrypt)
+ ext4_set_crypto_key(inode);
err = ext4_add_nondir(handle, dentry, inode);
if (!err && IS_DIRSYNC(dir))
ext4_handle_sync(handle);
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index b24a254..47e8e90 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -61,6 +61,24 @@ static void buffer_io_error(struct buffer_head *bh)
(unsigned long long)bh->b_blocknr);
}

+static void ext4_restore_control_page(struct page *data_page)
+{
+ struct page *control_page = NULL;
+ ext4_crypto_ctx_t *ctx = NULL;
+ BUG_ON(!PagePrivate(data_page));
+ ctx = (ext4_crypto_ctx_t *)page_private(data_page);
+ BUG_ON(!ctx);
+ control_page = ctx->control_page;
+ BUG_ON(!control_page);
+ BUG_ON(!page_buffers(control_page));
+ set_bh_to_page(page_buffers(control_page), control_page);
+ set_page_private(data_page, (unsigned long)NULL);
+ ClearPagePrivate(data_page);
+ BUG_ON(!PageLocked(data_page));
+ unlock_page(data_page);
+ ext4_release_crypto_ctx(ctx);
+}
+
static void ext4_finish_bio(struct bio *bio)
{
int i;
@@ -69,6 +87,8 @@ static void ext4_finish_bio(struct bio *bio)

bio_for_each_segment_all(bvec, bio, i) {
struct page *page = bvec->bv_page;
+ struct page *data_page = NULL;
+ ext4_crypto_ctx_t *ctx = NULL;
struct buffer_head *bh, *head;
unsigned bio_start = bvec->bv_offset;
unsigned bio_end = bio_start + bvec->bv_len;
@@ -78,6 +98,21 @@ static void ext4_finish_bio(struct bio *bio)
if (!page)
continue;

+ if (!page->mapping) {
+ /* The bounce data pages are unmapped. */
+ data_page = page;
+ BUG_ON(!PagePrivate(data_page));
+ ctx = (ext4_crypto_ctx_t *)page_private(data_page);
+ BUG_ON(!ctx);
+ page = ctx->control_page;
+ BUG_ON(!page);
+ } else {
+ /* TODO(mhalcrow): Remove this else{} for release */
+ struct inode *inode = page->mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ BUG_ON(ei->i_encrypt);
+ }
+
if (error) {
SetPageError(page);
set_bit(AS_EIO, &page->mapping->flags);
@@ -102,8 +137,11 @@ static void ext4_finish_bio(struct bio *bio)
} while ((bh = bh->b_this_page) != head);
bit_spin_unlock(BH_Uptodate_Lock, &head->b_state);
local_irq_restore(flags);
- if (!under_io)
+ if (!under_io) {
+ if (ctx)
+ ext4_restore_control_page(data_page);
end_page_writeback(page);
+ }
}
}

@@ -398,40 +436,29 @@ submit_and_retry:
return 0;
}

-int ext4_bio_write_page(struct ext4_io_submit *io,
- struct page *page,
- int len,
- struct writeback_control *wbc,
- bool keep_towrite)
-{
+static void ext4_abort_bio_write(struct page *page,
+ struct writeback_control *wbc) {
+ struct buffer_head *bh, *head;
+ printk(KERN_ERR "%s: called\n", __func__);
+ redirty_page_for_writepage(wbc, page);
+ bh = head = page_buffers(page);
+ do {
+ clear_buffer_async_write(bh);
+ bh = bh->b_this_page;
+ } while (bh != head);
+}
+
+static int ext4_bio_write_buffers(struct ext4_io_submit *io,
+ struct page *page,
+ struct page *data_page,
+ int len,
+ struct writeback_control *wbc) {
struct inode *inode = page->mapping->host;
- unsigned block_start, blocksize;
+ unsigned block_start;
struct buffer_head *bh, *head;
int ret = 0;
int nr_submitted = 0;

- blocksize = 1 << inode->i_blkbits;
-
- BUG_ON(!PageLocked(page));
- BUG_ON(PageWriteback(page));
-
- if (keep_towrite)
- set_page_writeback_keepwrite(page);
- else
- set_page_writeback(page);
- ClearPageError(page);
-
- /*
- * Comments copied from block_write_full_page:
- *
- * The page straddles i_size. It must be zeroed out on each and every
- * writepage invocation because it may be mmapped. "A file is mapped
- * in multiples of the page size. For a file that is not a multiple of
- * the page size, the remaining memory is zeroed when mapped, and
- * writes to that region are not written out to the file."
- */
- if (len < PAGE_CACHE_SIZE)
- zero_user_segment(page, len, PAGE_CACHE_SIZE);
/*
* In the first loop we prepare and mark buffers to submit. We have to
* mark all buffers in the page before submitting so that
@@ -449,7 +476,12 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
}
if (!buffer_dirty(bh) || buffer_delay(bh) ||
!buffer_mapped(bh) || buffer_unwritten(bh)) {
- /* A hole? We can safely clear the dirty bit */
+ /* A hole? We can safely clear the dirty bit,
+ * so long as we're not encrypting */
+ if (data_page) {
+ BUG_ON(!buffer_dirty(bh));
+ BUG_ON(!buffer_mapped(bh));
+ }
if (!buffer_mapped(bh))
clear_buffer_dirty(bh);
if (io->io_bio)
@@ -475,7 +507,6 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
* we can do but mark the page as dirty, and
* better luck next time.
*/
- redirty_page_for_writepage(wbc, page);
break;
}
nr_submitted++;
@@ -484,14 +515,91 @@ int ext4_bio_write_page(struct ext4_io_submit *io,

/* Error stopped previous loop? Clean up buffers... */
if (ret) {
- do {
- clear_buffer_async_write(bh);
- bh = bh->b_this_page;
- } while (bh != head);
+ printk_ratelimited(KERN_ERR "%s: ret = [%d]\n", __func__, ret);
+ ext4_abort_bio_write(page, wbc);
}
unlock_page(page);
/* Nothing submitted - we have to end page writeback */
- if (!nr_submitted)
+ if (!nr_submitted) {
+ if (data_page)
+ ext4_restore_control_page(data_page);
end_page_writeback(page);
+ }
+ return ret;
+}
+
+static int ext4_bio_encrypt_and_write(struct ext4_io_submit *io,
+ struct page *control_page,
+ struct writeback_control *wbc) {
+ struct page *data_page = NULL;
+ ext4_crypto_ctx_t *ctx = NULL;
+ struct inode *inode = control_page->mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ int res = 0;
+ if (!ei->i_encrypt) {
+ res = ext4_set_crypto_key(inode);
+ if (res)
+ goto fail;
+ }
+ BUG_ON(!ei->i_encrypt);
+ ctx = ext4_get_crypto_ctx(true, ei->i_crypto_key);
+ if (IS_ERR(ctx)) {
+ res = PTR_ERR(ctx);
+ goto fail;
+ }
+ data_page = ext4_encrypt(ctx, control_page);
+ if (IS_ERR(data_page)) {
+ res = PTR_ERR(data_page);
+ printk_ratelimited(KERN_ERR "%s: ext4_encrypt() returned "
+ "%d\n", __func__, res);
+ goto free_ctx_and_fail;
+ }
+ BUG_ON(PageLocked(data_page));
+ lock_page(data_page);
+ return ext4_bio_write_buffers(io, control_page, data_page,
+ PAGE_CACHE_SIZE, wbc);
+free_ctx_and_fail:
+ ext4_release_crypto_ctx(ctx);
+fail:
+ ext4_abort_bio_write(control_page, wbc);
+ end_page_writeback(control_page);
+ return res;
+}
+
+int ext4_bio_write_page(struct ext4_io_submit *io,
+ struct page *page,
+ int len,
+ struct writeback_control *wbc,
+ bool keep_towrite)
+{
+ struct ext4_inode_info *ei = EXT4_I(page->mapping->host);
+ int ret = 0;
+
+ BUG_ON(!PageLocked(page));
+ BUG_ON(PageWriteback(page));
+ if (keep_towrite)
+ set_page_writeback_keepwrite(page);
+ else
+ set_page_writeback(page);
+ ClearPageError(page);
+
+ /*
+ * Comments copied from block_write_full_page_endio:
+ *
+ * The page straddles i_size. It must be zeroed out on each and every
+ * writepage invocation because it may be mmapped. "A file is mapped
+ * in multiples of the page size. For a file that is not a multiple of
+ * the page size, the remaining memory is zeroed when mapped, and
+ * writes to that region are not written out to the file."
+ */
+ if (len < PAGE_CACHE_SIZE)
+ zero_user_segment(page, len, PAGE_CACHE_SIZE);
+
+ if (ei->i_encrypt) {
+ ret = ext4_bio_encrypt_and_write(io, page, wbc);
+ } else {
+ ret = ext4_bio_write_buffers(io, page, NULL, len, wbc);
+ }
+ unlock_page(page);
return ret;
}
--
2.0.0.526.g5318336


2014-07-23 21:23:28

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH 5/5] ext4: Implements real encryption in the EXT4 write and read paths

Implements real encryption in the EXT4 write and read paths.

Signed-off-by: Michael Halcrow <[email protected]>
---
fs/ext4/crypto.c | 65 +++++++++++++++++++++++---------------------------------
fs/ext4/inode.c | 9 +++++++-
2 files changed, 34 insertions(+), 40 deletions(-)

diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
index 435f33f..a17b23b 100644
--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -353,9 +353,10 @@ struct page *ext4_encrypt(ext4_crypto_ctx_t *ctx, struct page *plaintext_page)
ciphertext_page = ERR_PTR(-ENOMEM);
goto out;
}
- ablkcipher_request_set_callback(req,
- CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
- ext4_crypt_complete, &ecr);
+ ablkcipher_request_set_callback(
+ req,
+ CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
+ ext4_crypt_complete, &ecr);
ext4_xts_tweak_for_page(xts_tweak, plaintext_page);
sg_init_table(&dst, 1);
sg_init_table(&src, 1);
@@ -363,20 +364,20 @@ struct page *ext4_encrypt(ext4_crypto_ctx_t *ctx, struct page *plaintext_page)
sg_set_page(&src, plaintext_page, PAGE_CACHE_SIZE, 0);
ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
xts_tweak);
+ res = crypto_ablkcipher_encrypt(req);
+ if (res == -EINPROGRESS || res == -EBUSY) {
+ BUG_ON(req->base.data != &ecr);
+ wait_for_completion(&ecr.completion);
+ res = ecr.res;
+ reinit_completion(&ecr.completion);
+ }
ablkcipher_request_free(req);
-/* =======
- * TODO(mhalcrow): Removed real crypto so intermediate patch
- * for write path is still fully functional. */
- {
- /* TODO(mhalcrow): Temporary for testing */
- char *ciphertext_virt, *plaintext_virt;
- ciphertext_virt = kmap(ciphertext_page);
- plaintext_virt = kmap(plaintext_page);
- memcpy(ciphertext_virt, plaintext_virt, PAGE_CACHE_SIZE);
- kunmap(plaintext_page);
- kunmap(ciphertext_page);
+ if (res) {
+ printk_ratelimited(KERN_ERR "%s: crypto_ablkcipher_encrypt() "
+ "returned %d\n", __func__, res);
+ ciphertext_page = ERR_PTR(res);
+ goto out;
}
-/* ======= */
SetPageDirty(ciphertext_page);
SetPagePrivate(ciphertext_page);
ctx->control_page = plaintext_page;
@@ -410,9 +411,10 @@ int ext4_decrypt(ext4_crypto_ctx_t *ctx, struct page* page)
res = -ENOMEM;
goto out;
}
- ablkcipher_request_set_callback(req,
- CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
- ext4_crypt_complete, &ecr);
+ ablkcipher_request_set_callback(
+ req,
+ CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
+ ext4_crypt_complete, &ecr);
ext4_xts_tweak_for_page(xts_tweak, page);
sg_init_table(&dst, 1);
sg_init_table(&src, 1);
@@ -420,28 +422,13 @@ int ext4_decrypt(ext4_crypto_ctx_t *ctx, struct page* page)
sg_set_page(&src, page, PAGE_CACHE_SIZE, 0);
ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
xts_tweak);
-/* =======
- * TODO(mhalcrow): Removed real crypto so intermediate patch for read
- * path is still fully functional. For now just doing something that
- * might expose a race condition. */
- {
- char *page_virt;
- char tmp;
- int i;
- page_virt = kmap(page);
- for (i = 0; i < PAGE_CACHE_SIZE / 2; ++i) {
- tmp = page_virt[i];
- page_virt[i] = page_virt[PAGE_CACHE_SIZE - i - 1];
- page_virt[PAGE_CACHE_SIZE - i - 1] = tmp;
- }
- for (i = 0; i < PAGE_CACHE_SIZE / 2; ++i) {
- tmp = page_virt[i];
- page_virt[i] = page_virt[PAGE_CACHE_SIZE - i - 1];
- page_virt[PAGE_CACHE_SIZE - i - 1] = tmp;
- }
- kunmap(page);
+ res = crypto_ablkcipher_decrypt(req);
+ if (res == -EINPROGRESS || res == -EBUSY) {
+ BUG_ON(req->base.data != &ecr);
+ wait_for_completion(&ecr.completion);
+ res = ecr.res;
+ reinit_completion(&ecr.completion);
}
-/* ======= */
ablkcipher_request_free(req);
out:
if (res)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6bf57d3..a0e80b7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2848,6 +2848,8 @@ static void ext4_completion_work(struct work_struct *work)
ext4_crypto_ctx_t *ctx = container_of(work, ext4_crypto_ctx_t, work);
struct page *page = ctx->control_page;
WARN_ON_ONCE(ext4_decrypt(ctx, page));
+ atomic_dec(&ctx->dbg_refcnt);
+ BUG_ON(atomic_read(&ctx->dbg_refcnt) != 0);
ext4_release_crypto_ctx(ctx);
SetPageUptodate(page);
unlock_page(page);
@@ -2859,6 +2861,8 @@ static int ext4_complete_cb(struct bio *bio, int res)
struct page *page = ctx->control_page;
BUG_ON(atomic_read(&ctx->dbg_refcnt) != 1);
if (res) {
+ atomic_dec(&ctx->dbg_refcnt);
+ BUG_ON(atomic_read(&ctx->dbg_refcnt) != 0);
ext4_release_crypto_ctx(ctx);
unlock_page(page);
return res;
@@ -2962,8 +2966,11 @@ static int ext4_read_full_page(struct page *page)
BUG_ON(ctx->control_page);
ctx->control_page = page;
BUG_ON(atomic_read(&ctx->dbg_refcnt) != 1);
- if (submit_bh_cb(READ, bh, ext4_complete_cb, ctx))
+ if (submit_bh_cb(READ, bh, ext4_complete_cb, ctx)) {
+ atomic_dec(&ctx->dbg_refcnt);
+ BUG_ON(atomic_read(&ctx->dbg_refcnt) != 0);
ext4_release_crypto_ctx(ctx);
+ }
}
}
return 0;
--
2.0.0.526.g5318336


2014-07-23 21:23:27

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH 4/5] ext4: Adds EXT4 encryption read callback support

Adds EXT4 encryption read callback support.

Copies block_read_full_page() to ext4_read_full_page() and adds some
callback stuff near the end. I couldn't think of an elegant way to
modify block_read_full_page() to accept the callback context without
unnecessarily repeatedly allocating it and immediately deallocating it
for sparse pages.

Mimi, for IMA, maybe you'll want to do something like what's happening
in ext4_completion_work().

Signed-off-by: Michael Halcrow <[email protected]>
---
fs/ext4/crypto.c | 30 +++++++--
fs/ext4/ext4.h | 1 +
fs/ext4/file.c | 9 ++-
fs/ext4/inode.c | 184 ++++++++++++++++++++++++++++++++++++++++++++++++++--
include/linux/bio.h | 3 +
5 files changed, 215 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
index 3c9e9f4..435f33f 100644
--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -58,6 +58,7 @@ atomic_t ext4_dbg_ctxs = ATOMIC_INIT(0);
void ext4_release_crypto_ctx(ext4_crypto_ctx_t *ctx)
{
unsigned long flags;
+ atomic_dec(&ctx->dbg_refcnt);
if (ctx->bounce_page) {
if (ctx->flags & EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL) {
__free_page(ctx->bounce_page);
@@ -67,6 +68,7 @@ void ext4_release_crypto_ctx(ext4_crypto_ctx_t *ctx)
}
ctx->bounce_page = NULL;
}
+ ctx->control_page = NULL;
if (ctx->flags & EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL) {
if (ctx->tfm)
crypto_free_ablkcipher(ctx->tfm);
@@ -136,6 +138,7 @@ ext4_crypto_ctx_t *ext4_get_crypto_ctx(
} else {
ctx->flags &= ~EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL;
}
+ atomic_set(&ctx->dbg_refcnt, 0);

/* Allocate a new Crypto API context if we don't already have
* one. */
@@ -417,13 +420,28 @@ int ext4_decrypt(ext4_crypto_ctx_t *ctx, struct page* page)
sg_set_page(&src, page, PAGE_CACHE_SIZE, 0);
ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
xts_tweak);
- res = crypto_ablkcipher_decrypt(req);
- if (res == -EINPROGRESS || res == -EBUSY) {
- BUG_ON(req->base.data != &ecr);
- wait_for_completion(&ecr.completion);
- res = ecr.res;
- reinit_completion(&ecr.completion);
+/* =======
+ * TODO(mhalcrow): Removed real crypto so intermediate patch for read
+ * path is still fully functional. For now just doing something that
+ * might expose a race condition. */
+ {
+ char *page_virt;
+ char tmp;
+ int i;
+ page_virt = kmap(page);
+ for (i = 0; i < PAGE_CACHE_SIZE / 2; ++i) {
+ tmp = page_virt[i];
+ page_virt[i] = page_virt[PAGE_CACHE_SIZE - i - 1];
+ page_virt[PAGE_CACHE_SIZE - i - 1] = tmp;
+ }
+ for (i = 0; i < PAGE_CACHE_SIZE / 2; ++i) {
+ tmp = page_virt[i];
+ page_virt[i] = page_virt[PAGE_CACHE_SIZE - i - 1];
+ page_virt[PAGE_CACHE_SIZE - i - 1] = tmp;
+ }
+ kunmap(page);
}
+/* ======= */
ablkcipher_request_free(req);
out:
if (res)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 7508261..1118bb0 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2821,6 +2821,7 @@ typedef struct ext4_crypto_ctx {
struct work_struct work; /* Work queue for read complete path */
struct list_head free_list; /* Free list */
int flags; /* Flags */
+ atomic_t dbg_refcnt; /* TODO(mhalcrow): Remove for release */
} ext4_crypto_ctx_t;
extern struct workqueue_struct *mpage_read_workqueue;
int ext4_allocate_crypto(size_t num_crypto_pages, size_t num_crypto_ctxs);
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index aca7b24..9b8478c 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -202,6 +202,7 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
{
file_accessed(file);
vma->vm_ops = &ext4_file_vm_ops;
+ ext4_get_crypto_key(file->f_mapping->host);
return 0;
}

@@ -212,6 +213,7 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
struct vfsmount *mnt = filp->f_path.mnt;
struct path path;
char buf[64], *cp;
+ int ret;

if (unlikely(!(sbi->s_mount_flags & EXT4_MF_MNTDIR_SAMPLED) &&
!(sb->s_flags & MS_RDONLY))) {
@@ -250,11 +252,14 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
* writing and the journal is present
*/
if (filp->f_mode & FMODE_WRITE) {
- int ret = ext4_inode_attach_jinode(inode);
+ ret = ext4_inode_attach_jinode(inode);
if (ret < 0)
return ret;
}
- return dquot_file_open(inode, filp);
+ ret = dquot_file_open(inode, filp);
+ if (!ret)
+ ext4_get_crypto_key(inode);
+ return ret;
}

/*
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4d37a12..6bf57d3 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -800,6 +800,8 @@ struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode,
ext4_lblk_t block, int create, int *err)
{
struct buffer_head *bh;
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ ext4_crypto_ctx_t *ctx;

bh = ext4_getblk(handle, inode, block, create, err);
if (!bh)
@@ -808,8 +810,16 @@ struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode,
return bh;
ll_rw_block(READ | REQ_META | REQ_PRIO, 1, &bh);
wait_on_buffer(bh);
- if (buffer_uptodate(bh))
+ if (buffer_uptodate(bh)) {
+ if (ei->i_encrypt) {
+ BUG_ON(!bh->b_page);
+ BUG_ON(bh->b_size != PAGE_CACHE_SIZE);
+ ctx = ext4_get_crypto_ctx(false, ei->i_crypto_key);
+ WARN_ON_ONCE(ext4_decrypt(ctx, bh->b_page));
+ ext4_release_crypto_ctx(ctx);
+ }
return bh;
+ }
put_bh(bh);
*err = -EIO;
return NULL;
@@ -2833,20 +2843,151 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
return generic_block_bmap(mapping, block, ext4_get_block);
}

+static void ext4_completion_work(struct work_struct *work)
+{
+ ext4_crypto_ctx_t *ctx = container_of(work, ext4_crypto_ctx_t, work);
+ struct page *page = ctx->control_page;
+ WARN_ON_ONCE(ext4_decrypt(ctx, page));
+ ext4_release_crypto_ctx(ctx);
+ SetPageUptodate(page);
+ unlock_page(page);
+}
+
+static int ext4_complete_cb(struct bio *bio, int res)
+{
+ ext4_crypto_ctx_t *ctx = bio->bi_cb_ctx;
+ struct page *page = ctx->control_page;
+ BUG_ON(atomic_read(&ctx->dbg_refcnt) != 1);
+ if (res) {
+ ext4_release_crypto_ctx(ctx);
+ unlock_page(page);
+ return res;
+ }
+ INIT_WORK(&ctx->work, ext4_completion_work);
+ queue_work(mpage_read_workqueue, &ctx->work);
+ return 0;
+}
+
+static int ext4_read_full_page(struct page *page)
+{
+ struct inode *inode = page->mapping->host;
+ sector_t iblock, lblock;
+ struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
+ unsigned int blocksize, bbits;
+ int nr, i;
+ int fully_mapped = 1;
+
+ head = create_page_buffers(page, inode, 0);
+ blocksize = head->b_size;
+ bbits = ilog2(blocksize);
+
+ iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits);
+ lblock = (i_size_read(inode)+blocksize-1) >> bbits;
+ bh = head;
+ nr = 0;
+ i = 0;
+
+ do {
+ if (buffer_uptodate(bh))
+ continue;
+
+ if (!buffer_mapped(bh)) {
+ int err = 0;
+
+ fully_mapped = 0;
+ if (iblock < lblock) {
+ WARN_ON(bh->b_size != blocksize);
+ err = ext4_get_block(inode, iblock, bh, 0);
+ if (err)
+ SetPageError(page);
+ }
+ if (!buffer_mapped(bh)) {
+ zero_user(page, i * blocksize, blocksize);
+ if (!err)
+ set_buffer_uptodate(bh);
+ continue;
+ }
+ /*
+ * get_block() might have updated the buffer
+ * synchronously
+ */
+ if (buffer_uptodate(bh))
+ continue;
+ }
+ arr[nr++] = bh;
+ } while (i++, iblock++, (bh = bh->b_this_page) != head);
+
+ if (fully_mapped)
+ SetPageMappedToDisk(page);
+
+ if (!nr) {
+ /*
+ * All buffers are uptodate - we can set the page uptodate
+ * as well. But not if get_block() returned an error.
+ */
+ if (!PageError(page))
+ SetPageUptodate(page);
+ unlock_page(page);
+ return 0;
+ }
+
+ /* TODO(mhalcrow): For the development phase, encryption
+ * requires that the block size be equal to the page size. To
+ * make this the case for release (if we go that route), we'll
+ * need a super.c change to verify. */
+ BUG_ON(nr != 1);
+
+ /* Stage two: lock the buffers */
+ for (i = 0; i < nr; i++) {
+ bh = arr[i];
+ lock_buffer(bh);
+ mark_buffer_async_read(bh);
+ }
+
+ /*
+ * Stage 3: start the IO. Check for uptodateness
+ * inside the buffer lock in case another process reading
+ * the underlying blockdev brought it uptodate (the sct fix).
+ */
+ for (i = 0; i < nr; i++) {
+ bh = arr[i];
+ if (buffer_uptodate(bh))
+ end_buffer_async_read(bh, 1);
+ else {
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ ext4_crypto_ctx_t *ctx = ext4_get_crypto_ctx(
+ false, ei->i_crypto_key);
+ BUG_ON(atomic_read(&ctx->dbg_refcnt) != 0);
+ atomic_inc(&ctx->dbg_refcnt);
+ BUG_ON(ctx->control_page);
+ ctx->control_page = page;
+ BUG_ON(atomic_read(&ctx->dbg_refcnt) != 1);
+ if (submit_bh_cb(READ, bh, ext4_complete_cb, ctx))
+ ext4_release_crypto_ctx(ctx);
+ }
+ }
+ return 0;
+}
+
static int ext4_readpage(struct file *file, struct page *page)
{
int ret = -EAGAIN;
struct inode *inode = page->mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);

trace_ext4_readpage(page);

if (ext4_has_inline_data(inode))
ret = ext4_readpage_inline(inode, page);

- if (ret == -EAGAIN)
+ if (ei->i_encrypt) {
+ BUG_ON(ret != -EAGAIN);
+ ext4_read_full_page(page);
+ } else if (ret == -EAGAIN) {
return mpage_readpage(page, ext4_get_block);
+ }

- return ret;
+ return 0;
}

static int
@@ -2854,12 +2995,35 @@ ext4_readpages(struct file *file, struct address_space *mapping,
struct list_head *pages, unsigned nr_pages)
{
struct inode *inode = mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ struct page *page = NULL;
+ unsigned page_idx;

/* If the file has inline data, no need to do readpages. */
if (ext4_has_inline_data(inode))
return 0;

- return mpage_readpages(mapping, pages, nr_pages, ext4_get_block);
+ if (ei->i_encrypt) {
+ for (page_idx = 0; page_idx < nr_pages; page_idx++) {
+ page = list_entry(pages->prev, struct page, lru);
+ prefetchw(&page->flags);
+ list_del(&page->lru);
+ if (!add_to_page_cache_lru(page, mapping, page->index,
+ GFP_KERNEL)) {
+ if (!PageUptodate(page)) {
+ ext4_read_full_page(page);
+ } else {
+ unlock_page(page);
+ }
+ }
+ page_cache_release(page);
+ }
+ BUG_ON(!list_empty(pages));
+ return 0;
+ } else {
+ return mpage_readpages(mapping, pages, nr_pages,
+ ext4_get_block);
+ }
}

static void ext4_invalidatepage(struct page *page, unsigned int offset,
@@ -3118,9 +3282,13 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb,
{
struct file *file = iocb->ki_filp;
struct inode *inode = file->f_mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
size_t count = iov_iter_count(iter);
ssize_t ret;

+ if (ei->i_encrypt)
+ return 0;
+
/*
* If we are doing data journalling we don't support O_DIRECT
*/
@@ -3243,8 +3411,10 @@ static int ext4_block_zero_page_range(handle_t *handle,
unsigned blocksize, max, pos;
ext4_lblk_t iblock;
struct inode *inode = mapping->host;
+ struct ext4_inode_info *ei = EXT4_I(inode);
struct buffer_head *bh;
struct page *page;
+ ext4_crypto_ctx_t *ctx;
int err = 0;

page = find_or_create_page(mapping, from >> PAGE_CACHE_SHIFT,
@@ -3300,6 +3470,12 @@ static int ext4_block_zero_page_range(handle_t *handle,
/* Uhhuh. Read error. Complain and punt. */
if (!buffer_uptodate(bh))
goto unlock;
+ if (ei->i_encrypt) {
+ BUG_ON(blocksize != PAGE_CACHE_SIZE);
+ ctx = ext4_get_crypto_ctx(false, ei->i_crypto_key);
+ WARN_ON_ONCE(ext4_decrypt(ctx, page));
+ ext4_release_crypto_ctx(ctx);
+ }
}
if (ext4_should_journal_data(inode)) {
BUFFER_TRACE(bh, "get write access");
diff --git a/include/linux/bio.h b/include/linux/bio.h
index d2633ee..6ec3bee 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -375,6 +375,9 @@ static inline struct bio *bio_clone_kmalloc(struct bio *bio, gfp_t gfp_mask)

}

+/* TODO(mhalcrow): Only here for test; remove before release */
+extern atomic_t global_bio_count;
+
extern void bio_endio(struct bio *, int);
extern void bio_endio_nodec(struct bio *, int);
struct request_queue;
--
2.0.0.526.g5318336


2014-07-23 21:28:59

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH 1/5] ext4: Adds callback support for bio read completion

Adds callback support for bio read completion. This
supports data transformation such as encryption.

Signed-off-by: Michael Halcrow <[email protected]>
---
fs/buffer.c | 46 +++++++++++++++++++++++++++++++++++++++------
include/linux/blk_types.h | 4 ++++
include/linux/buffer_head.h | 8 ++++++++
3 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index eba6e4f..a5527c5 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -296,7 +296,7 @@ static void free_more_memory(void)
* I/O completion handler for block_read_full_page() - pages
* which come unlocked at the end of I/O.
*/
-static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
+void end_buffer_async_read(struct buffer_head *bh, int uptodate)
{
unsigned long flags;
struct buffer_head *first;
@@ -339,6 +339,13 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
bit_spin_unlock(BH_Uptodate_Lock, &first->b_state);
local_irq_restore(flags);

+ if (bh->b_private) {
+ struct bio *bio = (struct bio *)bh->b_private;
+ BUG_ON(!bio->bi_cb);
+ if (!bio->bi_cb(bio, !(page_uptodate && !PageError(page))))
+ goto out;
+ }
+
/*
* If none of the buffers had errors and they are all
* uptodate then we can set the page uptodate.
@@ -346,6 +353,7 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
if (page_uptodate && !PageError(page))
SetPageUptodate(page);
unlock_page(page);
+out:
return;

still_busy:
@@ -353,6 +361,7 @@ still_busy:
local_irq_restore(flags);
return;
}
+EXPORT_SYMBOL_GPL(end_buffer_async_read);

/*
* Completion handler for block_write_full_page() - pages which are unlocked
@@ -431,11 +440,12 @@ EXPORT_SYMBOL(end_buffer_async_write);
* PageLocked prevents anyone from starting writeback of a page which is
* under read I/O (PageWriteback is only ever set against a locked page).
*/
-static void mark_buffer_async_read(struct buffer_head *bh)
+void mark_buffer_async_read(struct buffer_head *bh)
{
bh->b_end_io = end_buffer_async_read;
set_buffer_async_read(bh);
}
+EXPORT_SYMBOL_GPL(mark_buffer_async_read);

static void mark_buffer_async_write_endio(struct buffer_head *bh,
bh_end_io_t *handler)
@@ -1654,14 +1664,17 @@ static inline int block_size_bits(unsigned int blocksize)
return ilog2(blocksize);
}

-static struct buffer_head *create_page_buffers(struct page *page, struct inode *inode, unsigned int b_state)
+struct buffer_head *create_page_buffers(struct page *page, struct inode *inode,
+ unsigned int b_state)
{
BUG_ON(!PageLocked(page));

if (!page_has_buffers(page))
- create_empty_buffers(page, 1 << ACCESS_ONCE(inode->i_blkbits), b_state);
+ create_empty_buffers(page, 1 << ACCESS_ONCE(inode->i_blkbits),
+ b_state);
return page_buffers(page);
}
+EXPORT_SYMBOL_GPL(create_page_buffers);

/*
* NOTE! All mapped/uptodate combinations are valid:
@@ -3009,7 +3022,8 @@ static void guard_bh_eod(int rw, struct bio *bio, struct buffer_head *bh)
}
}

-int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
+int _submit_bh_cb(int rw, struct buffer_head *bh, unsigned long bio_flags,
+ bio_completion_cb_t *cb, void *cb_ctx)
{
struct bio *bio;
int ret = 0;
@@ -3043,6 +3057,8 @@ int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)

bio->bi_end_io = end_bio_bh_io_sync;
bio->bi_private = bh;
+ bio->bi_cb = cb;
+ bio->bi_cb_ctx = cb_ctx;
bio->bi_flags |= bio_flags;

/* Take care of bh's that straddle the end of the device */
@@ -3054,6 +3070,12 @@ int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
rw |= REQ_PRIO;

bio_get(bio);
+
+ if (bio->bi_cb) {
+ BUG_ON(bh->b_private);
+ bh->b_private = bio;
+ }
+
submit_bio(rw, bio);

if (bio_flagged(bio, BIO_EOPNOTSUPP))
@@ -3062,14 +3084,26 @@ int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
bio_put(bio);
return ret;
}
+
+int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
+{
+ return _submit_bh_cb(rw, bh, bio_flags, NULL, NULL);
+}
EXPORT_SYMBOL_GPL(_submit_bh);

int submit_bh(int rw, struct buffer_head *bh)
{
- return _submit_bh(rw, bh, 0);
+ return submit_bh_cb(rw, bh, NULL, NULL);
}
EXPORT_SYMBOL(submit_bh);

+int submit_bh_cb(int rw, struct buffer_head *bh, bio_completion_cb_t *cb,
+ void *cb_ctx)
+{
+ return _submit_bh_cb(rw, bh, 0, cb, cb_ctx);
+}
+EXPORT_SYMBOL_GPL(submit_bh_cb);
+
/**
* ll_rw_block: low-level access to block devices (DEPRECATED)
* @rw: whether to %READ or %WRITE or maybe %READA (readahead)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 66c2167..06102df 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -16,6 +16,7 @@ struct io_context;
struct cgroup_subsys_state;
typedef void (bio_end_io_t) (struct bio *, int);
typedef void (bio_destructor_t) (struct bio *);
+typedef int (bio_completion_cb_t) (struct bio *, int);

/*
* was unsigned short, but we might as well be ready for > 64kB I/O pages
@@ -96,6 +97,9 @@ struct bio {

struct bio_set *bi_pool;

+ bio_completion_cb_t *bi_cb; /* completion callback */
+ void *bi_cb_ctx; /* callback context */
+
/*
* We can inline a number of vecs at the end of the bio, to avoid
* double allocations for a small number of bio_vecs. This member
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 324329c..24ea03a 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -160,7 +160,9 @@ void create_empty_buffers(struct page *, unsigned long,
unsigned long b_state);
void end_buffer_read_sync(struct buffer_head *bh, int uptodate);
void end_buffer_write_sync(struct buffer_head *bh, int uptodate);
+void end_buffer_async_read(struct buffer_head *bh, int uptodate);
void end_buffer_async_write(struct buffer_head *bh, int uptodate);
+void mark_buffer_async_read(struct buffer_head *bh);

/* Things to do with buffers at mapping->private_list */
void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode);
@@ -169,6 +171,8 @@ void invalidate_inode_buffers(struct inode *);
int remove_inode_buffers(struct inode *inode);
int sync_mapping_buffers(struct address_space *mapping);
void unmap_underlying_metadata(struct block_device *bdev, sector_t block);
+struct buffer_head *create_page_buffers(struct page *page, struct inode *inode,
+ unsigned int b_state);

void mark_buffer_async_write(struct buffer_head *bh);
void __wait_on_buffer(struct buffer_head *);
@@ -191,7 +195,11 @@ int sync_dirty_buffer(struct buffer_head *bh);
int __sync_dirty_buffer(struct buffer_head *bh, int rw);
void write_dirty_buffer(struct buffer_head *bh, int rw);
int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags);
+int _submit_bh_cb(int rw, struct buffer_head *bh, unsigned long bio_flags,
+ bio_completion_cb_t *cb, void *cb_ctx);
int submit_bh(int, struct buffer_head *);
+int submit_bh_cb(int rw, struct buffer_head *bh, bio_completion_cb_t *cb,
+ void *cb_ctx);
void write_boundary_block(struct block_device *bdev,
sector_t bblock, unsigned blocksize);
int bh_uptodate_or_lock(struct buffer_head *bh);
--
2.0.0.526.g5318336


2014-07-23 22:25:09

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/5] ext4: RFC: Encryption

Hi!
> This patchset proposes a method for encrypting in EXT4 data read and
> write paths. It's a proof-of-concept/prototype only right
> now. Outstanding issues:
>
> * While it seems to work well with complex tasks like a parallel
> kernel build, fsx is pretty good at reliably breaking it in its
> current form. I think it's trying to decrypt a page of all zeros
> when doing a mmap'd write after an falloc. I want to get feedback
> on the overall approach before I spend too much time bug-hunting.
>
> * It has not undergone a security audit/review. It isn't IND-CCA2
> secure, and that's the goal. We need a way to store (at least)
> page-granular metadata.

http://en.wikipedia.org/wiki/Ciphertext_indistinguishability#Indistinguishability_under_chosen_ciphertext_attack.2Fadaptive_chosen_ciphertext_attack_.28IND-CCA1.2C_IND-CCA2.29

So .. you are trying to say that if I offer Disney ability to decrypt
their chosen data, Disney may be able to prove I have their film
encrypted elsewhere on the disk?

Is it supposed to be IND-CPA secure? I.e. can Disney prove I have
their film on my disk if I don't help them? IND-CCA1?

Can I keep just a subtree (/home/pavel/.ssh) encrypted?

Hmm, I might actually want to try this.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-07-23 22:34:59

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/5] ext4: RFC: Encryption

On Thu 2014-07-24 00:25:06, Pavel Machek wrote:
> Hi!
> > This patchset proposes a method for encrypting in EXT4 data read and
> > write paths. It's a proof-of-concept/prototype only right
> > now. Outstanding issues:
> >
> > * While it seems to work well with complex tasks like a parallel
> > kernel build, fsx is pretty good at reliably breaking it in its
> > current form. I think it's trying to decrypt a page of all zeros
> > when doing a mmap'd write after an falloc. I want to get feedback
> > on the overall approach before I spend too much time bug-hunting.

> Can I keep just a subtree (/home/pavel/.ssh) encrypted?

Ok, as far as I can tell no, this is whole filesystem encryption for
now. I guess encrypting based on some attribute is planned...?

Best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-07-23 22:39:21

by Michael Halcrow

[permalink] [raw]
Subject: Re: [PATCH 0/5] ext4: RFC: Encryption

On Wed, Jul 23, 2014 at 3:34 PM, Pavel Machek <[email protected]> wrote:
> On Thu 2014-07-24 00:25:06, Pavel Machek wrote:
>> Hi!
>> > This patchset proposes a method for encrypting in EXT4 data read and
>> > write paths. It's a proof-of-concept/prototype only right
>> > now. Outstanding issues:
>> >
>> > * While it seems to work well with complex tasks like a parallel
>> > kernel build, fsx is pretty good at reliably breaking it in its
>> > current form. I think it's trying to decrypt a page of all zeros
>> > when doing a mmap'd write after an falloc. I want to get feedback
>> > on the overall approach before I spend too much time bug-hunting.
>
>> Can I keep just a subtree (/home/pavel/.ssh) encrypted?
>
> Ok, as far as I can tell no, this is whole filesystem encryption for
> now. I guess encrypting based on some attribute is planned...?

Correct; that's TBD as part of the LSS discussion next month. You can see
it wouldn't be that far-fetched to add an xattr to the parent directory that
specifies the key sig to use. It's just that unexpected things can happen
with hard links.

2014-07-23 22:39:56

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 0/5] ext4: RFC: Encryption

On Jul 23, 2014, at 3:23 PM, Michael Halcrow <[email protected]> wrote:
> This patchset proposes a method for encrypting in EXT4 data read and
> write paths. It's a proof-of-concept/prototype only right now.

Maybe it is worthwhile to take a step back and explain what your overall
goal is? What is the benefit of implementing crypto at the filesystem
level over at the block device level? Are you targeting per-user crypto
keys? Fast secure deletion of files by having per-inode keys that are
encrypted by the filesystem/user key and then clobbered at deletion time?

What is the threat model? Without knowing that, there isn't any point
in designing or implementing anything.

Hopefully you are already aware of the ext4 metadata checksum feature that
is in newer kernels? That might be useful for storing your strong crypto
integrity hashes for filesystem metadata.

We've also previously discussed storing file-data checksums in some form.
One of the leading candidates being either a per-block table of checksums
that are statically mapped either for every block in the filesystem, or
only to the "data" blocks of the filesystem (i.e. those that don't contain
fixed metadata that already has its own checksums such as inode tables,
bitmaps, and backup group descriptors/superblocks). The other possibility
is storing checksums with each extent, with the option to make the extents
as small or large as needed. See thread starting at:
http://www.spinics.net/lists/linux-ext4/msg42620.html

Once we understand what the actual security goals and threat model are,
then it will be easier to determine the best way to implement this.

Cheers, Andreas

> Outstanding issues:
>
> * While it seems to work well with complex tasks like a parallel
> kernel build, fsx is pretty good at reliably breaking it in its
> current form. I think it's trying to decrypt a page of all zeros
> when doing a mmap'd write after an falloc. I want to get feedback
> on the overall approach before I spend too much time bug-hunting.
>
> * It has not undergone a security audit/review. It isn't IND-CCA2
> secure, and that's the goal. We need a way to store (at least)
> page-granular metadata.
>
> * Only the file data is encrypted. I'd like to look into also
> encrypting the file system metadata with a mount-wide key. That's
> for another phase of development.
>
> * The key management isn't fleshed out. I've hacked in some eCryptfs
> stuff because it was the fastest way for me to stand up the
> prototype with real crypto keys. Use ecryptfs-add-passphrase to add
> a key to the keyring, and then pass the hex sig as the
> encrypt_key_sig mount option:
>
> # apt-get install ecryptfs-utils
> # echo -n "hunter2" | ecryptfs-add-passphrase
> Passphrase:
> Inserted auth tok with sig [4cb927ea0c564410] into the user session keyring
> # mount -o encrypt_key_sig=4cb927ea0c564410 /dev/sdb1 /mnt/ext4crypt
>
> * The EXT4 block size must be the same as the page size. I'm not yet
> sure whether I will want to try to support block-granular
> encryption or page-granular encryption. There are implications with
> respect to how much the integrity data occupies in relation to the
> encrypted data.
>
> Mimi, maybe an approach like this one will work out for IMA. We've
> just got to figure out where to store the block- or page-granular
> integrity data.
>
> I've broken up the patches so that not only can each one build after
> application, but discrete steps of functionality can be tested one
> patch at a time.
>
> A couple of other thoughts:
>
> * Maybe the write submit path can complete on the encryption
> callback. Not sure what that might buy us.
>
> * Maybe a key with a specific descriptor in each user's keyring
> (e.g. "EXT4_DEFAULT_KEY") can be used when creating new files so
> that each user can use his own key in a common EXT4 mount. Or maybe
> we can specify an encryption context in the parent directory xattr.
>
> Michael Halcrow (5):
> ext4: Adds callback support for bio read completion
> ext4: Adds EXT4 encryption facilities
> ext4: Implements the EXT4 encryption write path
> ext4: Adds EXT4 encryption read callback support
> ext4: Implements real encryption in the EXT4 write and read paths
>
> fs/buffer.c | 46 +++-
> fs/ext4/Makefile | 9 +-
> fs/ext4/crypto.c | 629 ++++++++++++++++++++++++++++++++++++++++++++
> fs/ext4/ext4.h | 50 ++++
> fs/ext4/file.c | 9 +-
> fs/ext4/inode.c | 196 +++++++++++++-
> fs/ext4/namei.c | 3 +
> fs/ext4/page-io.c | 182 ++++++++++---
> fs/ext4/super.c | 34 ++-
> fs/ext4/xattr.h | 1 +
> include/linux/bio.h | 3 +
> include/linux/blk_types.h | 4 +
> include/linux/buffer_head.h | 8 +
> 13 files changed, 1118 insertions(+), 56 deletions(-)
> create mode 100644 fs/ext4/crypto.c
>
> --
> 2.0.0.526.g5318336
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2014-07-23 23:02:00

by Michael Halcrow

[permalink] [raw]
Subject: Re: [PATCH 0/5] ext4: RFC: Encryption

(Reposting in plain text; the previous cut-and-paste resulted in LKML-hostile
message format.)

On Wed, Jul 23, 2014 at 3:39 PM, Andreas Dilger <[email protected]> wrote:
> On Jul 23, 2014, at 3:23 PM, Michael Halcrow <[email protected]> wrote:
>> This patchset proposes a method for encrypting in EXT4 data read and
>> write paths. It's a proof-of-concept/prototype only right now.
>
> Maybe it is worthwhile to take a step back and explain what your overall
> goal is? What is the benefit of implementing crypto at the filesystem
> level over at the block device level? Are you targeting per-user crypto
> keys? Fast secure deletion of files by having per-inode keys that are
> encrypted by the filesystem/user key and then clobbered at deletion time?
>
> What is the threat model? Without knowing that, there isn't any point
> in designing or implementing anything.

My apologies for leaving those details sparse. I have a fairly large
design document that I need to prune for publication, but I can
copy-and-paste the adversarial models section here. The current
patchset targets Phase 1. The primary use case at the moment
is the Chromium OS user cache.

I don't want to get bogged down in the details around the later phases
at the moment though, because there is related work with IMA that
needs to be taken into consideration.

===
Adversarial Models

The EXT4 encryption effort will have multiple phases and development
with various features. The first version will focus exclusively on
attacks against file content (not metadata) confidentiality under a
single point-in-time permanent offline compromise of the block device
content. Later features will add resiliency in the face of an
adversary who is able to manipulate the offline block device content
prior to the authorized user later performing EXT4 file system I/O on
said content.

We are not currently planning on attempting any mitigations against
timing attacks. We recognize that these are important to address, but
we consider that to be primarily a Linux kernel Crypto API
issue. Addressing timing attacks against users of the Crypto API is
out of scope for this document.

Phase 1 Model

With the initial set of features, we will target the narrowly-scoped
threat of a single point-in-time permanent offline compromise of the
block device content, where loss of confidentiality of file metadata,
including the file sizes, names, and permissions, is tolerable.

An example scenario is the Chromium OS user cache, where the file
names can be tokenized at the application layer.

Phase 2 Model

We will additionally protect the confidentiality of file sizes, names,
permissions, etc. in the face of a single point-in-time permanent
offline compromise of the block device content. A mount-wide protector
will be shared among all users of the file system, and so one user on
a system will be able to manipulate the metadata of other users’
files.

This behavior is necessary for the Chromium OS user cache scenario,
where one user must be able to delete cache content for another user.

Phase 3 Model

We will add per-block authentication tags to the data pages to protect
file content integrity. At this point, the threat scope increases to
include occasional temporary offline compromise of the block device
content. "Occasional" means that an observer will be able to read
and/or manipulate the offline ciphertext and/or authentication tags on
the order of dozens of times in the lifetime of the file system. File
metadata will still be subject to undetected corruption, particularly
by other users on the system who gain access to the block device
content. However, some types of metadata manipulation, such as file
size or block mapping corruption, can be detected when validating the
integrity of the file contents.

Phase 4 Model

We will add in-place cryptographic context conversion to facilitate
transparent live encryption and key rotation. This will address the
threat of a key being compromised due over-exposure (e.g., amount of
data encrypted with the same key, age of key, amount of time the key
has been resident in memory under various run-time circumstances,
etc.).

Phase 5 Model

We will add TPM protectors to require boot sequence integrity to
release the encryption keys. This will address the threat of an
attacker replacing measurable components in the boot sequence up to
and including the kernel.

Phase 6 Model

We will add versioning support to mitigate rollback attacks. This will
address the threat of an attacker snapshotting a previous portion of
the block device content and restoring that portion at a later time.
===

>
> Hopefully you are already aware of the ext4 metadata checksum feature that
> is in newer kernels? That might be useful for storing your strong crypto
> integrity hashes for filesystem metadata.

I'm aware of that work and am evaluating it as a potential vehicle for
storing the
metadata.

> We've also previously discussed storing file-data checksums in some form.
> One of the leading candidates being either a per-block table of checksums
> that are statically mapped either for every block in the filesystem, or
> only to the "data" blocks of the filesystem (i.e. those that don't contain
> fixed metadata that already has its own checksums such as inode tables,
> bitmaps, and backup group descriptors/superblocks). The other possibility
> is storing checksums with each extent, with the option to make the extents
> as small or large as needed. See thread starting at:
> http://www.spinics.net/lists/linux-ext4/msg42620.html
>
> Once we understand what the actual security goals and threat model are,
> then it will be easier to determine the best way to implement this.
>
> Cheers, Andreas
>
>> Outstanding issues:
>>
>> * While it seems to work well with complex tasks like a parallel
>> kernel build, fsx is pretty good at reliably breaking it in its
>> current form. I think it's trying to decrypt a page of all zeros
>> when doing a mmap'd write after an falloc. I want to get feedback
>> on the overall approach before I spend too much time bug-hunting.
>>
>> * It has not undergone a security audit/review. It isn't IND-CCA2
>> secure, and that's the goal. We need a way to store (at least)
>> page-granular metadata.
>>
>> * Only the file data is encrypted. I'd like to look into also
>> encrypting the file system metadata with a mount-wide key. That's
>> for another phase of development.
>>
>> * The key management isn't fleshed out. I've hacked in some eCryptfs
>> stuff because it was the fastest way for me to stand up the
>> prototype with real crypto keys. Use ecryptfs-add-passphrase to add
>> a key to the keyring, and then pass the hex sig as the
>> encrypt_key_sig mount option:
>>
>> # apt-get install ecryptfs-utils
>> # echo -n "hunter2" | ecryptfs-add-passphrase
>> Passphrase:
>> Inserted auth tok with sig [4cb927ea0c564410] into the user session keyring
>> # mount -o encrypt_key_sig=4cb927ea0c564410 /dev/sdb1 /mnt/ext4crypt
>>
>> * The EXT4 block size must be the same as the page size. I'm not yet
>> sure whether I will want to try to support block-granular
>> encryption or page-granular encryption. There are implications with
>> respect to how much the integrity data occupies in relation to the
>> encrypted data.
>>
>> Mimi, maybe an approach like this one will work out for IMA. We've
>> just got to figure out where to store the block- or page-granular
>> integrity data.
>>
>> I've broken up the patches so that not only can each one build after
>> application, but discrete steps of functionality can be tested one
>> patch at a time.
>>
>> A couple of other thoughts:
>>
>> * Maybe the write submit path can complete on the encryption
>> callback. Not sure what that might buy us.
>>
>> * Maybe a key with a specific descriptor in each user's keyring
>> (e.g. "EXT4_DEFAULT_KEY") can be used when creating new files so
>> that each user can use his own key in a common EXT4 mount. Or maybe
>> we can specify an encryption context in the parent directory xattr.
>>
>> Michael Halcrow (5):
>> ext4: Adds callback support for bio read completion
>> ext4: Adds EXT4 encryption facilities
>> ext4: Implements the EXT4 encryption write path
>> ext4: Adds EXT4 encryption read callback support
>> ext4: Implements real encryption in the EXT4 write and read paths
>>
>> fs/buffer.c | 46 +++-
>> fs/ext4/Makefile | 9 +-
>> fs/ext4/crypto.c | 629 ++++++++++++++++++++++++++++++++++++++++++++
>> fs/ext4/ext4.h | 50 ++++
>> fs/ext4/file.c | 9 +-
>> fs/ext4/inode.c | 196 +++++++++++++-
>> fs/ext4/namei.c | 3 +
>> fs/ext4/page-io.c | 182 ++++++++++---
>> fs/ext4/super.c | 34 ++-
>> fs/ext4/xattr.h | 1 +
>> include/linux/bio.h | 3 +
>> include/linux/blk_types.h | 4 +
>> include/linux/buffer_head.h | 8 +
>> 13 files changed, 1118 insertions(+), 56 deletions(-)
>> create mode 100644 fs/ext4/crypto.c
>>
>> --
>> 2.0.0.526.g5318336
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>

2014-07-24 12:27:21

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 0/5] ext4: RFC: Encryption

On Wed, Jul 23, 2014 at 04:39:43PM -0600, Andreas Dilger wrote:
>
> Maybe it is worthwhile to take a step back and explain what your overall
> goal is? What is the benefit of implementing crypto at the filesystem
> level over at the block device level? Are you targeting per-user crypto
> keys? Fast secure deletion of files by having per-inode keys that are
> encrypted by the filesystem/user key and then clobbered at deletion time?

One particular use case would involve per-user crypto keys, and where
space for things like (for example), browser cache files, can be
efficiently shared across multiple users, and where a root or some
privileged user can selectively delete cache files belonging to
another user in order to free up space, although without access to the
keys, root wouldn't be able to gain access to the data files.

The thinking was that each file would be encrypted using a per-file
key, and then the per-file key could be encrypted by one or more user
keys, and stored in the extended attributes.

(Hence, it's important for this design that the file metadata remains
unencrypted; that way it's possible to delete an encrypted file
without having access to the keys, and so e2fsck can check a file
system without having access to the keys. The flip side to this is
that we are potentially leaking more information due to the metadata
being encrypted. So for many use cases, using a block device dm-crypt
is going to be the better choice. But in this particular case, we do
need some of the attributes of ecryptfs, but we want something which
is more efficient and more stable/less bug-prone that ecryptfs.
Michael acknowledged all of this in his presentation at the ext4
workshop in February, and apologized for inflicting ecryptfs on the
world; you can think of this as atonement, by coming up with something
better as a replacement. :-)

Anyway, the question of the how to drive the policy of deciding which
key or keys should be used to encrypt the per-file key is something
for which we can add additional flexibility and power later; none of
this is in the patch series.

So the main thing that's up for review is the changes to the read and
write paths. One of the things that I'm especially looking for input
is the changes to fs/buffer.c. We've tried to keep the changes
minimal, and general, but something that I've considered is to simply
stop using fs/mpage.c and fs/buffer.c for the read path, and extending
fs/ext4/page_io.c so it is used for ext4's read and write path.

There are some advantages to doing this, in that it would reduce CPU
overhead on the read path (since right now we end up calling
ext4_get_block, and hence ext4_map_blocks for every single block), and
would allow us to avoid needing to attach buffer heads to pages except
for the data=journal mode. So this is something I'll want to do at
some point.

But, it would be simpler for review purposes to keep the modification
of ext4 to use page_io.c for reads separate from the encryption
changes, and other file systems that are still using fs/buffer.c and
fs/mpage.c might find these changes to be useful. So there are
arguments both ways.

> We've also previously discussed storing file-data checksums in some form.
> One of the leading candidates being either a per-block table of checksums
> that are statically mapped either for every block in the filesystem, or
> only to the "data" blocks of the filesystem (i.e. those that don't contain
> fixed metadata that already has its own checksums such as inode tables,
> bitmaps, and backup group descriptors/superblocks). The other possibility
> is storing checksums with each extent, with the option to make the extents
> as small or large as needed. See thread starting at:
> http://www.spinics.net/lists/linux-ext4/msg42620.html

Yes, this was discussed at the ext4 workshop in February, when Michael
presented his initial plans. The hope was that we could reuse the
infrastructure file-data checksums for the data integrity checksums.
The other potential thoguht was that some of infrastructure Ming ming
has been thinking about for reflink support could also be used for
both normal checksums and data integrity checksums. We'll have to see
what Lukas and Mingming come up with....

Cheers,

- Ted

2014-08-05 23:06:57

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH 4/5] ext4: Adds EXT4 encryption read callback support

On Wed, 2014-07-23 at 14:23 -0700, Michael Halcrow wrote:
> Adds EXT4 encryption read callback support.
>
> Copies block_read_full_page() to ext4_read_full_page() and adds some
> callback stuff near the end. I couldn't think of an elegant way to
> modify block_read_full_page() to accept the callback context without
> unnecessarily repeatedly allocating it and immediately deallocating it
> for sparse pages.
>
> Mimi, for IMA, maybe you'll want to do something like what's happening
> in ext4_completion_work().
>
> Signed-off-by: Michael Halcrow <[email protected]>
> ---
> fs/ext4/crypto.c | 30 +++++++--
> fs/ext4/ext4.h | 1 +
> fs/ext4/file.c | 9 ++-
> fs/ext4/inode.c | 184 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> include/linux/bio.h | 3 +
> 5 files changed, 215 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
> index 3c9e9f4..435f33f 100644
> --- a/fs/ext4/crypto.c
> +++ b/fs/ext4/crypto.c
> @@ -58,6 +58,7 @@ atomic_t ext4_dbg_ctxs = ATOMIC_INIT(0);
> void ext4_release_crypto_ctx(ext4_crypto_ctx_t *ctx)
> {
> unsigned long flags;
> + atomic_dec(&ctx->dbg_refcnt);
> if (ctx->bounce_page) {
> if (ctx->flags & EXT4_BOUNCE_PAGE_REQUIRES_FREE_ENCRYPT_FL) {
> __free_page(ctx->bounce_page);
> @@ -67,6 +68,7 @@ void ext4_release_crypto_ctx(ext4_crypto_ctx_t *ctx)
> }
> ctx->bounce_page = NULL;
> }
> + ctx->control_page = NULL;
> if (ctx->flags & EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL) {
> if (ctx->tfm)
> crypto_free_ablkcipher(ctx->tfm);
> @@ -136,6 +138,7 @@ ext4_crypto_ctx_t *ext4_get_crypto_ctx(
> } else {
> ctx->flags &= ~EXT4_CTX_REQUIRES_FREE_ENCRYPT_FL;
> }
> + atomic_set(&ctx->dbg_refcnt, 0);
>
> /* Allocate a new Crypto API context if we don't already have
> * one. */
> @@ -417,13 +420,28 @@ int ext4_decrypt(ext4_crypto_ctx_t *ctx, struct page* page)
> sg_set_page(&src, page, PAGE_CACHE_SIZE, 0);
> ablkcipher_request_set_crypt(req, &src, &dst, PAGE_CACHE_SIZE,
> xts_tweak);
> - res = crypto_ablkcipher_decrypt(req);
> - if (res == -EINPROGRESS || res == -EBUSY) {
> - BUG_ON(req->base.data != &ecr);
> - wait_for_completion(&ecr.completion);
> - res = ecr.res;
> - reinit_completion(&ecr.completion);
> +/* =======
> + * TODO(mhalcrow): Removed real crypto so intermediate patch for read
> + * path is still fully functional. For now just doing something that
> + * might expose a race condition. */
> + {
> + char *page_virt;
> + char tmp;
> + int i;
> + page_virt = kmap(page);
> + for (i = 0; i < PAGE_CACHE_SIZE / 2; ++i) {
> + tmp = page_virt[i];
> + page_virt[i] = page_virt[PAGE_CACHE_SIZE - i - 1];
> + page_virt[PAGE_CACHE_SIZE - i - 1] = tmp;
> + }
> + for (i = 0; i < PAGE_CACHE_SIZE / 2; ++i) {
> + tmp = page_virt[i];
> + page_virt[i] = page_virt[PAGE_CACHE_SIZE - i - 1];
> + page_virt[PAGE_CACHE_SIZE - i - 1] = tmp;
> + }
> + kunmap(page);
> }
> +/* ======= */
> ablkcipher_request_free(req);
> out:
> if (res)
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 7508261..1118bb0 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2821,6 +2821,7 @@ typedef struct ext4_crypto_ctx {
> struct work_struct work; /* Work queue for read complete path */
> struct list_head free_list; /* Free list */
> int flags; /* Flags */
> + atomic_t dbg_refcnt; /* TODO(mhalcrow): Remove for release */
> } ext4_crypto_ctx_t;
> extern struct workqueue_struct *mpage_read_workqueue;
> int ext4_allocate_crypto(size_t num_crypto_pages, size_t num_crypto_ctxs);
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index aca7b24..9b8478c 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -202,6 +202,7 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
> {
> file_accessed(file);
> vma->vm_ops = &ext4_file_vm_ops;
> + ext4_get_crypto_key(file->f_mapping->host);
> return 0;
> }
>
> @@ -212,6 +213,7 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
> struct vfsmount *mnt = filp->f_path.mnt;
> struct path path;
> char buf[64], *cp;
> + int ret;
>
> if (unlikely(!(sbi->s_mount_flags & EXT4_MF_MNTDIR_SAMPLED) &&
> !(sb->s_flags & MS_RDONLY))) {
> @@ -250,11 +252,14 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
> * writing and the journal is present
> */
> if (filp->f_mode & FMODE_WRITE) {
> - int ret = ext4_inode_attach_jinode(inode);
> + ret = ext4_inode_attach_jinode(inode);
> if (ret < 0)
> return ret;
> }
> - return dquot_file_open(inode, filp);
> + ret = dquot_file_open(inode, filp);
> + if (!ret)
> + ext4_get_crypto_key(inode);
> + return ret;
> }
>
> /*
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 4d37a12..6bf57d3 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -800,6 +800,8 @@ struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode,
> ext4_lblk_t block, int create, int *err)
> {
> struct buffer_head *bh;
> + struct ext4_inode_info *ei = EXT4_I(inode);
> + ext4_crypto_ctx_t *ctx;
>
> bh = ext4_getblk(handle, inode, block, create, err);
> if (!bh)
> @@ -808,8 +810,16 @@ struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode,
> return bh;
> ll_rw_block(READ | REQ_META | REQ_PRIO, 1, &bh);
> wait_on_buffer(bh);
> - if (buffer_uptodate(bh))
> + if (buffer_uptodate(bh)) {
> + if (ei->i_encrypt) {
> + BUG_ON(!bh->b_page);
> + BUG_ON(bh->b_size != PAGE_CACHE_SIZE);
> + ctx = ext4_get_crypto_ctx(false, ei->i_crypto_key);
> + WARN_ON_ONCE(ext4_decrypt(ctx, bh->b_page));
> + ext4_release_crypto_ctx(ctx);
> + }
> return bh;
> + }
> put_bh(bh);
> *err = -EIO;
> return NULL;
> @@ -2833,20 +2843,151 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
> return generic_block_bmap(mapping, block, ext4_get_block);
> }
>
> +static void ext4_completion_work(struct work_struct *work)
> +{
> + ext4_crypto_ctx_t *ctx = container_of(work, ext4_crypto_ctx_t, work);
> + struct page *page = ctx->control_page;
> + WARN_ON_ONCE(ext4_decrypt(ctx, page));
> + ext4_release_crypto_ctx(ctx);
> + SetPageUptodate(page);
> + unlock_page(page);
> +}
> +

This completion work is on a per block/page basis, not file basis. How
is this going to help?

Mimi

> +static int ext4_complete_cb(struct bio *bio, int res)
> +{
> + ext4_crypto_ctx_t *ctx = bio->bi_cb_ctx;
> + struct page *page = ctx->control_page;
> + BUG_ON(atomic_read(&ctx->dbg_refcnt) != 1);
> + if (res) {
> + ext4_release_crypto_ctx(ctx);
> + unlock_page(page);
> + return res;
> + }
> + INIT_WORK(&ctx->work, ext4_completion_work);
> + queue_work(mpage_read_workqueue, &ctx->work);
> + return 0;
> +}
> +
> +static int ext4_read_full_page(struct page *page)
> +{
> + struct inode *inode = page->mapping->host;
> + sector_t iblock, lblock;
> + struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
> + unsigned int blocksize, bbits;
> + int nr, i;
> + int fully_mapped = 1;
> +
> + head = create_page_buffers(page, inode, 0);
> + blocksize = head->b_size;
> + bbits = ilog2(blocksize);
> +
> + iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits);
> + lblock = (i_size_read(inode)+blocksize-1) >> bbits;
> + bh = head;
> + nr = 0;
> + i = 0;
> +
> + do {
> + if (buffer_uptodate(bh))
> + continue;
> +
> + if (!buffer_mapped(bh)) {
> + int err = 0;
> +
> + fully_mapped = 0;
> + if (iblock < lblock) {
> + WARN_ON(bh->b_size != blocksize);
> + err = ext4_get_block(inode, iblock, bh, 0);
> + if (err)
> + SetPageError(page);
> + }
> + if (!buffer_mapped(bh)) {
> + zero_user(page, i * blocksize, blocksize);
> + if (!err)
> + set_buffer_uptodate(bh);
> + continue;
> + }
> + /*
> + * get_block() might have updated the buffer
> + * synchronously
> + */
> + if (buffer_uptodate(bh))
> + continue;
> + }
> + arr[nr++] = bh;
> + } while (i++, iblock++, (bh = bh->b_this_page) != head);
> +
> + if (fully_mapped)
> + SetPageMappedToDisk(page);
> +
> + if (!nr) {
> + /*
> + * All buffers are uptodate - we can set the page uptodate
> + * as well. But not if get_block() returned an error.
> + */
> + if (!PageError(page))
> + SetPageUptodate(page);
> + unlock_page(page);
> + return 0;
> + }
> +
> + /* TODO(mhalcrow): For the development phase, encryption
> + * requires that the block size be equal to the page size. To
> + * make this the case for release (if we go that route), we'll
> + * need a super.c change to verify. */
> + BUG_ON(nr != 1);
> +
> + /* Stage two: lock the buffers */
> + for (i = 0; i < nr; i++) {
> + bh = arr[i];
> + lock_buffer(bh);
> + mark_buffer_async_read(bh);
> + }
> +
> + /*
> + * Stage 3: start the IO. Check for uptodateness
> + * inside the buffer lock in case another process reading
> + * the underlying blockdev brought it uptodate (the sct fix).
> + */
> + for (i = 0; i < nr; i++) {
> + bh = arr[i];
> + if (buffer_uptodate(bh))
> + end_buffer_async_read(bh, 1);
> + else {
> + struct ext4_inode_info *ei = EXT4_I(inode);
> + ext4_crypto_ctx_t *ctx = ext4_get_crypto_ctx(
> + false, ei->i_crypto_key);
> + BUG_ON(atomic_read(&ctx->dbg_refcnt) != 0);
> + atomic_inc(&ctx->dbg_refcnt);
> + BUG_ON(ctx->control_page);
> + ctx->control_page = page;
> + BUG_ON(atomic_read(&ctx->dbg_refcnt) != 1);
> + if (submit_bh_cb(READ, bh, ext4_complete_cb, ctx))
> + ext4_release_crypto_ctx(ctx);
> + }
> + }
> + return 0;
> +}
> +
> static int ext4_readpage(struct file *file, struct page *page)
> {
> int ret = -EAGAIN;
> struct inode *inode = page->mapping->host;
> + struct ext4_inode_info *ei = EXT4_I(inode);
>
> trace_ext4_readpage(page);
>
> if (ext4_has_inline_data(inode))
> ret = ext4_readpage_inline(inode, page);
>
> - if (ret == -EAGAIN)
> + if (ei->i_encrypt) {
> + BUG_ON(ret != -EAGAIN);
> + ext4_read_full_page(page);
> + } else if (ret == -EAGAIN) {
> return mpage_readpage(page, ext4_get_block);
> + }
>
> - return ret;
> + return 0;
> }
>
> static int
> @@ -2854,12 +2995,35 @@ ext4_readpages(struct file *file, struct address_space *mapping,
> struct list_head *pages, unsigned nr_pages)
> {
> struct inode *inode = mapping->host;
> + struct ext4_inode_info *ei = EXT4_I(inode);
> + struct page *page = NULL;
> + unsigned page_idx;
>
> /* If the file has inline data, no need to do readpages. */
> if (ext4_has_inline_data(inode))
> return 0;
>
> - return mpage_readpages(mapping, pages, nr_pages, ext4_get_block);
> + if (ei->i_encrypt) {
> + for (page_idx = 0; page_idx < nr_pages; page_idx++) {
> + page = list_entry(pages->prev, struct page, lru);
> + prefetchw(&page->flags);
> + list_del(&page->lru);
> + if (!add_to_page_cache_lru(page, mapping, page->index,
> + GFP_KERNEL)) {
> + if (!PageUptodate(page)) {
> + ext4_read_full_page(page);
> + } else {
> + unlock_page(page);
> + }
> + }
> + page_cache_release(page);
> + }
> + BUG_ON(!list_empty(pages));
> + return 0;
> + } else {
> + return mpage_readpages(mapping, pages, nr_pages,
> + ext4_get_block);
> + }
> }
>
> static void ext4_invalidatepage(struct page *page, unsigned int offset,
> @@ -3118,9 +3282,13 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb,
> {
> struct file *file = iocb->ki_filp;
> struct inode *inode = file->f_mapping->host;
> + struct ext4_inode_info *ei = EXT4_I(inode);
> size_t count = iov_iter_count(iter);
> ssize_t ret;
>
> + if (ei->i_encrypt)
> + return 0;
> +
> /*
> * If we are doing data journalling we don't support O_DIRECT
> */
> @@ -3243,8 +3411,10 @@ static int ext4_block_zero_page_range(handle_t *handle,
> unsigned blocksize, max, pos;
> ext4_lblk_t iblock;
> struct inode *inode = mapping->host;
> + struct ext4_inode_info *ei = EXT4_I(inode);
> struct buffer_head *bh;
> struct page *page;
> + ext4_crypto_ctx_t *ctx;
> int err = 0;
>
> page = find_or_create_page(mapping, from >> PAGE_CACHE_SHIFT,
> @@ -3300,6 +3470,12 @@ static int ext4_block_zero_page_range(handle_t *handle,
> /* Uhhuh. Read error. Complain and punt. */
> if (!buffer_uptodate(bh))
> goto unlock;
> + if (ei->i_encrypt) {
> + BUG_ON(blocksize != PAGE_CACHE_SIZE);
> + ctx = ext4_get_crypto_ctx(false, ei->i_crypto_key);
> + WARN_ON_ONCE(ext4_decrypt(ctx, page));
> + ext4_release_crypto_ctx(ctx);
> + }
> }
> if (ext4_should_journal_data(inode)) {
> BUFFER_TRACE(bh, "get write access");
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index d2633ee..6ec3bee 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -375,6 +375,9 @@ static inline struct bio *bio_clone_kmalloc(struct bio *bio, gfp_t gfp_mask)
>
> }
>
> +/* TODO(mhalcrow): Only here for test; remove before release */
> +extern atomic_t global_bio_count;
> +
> extern void bio_endio(struct bio *, int);
> extern void bio_endio_nodec(struct bio *, int);
> struct request_queue;