This patchset moves the "FS read I/O callbacks" code into a file of its
own (i.e. fs/read_callbacks.c) and modifies the generic
do_mpage_readpge() to make use of the functionality provided.
"FS read I/O callbacks" code implements the state machine that needs
to be executed after reading data from files that are encrypted and/or
have verity metadata associated with them.
With these changes in place, the patchset changes Ext4 to use
mpage_readpage[s] instead of its own custom ext4_readpage[s]()
functions. This is done to reduce duplicity of code across
filesystems. Also, "FS read I/O callbacks" source files will be built
only if one of CONFIG_FS_ENCRYPTION and CONFIG_FS_VERITY is enabled.
The patchset also modifies fs/buffer.c and fscrypt functionality to
get file encryption/decryption to work with subpage-sized blocks.
The following fixes from Eric Biggers are prerequisites for this
patchset,
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
fscrypt: only set dentry_operations on ciphertext dentries
fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory
fscrypt: fix race allowing rename() and link() of ciphertext dentries
fscrypt: clean up and improve dentry revalidation
The patches can also be obtained from,
"https://github.com/chandanr/linux.git subpage-encryption-v2"
Changelog:
V1 -> V2:
1. Removed the phrase "post_read_process" from file names and
functions. Instead we now use the phrase "read_callbacks" in its
place.
2. When performing changes associated with (1), the changes made by
the patch "Remove the term 'bio' from post read processing" are
made in the earlier patch "Consolidate 'read callbacks' into a new
file". Hence the patch "Remove the term 'bio' from post read
processing" is removed from the patchset.
RFC V2 -> V1:
1. Test and verify FS_CFLG_OWN_PAGES subset of fscrypt_encrypt_page()
code by executing fstests on UBIFS.
2. Implement F2fs function call back to check if the contents of a
page holding a verity file's data needs to be verified.
RFC V1 -> RFC V2:
1. Describe the purpose of "Post processing code" in the cover letter.
2. Fix build errors when CONFIG_FS_VERITY is enabled.
Chandan Rajendra (13):
ext4: Clear BH_Uptodate flag on decryption error
Consolidate "read callbacks" into a new file
fsverity: Add call back to decide if verity check has to be performed
fsverity: Add call back to determine readpage limit
fs/mpage.c: Integrate read callbacks
ext4: Wire up ext4_readpage[s] to use mpage_readpage[s]
Add decryption support for sub-pagesized blocks
ext4: Decrypt all boundary blocks when doing buffered write
ext4: Decrypt the block that needs to be partially zeroed
fscrypt_encrypt_page: Loop across all blocks mapped by a page range
ext4: Compute logical block and the page range to be encrypted
fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page
ext4: Enable encryption for subpage-sized blocks
Documentation/filesystems/fscrypt.rst | 4 +-
fs/Kconfig | 4 +
fs/Makefile | 4 +
fs/buffer.c | 83 +++--
fs/crypto/Kconfig | 1 +
fs/crypto/bio.c | 111 ++++---
fs/crypto/crypto.c | 73 +++--
fs/crypto/fscrypt_private.h | 3 +
fs/ext4/Makefile | 2 +-
fs/ext4/ext4.h | 2 -
fs/ext4/inode.c | 47 ++-
fs/ext4/page-io.c | 9 +-
fs/ext4/readpage.c | 445 --------------------------
fs/ext4/super.c | 39 ++-
fs/f2fs/data.c | 148 ++-------
fs/f2fs/super.c | 15 +-
fs/mpage.c | 51 ++-
fs/read_callbacks.c | 155 +++++++++
fs/verity/Kconfig | 1 +
fs/verity/verify.c | 12 +
include/linux/buffer_head.h | 1 +
include/linux/fscrypt.h | 20 +-
include/linux/fsverity.h | 2 +
include/linux/read_callbacks.h | 22 ++
24 files changed, 522 insertions(+), 732 deletions(-)
delete mode 100644 fs/ext4/readpage.c
create mode 100644 fs/read_callbacks.c
create mode 100644 include/linux/read_callbacks.h
--
2.19.1
On an error return from fscrypt_decrypt_page(), ext4_block_write_begin()
can return with the page's buffer_head marked with BH_Uptodate
flag. This commit clears the BH_Uptodate flag in such cases.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/inode.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3c2e7f5a6c84..05b258db8673 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1225,11 +1225,15 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
if (!buffer_uptodate(*wait_bh))
err = -EIO;
}
- if (unlikely(err))
+ if (unlikely(err)) {
page_zero_new_buffers(page, from, to);
- else if (decrypt)
+ } else if (decrypt) {
err = fscrypt_decrypt_page(page->mapping->host, page,
PAGE_SIZE, 0, page->index);
+ if (err)
+ clear_buffer_uptodate(*wait_bh);
+ }
+
return err;
}
#endif
--
2.19.1
The "read callbacks" code is used by both Ext4 and F2FS. Hence to
remove duplicity, this commit moves the code into
include/linux/read_callbacks.h and fs/read_callbacks.c.
The corresponding decrypt and verity "work" functions have been moved
inside fscrypt and fsverity sources. With these in place, the read
callbacks code now has to just invoke enqueue functions provided by
fscrypt and fsverity.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/Kconfig | 4 +
fs/Makefile | 4 +
fs/crypto/Kconfig | 1 +
fs/crypto/bio.c | 23 ++---
fs/crypto/crypto.c | 17 +--
fs/crypto/fscrypt_private.h | 3 +
fs/ext4/ext4.h | 2 -
fs/ext4/readpage.c | 183 +++++----------------------------
fs/ext4/super.c | 9 +-
fs/f2fs/data.c | 148 ++++----------------------
fs/f2fs/super.c | 9 +-
fs/read_callbacks.c | 136 ++++++++++++++++++++++++
fs/verity/Kconfig | 1 +
fs/verity/verify.c | 12 +++
include/linux/fscrypt.h | 20 +---
include/linux/read_callbacks.h | 21 ++++
16 files changed, 251 insertions(+), 342 deletions(-)
create mode 100644 fs/read_callbacks.c
create mode 100644 include/linux/read_callbacks.h
diff --git a/fs/Kconfig b/fs/Kconfig
index 97f9eb8df713..03084f2dbeaf 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -308,6 +308,10 @@ config NFS_COMMON
depends on NFSD || NFS_FS || LOCKD
default y
+config FS_READ_CALLBACKS
+ bool
+ default n
+
source "net/sunrpc/Kconfig"
source "fs/ceph/Kconfig"
source "fs/cifs/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index 9dd2186e74b5..e0c0fce8cf40 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -21,6 +21,10 @@ else
obj-y += no-block.o
endif
+ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
+obj-y += read_callbacks.o
+endif
+
obj-$(CONFIG_PROC_FS) += proc_namespace.o
obj-y += notify/
diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
index f0de238000c0..163c328bcbd4 100644
--- a/fs/crypto/Kconfig
+++ b/fs/crypto/Kconfig
@@ -8,6 +8,7 @@ config FS_ENCRYPTION
select CRYPTO_CTS
select CRYPTO_SHA256
select KEYS
+ select FS_READ_CALLBACKS
help
Enable encryption of files and directories. This
feature is similar to ecryptfs, but it is more memory
diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
index 5759bcd018cd..27f5618174f2 100644
--- a/fs/crypto/bio.c
+++ b/fs/crypto/bio.c
@@ -24,6 +24,8 @@
#include <linux/module.h>
#include <linux/bio.h>
#include <linux/namei.h>
+#include <linux/read_callbacks.h>
+
#include "fscrypt_private.h"
static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
@@ -54,24 +56,15 @@ void fscrypt_decrypt_bio(struct bio *bio)
}
EXPORT_SYMBOL(fscrypt_decrypt_bio);
-static void completion_pages(struct work_struct *work)
+void fscrypt_decrypt_work(struct work_struct *work)
{
- struct fscrypt_ctx *ctx =
- container_of(work, struct fscrypt_ctx, r.work);
- struct bio *bio = ctx->r.bio;
+ struct read_callbacks_ctx *ctx =
+ container_of(work, struct read_callbacks_ctx, work);
- __fscrypt_decrypt_bio(bio, true);
- fscrypt_release_ctx(ctx);
- bio_put(bio);
-}
+ fscrypt_decrypt_bio(ctx->bio);
-void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx, struct bio *bio)
-{
- INIT_WORK(&ctx->r.work, completion_pages);
- ctx->r.bio = bio;
- fscrypt_enqueue_decrypt_work(&ctx->r.work);
+ read_callbacks(ctx);
}
-EXPORT_SYMBOL(fscrypt_enqueue_decrypt_bio);
void fscrypt_pullback_bio_page(struct page **page, bool restore)
{
@@ -87,7 +80,7 @@ void fscrypt_pullback_bio_page(struct page **page, bool restore)
ctx = (struct fscrypt_ctx *)page_private(bounce_page);
/* restore control page */
- *page = ctx->w.control_page;
+ *page = ctx->control_page;
if (restore)
fscrypt_restore_control_page(bounce_page);
diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 3fc84bf2b1e5..ffa9302a7351 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -53,6 +53,7 @@ struct kmem_cache *fscrypt_info_cachep;
void fscrypt_enqueue_decrypt_work(struct work_struct *work)
{
+ INIT_WORK(work, fscrypt_decrypt_work);
queue_work(fscrypt_read_workqueue, work);
}
EXPORT_SYMBOL(fscrypt_enqueue_decrypt_work);
@@ -70,11 +71,11 @@ void fscrypt_release_ctx(struct fscrypt_ctx *ctx)
{
unsigned long flags;
- if (ctx->flags & FS_CTX_HAS_BOUNCE_BUFFER_FL && ctx->w.bounce_page) {
- mempool_free(ctx->w.bounce_page, fscrypt_bounce_page_pool);
- ctx->w.bounce_page = NULL;
+ if (ctx->flags & FS_CTX_HAS_BOUNCE_BUFFER_FL && ctx->bounce_page) {
+ mempool_free(ctx->bounce_page, fscrypt_bounce_page_pool);
+ ctx->bounce_page = NULL;
}
- ctx->w.control_page = NULL;
+ ctx->control_page = NULL;
if (ctx->flags & FS_CTX_REQUIRES_FREE_ENCRYPT_FL) {
kmem_cache_free(fscrypt_ctx_cachep, ctx);
} else {
@@ -194,11 +195,11 @@ int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw,
struct page *fscrypt_alloc_bounce_page(struct fscrypt_ctx *ctx,
gfp_t gfp_flags)
{
- ctx->w.bounce_page = mempool_alloc(fscrypt_bounce_page_pool, gfp_flags);
- if (ctx->w.bounce_page == NULL)
+ ctx->bounce_page = mempool_alloc(fscrypt_bounce_page_pool, gfp_flags);
+ if (ctx->bounce_page == NULL)
return ERR_PTR(-ENOMEM);
ctx->flags |= FS_CTX_HAS_BOUNCE_BUFFER_FL;
- return ctx->w.bounce_page;
+ return ctx->bounce_page;
}
/**
@@ -267,7 +268,7 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
if (IS_ERR(ciphertext_page))
goto errout;
- ctx->w.control_page = page;
+ ctx->control_page = page;
err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
page, ciphertext_page, len, offs,
gfp_flags);
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index 7da276159593..412a3bcf9efd 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -114,6 +114,9 @@ static inline bool fscrypt_valid_enc_modes(u32 contents_mode,
return false;
}
+/* bio.c */
+void fscrypt_decrypt_work(struct work_struct *work);
+
/* crypto.c */
extern struct kmem_cache *fscrypt_info_cachep;
extern int fscrypt_initialize(unsigned int cop_flags);
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f2b0e628ff7b..23f8568c9b53 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3127,8 +3127,6 @@ static inline void ext4_set_de_type(struct super_block *sb,
extern int ext4_mpage_readpages(struct address_space *mapping,
struct list_head *pages, struct page *page,
unsigned nr_pages, bool is_readahead);
-extern int __init ext4_init_post_read_processing(void);
-extern void ext4_exit_post_read_processing(void);
/* symlink.c */
extern const struct inode_operations ext4_encrypted_symlink_inode_operations;
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 0169e3809da3..e363dededc21 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -44,14 +44,10 @@
#include <linux/backing-dev.h>
#include <linux/pagevec.h>
#include <linux/cleancache.h>
+#include <linux/read_callbacks.h>
#include "ext4.h"
-#define NUM_PREALLOC_POST_READ_CTXS 128
-
-static struct kmem_cache *bio_post_read_ctx_cache;
-static mempool_t *bio_post_read_ctx_pool;
-
static inline bool ext4_bio_encrypted(struct bio *bio)
{
#ifdef CONFIG_FS_ENCRYPTION
@@ -61,125 +57,6 @@ static inline bool ext4_bio_encrypted(struct bio *bio)
#endif
}
-/* postprocessing steps for read bios */
-enum bio_post_read_step {
- STEP_INITIAL = 0,
- STEP_DECRYPT,
- STEP_VERITY,
-};
-
-struct bio_post_read_ctx {
- struct bio *bio;
- struct work_struct work;
- unsigned int cur_step;
- unsigned int enabled_steps;
-};
-
-static void __read_end_io(struct bio *bio)
-{
- struct page *page;
- struct bio_vec *bv;
- int i;
- struct bvec_iter_all iter_all;
-
- bio_for_each_segment_all(bv, bio, i, iter_all) {
- page = bv->bv_page;
-
- /* PG_error was set if any post_read step failed */
- if (bio->bi_status || PageError(page)) {
- ClearPageUptodate(page);
- SetPageError(page);
- } else {
- SetPageUptodate(page);
- }
- unlock_page(page);
- }
- if (bio->bi_private)
- mempool_free(bio->bi_private, bio_post_read_ctx_pool);
- bio_put(bio);
-}
-
-static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
-
-static void decrypt_work(struct work_struct *work)
-{
- struct bio_post_read_ctx *ctx =
- container_of(work, struct bio_post_read_ctx, work);
-
- fscrypt_decrypt_bio(ctx->bio);
-
- bio_post_read_processing(ctx);
-}
-
-static void verity_work(struct work_struct *work)
-{
- struct bio_post_read_ctx *ctx =
- container_of(work, struct bio_post_read_ctx, work);
-
- fsverity_verify_bio(ctx->bio);
-
- bio_post_read_processing(ctx);
-}
-
-static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
-{
- /*
- * We use different work queues for decryption and for verity because
- * verity may require reading metadata pages that need decryption, and
- * we shouldn't recurse to the same workqueue.
- */
- switch (++ctx->cur_step) {
- case STEP_DECRYPT:
- if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
- INIT_WORK(&ctx->work, decrypt_work);
- fscrypt_enqueue_decrypt_work(&ctx->work);
- return;
- }
- ctx->cur_step++;
- /* fall-through */
- case STEP_VERITY:
- if (ctx->enabled_steps & (1 << STEP_VERITY)) {
- INIT_WORK(&ctx->work, verity_work);
- fsverity_enqueue_verify_work(&ctx->work);
- return;
- }
- ctx->cur_step++;
- /* fall-through */
- default:
- __read_end_io(ctx->bio);
- }
-}
-
-static struct bio_post_read_ctx *get_bio_post_read_ctx(struct inode *inode,
- struct bio *bio,
- pgoff_t index)
-{
- unsigned int post_read_steps = 0;
- struct bio_post_read_ctx *ctx = NULL;
-
- if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
- post_read_steps |= 1 << STEP_DECRYPT;
-#ifdef CONFIG_FS_VERITY
- if (inode->i_verity_info != NULL &&
- (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
- post_read_steps |= 1 << STEP_VERITY;
-#endif
- if (post_read_steps) {
- ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
- if (!ctx)
- return ERR_PTR(-ENOMEM);
- ctx->bio = bio;
- ctx->enabled_steps = post_read_steps;
- bio->bi_private = ctx;
- }
- return ctx;
-}
-
-static bool bio_post_read_required(struct bio *bio)
-{
- return bio->bi_private && !bio->bi_status;
-}
-
/*
* I/O completion handler for multipage BIOs.
*
@@ -194,14 +71,30 @@ static bool bio_post_read_required(struct bio *bio)
*/
static void mpage_end_io(struct bio *bio)
{
- if (bio_post_read_required(bio)) {
- struct bio_post_read_ctx *ctx = bio->bi_private;
+ struct bio_vec *bv;
+ int i;
+ struct bvec_iter_all iter_all;
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ if (read_callbacks_required(bio)) {
+ struct read_callbacks_ctx *ctx = bio->bi_private;
- ctx->cur_step = STEP_INITIAL;
- bio_post_read_processing(ctx);
+ read_callbacks(ctx);
return;
}
- __read_end_io(bio);
+#endif
+ bio_for_each_segment_all(bv, bio, i, iter_all) {
+ struct page *page = bv->bv_page;
+
+ if (!bio->bi_status) {
+ SetPageUptodate(page);
+ } else {
+ ClearPageUptodate(page);
+ SetPageError(page);
+ }
+ unlock_page(page);
+ }
+
+ bio_put(bio);
}
static inline loff_t ext4_readpage_limit(struct inode *inode)
@@ -368,17 +261,19 @@ int ext4_mpage_readpages(struct address_space *mapping,
bio = NULL;
}
if (bio == NULL) {
- struct bio_post_read_ctx *ctx;
+ struct read_callbacks_ctx *ctx = NULL;
bio = bio_alloc(GFP_KERNEL,
min_t(int, nr_pages, BIO_MAX_PAGES));
if (!bio)
goto set_error_page;
- ctx = get_bio_post_read_ctx(inode, bio, page->index);
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ ctx = get_read_callbacks_ctx(inode, bio, page->index);
if (IS_ERR(ctx)) {
bio_put(bio);
goto set_error_page;
}
+#endif
bio_set_dev(bio, bdev);
bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
bio->bi_end_io = mpage_end_io;
@@ -417,29 +312,3 @@ int ext4_mpage_readpages(struct address_space *mapping,
submit_bio(bio);
return 0;
}
-
-int __init ext4_init_post_read_processing(void)
-{
- bio_post_read_ctx_cache =
- kmem_cache_create("ext4_bio_post_read_ctx",
- sizeof(struct bio_post_read_ctx), 0, 0, NULL);
- if (!bio_post_read_ctx_cache)
- goto fail;
- bio_post_read_ctx_pool =
- mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
- bio_post_read_ctx_cache);
- if (!bio_post_read_ctx_pool)
- goto fail_free_cache;
- return 0;
-
-fail_free_cache:
- kmem_cache_destroy(bio_post_read_ctx_cache);
-fail:
- return -ENOMEM;
-}
-
-void ext4_exit_post_read_processing(void)
-{
- mempool_destroy(bio_post_read_ctx_pool);
- kmem_cache_destroy(bio_post_read_ctx_cache);
-}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4ae6f5849caa..aba724f82cc3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6101,10 +6101,6 @@ static int __init ext4_init_fs(void)
return err;
err = ext4_init_pending();
- if (err)
- goto out7;
-
- err = ext4_init_post_read_processing();
if (err)
goto out6;
@@ -6146,10 +6142,8 @@ static int __init ext4_init_fs(void)
out4:
ext4_exit_pageio();
out5:
- ext4_exit_post_read_processing();
-out6:
ext4_exit_pending();
-out7:
+out6:
ext4_exit_es();
return err;
@@ -6166,7 +6160,6 @@ static void __exit ext4_exit_fs(void)
ext4_exit_sysfs();
ext4_exit_system_zone();
ext4_exit_pageio();
- ext4_exit_post_read_processing();
ext4_exit_es();
ext4_exit_pending();
}
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 038b958d0fa9..05430d3650ab 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -18,6 +18,7 @@
#include <linux/uio.h>
#include <linux/cleancache.h>
#include <linux/sched/signal.h>
+#include <linux/read_callbacks.h>
#include "f2fs.h"
#include "node.h"
@@ -25,11 +26,6 @@
#include "trace.h"
#include <trace/events/f2fs.h>
-#define NUM_PREALLOC_POST_READ_CTXS 128
-
-static struct kmem_cache *bio_post_read_ctx_cache;
-static mempool_t *bio_post_read_ctx_pool;
-
static bool __is_cp_guaranteed(struct page *page)
{
struct address_space *mapping = page->mapping;
@@ -69,20 +65,6 @@ static enum count_type __read_io_type(struct page *page)
return F2FS_RD_DATA;
}
-/* postprocessing steps for read bios */
-enum bio_post_read_step {
- STEP_INITIAL = 0,
- STEP_DECRYPT,
- STEP_VERITY,
-};
-
-struct bio_post_read_ctx {
- struct bio *bio;
- struct work_struct work;
- unsigned int cur_step;
- unsigned int enabled_steps;
-};
-
static void __read_end_io(struct bio *bio)
{
struct page *page;
@@ -104,65 +86,16 @@ static void __read_end_io(struct bio *bio)
dec_page_count(F2FS_P_SB(page), __read_io_type(page));
unlock_page(page);
}
- if (bio->bi_private)
- mempool_free(bio->bi_private, bio_post_read_ctx_pool);
- bio_put(bio);
-}
-
-static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
-static void decrypt_work(struct work_struct *work)
-{
- struct bio_post_read_ctx *ctx =
- container_of(work, struct bio_post_read_ctx, work);
-
- fscrypt_decrypt_bio(ctx->bio);
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ if (bio->bi_private) {
+ struct read_callbacks_ctx *ctx;
- bio_post_read_processing(ctx);
-}
-
-static void verity_work(struct work_struct *work)
-{
- struct bio_post_read_ctx *ctx =
- container_of(work, struct bio_post_read_ctx, work);
-
- fsverity_verify_bio(ctx->bio);
-
- bio_post_read_processing(ctx);
-}
-
-static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
-{
- /*
- * We use different work queues for decryption and for verity because
- * verity may require reading metadata pages that need decryption, and
- * we shouldn't recurse to the same workqueue.
- */
- switch (++ctx->cur_step) {
- case STEP_DECRYPT:
- if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
- INIT_WORK(&ctx->work, decrypt_work);
- fscrypt_enqueue_decrypt_work(&ctx->work);
- return;
- }
- ctx->cur_step++;
- /* fall-through */
- case STEP_VERITY:
- if (ctx->enabled_steps & (1 << STEP_VERITY)) {
- INIT_WORK(&ctx->work, verity_work);
- fsverity_enqueue_verify_work(&ctx->work);
- return;
- }
- ctx->cur_step++;
- /* fall-through */
- default:
- __read_end_io(ctx->bio);
+ ctx = bio->bi_private;
+ put_read_callbacks_ctx(ctx);
}
-}
-
-static bool f2fs_bio_post_read_required(struct bio *bio)
-{
- return bio->bi_private && !bio->bi_status;
+#endif
+ bio_put(bio);
}
static void f2fs_read_end_io(struct bio *bio)
@@ -173,14 +106,12 @@ static void f2fs_read_end_io(struct bio *bio)
bio->bi_status = BLK_STS_IOERR;
}
- if (f2fs_bio_post_read_required(bio)) {
- struct bio_post_read_ctx *ctx = bio->bi_private;
-
- ctx->cur_step = STEP_INITIAL;
- bio_post_read_processing(ctx);
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ if (!bio->bi_status && bio->bi_private) {
+ read_callbacks((struct read_callbacks_ctx *)(bio->bi_private));
return;
}
-
+#endif
__read_end_io(bio);
}
@@ -582,9 +513,9 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct bio *bio;
- struct bio_post_read_ctx *ctx;
- unsigned int post_read_steps = 0;
-
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ struct read_callbacks_ctx *ctx;
+#endif
if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC))
return ERR_PTR(-EFAULT);
@@ -595,24 +526,13 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
bio->bi_end_io = f2fs_read_end_io;
bio_set_op_attrs(bio, REQ_OP_READ, op_flag);
- if (f2fs_encrypted_file(inode))
- post_read_steps |= 1 << STEP_DECRYPT;
-#ifdef CONFIG_FS_VERITY
- if (inode->i_verity_info != NULL &&
- (first_idx < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
- post_read_steps |= 1 << STEP_VERITY;
-#endif
- if (post_read_steps) {
- ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
- if (!ctx) {
- bio_put(bio);
- return ERR_PTR(-ENOMEM);
- }
- ctx->bio = bio;
- ctx->enabled_steps = post_read_steps;
- bio->bi_private = ctx;
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ ctx = get_read_callbacks_ctx(inode, bio, first_idx);
+ if (IS_ERR(ctx)) {
+ bio_put(bio);
+ return (struct bio *)ctx;
}
-
+#endif
return bio;
}
@@ -2894,29 +2814,3 @@ void f2fs_clear_page_cache_dirty_tag(struct page *page)
PAGECACHE_TAG_DIRTY);
xa_unlock_irqrestore(&mapping->i_pages, flags);
}
-
-int __init f2fs_init_post_read_processing(void)
-{
- bio_post_read_ctx_cache =
- kmem_cache_create("f2fs_bio_post_read_ctx",
- sizeof(struct bio_post_read_ctx), 0, 0, NULL);
- if (!bio_post_read_ctx_cache)
- goto fail;
- bio_post_read_ctx_pool =
- mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
- bio_post_read_ctx_cache);
- if (!bio_post_read_ctx_pool)
- goto fail_free_cache;
- return 0;
-
-fail_free_cache:
- kmem_cache_destroy(bio_post_read_ctx_cache);
-fail:
- return -ENOMEM;
-}
-
-void __exit f2fs_destroy_post_read_processing(void)
-{
- mempool_destroy(bio_post_read_ctx_pool);
- kmem_cache_destroy(bio_post_read_ctx_cache);
-}
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0e187f67b206..2f75f06c784a 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -3633,15 +3633,11 @@ static int __init init_f2fs_fs(void)
err = register_filesystem(&f2fs_fs_type);
if (err)
goto free_shrinker;
+
f2fs_create_root_stats();
- err = f2fs_init_post_read_processing();
- if (err)
- goto free_root_stats;
+
return 0;
-free_root_stats:
- f2fs_destroy_root_stats();
- unregister_filesystem(&f2fs_fs_type);
free_shrinker:
unregister_shrinker(&f2fs_shrinker_info);
free_sysfs:
@@ -3662,7 +3658,6 @@ static int __init init_f2fs_fs(void)
static void __exit exit_f2fs_fs(void)
{
- f2fs_destroy_post_read_processing();
f2fs_destroy_root_stats();
unregister_filesystem(&f2fs_fs_type);
unregister_shrinker(&f2fs_shrinker_info);
diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
new file mode 100644
index 000000000000..b6d5b95e67d7
--- /dev/null
+++ b/fs/read_callbacks.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file tracks the state machine that needs to be executed after reading
+ * data from files that are encrypted and/or have verity metadata associated
+ * with them.
+ */
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/bio.h>
+#include <linux/fscrypt.h>
+#include <linux/fsverity.h>
+#include <linux/read_callbacks.h>
+
+#define NUM_PREALLOC_POST_READ_CTXS 128
+
+static struct kmem_cache *read_callbacks_ctx_cache;
+static mempool_t *read_callbacks_ctx_pool;
+
+/* Read callback state machine steps */
+enum read_callbacks_step {
+ STEP_INITIAL = 0,
+ STEP_DECRYPT,
+ STEP_VERITY,
+};
+
+void end_read_callbacks(struct bio *bio)
+{
+ struct page *page;
+ struct bio_vec *bv;
+ int i;
+ struct bvec_iter_all iter_all;
+
+ bio_for_each_segment_all(bv, bio, i, iter_all) {
+ page = bv->bv_page;
+
+ BUG_ON(bio->bi_status);
+
+ if (!PageError(page))
+ SetPageUptodate(page);
+
+ unlock_page(page);
+ }
+ if (bio->bi_private)
+ put_read_callbacks_ctx(bio->bi_private);
+ bio_put(bio);
+}
+EXPORT_SYMBOL(end_read_callbacks);
+
+void read_callbacks(struct read_callbacks_ctx *ctx)
+{
+ /*
+ * We use different work queues for decryption and for verity because
+ * verity may require reading metadata pages that need decryption, and
+ * we shouldn't recurse to the same workqueue.
+ */
+ switch (++ctx->cur_step) {
+ case STEP_DECRYPT:
+ if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
+ fscrypt_enqueue_decrypt_work(&ctx->work);
+ return;
+ }
+ ctx->cur_step++;
+ /* fall-through */
+ case STEP_VERITY:
+ if (ctx->enabled_steps & (1 << STEP_VERITY)) {
+ fsverity_enqueue_verify_work(&ctx->work);
+ return;
+ }
+ ctx->cur_step++;
+ /* fall-through */
+ default:
+ end_read_callbacks(ctx->bio);
+ }
+}
+EXPORT_SYMBOL(read_callbacks);
+
+struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
+ struct bio *bio,
+ pgoff_t index)
+{
+ unsigned int read_callbacks_steps = 0;
+ struct read_callbacks_ctx *ctx = NULL;
+
+ if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
+ read_callbacks_steps |= 1 << STEP_DECRYPT;
+#ifdef CONFIG_FS_VERITY
+ if (inode->i_verity_info != NULL &&
+ (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
+ read_callbacks_steps |= 1 << STEP_VERITY;
+#endif
+ if (read_callbacks_steps) {
+ ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
+ if (!ctx)
+ return ERR_PTR(-ENOMEM);
+ ctx->bio = bio;
+ ctx->inode = inode;
+ ctx->enabled_steps = read_callbacks_steps;
+ ctx->cur_step = STEP_INITIAL;
+ bio->bi_private = ctx;
+ }
+ return ctx;
+}
+EXPORT_SYMBOL(get_read_callbacks_ctx);
+
+void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx)
+{
+ mempool_free(ctx, read_callbacks_ctx_pool);
+}
+EXPORT_SYMBOL(put_read_callbacks_ctx);
+
+bool read_callbacks_required(struct bio *bio)
+{
+ return bio->bi_private && !bio->bi_status;
+}
+EXPORT_SYMBOL(read_callbacks_required);
+
+static int __init init_read_callbacks(void)
+{
+ read_callbacks_ctx_cache = KMEM_CACHE(read_callbacks_ctx, 0);
+ if (!read_callbacks_ctx_cache)
+ goto fail;
+ read_callbacks_ctx_pool =
+ mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
+ read_callbacks_ctx_cache);
+ if (!read_callbacks_ctx_pool)
+ goto fail_free_cache;
+ return 0;
+
+fail_free_cache:
+ kmem_cache_destroy(read_callbacks_ctx_cache);
+fail:
+ return -ENOMEM;
+}
+
+fs_initcall(init_read_callbacks);
diff --git a/fs/verity/Kconfig b/fs/verity/Kconfig
index 563593fb42db..44784e57c99d 100644
--- a/fs/verity/Kconfig
+++ b/fs/verity/Kconfig
@@ -4,6 +4,7 @@ config FS_VERITY
# SHA-256 is selected as it's intended to be the default hash algorithm.
# To avoid bloat, other wanted algorithms must be selected explicitly.
select CRYPTO_SHA256
+ select FS_READ_CALLBACKS
help
This option enables fs-verity. fs-verity is the dm-verity
mechanism implemented at the file level. On supported
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 5732453a81e7..f93bee33872d 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -13,6 +13,7 @@
#include <linux/pagemap.h>
#include <linux/ratelimit.h>
#include <linux/scatterlist.h>
+#include <linux/read_callbacks.h>
struct workqueue_struct *fsverity_read_workqueue;
@@ -284,6 +285,16 @@ void fsverity_verify_bio(struct bio *bio)
EXPORT_SYMBOL_GPL(fsverity_verify_bio);
#endif /* CONFIG_BLOCK */
+static void fsverity_verify_work(struct work_struct *work)
+{
+ struct read_callbacks_ctx *ctx =
+ container_of(work, struct read_callbacks_ctx, work);
+
+ fsverity_verify_bio(ctx->bio);
+
+ read_callbacks(ctx);
+}
+
/**
* fsverity_enqueue_verify_work - enqueue work on the fs-verity workqueue
*
@@ -291,6 +302,7 @@ EXPORT_SYMBOL_GPL(fsverity_verify_bio);
*/
void fsverity_enqueue_verify_work(struct work_struct *work)
{
+ INIT_WORK(work, fsverity_verify_work);
queue_work(fsverity_read_workqueue, work);
}
EXPORT_SYMBOL_GPL(fsverity_enqueue_verify_work);
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index c00b764d6b8c..a760b7bd81d4 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -68,11 +68,7 @@ struct fscrypt_ctx {
struct {
struct page *bounce_page; /* Ciphertext page */
struct page *control_page; /* Original page */
- } w;
- struct {
- struct bio *bio;
- struct work_struct work;
- } r;
+ };
struct list_head free_list; /* Free list */
};
u8 flags; /* Flags */
@@ -113,7 +109,7 @@ extern int fscrypt_decrypt_page(const struct inode *, struct page *, unsigned in
static inline struct page *fscrypt_control_page(struct page *page)
{
- return ((struct fscrypt_ctx *)page_private(page))->w.control_page;
+ return ((struct fscrypt_ctx *)page_private(page))->control_page;
}
extern void fscrypt_restore_control_page(struct page *);
@@ -218,9 +214,6 @@ static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
}
/* bio.c */
-extern void fscrypt_decrypt_bio(struct bio *);
-extern void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx,
- struct bio *bio);
extern void fscrypt_pullback_bio_page(struct page **, bool);
extern int fscrypt_zeroout_range(const struct inode *, pgoff_t, sector_t,
unsigned int);
@@ -390,15 +383,6 @@ static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len);
}
-/* bio.c */
-static inline void fscrypt_decrypt_bio(struct bio *bio)
-{
-}
-
-static inline void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx,
- struct bio *bio)
-{
-}
static inline void fscrypt_pullback_bio_page(struct page **page, bool restore)
{
diff --git a/include/linux/read_callbacks.h b/include/linux/read_callbacks.h
new file mode 100644
index 000000000000..c501cdf83a5b
--- /dev/null
+++ b/include/linux/read_callbacks.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _READ_CALLBACKS_H
+#define _READ_CALLBACKS_H
+
+struct read_callbacks_ctx {
+ struct bio *bio;
+ struct inode *inode;
+ struct work_struct work;
+ unsigned int cur_step;
+ unsigned int enabled_steps;
+};
+
+void end_read_callbacks(struct bio *bio);
+void read_callbacks(struct read_callbacks_ctx *ctx);
+struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
+ struct bio *bio,
+ pgoff_t index);
+void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx);
+bool read_callbacks_required(struct bio *bio);
+
+#endif /* _READ_CALLBACKS_H */
--
2.19.1
This commit adds code to make do_mpage_readpage() to be "read
callbacks" aware i.e. for files requiring decryption/verification,
do_mpage_readpage() now allocates a context structure and assigns the
corresponding pointer to bio->bi_private. At endio time, a non-zero
bio->bi_private indicates that after the read operation is performed, the
bio's payload needs to be processed further before handing over the data
to user space.
The context structure is used for tracking the state machine associated
with post read processing.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/mpage.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 48 insertions(+), 3 deletions(-)
diff --git a/fs/mpage.c b/fs/mpage.c
index 3f19da75178b..e342b859ee44 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -30,6 +30,10 @@
#include <linux/backing-dev.h>
#include <linux/pagevec.h>
#include <linux/cleancache.h>
+#include <linux/fsverity.h>
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+#include <linux/read_callbacks.h>
+#endif
#include "internal.h"
/*
@@ -50,6 +54,20 @@ static void mpage_end_io(struct bio *bio)
int i;
struct bvec_iter_all iter_all;
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ if (!bio->bi_status && bio->bi_private) {
+ struct read_callbacks_ctx *ctx;
+
+ ctx = bio->bi_private;
+
+ read_callbacks(ctx);
+ return;
+ }
+
+ if (bio->bi_private)
+ put_read_callbacks_ctx((struct read_callbacks_ctx *)(bio->bi_private));
+#endif
+
bio_for_each_segment_all(bv, bio, i, iter_all) {
struct page *page = bv->bv_page;
page_endio(page, bio_op(bio),
@@ -189,7 +207,13 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits);
last_block = block_in_file + args->nr_pages * blocks_per_page;
- last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits;
+#ifdef CONFIG_FS_VERITY
+ if (IS_VERITY(inode) && inode->i_sb->s_vop->readpage_limit)
+ last_block_in_file = inode->i_sb->s_vop->readpage_limit(inode);
+ else
+#endif
+ last_block_in_file = (i_size_read(inode) + blocksize - 1)
+ >> blkbits;
if (last_block > last_block_in_file)
last_block = last_block_in_file;
page_block = 0;
@@ -277,6 +301,14 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
if (first_hole != blocks_per_page) {
zero_user_segment(page, first_hole << blkbits, PAGE_SIZE);
if (first_hole == 0) {
+#ifdef CONFIG_FS_VERITY
+ if (IS_VERITY(inode)) {
+ if (!fsverity_check_hole(inode, page)) {
+ SetPageError(page);
+ goto confused;
+ }
+ }
+#endif
SetPageUptodate(page);
unlock_page(page);
goto out;
@@ -299,7 +331,11 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
alloc_new:
if (args->bio == NULL) {
- if (first_hole == blocks_per_page) {
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ struct read_callbacks_ctx *ctx;
+#endif
+ if (first_hole == blocks_per_page
+ && !(IS_ENCRYPTED(inode) || IS_VERITY(inode))) {
if (!bdev_read_page(bdev, blocks[0] << (blkbits - 9),
page))
goto out;
@@ -310,6 +346,15 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
gfp);
if (args->bio == NULL)
goto confused;
+
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ ctx = get_read_callbacks_ctx(inode, args->bio, page->index);
+ if (IS_ERR(ctx)) {
+ bio_put(args->bio);
+ args->bio = NULL;
+ goto confused;
+ }
+#endif
}
length = first_hole << blkbits;
@@ -331,7 +376,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
confused:
if (args->bio)
args->bio = mpage_bio_submit(REQ_OP_READ, op_flags, args->bio);
- if (!PageUptodate(page))
+ if (!PageUptodate(page) && !PageError(page))
block_read_full_page(page, args->get_block);
else
unlock_page(page);
--
2.19.1
Ext4 and F2FS store verity metadata beyond i_size. This commit adds a
call back pointer to "struct fsverity_operations" which helps in
determining the the real file size limit upto which data can be read
from the file.
This call back will be required in order to get do_mpage_readpage()
to read files having verity metadata appended beyond i_size.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/super.c | 17 +++++++++++++++++
include/linux/fsverity.h | 1 +
2 files changed, 18 insertions(+)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 63d73b360f1d..8e483afbaa2e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1428,6 +1428,22 @@ static struct page *ext4_read_verity_metadata_page(struct inode *inode,
return read_mapping_page(inode->i_mapping, index, NULL);
}
+static loff_t ext4_readpage_limit(struct inode *inode)
+{
+ if (IS_VERITY(inode)) {
+ if (inode->i_verity_info)
+ /* limit to end of metadata region */
+ return fsverity_full_i_size(inode);
+ /*
+ * fsverity_info is currently being set up and no user reads are
+ * allowed yet. It's easiest to just not enforce a limit yet.
+ */
+ return inode->i_sb->s_maxbytes;
+ }
+
+ return i_size_read(inode);
+}
+
static bool ext4_verity_required(struct inode *inode, pgoff_t index)
{
return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
@@ -1438,6 +1454,7 @@ static const struct fsverity_operations ext4_verityops = {
.get_metadata_end = ext4_get_verity_metadata_end,
.read_metadata_page = ext4_read_verity_metadata_page,
.verity_required = ext4_verity_required,
+ .readpage_limit = ext4_readpage_limit,
};
#endif /* CONFIG_FS_VERITY */
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index b83712d6c79a..fc8113acbbfe 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -19,6 +19,7 @@ struct fsverity_operations {
int (*get_metadata_end)(struct inode *inode, loff_t *metadata_end_ret);
struct page *(*read_metadata_page)(struct inode *inode, pgoff_t index);
bool (*verity_required)(struct inode *inode, pgoff_t index);
+ loff_t (*readpage_limit)(struct inode *inode);
};
#ifdef CONFIG_FS_VERITY
--
2.19.1
Now that do_mpage_readpage() is "post read process" aware, this commit
gets ext4_readpage[s] to use mpage_readpage[s] and deletes ext4's
readpage.c since the associated functionality is not required anymore.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/Makefile | 2 +-
fs/ext4/inode.c | 5 +-
fs/ext4/readpage.c | 314 ---------------------------------------------
3 files changed, 3 insertions(+), 318 deletions(-)
delete mode 100644 fs/ext4/readpage.c
diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile
index 8fdfcd3c3e04..7c38803a808d 100644
--- a/fs/ext4/Makefile
+++ b/fs/ext4/Makefile
@@ -8,7 +8,7 @@ obj-$(CONFIG_EXT4_FS) += ext4.o
ext4-y := balloc.o bitmap.o block_validity.o dir.o ext4_jbd2.o extents.o \
extents_status.o file.o fsmap.o fsync.o hash.o ialloc.o \
indirect.o inline.o inode.o ioctl.o mballoc.o migrate.o \
- mmp.o move_extent.o namei.o page-io.o readpage.o resize.o \
+ mmp.o move_extent.o namei.o page-io.o resize.o \
super.o symlink.o sysfs.o xattr.o xattr_trusted.o xattr_user.o
ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 05b258db8673..1327e04334df 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3353,8 +3353,7 @@ static int ext4_readpage(struct file *file, struct page *page)
ret = ext4_readpage_inline(inode, page);
if (ret == -EAGAIN)
- return ext4_mpage_readpages(page->mapping, NULL, page, 1,
- false);
+ return mpage_readpage(page, ext4_get_block);
return ret;
}
@@ -3369,7 +3368,7 @@ ext4_readpages(struct file *file, struct address_space *mapping,
if (ext4_has_inline_data(inode))
return 0;
- return ext4_mpage_readpages(mapping, pages, NULL, nr_pages, true);
+ return mpage_readpages(mapping, pages, nr_pages, ext4_get_block);
}
static void ext4_invalidatepage(struct page *page, unsigned int offset,
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
deleted file mode 100644
index e363dededc21..000000000000
--- a/fs/ext4/readpage.c
+++ /dev/null
@@ -1,314 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * linux/fs/ext4/readpage.c
- *
- * Copyright (C) 2002, Linus Torvalds.
- * Copyright (C) 2015, Google, Inc.
- *
- * This was originally taken from fs/mpage.c
- *
- * The intent is the ext4_mpage_readpages() function here is intended
- * to replace mpage_readpages() in the general case, not just for
- * encrypted files. It has some limitations (see below), where it
- * will fall back to read_block_full_page(), but these limitations
- * should only be hit when page_size != block_size.
- *
- * This will allow us to attach a callback function to support ext4
- * encryption.
- *
- * If anything unusual happens, such as:
- *
- * - encountering a page which has buffers
- * - encountering a page which has a non-hole after a hole
- * - encountering a page with non-contiguous blocks
- *
- * then this code just gives up and calls the buffer_head-based read function.
- * It does handle a page which has holes at the end - that is a common case:
- * the end-of-file on blocksize < PAGE_SIZE setups.
- *
- */
-
-#include <linux/kernel.h>
-#include <linux/export.h>
-#include <linux/mm.h>
-#include <linux/kdev_t.h>
-#include <linux/gfp.h>
-#include <linux/bio.h>
-#include <linux/fs.h>
-#include <linux/buffer_head.h>
-#include <linux/blkdev.h>
-#include <linux/highmem.h>
-#include <linux/prefetch.h>
-#include <linux/mpage.h>
-#include <linux/writeback.h>
-#include <linux/backing-dev.h>
-#include <linux/pagevec.h>
-#include <linux/cleancache.h>
-#include <linux/read_callbacks.h>
-
-#include "ext4.h"
-
-static inline bool ext4_bio_encrypted(struct bio *bio)
-{
-#ifdef CONFIG_FS_ENCRYPTION
- return unlikely(bio->bi_private != NULL);
-#else
- return false;
-#endif
-}
-
-/*
- * I/O completion handler for multipage BIOs.
- *
- * The mpage code never puts partial pages into a BIO (except for end-of-file).
- * If a page does not map to a contiguous run of blocks then it simply falls
- * back to block_read_full_page().
- *
- * Why is this? If a page's completion depends on a number of different BIOs
- * which can complete in any order (or at the same time) then determining the
- * status of that page is hard. See end_buffer_async_read() for the details.
- * There is no point in duplicating all that complexity.
- */
-static void mpage_end_io(struct bio *bio)
-{
- struct bio_vec *bv;
- int i;
- struct bvec_iter_all iter_all;
-#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
- if (read_callbacks_required(bio)) {
- struct read_callbacks_ctx *ctx = bio->bi_private;
-
- read_callbacks(ctx);
- return;
- }
-#endif
- bio_for_each_segment_all(bv, bio, i, iter_all) {
- struct page *page = bv->bv_page;
-
- if (!bio->bi_status) {
- SetPageUptodate(page);
- } else {
- ClearPageUptodate(page);
- SetPageError(page);
- }
- unlock_page(page);
- }
-
- bio_put(bio);
-}
-
-static inline loff_t ext4_readpage_limit(struct inode *inode)
-{
-#ifdef CONFIG_FS_VERITY
- if (IS_VERITY(inode)) {
- if (inode->i_verity_info)
- /* limit to end of metadata region */
- return fsverity_full_i_size(inode);
- /*
- * fsverity_info is currently being set up and no user reads are
- * allowed yet. It's easiest to just not enforce a limit yet.
- */
- return inode->i_sb->s_maxbytes;
- }
-#endif
- return i_size_read(inode);
-}
-
-int ext4_mpage_readpages(struct address_space *mapping,
- struct list_head *pages, struct page *page,
- unsigned nr_pages, bool is_readahead)
-{
- struct bio *bio = NULL;
- sector_t last_block_in_bio = 0;
-
- struct inode *inode = mapping->host;
- const unsigned blkbits = inode->i_blkbits;
- const unsigned blocks_per_page = PAGE_SIZE >> blkbits;
- const unsigned blocksize = 1 << blkbits;
- sector_t block_in_file;
- sector_t last_block;
- sector_t last_block_in_file;
- sector_t blocks[MAX_BUF_PER_PAGE];
- unsigned page_block;
- struct block_device *bdev = inode->i_sb->s_bdev;
- int length;
- unsigned relative_block = 0;
- struct ext4_map_blocks map;
-
- map.m_pblk = 0;
- map.m_lblk = 0;
- map.m_len = 0;
- map.m_flags = 0;
-
- for (; nr_pages; nr_pages--) {
- int fully_mapped = 1;
- unsigned first_hole = blocks_per_page;
-
- prefetchw(&page->flags);
- if (pages) {
- page = lru_to_page(pages);
- list_del(&page->lru);
- if (add_to_page_cache_lru(page, mapping, page->index,
- readahead_gfp_mask(mapping)))
- goto next_page;
- }
-
- if (page_has_buffers(page))
- goto confused;
-
- block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits);
- last_block = block_in_file + nr_pages * blocks_per_page;
- last_block_in_file = (ext4_readpage_limit(inode) +
- blocksize - 1) >> blkbits;
- if (last_block > last_block_in_file)
- last_block = last_block_in_file;
- page_block = 0;
-
- /*
- * Map blocks using the previous result first.
- */
- if ((map.m_flags & EXT4_MAP_MAPPED) &&
- block_in_file > map.m_lblk &&
- block_in_file < (map.m_lblk + map.m_len)) {
- unsigned map_offset = block_in_file - map.m_lblk;
- unsigned last = map.m_len - map_offset;
-
- for (relative_block = 0; ; relative_block++) {
- if (relative_block == last) {
- /* needed? */
- map.m_flags &= ~EXT4_MAP_MAPPED;
- break;
- }
- if (page_block == blocks_per_page)
- break;
- blocks[page_block] = map.m_pblk + map_offset +
- relative_block;
- page_block++;
- block_in_file++;
- }
- }
-
- /*
- * Then do more ext4_map_blocks() calls until we are
- * done with this page.
- */
- while (page_block < blocks_per_page) {
- if (block_in_file < last_block) {
- map.m_lblk = block_in_file;
- map.m_len = last_block - block_in_file;
-
- if (ext4_map_blocks(NULL, inode, &map, 0) < 0) {
- set_error_page:
- SetPageError(page);
- zero_user_segment(page, 0,
- PAGE_SIZE);
- unlock_page(page);
- goto next_page;
- }
- }
- if ((map.m_flags & EXT4_MAP_MAPPED) == 0) {
- fully_mapped = 0;
- if (first_hole == blocks_per_page)
- first_hole = page_block;
- page_block++;
- block_in_file++;
- continue;
- }
- if (first_hole != blocks_per_page)
- goto confused; /* hole -> non-hole */
-
- /* Contiguous blocks? */
- if (page_block && blocks[page_block-1] != map.m_pblk-1)
- goto confused;
- for (relative_block = 0; ; relative_block++) {
- if (relative_block == map.m_len) {
- /* needed? */
- map.m_flags &= ~EXT4_MAP_MAPPED;
- break;
- } else if (page_block == blocks_per_page)
- break;
- blocks[page_block] = map.m_pblk+relative_block;
- page_block++;
- block_in_file++;
- }
- }
- if (first_hole != blocks_per_page) {
- zero_user_segment(page, first_hole << blkbits,
- PAGE_SIZE);
- if (first_hole == 0) {
- if (!fsverity_check_hole(inode, page))
- goto set_error_page;
- SetPageUptodate(page);
- unlock_page(page);
- goto next_page;
- }
- } else if (fully_mapped) {
- SetPageMappedToDisk(page);
- }
- if (fully_mapped && blocks_per_page == 1 &&
- !PageUptodate(page) && cleancache_get_page(page) == 0) {
- SetPageUptodate(page);
- goto confused;
- }
-
- /*
- * This page will go to BIO. Do we need to send this
- * BIO off first?
- */
- if (bio && (last_block_in_bio != blocks[0] - 1)) {
- submit_and_realloc:
- submit_bio(bio);
- bio = NULL;
- }
- if (bio == NULL) {
- struct read_callbacks_ctx *ctx = NULL;
-
- bio = bio_alloc(GFP_KERNEL,
- min_t(int, nr_pages, BIO_MAX_PAGES));
- if (!bio)
- goto set_error_page;
-#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
- ctx = get_read_callbacks_ctx(inode, bio, page->index);
- if (IS_ERR(ctx)) {
- bio_put(bio);
- goto set_error_page;
- }
-#endif
- bio_set_dev(bio, bdev);
- bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
- bio->bi_end_io = mpage_end_io;
- bio->bi_private = ctx;
- bio_set_op_attrs(bio, REQ_OP_READ,
- is_readahead ? REQ_RAHEAD : 0);
- }
-
- length = first_hole << blkbits;
- if (bio_add_page(bio, page, length, 0) < length)
- goto submit_and_realloc;
-
- if (((map.m_flags & EXT4_MAP_BOUNDARY) &&
- (relative_block == map.m_len)) ||
- (first_hole != blocks_per_page)) {
- submit_bio(bio);
- bio = NULL;
- } else
- last_block_in_bio = blocks[blocks_per_page - 1];
- goto next_page;
- confused:
- if (bio) {
- submit_bio(bio);
- bio = NULL;
- }
- if (!PageUptodate(page))
- block_read_full_page(page, ext4_get_block);
- else
- unlock_page(page);
- next_page:
- if (pages)
- put_page(page);
- }
- BUG_ON(pages && !list_empty(pages));
- if (bio)
- submit_bio(bio);
- return 0;
-}
--
2.19.1
With subpage sized blocks, ext4_block_write_begin() can have up to two
blocks to decrypt. Hence this commit invokes fscrypt_decrypt_page() for
each of those blocks.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/inode.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1327e04334df..51744a3c3964 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1156,12 +1156,14 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
unsigned to = from + len;
struct inode *inode = page->mapping->host;
unsigned block_start, block_end;
- sector_t block;
+ sector_t block, page_blk_nr;
int err = 0;
unsigned blocksize = inode->i_sb->s_blocksize;
unsigned bbits;
- struct buffer_head *bh, *head, *wait[2], **wait_bh = wait;
+ struct buffer_head *bh, *head, *wait[2];
+ int nr_wait = 0;
bool decrypt = false;
+ int i;
BUG_ON(!PageLocked(page));
BUG_ON(from > PAGE_SIZE);
@@ -1213,25 +1215,36 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
!buffer_unwritten(bh) &&
(block_start < from || block_end > to)) {
ll_rw_block(REQ_OP_READ, 0, 1, &bh);
- *wait_bh++ = bh;
+ wait[nr_wait++] = bh;
decrypt = IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode);
}
}
/*
* If we issued read requests, let them complete.
*/
- while (wait_bh > wait) {
- wait_on_buffer(*--wait_bh);
- if (!buffer_uptodate(*wait_bh))
+ for (i = 0; i < nr_wait; i++) {
+ wait_on_buffer(wait[i]);
+ if (!buffer_uptodate(wait[i]))
err = -EIO;
}
if (unlikely(err)) {
page_zero_new_buffers(page, from, to);
} else if (decrypt) {
- err = fscrypt_decrypt_page(page->mapping->host, page,
- PAGE_SIZE, 0, page->index);
- if (err)
- clear_buffer_uptodate(*wait_bh);
+ page_blk_nr = (sector_t)page->index << (PAGE_SHIFT - bbits);
+
+ for (i = 0; i < nr_wait; i++) {
+ int err2;
+
+ block = page_blk_nr + (bh_offset(wait[i]) >> bbits);
+ err2 = fscrypt_decrypt_page(page->mapping->host, page,
+ wait[i]->b_size,
+ bh_offset(wait[i]),
+ block);
+ if (err2) {
+ clear_buffer_uptodate(wait[i]);
+ err = err2;
+ }
+ }
}
return err;
--
2.19.1
__ext4_block_zero_page_range decrypts the entire page. This commit
decrypts the block to be partially zeroed instead of the whole page.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/inode.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 51744a3c3964..ade1816697a8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4080,9 +4080,10 @@ static int __ext4_block_zero_page_range(handle_t *handle,
if (S_ISREG(inode->i_mode) && IS_ENCRYPTED(inode)) {
/* We expect the key to be set. */
BUG_ON(!fscrypt_has_encryption_key(inode));
- BUG_ON(blocksize != PAGE_SIZE);
WARN_ON_ONCE(fscrypt_decrypt_page(page->mapping->host,
- page, PAGE_SIZE, 0, page->index));
+ page, blocksize,
+ round_down(offset, blocksize),
+ iblock));
}
}
if (ext4_should_journal_data(inode)) {
--
2.19.1
For subpage-sized blocks, this commit now encrypts all blocks mapped by
a page range.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
1 file changed, 25 insertions(+), 12 deletions(-)
diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 4f0d832cae71..2d65b431563f 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
{
struct fscrypt_ctx *ctx;
struct page *ciphertext_page = page;
+ int i, page_nr_blks;
int err;
BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
+ page_nr_blks = len >> inode->i_blkbits;
+
if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
/* with inplace-encryption we just encrypt the page */
- err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
- ciphertext_page, len, offs,
- gfp_flags);
- if (err)
- return ERR_PTR(err);
-
+ for (i = 0; i < page_nr_blks; i++) {
+ err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
+ lblk_num, page,
+ ciphertext_page,
+ i_blocksize(inode), offs,
+ gfp_flags);
+ if (err)
+ return ERR_PTR(err);
+ ++lblk_num;
+ offs += i_blocksize(inode);
+ }
return ciphertext_page;
}
@@ -269,12 +277,17 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
goto errout;
ctx->control_page = page;
- err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
- page, ciphertext_page, len, offs,
- gfp_flags);
- if (err) {
- ciphertext_page = ERR_PTR(err);
- goto errout;
+
+ for (i = 0; i < page_nr_blks; i++) {
+ err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
+ page, ciphertext_page,
+ i_blocksize(inode), offs, gfp_flags);
+ if (err) {
+ ciphertext_page = ERR_PTR(err);
+ goto errout;
+ }
+ ++lblk_num;
+ offs += i_blocksize(inode);
}
SetPagePrivate(ciphertext_page);
set_page_private(ciphertext_page, (unsigned long)ctx);
--
2.19.1
Now that we have the code to support encryption for subpage-sized
blocks, this commit removes the conditional check in filesystem mount
code.
The commit also changes the support statement in
Documentation/filesystems/fscrypt.rst to reflect the fact that
encryption of filesystems with blocksize less than page size now works.
Signed-off-by: Chandan Rajendra <[email protected]>
---
Documentation/filesystems/fscrypt.rst | 4 ++--
fs/ext4/super.c | 7 -------
2 files changed, 2 insertions(+), 9 deletions(-)
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 08c23b60e016..ff2fea121da9 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -213,8 +213,8 @@ Contents encryption
-------------------
For file contents, each filesystem block is encrypted independently.
-Currently, only the case where the filesystem block size is equal to
-the system's page size (usually 4096 bytes) is supported.
+Starting from Linux kernel 5.3, encryption of filesystems with block
+size less than system's page size is supported.
Each block's IV is set to the logical block number within the file as
a little endian number, except that:
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 8e483afbaa2e..4acfefa98ec5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4432,13 +4432,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
}
}
- if ((DUMMY_ENCRYPTION_ENABLED(sbi) || ext4_has_feature_encrypt(sb)) &&
- (blocksize != PAGE_SIZE)) {
- ext4_msg(sb, KERN_ERR,
- "Unsupported blocksize for fs encryption");
- goto failed_mount_wq;
- }
-
if (DUMMY_ENCRYPTION_ENABLED(sbi) && !sb_rdonly(sb) &&
!ext4_has_feature_encrypt(sb)) {
ext4_set_feature_encrypt(sb);
--
2.19.1
For subpage-sized blocks, this commit adds code to encrypt all zeroed
out blocks mapped by a page.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/crypto/bio.c | 40 ++++++++++++++++++----------------------
1 file changed, 18 insertions(+), 22 deletions(-)
diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
index 856f4694902d..46dd2ec50c7d 100644
--- a/fs/crypto/bio.c
+++ b/fs/crypto/bio.c
@@ -108,29 +108,23 @@ EXPORT_SYMBOL(fscrypt_pullback_bio_page);
int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
sector_t pblk, unsigned int len)
{
- struct fscrypt_ctx *ctx;
struct page *ciphertext_page = NULL;
struct bio *bio;
+ u64 total_bytes, page_bytes;
int ret, err = 0;
- BUG_ON(inode->i_sb->s_blocksize != PAGE_SIZE);
-
- ctx = fscrypt_get_ctx(inode, GFP_NOFS);
- if (IS_ERR(ctx))
- return PTR_ERR(ctx);
+ total_bytes = len << inode->i_blkbits;
- ciphertext_page = fscrypt_alloc_bounce_page(ctx, GFP_NOWAIT);
- if (IS_ERR(ciphertext_page)) {
- err = PTR_ERR(ciphertext_page);
- goto errout;
- }
+ while (total_bytes) {
+ page_bytes = min_t(u64, total_bytes, PAGE_SIZE);
- while (len--) {
- err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk,
- ZERO_PAGE(0), ciphertext_page,
- PAGE_SIZE, 0, GFP_NOFS);
- if (err)
+ ciphertext_page = fscrypt_encrypt_page(inode, ZERO_PAGE(0),
+ page_bytes, 0, lblk, GFP_NOFS);
+ if (IS_ERR(ciphertext_page)) {
+ err = PTR_ERR(ciphertext_page);
+ ciphertext_page = NULL;
goto errout;
+ }
bio = bio_alloc(GFP_NOWAIT, 1);
if (!bio) {
@@ -141,9 +135,8 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
bio->bi_iter.bi_sector =
pblk << (inode->i_sb->s_blocksize_bits - 9);
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
- ret = bio_add_page(bio, ciphertext_page,
- inode->i_sb->s_blocksize, 0);
- if (ret != inode->i_sb->s_blocksize) {
+ ret = bio_add_page(bio, ciphertext_page, page_bytes, 0);
+ if (ret != page_bytes) {
/* should never happen! */
WARN_ON(1);
bio_put(bio);
@@ -156,12 +149,15 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
bio_put(bio);
if (err)
goto errout;
- lblk++;
- pblk++;
+
+ lblk += page_bytes >> inode->i_blkbits;
+ pblk += page_bytes >> inode->i_blkbits;
+ total_bytes -= page_bytes;
}
err = 0;
errout:
- fscrypt_release_ctx(ctx);
+ if (!IS_ERR_OR_NULL(ciphertext_page))
+ fscrypt_restore_control_page(ciphertext_page);
return err;
}
EXPORT_SYMBOL(fscrypt_zeroout_range);
--
2.19.1
For subpage-sized blocks, the initial logical block number mapped by a
page can be different from page->index. Hence this commit adds code to
compute the first logical block mapped by the page and also the page
range to be encrypted.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/page-io.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 3e9298e6a705..75485ee9e800 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -418,6 +418,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
{
struct page *data_page = NULL;
struct inode *inode = page->mapping->host;
+ u64 page_blk;
unsigned block_start;
struct buffer_head *bh, *head;
int ret = 0;
@@ -478,10 +479,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode) && nr_to_submit) {
gfp_t gfp_flags = GFP_NOFS;
+ unsigned int page_bytes;
+
+ page_bytes = round_up(len, i_blocksize(inode));
+ page_blk = page->index << (PAGE_SHIFT - inode->i_blkbits);
retry_encrypt:
- data_page = fscrypt_encrypt_page(inode, page, PAGE_SIZE, 0,
- page->index, gfp_flags);
+ data_page = fscrypt_encrypt_page(inode, page, page_bytes, 0,
+ page_blk, gfp_flags);
if (IS_ERR(data_page)) {
ret = PTR_ERR(data_page);
if (ret == -ENOMEM && wbc->sync_mode == WB_SYNC_ALL) {
--
2.19.1
Ext4 and F2FS store verity metadata in data extents (beyond
inode->i_size) associated with a file. But other filesystems might
choose alternative means to store verity metadata. Hence this commit
adds a callback function pointer to 'struct fsverity_operations' to help
in deciding if verity operation needs to performed against a page-cache
page holding file data.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/ext4/super.c | 6 ++++++
fs/f2fs/super.c | 6 ++++++
fs/read_callbacks.c | 4 +++-
include/linux/fsverity.h | 1 +
4 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index aba724f82cc3..63d73b360f1d 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1428,10 +1428,16 @@ static struct page *ext4_read_verity_metadata_page(struct inode *inode,
return read_mapping_page(inode->i_mapping, index, NULL);
}
+static bool ext4_verity_required(struct inode *inode, pgoff_t index)
+{
+ return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
+}
+
static const struct fsverity_operations ext4_verityops = {
.set_verity = ext4_set_verity,
.get_metadata_end = ext4_get_verity_metadata_end,
.read_metadata_page = ext4_read_verity_metadata_page,
+ .verity_required = ext4_verity_required,
};
#endif /* CONFIG_FS_VERITY */
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 2f75f06c784a..cd1299e1f92d 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2257,10 +2257,16 @@ static struct page *f2fs_read_verity_metadata_page(struct inode *inode,
return read_mapping_page(inode->i_mapping, index, NULL);
}
+static bool f2fs_verity_required(struct inode *inode, pgoff_t index)
+{
+ return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
+}
+
static const struct fsverity_operations f2fs_verityops = {
.set_verity = f2fs_set_verity,
.get_metadata_end = f2fs_get_verity_metadata_end,
.read_metadata_page = f2fs_read_verity_metadata_page,
+ .verity_required = f2fs_verity_required,
};
#endif /* CONFIG_FS_VERITY */
diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
index b6d5b95e67d7..6dea54b0baa9 100644
--- a/fs/read_callbacks.c
+++ b/fs/read_callbacks.c
@@ -86,7 +86,9 @@ struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
read_callbacks_steps |= 1 << STEP_DECRYPT;
#ifdef CONFIG_FS_VERITY
if (inode->i_verity_info != NULL &&
- (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
+ ((inode->i_sb->s_vop->verity_required
+ && inode->i_sb->s_vop->verity_required(inode, index))
+ || (inode->i_sb->s_vop->verity_required == NULL)))
read_callbacks_steps |= 1 << STEP_VERITY;
#endif
if (read_callbacks_steps) {
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 7c33b42abf1b..b83712d6c79a 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -18,6 +18,7 @@ struct fsverity_operations {
int (*set_verity)(struct inode *inode, loff_t data_i_size);
int (*get_metadata_end)(struct inode *inode, loff_t *metadata_end_ret);
struct page *(*read_metadata_page)(struct inode *inode, pgoff_t index);
+ bool (*verity_required)(struct inode *inode, pgoff_t index);
};
#ifdef CONFIG_FS_VERITY
--
2.19.1
To support decryption of sub-pagesized blocks this commit adds code to,
1. Track buffer head in "struct read_callbacks_ctx".
2. Pass buffer head argument to all read callbacks.
3. In the corresponding endio, loop across all the blocks mapped by the
page, decrypting each block in turn.
Signed-off-by: Chandan Rajendra <[email protected]>
---
fs/buffer.c | 83 +++++++++++++++++++++++++---------
fs/crypto/bio.c | 50 +++++++++++++-------
fs/crypto/crypto.c | 19 +++++++-
fs/f2fs/data.c | 2 +-
fs/mpage.c | 2 +-
fs/read_callbacks.c | 53 ++++++++++++++--------
include/linux/buffer_head.h | 1 +
include/linux/read_callbacks.h | 5 +-
8 files changed, 154 insertions(+), 61 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index ce357602f471..f324727e24bb 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -45,6 +45,7 @@
#include <linux/bit_spinlock.h>
#include <linux/pagevec.h>
#include <linux/sched/mm.h>
+#include <linux/read_callbacks.h>
#include <trace/events/block.h>
static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
@@ -245,11 +246,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
return ret;
}
-/*
- * I/O completion handler for block_read_full_page() - pages
- * which come unlocked at the end of I/O.
- */
-static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
+void end_buffer_page_read(struct buffer_head *bh)
{
unsigned long flags;
struct buffer_head *first;
@@ -257,17 +254,7 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
struct page *page;
int page_uptodate = 1;
- BUG_ON(!buffer_async_read(bh));
-
page = bh->b_page;
- if (uptodate) {
- set_buffer_uptodate(bh);
- } else {
- clear_buffer_uptodate(bh);
- buffer_io_error(bh, ", async page read");
- SetPageError(page);
- }
-
/*
* Be _very_ careful from here on. Bad things can happen if
* two buffer heads end IO at almost the same time and both
@@ -305,6 +292,44 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
local_irq_restore(flags);
return;
}
+EXPORT_SYMBOL(end_buffer_page_read);
+
+/*
+ * I/O completion handler for block_read_full_page() - pages
+ * which come unlocked at the end of I/O.
+ */
+static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
+{
+ struct page *page;
+
+ BUG_ON(!buffer_async_read(bh));
+
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ if (uptodate && bh->b_private) {
+ struct read_callbacks_ctx *ctx = bh->b_private;
+
+ read_callbacks(ctx);
+ return;
+ }
+
+ if (bh->b_private) {
+ struct read_callbacks_ctx *ctx = bh->b_private;
+
+ WARN_ON(uptodate);
+ put_read_callbacks_ctx(ctx);
+ }
+#endif
+ page = bh->b_page;
+ if (uptodate) {
+ set_buffer_uptodate(bh);
+ } else {
+ clear_buffer_uptodate(bh);
+ buffer_io_error(bh, ", async page read");
+ SetPageError(page);
+ }
+
+ end_buffer_page_read(bh);
+}
/*
* Completion handler for block_write_full_page() - pages which are unlocked
@@ -2220,7 +2245,11 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
{
struct inode *inode = page->mapping->host;
sector_t iblock, lblock;
- struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
+ struct buffer_head *bh, *head;
+ struct {
+ sector_t blk_nr;
+ struct buffer_head *bh;
+ } arr[MAX_BUF_PER_PAGE];
unsigned int blocksize, bbits;
int nr, i;
int fully_mapped = 1;
@@ -2262,7 +2291,9 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
if (buffer_uptodate(bh))
continue;
}
- arr[nr++] = bh;
+ arr[nr].blk_nr = iblock;
+ arr[nr].bh = bh;
+ ++nr;
} while (i++, iblock++, (bh = bh->b_this_page) != head);
if (fully_mapped)
@@ -2281,7 +2312,7 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
/* Stage two: lock the buffers */
for (i = 0; i < nr; i++) {
- bh = arr[i];
+ bh = arr[i].bh;
lock_buffer(bh);
mark_buffer_async_read(bh);
}
@@ -2292,11 +2323,21 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
* the underlying blockdev brought it uptodate (the sct fix).
*/
for (i = 0; i < nr; i++) {
- bh = arr[i];
- if (buffer_uptodate(bh))
+ bh = arr[i].bh;
+ if (buffer_uptodate(bh)) {
end_buffer_async_read(bh, 1);
- else
+ } else {
+#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
+ struct read_callbacks_ctx *ctx;
+
+ ctx = get_read_callbacks_ctx(inode, NULL, bh, arr[i].blk_nr);
+ if (WARN_ON(IS_ERR(ctx))) {
+ end_buffer_async_read(bh, 0);
+ continue;
+ }
+#endif
submit_bh(REQ_OP_READ, 0, bh);
+ }
}
return 0;
}
diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
index 27f5618174f2..856f4694902d 100644
--- a/fs/crypto/bio.c
+++ b/fs/crypto/bio.c
@@ -24,44 +24,62 @@
#include <linux/module.h>
#include <linux/bio.h>
#include <linux/namei.h>
+#include <linux/buffer_head.h>
#include <linux/read_callbacks.h>
#include "fscrypt_private.h"
-static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
+static void fscrypt_decrypt(struct bio *bio, struct buffer_head *bh)
{
+ struct inode *inode;
+ struct page *page;
struct bio_vec *bv;
+ sector_t blk_nr;
+ int ret;
int i;
struct bvec_iter_all iter_all;
- bio_for_each_segment_all(bv, bio, i, iter_all) {
- struct page *page = bv->bv_page;
- int ret = fscrypt_decrypt_page(page->mapping->host, page,
- PAGE_SIZE, 0, page->index);
+ WARN_ON(!bh && !bio);
+ if (bh) {
+ page = bh->b_page;
+ inode = page->mapping->host;
+
+ blk_nr = page->index << (PAGE_SHIFT - inode->i_blkbits);
+ blk_nr += (bh_offset(bh) >> inode->i_blkbits);
+
+ ret = fscrypt_decrypt_page(inode, page, i_blocksize(inode),
+ bh_offset(bh), blk_nr);
if (ret) {
WARN_ON_ONCE(1);
SetPageError(page);
- } else if (done) {
- SetPageUptodate(page);
}
- if (done)
- unlock_page(page);
+ } else if (bio) {
+ bio_for_each_segment_all(bv, bio, i, iter_all) {
+ unsigned int blkbits;
+
+ page = bv->bv_page;
+ inode = page->mapping->host;
+ blkbits = inode->i_blkbits;
+ blk_nr = page->index << (PAGE_SHIFT - blkbits);
+ blk_nr += (bv->bv_offset >> blkbits);
+ ret = fscrypt_decrypt_page(page->mapping->host,
+ page, bv->bv_len,
+ bv->bv_offset, blk_nr);
+ if (ret) {
+ WARN_ON_ONCE(1);
+ SetPageError(page);
+ }
+ }
}
}
-void fscrypt_decrypt_bio(struct bio *bio)
-{
- __fscrypt_decrypt_bio(bio, false);
-}
-EXPORT_SYMBOL(fscrypt_decrypt_bio);
-
void fscrypt_decrypt_work(struct work_struct *work)
{
struct read_callbacks_ctx *ctx =
container_of(work, struct read_callbacks_ctx, work);
- fscrypt_decrypt_bio(ctx->bio);
+ fscrypt_decrypt(ctx->bio, ctx->bh);
read_callbacks(ctx);
}
diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index ffa9302a7351..4f0d832cae71 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -305,11 +305,26 @@ EXPORT_SYMBOL(fscrypt_encrypt_page);
int fscrypt_decrypt_page(const struct inode *inode, struct page *page,
unsigned int len, unsigned int offs, u64 lblk_num)
{
+ int i, page_nr_blks;
+ int err = 0;
+
if (!(inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES))
BUG_ON(!PageLocked(page));
- return fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num, page, page,
- len, offs, GFP_NOFS);
+ page_nr_blks = len >> inode->i_blkbits;
+
+ for (i = 0; i < page_nr_blks; i++) {
+ err = fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num,
+ page, page, i_blocksize(inode), offs,
+ GFP_NOFS);
+ if (err)
+ break;
+
+ ++lblk_num;
+ offs += i_blocksize(inode);
+ }
+
+ return err;
}
EXPORT_SYMBOL(fscrypt_decrypt_page);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 05430d3650ab..ba437a2085e7 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -527,7 +527,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
bio_set_op_attrs(bio, REQ_OP_READ, op_flag);
#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
- ctx = get_read_callbacks_ctx(inode, bio, first_idx);
+ ctx = get_read_callbacks_ctx(inode, bio, NULL, first_idx);
if (IS_ERR(ctx)) {
bio_put(bio);
return (struct bio *)ctx;
diff --git a/fs/mpage.c b/fs/mpage.c
index e342b859ee44..0557479fdca4 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -348,7 +348,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
goto confused;
#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
- ctx = get_read_callbacks_ctx(inode, args->bio, page->index);
+ ctx = get_read_callbacks_ctx(inode, args->bio, NULL, page->index);
if (IS_ERR(ctx)) {
bio_put(args->bio);
args->bio = NULL;
diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
index 6dea54b0baa9..b3881c525720 100644
--- a/fs/read_callbacks.c
+++ b/fs/read_callbacks.c
@@ -8,6 +8,7 @@
#include <linux/mm.h>
#include <linux/pagemap.h>
#include <linux/bio.h>
+#include <linux/buffer_head.h>
#include <linux/fscrypt.h>
#include <linux/fsverity.h>
#include <linux/read_callbacks.h>
@@ -24,26 +25,41 @@ enum read_callbacks_step {
STEP_VERITY,
};
-void end_read_callbacks(struct bio *bio)
+void end_read_callbacks(struct bio *bio, struct buffer_head *bh)
{
+ struct read_callbacks_ctx *ctx;
struct page *page;
struct bio_vec *bv;
int i;
struct bvec_iter_all iter_all;
- bio_for_each_segment_all(bv, bio, i, iter_all) {
- page = bv->bv_page;
+ if (bh) {
+ if (!PageError(bh->b_page))
+ set_buffer_uptodate(bh);
- BUG_ON(bio->bi_status);
+ ctx = bh->b_private;
- if (!PageError(page))
- SetPageUptodate(page);
+ end_buffer_page_read(bh);
- unlock_page(page);
+ put_read_callbacks_ctx(ctx);
+ } else if (bio) {
+ bio_for_each_segment_all(bv, bio, i, iter_all) {
+ page = bv->bv_page;
+
+ WARN_ON(bio->bi_status);
+
+ if (!PageError(page))
+ SetPageUptodate(page);
+
+ unlock_page(page);
+ }
+ WARN_ON(!bio->bi_private);
+
+ ctx = bio->bi_private;
+ put_read_callbacks_ctx(ctx);
+
+ bio_put(bio);
}
- if (bio->bi_private)
- put_read_callbacks_ctx(bio->bi_private);
- bio_put(bio);
}
EXPORT_SYMBOL(end_read_callbacks);
@@ -70,18 +86,21 @@ void read_callbacks(struct read_callbacks_ctx *ctx)
ctx->cur_step++;
/* fall-through */
default:
- end_read_callbacks(ctx->bio);
+ end_read_callbacks(ctx->bio, ctx->bh);
}
}
EXPORT_SYMBOL(read_callbacks);
struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
struct bio *bio,
+ struct buffer_head *bh,
pgoff_t index)
{
unsigned int read_callbacks_steps = 0;
struct read_callbacks_ctx *ctx = NULL;
+ WARN_ON(!bh && !bio);
+
if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
read_callbacks_steps |= 1 << STEP_DECRYPT;
#ifdef CONFIG_FS_VERITY
@@ -95,11 +114,15 @@ struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
if (!ctx)
return ERR_PTR(-ENOMEM);
+ ctx->bh = bh;
ctx->bio = bio;
ctx->inode = inode;
ctx->enabled_steps = read_callbacks_steps;
ctx->cur_step = STEP_INITIAL;
- bio->bi_private = ctx;
+ if (bio)
+ bio->bi_private = ctx;
+ else if (bh)
+ bh->b_private = ctx;
}
return ctx;
}
@@ -111,12 +134,6 @@ void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx)
}
EXPORT_SYMBOL(put_read_callbacks_ctx);
-bool read_callbacks_required(struct bio *bio)
-{
- return bio->bi_private && !bio->bi_status;
-}
-EXPORT_SYMBOL(read_callbacks_required);
-
static int __init init_read_callbacks(void)
{
read_callbacks_ctx_cache = KMEM_CACHE(read_callbacks_ctx, 0);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 7b73ef7f902d..782ed6350dfc 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -165,6 +165,7 @@ void create_empty_buffers(struct page *, unsigned long,
void end_buffer_read_sync(struct buffer_head *bh, int uptodate);
void end_buffer_write_sync(struct buffer_head *bh, int uptodate);
void end_buffer_async_write(struct buffer_head *bh, int uptodate);
+void end_buffer_page_read(struct buffer_head *bh);
/* Things to do with buffers at mapping->private_list */
void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode);
diff --git a/include/linux/read_callbacks.h b/include/linux/read_callbacks.h
index c501cdf83a5b..ae32dc4efa6d 100644
--- a/include/linux/read_callbacks.h
+++ b/include/linux/read_callbacks.h
@@ -3,6 +3,7 @@
#define _READ_CALLBACKS_H
struct read_callbacks_ctx {
+ struct buffer_head *bh;
struct bio *bio;
struct inode *inode;
struct work_struct work;
@@ -10,12 +11,12 @@ struct read_callbacks_ctx {
unsigned int enabled_steps;
};
-void end_read_callbacks(struct bio *bio);
+void end_read_callbacks(struct bio *bio, struct buffer_head *bh);
void read_callbacks(struct read_callbacks_ctx *ctx);
struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
struct bio *bio,
+ struct buffer_head *bh,
pgoff_t index);
void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx);
-bool read_callbacks_required(struct bio *bio);
#endif /* _READ_CALLBACKS_H */
--
2.19.1
Hi Chandan,
On Sun, Apr 28, 2019 at 10:01:10AM +0530, Chandan Rajendra wrote:
> The "read callbacks" code is used by both Ext4 and F2FS. Hence to
> remove duplicity, this commit moves the code into
> include/linux/read_callbacks.h and fs/read_callbacks.c.
>
> The corresponding decrypt and verity "work" functions have been moved
> inside fscrypt and fsverity sources. With these in place, the read
> callbacks code now has to just invoke enqueue functions provided by
> fscrypt and fsverity.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/Kconfig | 4 +
> fs/Makefile | 4 +
> fs/crypto/Kconfig | 1 +
> fs/crypto/bio.c | 23 ++---
> fs/crypto/crypto.c | 17 +--
> fs/crypto/fscrypt_private.h | 3 +
> fs/ext4/ext4.h | 2 -
> fs/ext4/readpage.c | 183 +++++----------------------------
> fs/ext4/super.c | 9 +-
> fs/f2fs/data.c | 148 ++++----------------------
> fs/f2fs/super.c | 9 +-
> fs/read_callbacks.c | 136 ++++++++++++++++++++++++
> fs/verity/Kconfig | 1 +
> fs/verity/verify.c | 12 +++
> include/linux/fscrypt.h | 20 +---
> include/linux/read_callbacks.h | 21 ++++
> 16 files changed, 251 insertions(+), 342 deletions(-)
> create mode 100644 fs/read_callbacks.c
> create mode 100644 include/linux/read_callbacks.h
>
For easier review, can you split this into multiple patches? Ideally the ext4
and f2fs patches would be separate, but if that's truly not possible due to
interdependencies it seems you could at least do:
1. Introduce the read_callbacks.
2. Convert encryption to use the read_callbacks.
3. Remove union from struct fscrypt_context.
Also: just FYI, fs-verity isn't upstream yet, and in the past few months I
haven't had much time to work on it. So you might consider arranging your
series so that initially just fscrypt is supported. That will be useful on its
own, for block_size < PAGE_SIZE support. Then fsverity can be added later.
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 97f9eb8df713..03084f2dbeaf 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -308,6 +308,10 @@ config NFS_COMMON
> depends on NFSD || NFS_FS || LOCKD
> default y
>
> +config FS_READ_CALLBACKS
> + bool
> + default n
'default n' is unnecesary, since 'n' is already the default.
> +
> source "net/sunrpc/Kconfig"
> source "fs/ceph/Kconfig"
> source "fs/cifs/Kconfig"
> diff --git a/fs/Makefile b/fs/Makefile
> index 9dd2186e74b5..e0c0fce8cf40 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -21,6 +21,10 @@ else
> obj-y += no-block.o
> endif
>
> +ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
> +obj-y += read_callbacks.o
> +endif
> +
This can be simplified to:
obj-$(CONFIG_FS_READ_CALLBACKS) += read_callbacks.o
> diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> new file mode 100644
> index 000000000000..b6d5b95e67d7
> --- /dev/null
> +++ b/fs/read_callbacks.c
> @@ -0,0 +1,136 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This file tracks the state machine that needs to be executed after reading
> + * data from files that are encrypted and/or have verity metadata associated
> + * with them.
> + */
> +#include <linux/module.h>
> +#include <linux/mm.h>
> +#include <linux/pagemap.h>
> +#include <linux/bio.h>
> +#include <linux/fscrypt.h>
> +#include <linux/fsverity.h>
> +#include <linux/read_callbacks.h>
> +
> +#define NUM_PREALLOC_POST_READ_CTXS 128
> +
> +static struct kmem_cache *read_callbacks_ctx_cache;
> +static mempool_t *read_callbacks_ctx_pool;
> +
> +/* Read callback state machine steps */
> +enum read_callbacks_step {
> + STEP_INITIAL = 0,
> + STEP_DECRYPT,
> + STEP_VERITY,
> +};
> +
> +void end_read_callbacks(struct bio *bio)
> +{
> + struct page *page;
> + struct bio_vec *bv;
> + int i;
> + struct bvec_iter_all iter_all;
> +
> + bio_for_each_segment_all(bv, bio, i, iter_all) {
> + page = bv->bv_page;
> +
> + BUG_ON(bio->bi_status);
> +
> + if (!PageError(page))
> + SetPageUptodate(page);
> +
> + unlock_page(page);
> + }
> + if (bio->bi_private)
> + put_read_callbacks_ctx(bio->bi_private);
> + bio_put(bio);
> +}
> +EXPORT_SYMBOL(end_read_callbacks);
end_read_callbacks() is only called by read_callbacks() just below, so it should
be 'static'.
> +
> +struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> + struct bio *bio,
> + pgoff_t index)
> +{
> + unsigned int read_callbacks_steps = 0;
Rename 'read_callbacks_steps' => 'enabled_steps', since it's clear from context.
> + struct read_callbacks_ctx *ctx = NULL;
> +
> + if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> + read_callbacks_steps |= 1 << STEP_DECRYPT;
> +#ifdef CONFIG_FS_VERITY
> + if (inode->i_verity_info != NULL &&
> + (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> + read_callbacks_steps |= 1 << STEP_VERITY;
> +#endif
To avoid the #ifdef, this should probably be made a function in fsverity.h.
> + if (read_callbacks_steps) {
> + ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
> + if (!ctx)
> + return ERR_PTR(-ENOMEM);
> + ctx->bio = bio;
> + ctx->inode = inode;
> + ctx->enabled_steps = read_callbacks_steps;
> + ctx->cur_step = STEP_INITIAL;
> + bio->bi_private = ctx;
> + }
> + return ctx;
> +}
> +EXPORT_SYMBOL(get_read_callbacks_ctx);
The callers don't actually use the returned read_callbacks_ctx. Instead, they
rely on this function storing it in ->bi_private. So, this function should just
return an error code, and it should be renamed. Perhaps:
int read_callbacks_setup_bio(struct inode *inode, struct bio *bio,
pgoff_t first_pgoff);
Please rename 'index' to 'first_pgoff' to make it clearer what it is, given that
a bio can contain many pages.
Please add kerneldoc for this function.
- Eric
On Sun, Apr 28, 2019 at 10:01:08AM +0530, Chandan Rajendra wrote:
> With these changes in place, the patchset changes Ext4 to use
> mpage_readpage[s] instead of its own custom ext4_readpage[s]()
> functions. This is done to reduce duplicity of code across
FYI, "duplicity" means "lying". You meant "duplication".
On Sun, Apr 28, 2019 at 10:01:15AM +0530, Chandan Rajendra wrote:
> To support decryption of sub-pagesized blocks this commit adds code to,
> 1. Track buffer head in "struct read_callbacks_ctx".
> 2. Pass buffer head argument to all read callbacks.
> 3. In the corresponding endio, loop across all the blocks mapped by the
> page, decrypting each block in turn.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/buffer.c | 83 +++++++++++++++++++++++++---------
> fs/crypto/bio.c | 50 +++++++++++++-------
> fs/crypto/crypto.c | 19 +++++++-
> fs/f2fs/data.c | 2 +-
> fs/mpage.c | 2 +-
> fs/read_callbacks.c | 53 ++++++++++++++--------
> include/linux/buffer_head.h | 1 +
> include/linux/read_callbacks.h | 5 +-
> 8 files changed, 154 insertions(+), 61 deletions(-)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index ce357602f471..f324727e24bb 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -45,6 +45,7 @@
> #include <linux/bit_spinlock.h>
> #include <linux/pagevec.h>
> #include <linux/sched/mm.h>
> +#include <linux/read_callbacks.h>
> #include <trace/events/block.h>
>
> static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
> @@ -245,11 +246,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
> return ret;
> }
>
> -/*
> - * I/O completion handler for block_read_full_page() - pages
> - * which come unlocked at the end of I/O.
> - */
> -static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> +void end_buffer_page_read(struct buffer_head *bh)
I think __end_buffer_async_read() would be a better name, since the *page* isn't
necessarily done yet.
> {
> unsigned long flags;
> struct buffer_head *first;
> @@ -257,17 +254,7 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> struct page *page;
> int page_uptodate = 1;
>
> - BUG_ON(!buffer_async_read(bh));
> -
> page = bh->b_page;
> - if (uptodate) {
> - set_buffer_uptodate(bh);
> - } else {
> - clear_buffer_uptodate(bh);
> - buffer_io_error(bh, ", async page read");
> - SetPageError(page);
> - }
> -
> /*
> * Be _very_ careful from here on. Bad things can happen if
> * two buffer heads end IO at almost the same time and both
> @@ -305,6 +292,44 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> local_irq_restore(flags);
> return;
> }
> +EXPORT_SYMBOL(end_buffer_page_read);
No need for EXPORT_SYMBOL() here, as this is only called by built-in code.
> +
> +/*
> + * I/O completion handler for block_read_full_page() - pages
> + * which come unlocked at the end of I/O.
> + */
This comment is no longer correct. Change to something like:
/*
* I/O completion handler for block_read_full_page(). Pages are unlocked after
* the I/O completes and the read callbacks (if any) have executed.
*/
> +static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> +{
> + struct page *page;
> +
> + BUG_ON(!buffer_async_read(bh));
> +
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + if (uptodate && bh->b_private) {
> + struct read_callbacks_ctx *ctx = bh->b_private;
> +
> + read_callbacks(ctx);
> + return;
> + }
> +
> + if (bh->b_private) {
> + struct read_callbacks_ctx *ctx = bh->b_private;
> +
> + WARN_ON(uptodate);
> + put_read_callbacks_ctx(ctx);
> + }
> +#endif
These details should be handled in read_callbacks code, not here. AFAICS, all
you need is a function read_callbacks_end_bh() that returns a bool indicating
whether it handled the buffer_head or not:
static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
{
BUG_ON(!buffer_async_read(bh));
if (read_callbacks_end_bh(bh, uptodate))
return;
page = bh->b_page;
...
}
Then read_callbacks_end_bh() would check ->b_private and uptodate, and call
read_callbacks() or put_read_callbacks_ctx() as appropriate. When
CONFIG_FS_READ_CALLBACKS=n it would be a stub that always returns false.
> + page = bh->b_page;
[...]
> }
> @@ -2292,11 +2323,21 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
> * the underlying blockdev brought it uptodate (the sct fix).
> */
> for (i = 0; i < nr; i++) {
> - bh = arr[i];
> - if (buffer_uptodate(bh))
> + bh = arr[i].bh;
> + if (buffer_uptodate(bh)) {
> end_buffer_async_read(bh, 1);
> - else
> + } else {
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + struct read_callbacks_ctx *ctx;
> +
> + ctx = get_read_callbacks_ctx(inode, NULL, bh, arr[i].blk_nr);
> + if (WARN_ON(IS_ERR(ctx))) {
> + end_buffer_async_read(bh, 0);
> + continue;
> + }
> +#endif
> submit_bh(REQ_OP_READ, 0, bh);
> + }
> }
> return 0;
Similarly here. This level of detail doesn't need to be exposed outside of the
read_callbacks code. Just call read_callbacks_setup_bh() or something, make it
return an 'err' rather than the read_callbacks_ctx, and make read_callbacks.h
stub it out when !CONFIG_FS_READ_CALLBACKS. There should be no #ifdef here.
> diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> index 27f5618174f2..856f4694902d 100644
> --- a/fs/crypto/bio.c
> +++ b/fs/crypto/bio.c
> @@ -24,44 +24,62 @@
> #include <linux/module.h>
> #include <linux/bio.h>
> #include <linux/namei.h>
> +#include <linux/buffer_head.h>
> #include <linux/read_callbacks.h>
>
> #include "fscrypt_private.h"
>
> -static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
> +static void fscrypt_decrypt(struct bio *bio, struct buffer_head *bh)
> {
> + struct inode *inode;
> + struct page *page;
> struct bio_vec *bv;
> + sector_t blk_nr;
> + int ret;
> int i;
> struct bvec_iter_all iter_all;
>
> - bio_for_each_segment_all(bv, bio, i, iter_all) {
> - struct page *page = bv->bv_page;
> - int ret = fscrypt_decrypt_page(page->mapping->host, page,
> - PAGE_SIZE, 0, page->index);
> + WARN_ON(!bh && !bio);
>
> + if (bh) {
> + page = bh->b_page;
> + inode = page->mapping->host;
> +
> + blk_nr = page->index << (PAGE_SHIFT - inode->i_blkbits);
> + blk_nr += (bh_offset(bh) >> inode->i_blkbits);
> +
> + ret = fscrypt_decrypt_page(inode, page, i_blocksize(inode),
> + bh_offset(bh), blk_nr);
> if (ret) {
> WARN_ON_ONCE(1);
> SetPageError(page);
> - } else if (done) {
> - SetPageUptodate(page);
> }
> - if (done)
> - unlock_page(page);
> + } else if (bio) {
> + bio_for_each_segment_all(bv, bio, i, iter_all) {
> + unsigned int blkbits;
> +
> + page = bv->bv_page;
> + inode = page->mapping->host;
> + blkbits = inode->i_blkbits;
> + blk_nr = page->index << (PAGE_SHIFT - blkbits);
> + blk_nr += (bv->bv_offset >> blkbits);
> + ret = fscrypt_decrypt_page(page->mapping->host,
> + page, bv->bv_len,
> + bv->bv_offset, blk_nr);
> + if (ret) {
> + WARN_ON_ONCE(1);
> + SetPageError(page);
> + }
> + }
> }
> }
For clarity, can you make these two different functions?
fscrypt_decrypt_bio() and fscrypt_decrypt_bh().
FYI, the WARN_ON_ONCE() here was removed in the latest fscrypt tree.
>
> -void fscrypt_decrypt_bio(struct bio *bio)
> -{
> - __fscrypt_decrypt_bio(bio, false);
> -}
> -EXPORT_SYMBOL(fscrypt_decrypt_bio);
> -
> void fscrypt_decrypt_work(struct work_struct *work)
> {
> struct read_callbacks_ctx *ctx =
> container_of(work, struct read_callbacks_ctx, work);
>
> - fscrypt_decrypt_bio(ctx->bio);
> + fscrypt_decrypt(ctx->bio, ctx->bh);
>
> read_callbacks(ctx);
> }
> diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> index ffa9302a7351..4f0d832cae71 100644
> --- a/fs/crypto/crypto.c
> +++ b/fs/crypto/crypto.c
> @@ -305,11 +305,26 @@ EXPORT_SYMBOL(fscrypt_encrypt_page);
> int fscrypt_decrypt_page(const struct inode *inode, struct page *page,
> unsigned int len, unsigned int offs, u64 lblk_num)
> {
> + int i, page_nr_blks;
> + int err = 0;
> +
> if (!(inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES))
> BUG_ON(!PageLocked(page));
>
> - return fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num, page, page,
> - len, offs, GFP_NOFS);
> + page_nr_blks = len >> inode->i_blkbits;
> +
> + for (i = 0; i < page_nr_blks; i++) {
> + err = fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num,
> + page, page, i_blocksize(inode), offs,
> + GFP_NOFS);
> + if (err)
> + break;
> +
> + ++lblk_num;
> + offs += i_blocksize(inode);
> + }
> +
> + return err;
> }
> EXPORT_SYMBOL(fscrypt_decrypt_page);
I was confused by the code calling this until I saw you updated it to handle
multiple blocks. Can you please rename it to fscrypt_decrypt_blocks()? The
function comment also needs to be updated to clarify what it does now (decrypt a
contiguous sequence of one or more filesystem blocks in the page). Also,
'lblk_num' should be renamed to 'starting_lblk_num' or similar.
Please also rename fscrypt_do_page_crypto() to fscrypt_crypt_block().
Also, there should be a check that the len and offset are block-aligned:
const unsigned int blocksize = i_blocksize(inode);
if (!IS_ALIGNED(len | offs, blocksize))
return -EINVAL;
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 05430d3650ab..ba437a2085e7 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -527,7 +527,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
> bio_set_op_attrs(bio, REQ_OP_READ, op_flag);
>
> #if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> - ctx = get_read_callbacks_ctx(inode, bio, first_idx);
> + ctx = get_read_callbacks_ctx(inode, bio, NULL, first_idx);
> if (IS_ERR(ctx)) {
> bio_put(bio);
> return (struct bio *)ctx;
> diff --git a/fs/mpage.c b/fs/mpage.c
> index e342b859ee44..0557479fdca4 100644
> --- a/fs/mpage.c
> +++ b/fs/mpage.c
> @@ -348,7 +348,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
> goto confused;
>
> #if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> - ctx = get_read_callbacks_ctx(inode, args->bio, page->index);
> + ctx = get_read_callbacks_ctx(inode, args->bio, NULL, page->index);
> if (IS_ERR(ctx)) {
> bio_put(args->bio);
> args->bio = NULL;
> diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> index 6dea54b0baa9..b3881c525720 100644
> --- a/fs/read_callbacks.c
> +++ b/fs/read_callbacks.c
> @@ -8,6 +8,7 @@
> #include <linux/mm.h>
> #include <linux/pagemap.h>
> #include <linux/bio.h>
> +#include <linux/buffer_head.h>
> #include <linux/fscrypt.h>
> #include <linux/fsverity.h>
> #include <linux/read_callbacks.h>
> @@ -24,26 +25,41 @@ enum read_callbacks_step {
> STEP_VERITY,
> };
>
> -void end_read_callbacks(struct bio *bio)
> +void end_read_callbacks(struct bio *bio, struct buffer_head *bh)
> {
> + struct read_callbacks_ctx *ctx;
> struct page *page;
> struct bio_vec *bv;
> int i;
> struct bvec_iter_all iter_all;
>
> - bio_for_each_segment_all(bv, bio, i, iter_all) {
> - page = bv->bv_page;
> + if (bh) {
> + if (!PageError(bh->b_page))
> + set_buffer_uptodate(bh);
>
> - BUG_ON(bio->bi_status);
> + ctx = bh->b_private;
>
> - if (!PageError(page))
> - SetPageUptodate(page);
> + end_buffer_page_read(bh);
>
> - unlock_page(page);
> + put_read_callbacks_ctx(ctx);
> + } else if (bio) {
> + bio_for_each_segment_all(bv, bio, i, iter_all) {
> + page = bv->bv_page;
> +
> + WARN_ON(bio->bi_status);
> +
> + if (!PageError(page))
> + SetPageUptodate(page);
> +
> + unlock_page(page);
> + }
> + WARN_ON(!bio->bi_private);
> +
> + ctx = bio->bi_private;
> + put_read_callbacks_ctx(ctx);
> +
> + bio_put(bio);
> }
> - if (bio->bi_private)
> - put_read_callbacks_ctx(bio->bi_private);
> - bio_put(bio);
> }
> EXPORT_SYMBOL(end_read_callbacks);
To make this easier to read, can you split this into end_read_callbacks_bio()
and end_read_callbacks_bh()?
>
> @@ -70,18 +86,21 @@ void read_callbacks(struct read_callbacks_ctx *ctx)
> ctx->cur_step++;
> /* fall-through */
> default:
> - end_read_callbacks(ctx->bio);
> + end_read_callbacks(ctx->bio, ctx->bh);
> }
> }
> EXPORT_SYMBOL(read_callbacks);
>
> struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> struct bio *bio,
> + struct buffer_head *bh,
> pgoff_t index)
> {
> unsigned int read_callbacks_steps = 0;
> struct read_callbacks_ctx *ctx = NULL;
>
> + WARN_ON(!bh && !bio);
> +
If this condition is true, return an error code; don't continue on.
> if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> read_callbacks_steps |= 1 << STEP_DECRYPT;
> #ifdef CONFIG_FS_VERITY
> @@ -95,11 +114,15 @@ struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
> if (!ctx)
> return ERR_PTR(-ENOMEM);
> + ctx->bh = bh;
> ctx->bio = bio;
> ctx->inode = inode;
> ctx->enabled_steps = read_callbacks_steps;
> ctx->cur_step = STEP_INITIAL;
> - bio->bi_private = ctx;
> + if (bio)
> + bio->bi_private = ctx;
> + else if (bh)
> + bh->b_private = ctx;
... and if doing that, then you don't need to check 'else if (bh)' here.
> }
> return ctx;
> }
> @@ -111,12 +134,6 @@ void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx)
> }
> EXPORT_SYMBOL(put_read_callbacks_ctx);
>
> -bool read_callbacks_required(struct bio *bio)
> -{
> - return bio->bi_private && !bio->bi_status;
> -}
> -EXPORT_SYMBOL(read_callbacks_required);
> -
It's unexpected that the patch series introduces this function,
only to delete it later.
- Eric
On 2019/4/28 12:31, Chandan Rajendra wrote:
> The "read callbacks" code is used by both Ext4 and F2FS. Hence to
> remove duplicity, this commit moves the code into
> include/linux/read_callbacks.h and fs/read_callbacks.c.
>
> The corresponding decrypt and verity "work" functions have been moved
> inside fscrypt and fsverity sources. With these in place, the read
> callbacks code now has to just invoke enqueue functions provided by
> fscrypt and fsverity.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/Kconfig | 4 +
> fs/Makefile | 4 +
> fs/crypto/Kconfig | 1 +
> fs/crypto/bio.c | 23 ++---
> fs/crypto/crypto.c | 17 +--
> fs/crypto/fscrypt_private.h | 3 +
> fs/ext4/ext4.h | 2 -
> fs/ext4/readpage.c | 183 +++++----------------------------
> fs/ext4/super.c | 9 +-
> fs/f2fs/data.c | 148 ++++----------------------
> fs/f2fs/super.c | 9 +-
> fs/read_callbacks.c | 136 ++++++++++++++++++++++++
> fs/verity/Kconfig | 1 +
> fs/verity/verify.c | 12 +++
> include/linux/fscrypt.h | 20 +---
> include/linux/read_callbacks.h | 21 ++++
> 16 files changed, 251 insertions(+), 342 deletions(-)
> create mode 100644 fs/read_callbacks.c
> create mode 100644 include/linux/read_callbacks.h
>
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 97f9eb8df713..03084f2dbeaf 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -308,6 +308,10 @@ config NFS_COMMON
> depends on NFSD || NFS_FS || LOCKD
> default y
>
> +config FS_READ_CALLBACKS
> + bool
> + default n
> +
> source "net/sunrpc/Kconfig"
> source "fs/ceph/Kconfig"
> source "fs/cifs/Kconfig"
> diff --git a/fs/Makefile b/fs/Makefile
> index 9dd2186e74b5..e0c0fce8cf40 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -21,6 +21,10 @@ else
> obj-y += no-block.o
> endif
>
> +ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
> +obj-y += read_callbacks.o
> +endif
> +
> obj-$(CONFIG_PROC_FS) += proc_namespace.o
>
> obj-y += notify/
> diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
> index f0de238000c0..163c328bcbd4 100644
> --- a/fs/crypto/Kconfig
> +++ b/fs/crypto/Kconfig
> @@ -8,6 +8,7 @@ config FS_ENCRYPTION
> select CRYPTO_CTS
> select CRYPTO_SHA256
> select KEYS
> + select FS_READ_CALLBACKS
> help
> Enable encryption of files and directories. This
> feature is similar to ecryptfs, but it is more memory
> diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> index 5759bcd018cd..27f5618174f2 100644
> --- a/fs/crypto/bio.c
> +++ b/fs/crypto/bio.c
> @@ -24,6 +24,8 @@
> #include <linux/module.h>
> #include <linux/bio.h>
> #include <linux/namei.h>
> +#include <linux/read_callbacks.h>
> +
> #include "fscrypt_private.h"
>
> static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
> @@ -54,24 +56,15 @@ void fscrypt_decrypt_bio(struct bio *bio)
> }
> EXPORT_SYMBOL(fscrypt_decrypt_bio);
>
> -static void completion_pages(struct work_struct *work)
> +void fscrypt_decrypt_work(struct work_struct *work)
> {
> - struct fscrypt_ctx *ctx =
> - container_of(work, struct fscrypt_ctx, r.work);
> - struct bio *bio = ctx->r.bio;
> + struct read_callbacks_ctx *ctx =
> + container_of(work, struct read_callbacks_ctx, work);
>
> - __fscrypt_decrypt_bio(bio, true);
> - fscrypt_release_ctx(ctx);
> - bio_put(bio);
> -}
> + fscrypt_decrypt_bio(ctx->bio);
>
> -void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx, struct bio *bio)
> -{
> - INIT_WORK(&ctx->r.work, completion_pages);
> - ctx->r.bio = bio;
> - fscrypt_enqueue_decrypt_work(&ctx->r.work);
> + read_callbacks(ctx);
> }
> -EXPORT_SYMBOL(fscrypt_enqueue_decrypt_bio);
>
> void fscrypt_pullback_bio_page(struct page **page, bool restore)
> {
> @@ -87,7 +80,7 @@ void fscrypt_pullback_bio_page(struct page **page, bool restore)
> ctx = (struct fscrypt_ctx *)page_private(bounce_page);
>
> /* restore control page */
> - *page = ctx->w.control_page;
> + *page = ctx->control_page;
>
> if (restore)
> fscrypt_restore_control_page(bounce_page);
> diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> index 3fc84bf2b1e5..ffa9302a7351 100644
> --- a/fs/crypto/crypto.c
> +++ b/fs/crypto/crypto.c
> @@ -53,6 +53,7 @@ struct kmem_cache *fscrypt_info_cachep;
>
> void fscrypt_enqueue_decrypt_work(struct work_struct *work)
> {
> + INIT_WORK(work, fscrypt_decrypt_work);
> queue_work(fscrypt_read_workqueue, work);
> }
> EXPORT_SYMBOL(fscrypt_enqueue_decrypt_work);
> @@ -70,11 +71,11 @@ void fscrypt_release_ctx(struct fscrypt_ctx *ctx)
> {
> unsigned long flags;
>
> - if (ctx->flags & FS_CTX_HAS_BOUNCE_BUFFER_FL && ctx->w.bounce_page) {
> - mempool_free(ctx->w.bounce_page, fscrypt_bounce_page_pool);
> - ctx->w.bounce_page = NULL;
> + if (ctx->flags & FS_CTX_HAS_BOUNCE_BUFFER_FL && ctx->bounce_page) {
> + mempool_free(ctx->bounce_page, fscrypt_bounce_page_pool);
> + ctx->bounce_page = NULL;
> }
> - ctx->w.control_page = NULL;
> + ctx->control_page = NULL;
> if (ctx->flags & FS_CTX_REQUIRES_FREE_ENCRYPT_FL) {
> kmem_cache_free(fscrypt_ctx_cachep, ctx);
> } else {
> @@ -194,11 +195,11 @@ int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw,
> struct page *fscrypt_alloc_bounce_page(struct fscrypt_ctx *ctx,
> gfp_t gfp_flags)
> {
> - ctx->w.bounce_page = mempool_alloc(fscrypt_bounce_page_pool, gfp_flags);
> - if (ctx->w.bounce_page == NULL)
> + ctx->bounce_page = mempool_alloc(fscrypt_bounce_page_pool, gfp_flags);
> + if (ctx->bounce_page == NULL)
> return ERR_PTR(-ENOMEM);
> ctx->flags |= FS_CTX_HAS_BOUNCE_BUFFER_FL;
> - return ctx->w.bounce_page;
> + return ctx->bounce_page;
> }
>
> /**
> @@ -267,7 +268,7 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> if (IS_ERR(ciphertext_page))
> goto errout;
>
> - ctx->w.control_page = page;
> + ctx->control_page = page;
> err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
> page, ciphertext_page, len, offs,
> gfp_flags);
> diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
> index 7da276159593..412a3bcf9efd 100644
> --- a/fs/crypto/fscrypt_private.h
> +++ b/fs/crypto/fscrypt_private.h
> @@ -114,6 +114,9 @@ static inline bool fscrypt_valid_enc_modes(u32 contents_mode,
> return false;
> }
>
> +/* bio.c */
> +void fscrypt_decrypt_work(struct work_struct *work);
> +
> /* crypto.c */
> extern struct kmem_cache *fscrypt_info_cachep;
> extern int fscrypt_initialize(unsigned int cop_flags);
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index f2b0e628ff7b..23f8568c9b53 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -3127,8 +3127,6 @@ static inline void ext4_set_de_type(struct super_block *sb,
> extern int ext4_mpage_readpages(struct address_space *mapping,
> struct list_head *pages, struct page *page,
> unsigned nr_pages, bool is_readahead);
> -extern int __init ext4_init_post_read_processing(void);
> -extern void ext4_exit_post_read_processing(void);
>
> /* symlink.c */
> extern const struct inode_operations ext4_encrypted_symlink_inode_operations;
> diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
> index 0169e3809da3..e363dededc21 100644
> --- a/fs/ext4/readpage.c
> +++ b/fs/ext4/readpage.c
> @@ -44,14 +44,10 @@
> #include <linux/backing-dev.h>
> #include <linux/pagevec.h>
> #include <linux/cleancache.h>
> +#include <linux/read_callbacks.h>
>
> #include "ext4.h"
>
> -#define NUM_PREALLOC_POST_READ_CTXS 128
> -
> -static struct kmem_cache *bio_post_read_ctx_cache;
> -static mempool_t *bio_post_read_ctx_pool;
> -
> static inline bool ext4_bio_encrypted(struct bio *bio)
> {
> #ifdef CONFIG_FS_ENCRYPTION
> @@ -61,125 +57,6 @@ static inline bool ext4_bio_encrypted(struct bio *bio)
> #endif
> }
>
> -/* postprocessing steps for read bios */
> -enum bio_post_read_step {
> - STEP_INITIAL = 0,
> - STEP_DECRYPT,
> - STEP_VERITY,
> -};
> -
> -struct bio_post_read_ctx {
> - struct bio *bio;
> - struct work_struct work;
> - unsigned int cur_step;
> - unsigned int enabled_steps;
> -};
> -
> -static void __read_end_io(struct bio *bio)
> -{
> - struct page *page;
> - struct bio_vec *bv;
> - int i;
> - struct bvec_iter_all iter_all;
> -
> - bio_for_each_segment_all(bv, bio, i, iter_all) {
> - page = bv->bv_page;
> -
> - /* PG_error was set if any post_read step failed */
> - if (bio->bi_status || PageError(page)) {
> - ClearPageUptodate(page);
> - SetPageError(page);
> - } else {
> - SetPageUptodate(page);
> - }
> - unlock_page(page);
> - }
> - if (bio->bi_private)
> - mempool_free(bio->bi_private, bio_post_read_ctx_pool);
> - bio_put(bio);
> -}
> -
> -static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
> -
> -static void decrypt_work(struct work_struct *work)
> -{
> - struct bio_post_read_ctx *ctx =
> - container_of(work, struct bio_post_read_ctx, work);
> -
> - fscrypt_decrypt_bio(ctx->bio);
> -
> - bio_post_read_processing(ctx);
> -}
> -
> -static void verity_work(struct work_struct *work)
> -{
> - struct bio_post_read_ctx *ctx =
> - container_of(work, struct bio_post_read_ctx, work);
> -
> - fsverity_verify_bio(ctx->bio);
> -
> - bio_post_read_processing(ctx);
> -}
> -
> -static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
> -{
> - /*
> - * We use different work queues for decryption and for verity because
> - * verity may require reading metadata pages that need decryption, and
> - * we shouldn't recurse to the same workqueue.
> - */
> - switch (++ctx->cur_step) {
> - case STEP_DECRYPT:
> - if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
> - INIT_WORK(&ctx->work, decrypt_work);
> - fscrypt_enqueue_decrypt_work(&ctx->work);
> - return;
> - }
> - ctx->cur_step++;
> - /* fall-through */
> - case STEP_VERITY:
> - if (ctx->enabled_steps & (1 << STEP_VERITY)) {
> - INIT_WORK(&ctx->work, verity_work);
> - fsverity_enqueue_verify_work(&ctx->work);
> - return;
> - }
> - ctx->cur_step++;
> - /* fall-through */
> - default:
> - __read_end_io(ctx->bio);
> - }
> -}
> -
> -static struct bio_post_read_ctx *get_bio_post_read_ctx(struct inode *inode,
> - struct bio *bio,
> - pgoff_t index)
> -{
> - unsigned int post_read_steps = 0;
> - struct bio_post_read_ctx *ctx = NULL;
> -
> - if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> - post_read_steps |= 1 << STEP_DECRYPT;
> -#ifdef CONFIG_FS_VERITY
> - if (inode->i_verity_info != NULL &&
> - (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> - post_read_steps |= 1 << STEP_VERITY;
> -#endif
> - if (post_read_steps) {
> - ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
> - if (!ctx)
> - return ERR_PTR(-ENOMEM);
> - ctx->bio = bio;
> - ctx->enabled_steps = post_read_steps;
> - bio->bi_private = ctx;
> - }
> - return ctx;
> -}
> -
> -static bool bio_post_read_required(struct bio *bio)
> -{
> - return bio->bi_private && !bio->bi_status;
> -}
> -
> /*
> * I/O completion handler for multipage BIOs.
> *
> @@ -194,14 +71,30 @@ static bool bio_post_read_required(struct bio *bio)
> */
> static void mpage_end_io(struct bio *bio)
> {
> - if (bio_post_read_required(bio)) {
> - struct bio_post_read_ctx *ctx = bio->bi_private;
> + struct bio_vec *bv;
> + int i;
> + struct bvec_iter_all iter_all;
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + if (read_callbacks_required(bio)) {
> + struct read_callbacks_ctx *ctx = bio->bi_private;
>
> - ctx->cur_step = STEP_INITIAL;
> - bio_post_read_processing(ctx);
> + read_callbacks(ctx);
> return;
> }
> - __read_end_io(bio);
> +#endif
> + bio_for_each_segment_all(bv, bio, i, iter_all) {
> + struct page *page = bv->bv_page;
> +
> + if (!bio->bi_status) {
> + SetPageUptodate(page);
> + } else {
> + ClearPageUptodate(page);
> + SetPageError(page);
> + }
> + unlock_page(page);
> + }
> +
> + bio_put(bio);
> }
>
> static inline loff_t ext4_readpage_limit(struct inode *inode)
> @@ -368,17 +261,19 @@ int ext4_mpage_readpages(struct address_space *mapping,
> bio = NULL;
> }
> if (bio == NULL) {
> - struct bio_post_read_ctx *ctx;
> + struct read_callbacks_ctx *ctx = NULL;
>
> bio = bio_alloc(GFP_KERNEL,
> min_t(int, nr_pages, BIO_MAX_PAGES));
> if (!bio)
> goto set_error_page;
> - ctx = get_bio_post_read_ctx(inode, bio, page->index);
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + ctx = get_read_callbacks_ctx(inode, bio, page->index);
> if (IS_ERR(ctx)) {
> bio_put(bio);
> goto set_error_page;
> }
> +#endif
> bio_set_dev(bio, bdev);
> bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
> bio->bi_end_io = mpage_end_io;
> @@ -417,29 +312,3 @@ int ext4_mpage_readpages(struct address_space *mapping,
> submit_bio(bio);
> return 0;
> }
> -
> -int __init ext4_init_post_read_processing(void)
> -{
> - bio_post_read_ctx_cache =
> - kmem_cache_create("ext4_bio_post_read_ctx",
> - sizeof(struct bio_post_read_ctx), 0, 0, NULL);
> - if (!bio_post_read_ctx_cache)
> - goto fail;
> - bio_post_read_ctx_pool =
> - mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
> - bio_post_read_ctx_cache);
> - if (!bio_post_read_ctx_pool)
> - goto fail_free_cache;
> - return 0;
> -
> -fail_free_cache:
> - kmem_cache_destroy(bio_post_read_ctx_cache);
> -fail:
> - return -ENOMEM;
> -}
> -
> -void ext4_exit_post_read_processing(void)
> -{
> - mempool_destroy(bio_post_read_ctx_pool);
> - kmem_cache_destroy(bio_post_read_ctx_cache);
> -}
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4ae6f5849caa..aba724f82cc3 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -6101,10 +6101,6 @@ static int __init ext4_init_fs(void)
> return err;
>
> err = ext4_init_pending();
> - if (err)
> - goto out7;
> -
> - err = ext4_init_post_read_processing();
> if (err)
> goto out6;
>
> @@ -6146,10 +6142,8 @@ static int __init ext4_init_fs(void)
> out4:
> ext4_exit_pageio();
> out5:
> - ext4_exit_post_read_processing();
> -out6:
> ext4_exit_pending();
> -out7:
> +out6:
> ext4_exit_es();
>
> return err;
> @@ -6166,7 +6160,6 @@ static void __exit ext4_exit_fs(void)
> ext4_exit_sysfs();
> ext4_exit_system_zone();
> ext4_exit_pageio();
> - ext4_exit_post_read_processing();
> ext4_exit_es();
> ext4_exit_pending();
> }
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 038b958d0fa9..05430d3650ab 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -18,6 +18,7 @@
> #include <linux/uio.h>
> #include <linux/cleancache.h>
> #include <linux/sched/signal.h>
> +#include <linux/read_callbacks.h>
>
> #include "f2fs.h"
> #include "node.h"
> @@ -25,11 +26,6 @@
> #include "trace.h"
> #include <trace/events/f2fs.h>
>
> -#define NUM_PREALLOC_POST_READ_CTXS 128
> -
> -static struct kmem_cache *bio_post_read_ctx_cache;
> -static mempool_t *bio_post_read_ctx_pool;
> -
> static bool __is_cp_guaranteed(struct page *page)
> {
> struct address_space *mapping = page->mapping;
> @@ -69,20 +65,6 @@ static enum count_type __read_io_type(struct page *page)
> return F2FS_RD_DATA;
> }
>
> -/* postprocessing steps for read bios */
> -enum bio_post_read_step {
> - STEP_INITIAL = 0,
> - STEP_DECRYPT,
> - STEP_VERITY,
> -};
> -
> -struct bio_post_read_ctx {
> - struct bio *bio;
> - struct work_struct work;
> - unsigned int cur_step;
> - unsigned int enabled_steps;
> -};
> -
> static void __read_end_io(struct bio *bio)
> {
> struct page *page;
> @@ -104,65 +86,16 @@ static void __read_end_io(struct bio *bio)
> dec_page_count(F2FS_P_SB(page), __read_io_type(page));
> unlock_page(page);
> }
> - if (bio->bi_private)
> - mempool_free(bio->bi_private, bio_post_read_ctx_pool);
> - bio_put(bio);
> -}
> -
> -static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
>
> -static void decrypt_work(struct work_struct *work)
> -{
> - struct bio_post_read_ctx *ctx =
> - container_of(work, struct bio_post_read_ctx, work);
> -
> - fscrypt_decrypt_bio(ctx->bio);
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + if (bio->bi_private) {
> + struct read_callbacks_ctx *ctx;
>
> - bio_post_read_processing(ctx);
> -}
> -
> -static void verity_work(struct work_struct *work)
> -{
> - struct bio_post_read_ctx *ctx =
> - container_of(work, struct bio_post_read_ctx, work);
> -
> - fsverity_verify_bio(ctx->bio);
> -
> - bio_post_read_processing(ctx);
> -}
> -
> -static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
> -{
> - /*
> - * We use different work queues for decryption and for verity because
> - * verity may require reading metadata pages that need decryption, and
> - * we shouldn't recurse to the same workqueue.
> - */
> - switch (++ctx->cur_step) {
> - case STEP_DECRYPT:
> - if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
> - INIT_WORK(&ctx->work, decrypt_work);
> - fscrypt_enqueue_decrypt_work(&ctx->work);
> - return;
> - }
> - ctx->cur_step++;
> - /* fall-through */
> - case STEP_VERITY:
> - if (ctx->enabled_steps & (1 << STEP_VERITY)) {
> - INIT_WORK(&ctx->work, verity_work);
> - fsverity_enqueue_verify_work(&ctx->work);
> - return;
> - }
> - ctx->cur_step++;
> - /* fall-through */
> - default:
> - __read_end_io(ctx->bio);
> + ctx = bio->bi_private;
> + put_read_callbacks_ctx(ctx);
> }
> -}
> -
> -static bool f2fs_bio_post_read_required(struct bio *bio)
> -{
> - return bio->bi_private && !bio->bi_status;
> +#endif
> + bio_put(bio);
> }
>
> static void f2fs_read_end_io(struct bio *bio)
> @@ -173,14 +106,12 @@ static void f2fs_read_end_io(struct bio *bio)
> bio->bi_status = BLK_STS_IOERR;
> }
>
> - if (f2fs_bio_post_read_required(bio)) {
> - struct bio_post_read_ctx *ctx = bio->bi_private;
> -
> - ctx->cur_step = STEP_INITIAL;
> - bio_post_read_processing(ctx);
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + if (!bio->bi_status && bio->bi_private) {
> + read_callbacks((struct read_callbacks_ctx *)(bio->bi_private));
> return;
Previously, in __read_end_io() we will decrease in-flight read IO count for each
page, but it looks in the case that if fscrypto or fsverity is on and there is
no IO error in end_io(), we will miss handling the count.
> @@ -104,65 +86,16 @@ static void __read_end_io(struct bio *bio)
> dec_page_count(F2FS_P_SB(page), __read_io_type(page));
Thanks,
> }
> -
> +#endif
> __read_end_io(bio);
> }
>
> @@ -582,9 +513,9 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
> {
> struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> struct bio *bio;
> - struct bio_post_read_ctx *ctx;
> - unsigned int post_read_steps = 0;
> -
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + struct read_callbacks_ctx *ctx;
> +#endif
> if (!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC))
> return ERR_PTR(-EFAULT);
>
> @@ -595,24 +526,13 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
> bio->bi_end_io = f2fs_read_end_io;
> bio_set_op_attrs(bio, REQ_OP_READ, op_flag);
>
> - if (f2fs_encrypted_file(inode))
> - post_read_steps |= 1 << STEP_DECRYPT;
> -#ifdef CONFIG_FS_VERITY
> - if (inode->i_verity_info != NULL &&
> - (first_idx < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> - post_read_steps |= 1 << STEP_VERITY;
> -#endif
> - if (post_read_steps) {
> - ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
> - if (!ctx) {
> - bio_put(bio);
> - return ERR_PTR(-ENOMEM);
> - }
> - ctx->bio = bio;
> - ctx->enabled_steps = post_read_steps;
> - bio->bi_private = ctx;
> +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> + ctx = get_read_callbacks_ctx(inode, bio, first_idx);
> + if (IS_ERR(ctx)) {
> + bio_put(bio);
> + return (struct bio *)ctx;
> }
> -
> +#endif
> return bio;
> }
>
> @@ -2894,29 +2814,3 @@ void f2fs_clear_page_cache_dirty_tag(struct page *page)
> PAGECACHE_TAG_DIRTY);
> xa_unlock_irqrestore(&mapping->i_pages, flags);
> }
> -
> -int __init f2fs_init_post_read_processing(void)
> -{
> - bio_post_read_ctx_cache =
> - kmem_cache_create("f2fs_bio_post_read_ctx",
> - sizeof(struct bio_post_read_ctx), 0, 0, NULL);
> - if (!bio_post_read_ctx_cache)
> - goto fail;
> - bio_post_read_ctx_pool =
> - mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
> - bio_post_read_ctx_cache);
> - if (!bio_post_read_ctx_pool)
> - goto fail_free_cache;
> - return 0;
> -
> -fail_free_cache:
> - kmem_cache_destroy(bio_post_read_ctx_cache);
> -fail:
> - return -ENOMEM;
> -}
> -
> -void __exit f2fs_destroy_post_read_processing(void)
> -{
> - mempool_destroy(bio_post_read_ctx_pool);
> - kmem_cache_destroy(bio_post_read_ctx_cache);
> -}
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 0e187f67b206..2f75f06c784a 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -3633,15 +3633,11 @@ static int __init init_f2fs_fs(void)
> err = register_filesystem(&f2fs_fs_type);
> if (err)
> goto free_shrinker;
> +
> f2fs_create_root_stats();
> - err = f2fs_init_post_read_processing();
> - if (err)
> - goto free_root_stats;
> +
> return 0;
>
> -free_root_stats:
> - f2fs_destroy_root_stats();
> - unregister_filesystem(&f2fs_fs_type);
> free_shrinker:
> unregister_shrinker(&f2fs_shrinker_info);
> free_sysfs:
> @@ -3662,7 +3658,6 @@ static int __init init_f2fs_fs(void)
>
> static void __exit exit_f2fs_fs(void)
> {
> - f2fs_destroy_post_read_processing();
> f2fs_destroy_root_stats();
> unregister_filesystem(&f2fs_fs_type);
> unregister_shrinker(&f2fs_shrinker_info);
> diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> new file mode 100644
> index 000000000000..b6d5b95e67d7
> --- /dev/null
> +++ b/fs/read_callbacks.c
> @@ -0,0 +1,136 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This file tracks the state machine that needs to be executed after reading
> + * data from files that are encrypted and/or have verity metadata associated
> + * with them.
> + */
> +#include <linux/module.h>
> +#include <linux/mm.h>
> +#include <linux/pagemap.h>
> +#include <linux/bio.h>
> +#include <linux/fscrypt.h>
> +#include <linux/fsverity.h>
> +#include <linux/read_callbacks.h>
> +
> +#define NUM_PREALLOC_POST_READ_CTXS 128
> +
> +static struct kmem_cache *read_callbacks_ctx_cache;
> +static mempool_t *read_callbacks_ctx_pool;
> +
> +/* Read callback state machine steps */
> +enum read_callbacks_step {
> + STEP_INITIAL = 0,
> + STEP_DECRYPT,
> + STEP_VERITY,
> +};
> +
> +void end_read_callbacks(struct bio *bio)
> +{
> + struct page *page;
> + struct bio_vec *bv;
> + int i;
> + struct bvec_iter_all iter_all;
> +
> + bio_for_each_segment_all(bv, bio, i, iter_all) {
> + page = bv->bv_page;
> +
> + BUG_ON(bio->bi_status);
> +
> + if (!PageError(page))
> + SetPageUptodate(page);
> +
> + unlock_page(page);
> + }
> + if (bio->bi_private)
> + put_read_callbacks_ctx(bio->bi_private);
> + bio_put(bio);
> +}
> +EXPORT_SYMBOL(end_read_callbacks);
> +
> +void read_callbacks(struct read_callbacks_ctx *ctx)
> +{
> + /*
> + * We use different work queues for decryption and for verity because
> + * verity may require reading metadata pages that need decryption, and
> + * we shouldn't recurse to the same workqueue.
> + */
> + switch (++ctx->cur_step) {
> + case STEP_DECRYPT:
> + if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
> + fscrypt_enqueue_decrypt_work(&ctx->work);
> + return;
> + }
> + ctx->cur_step++;
> + /* fall-through */
> + case STEP_VERITY:
> + if (ctx->enabled_steps & (1 << STEP_VERITY)) {
> + fsverity_enqueue_verify_work(&ctx->work);
> + return;
> + }
> + ctx->cur_step++;
> + /* fall-through */
> + default:
> + end_read_callbacks(ctx->bio);
> + }
> +}
> +EXPORT_SYMBOL(read_callbacks);
> +
> +struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> + struct bio *bio,
> + pgoff_t index)
> +{
> + unsigned int read_callbacks_steps = 0;
> + struct read_callbacks_ctx *ctx = NULL;
> +
> + if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> + read_callbacks_steps |= 1 << STEP_DECRYPT;
> +#ifdef CONFIG_FS_VERITY
> + if (inode->i_verity_info != NULL &&
> + (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> + read_callbacks_steps |= 1 << STEP_VERITY;
> +#endif
> + if (read_callbacks_steps) {
> + ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
> + if (!ctx)
> + return ERR_PTR(-ENOMEM);
> + ctx->bio = bio;
> + ctx->inode = inode;
> + ctx->enabled_steps = read_callbacks_steps;
> + ctx->cur_step = STEP_INITIAL;
> + bio->bi_private = ctx;
> + }
> + return ctx;
> +}
> +EXPORT_SYMBOL(get_read_callbacks_ctx);
> +
> +void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx)
> +{
> + mempool_free(ctx, read_callbacks_ctx_pool);
> +}
> +EXPORT_SYMBOL(put_read_callbacks_ctx);
> +
> +bool read_callbacks_required(struct bio *bio)
> +{
> + return bio->bi_private && !bio->bi_status;
> +}
> +EXPORT_SYMBOL(read_callbacks_required);
> +
> +static int __init init_read_callbacks(void)
> +{
> + read_callbacks_ctx_cache = KMEM_CACHE(read_callbacks_ctx, 0);
> + if (!read_callbacks_ctx_cache)
> + goto fail;
> + read_callbacks_ctx_pool =
> + mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
> + read_callbacks_ctx_cache);
> + if (!read_callbacks_ctx_pool)
> + goto fail_free_cache;
> + return 0;
> +
> +fail_free_cache:
> + kmem_cache_destroy(read_callbacks_ctx_cache);
> +fail:
> + return -ENOMEM;
> +}
> +
> +fs_initcall(init_read_callbacks);
> diff --git a/fs/verity/Kconfig b/fs/verity/Kconfig
> index 563593fb42db..44784e57c99d 100644
> --- a/fs/verity/Kconfig
> +++ b/fs/verity/Kconfig
> @@ -4,6 +4,7 @@ config FS_VERITY
> # SHA-256 is selected as it's intended to be the default hash algorithm.
> # To avoid bloat, other wanted algorithms must be selected explicitly.
> select CRYPTO_SHA256
> + select FS_READ_CALLBACKS
> help
> This option enables fs-verity. fs-verity is the dm-verity
> mechanism implemented at the file level. On supported
> diff --git a/fs/verity/verify.c b/fs/verity/verify.c
> index 5732453a81e7..f93bee33872d 100644
> --- a/fs/verity/verify.c
> +++ b/fs/verity/verify.c
> @@ -13,6 +13,7 @@
> #include <linux/pagemap.h>
> #include <linux/ratelimit.h>
> #include <linux/scatterlist.h>
> +#include <linux/read_callbacks.h>
>
> struct workqueue_struct *fsverity_read_workqueue;
>
> @@ -284,6 +285,16 @@ void fsverity_verify_bio(struct bio *bio)
> EXPORT_SYMBOL_GPL(fsverity_verify_bio);
> #endif /* CONFIG_BLOCK */
>
> +static void fsverity_verify_work(struct work_struct *work)
> +{
> + struct read_callbacks_ctx *ctx =
> + container_of(work, struct read_callbacks_ctx, work);
> +
> + fsverity_verify_bio(ctx->bio);
> +
> + read_callbacks(ctx);
> +}
> +
> /**
> * fsverity_enqueue_verify_work - enqueue work on the fs-verity workqueue
> *
> @@ -291,6 +302,7 @@ EXPORT_SYMBOL_GPL(fsverity_verify_bio);
> */
> void fsverity_enqueue_verify_work(struct work_struct *work)
> {
> + INIT_WORK(work, fsverity_verify_work);
> queue_work(fsverity_read_workqueue, work);
> }
> EXPORT_SYMBOL_GPL(fsverity_enqueue_verify_work);
> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> index c00b764d6b8c..a760b7bd81d4 100644
> --- a/include/linux/fscrypt.h
> +++ b/include/linux/fscrypt.h
> @@ -68,11 +68,7 @@ struct fscrypt_ctx {
> struct {
> struct page *bounce_page; /* Ciphertext page */
> struct page *control_page; /* Original page */
> - } w;
> - struct {
> - struct bio *bio;
> - struct work_struct work;
> - } r;
> + };
> struct list_head free_list; /* Free list */
> };
> u8 flags; /* Flags */
> @@ -113,7 +109,7 @@ extern int fscrypt_decrypt_page(const struct inode *, struct page *, unsigned in
>
> static inline struct page *fscrypt_control_page(struct page *page)
> {
> - return ((struct fscrypt_ctx *)page_private(page))->w.control_page;
> + return ((struct fscrypt_ctx *)page_private(page))->control_page;
> }
>
> extern void fscrypt_restore_control_page(struct page *);
> @@ -218,9 +214,6 @@ static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
> }
>
> /* bio.c */
> -extern void fscrypt_decrypt_bio(struct bio *);
> -extern void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx,
> - struct bio *bio);
> extern void fscrypt_pullback_bio_page(struct page **, bool);
> extern int fscrypt_zeroout_range(const struct inode *, pgoff_t, sector_t,
> unsigned int);
> @@ -390,15 +383,6 @@ static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
> return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len);
> }
>
> -/* bio.c */
> -static inline void fscrypt_decrypt_bio(struct bio *bio)
> -{
> -}
> -
> -static inline void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx,
> - struct bio *bio)
> -{
> -}
>
> static inline void fscrypt_pullback_bio_page(struct page **page, bool restore)
> {
> diff --git a/include/linux/read_callbacks.h b/include/linux/read_callbacks.h
> new file mode 100644
> index 000000000000..c501cdf83a5b
> --- /dev/null
> +++ b/include/linux/read_callbacks.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _READ_CALLBACKS_H
> +#define _READ_CALLBACKS_H
> +
> +struct read_callbacks_ctx {
> + struct bio *bio;
> + struct inode *inode;
> + struct work_struct work;
> + unsigned int cur_step;
> + unsigned int enabled_steps;
> +};
> +
> +void end_read_callbacks(struct bio *bio);
> +void read_callbacks(struct read_callbacks_ctx *ctx);
> +struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> + struct bio *bio,
> + pgoff_t index);
> +void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx);
> +bool read_callbacks_required(struct bio *bio);
> +
> +#endif /* _READ_CALLBACKS_H */
>
On Sun, Apr 28, 2019 at 10:01:20AM +0530, Chandan Rajendra wrote:
> For subpage-sized blocks, this commit adds code to encrypt all zeroed
> out blocks mapped by a page.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/crypto/bio.c | 40 ++++++++++++++++++----------------------
> 1 file changed, 18 insertions(+), 22 deletions(-)
>
> diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> index 856f4694902d..46dd2ec50c7d 100644
> --- a/fs/crypto/bio.c
> +++ b/fs/crypto/bio.c
> @@ -108,29 +108,23 @@ EXPORT_SYMBOL(fscrypt_pullback_bio_page);
> int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> sector_t pblk, unsigned int len)
> {
> - struct fscrypt_ctx *ctx;
> struct page *ciphertext_page = NULL;
> struct bio *bio;
> + u64 total_bytes, page_bytes;
page_bytes should be 'unsigned int', since it's <= PAGE_SIZE.
> int ret, err = 0;
>
> - BUG_ON(inode->i_sb->s_blocksize != PAGE_SIZE);
> -
> - ctx = fscrypt_get_ctx(inode, GFP_NOFS);
> - if (IS_ERR(ctx))
> - return PTR_ERR(ctx);
> + total_bytes = len << inode->i_blkbits;
Should cast len to 'u64' here, in case it's greater than UINT_MAX / blocksize.
>
> - ciphertext_page = fscrypt_alloc_bounce_page(ctx, GFP_NOWAIT);
> - if (IS_ERR(ciphertext_page)) {
> - err = PTR_ERR(ciphertext_page);
> - goto errout;
> - }
> + while (total_bytes) {
> + page_bytes = min_t(u64, total_bytes, PAGE_SIZE);
>
> - while (len--) {
> - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk,
> - ZERO_PAGE(0), ciphertext_page,
> - PAGE_SIZE, 0, GFP_NOFS);
> - if (err)
> + ciphertext_page = fscrypt_encrypt_page(inode, ZERO_PAGE(0),
> + page_bytes, 0, lblk, GFP_NOFS);
> + if (IS_ERR(ciphertext_page)) {
> + err = PTR_ERR(ciphertext_page);
> + ciphertext_page = NULL;
> goto errout;
> + }
'ciphertext_page' is leaked after each loop iteration. Did you mean to free it,
or did you mean to reuse it for subsequent iterations?
>
> bio = bio_alloc(GFP_NOWAIT, 1);
> if (!bio) {
> @@ -141,9 +135,8 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> bio->bi_iter.bi_sector =
> pblk << (inode->i_sb->s_blocksize_bits - 9);
This line uses ->s_blocksize_bits, but your new code uses ->i_blkbits. AFAIK
they'll always be the same, but please pick one or the other to use.
> bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
> - ret = bio_add_page(bio, ciphertext_page,
> - inode->i_sb->s_blocksize, 0);
> - if (ret != inode->i_sb->s_blocksize) {
> + ret = bio_add_page(bio, ciphertext_page, page_bytes, 0);
> + if (ret != page_bytes) {
> /* should never happen! */
> WARN_ON(1);
> bio_put(bio);
> @@ -156,12 +149,15 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> bio_put(bio);
> if (err)
> goto errout;
> - lblk++;
> - pblk++;
> +
> + lblk += page_bytes >> inode->i_blkbits;
> + pblk += page_bytes >> inode->i_blkbits;
> + total_bytes -= page_bytes;
> }
> err = 0;
> errout:
> - fscrypt_release_ctx(ctx);
> + if (!IS_ERR_OR_NULL(ciphertext_page))
> + fscrypt_restore_control_page(ciphertext_page);
> return err;
> }
> EXPORT_SYMBOL(fscrypt_zeroout_range);
> --
> 2.19.1
>
On Sun, Apr 28, 2019 at 10:01:19AM +0530, Chandan Rajendra wrote:
> For subpage-sized blocks, the initial logical block number mapped by a
> page can be different from page->index. Hence this commit adds code to
> compute the first logical block mapped by the page and also the page
> range to be encrypted.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/ext4/page-io.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 3e9298e6a705..75485ee9e800 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -418,6 +418,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> {
> struct page *data_page = NULL;
> struct inode *inode = page->mapping->host;
> + u64 page_blk;
> unsigned block_start;
> struct buffer_head *bh, *head;
> int ret = 0;
> @@ -478,10 +479,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
>
> if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode) && nr_to_submit) {
> gfp_t gfp_flags = GFP_NOFS;
> + unsigned int page_bytes;
> +
page_blk should be declared here, just after page_bytes.
> + page_bytes = round_up(len, i_blocksize(inode));
> + page_blk = page->index << (PAGE_SHIFT - inode->i_blkbits);
Although block numbers are 32-bit in ext4, if you're going to make 'page_blk' a
u64 anyway, then for consistency page->index should be cast to u64 here.
>
> retry_encrypt:
> - data_page = fscrypt_encrypt_page(inode, page, PAGE_SIZE, 0,
> - page->index, gfp_flags);
> + data_page = fscrypt_encrypt_page(inode, page, page_bytes, 0,
> + page_blk, gfp_flags);
> if (IS_ERR(data_page)) {
> ret = PTR_ERR(data_page);
> if (ret == -ENOMEM && wbc->sync_mode == WB_SYNC_ALL) {
> --
> 2.19.1
>
On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> For subpage-sized blocks, this commit now encrypts all blocks mapped by
> a page range.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
> 1 file changed, 25 insertions(+), 12 deletions(-)
>
> diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> index 4f0d832cae71..2d65b431563f 100644
> --- a/fs/crypto/crypto.c
> +++ b/fs/crypto/crypto.c
> @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
Need to update the function comment to clearly explain what this function
actually does now.
> {
> struct fscrypt_ctx *ctx;
> struct page *ciphertext_page = page;
> + int i, page_nr_blks;
> int err;
>
> BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
>
Make a 'blocksize' variable so you don't have to keep calling i_blocksize().
Also, you need to check whether 'len' and 'offs' are filesystem-block-aligned,
since the code now assumes it.
const unsigned int blocksize = i_blocksize(inode);
if (!IS_ALIGNED(len | offs, blocksize))
return -EINVAL;
However, did you check whether that's always true for ubifs? It looks like it
may expect to encrypt a prefix of a block, that is only padded to the next
16-byte boundary.
> + page_nr_blks = len >> inode->i_blkbits;
> +
> if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> /* with inplace-encryption we just encrypt the page */
> - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
> - ciphertext_page, len, offs,
> - gfp_flags);
> - if (err)
> - return ERR_PTR(err);
> -
> + for (i = 0; i < page_nr_blks; i++) {
> + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> + lblk_num, page,
> + ciphertext_page,
> + i_blocksize(inode), offs,
> + gfp_flags);
> + if (err)
> + return ERR_PTR(err);
> + ++lblk_num;
> + offs += i_blocksize(inode);
> + }
> return ciphertext_page;
> }
>
> @@ -269,12 +277,17 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> goto errout;
>
> ctx->control_page = page;
> - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
> - page, ciphertext_page, len, offs,
> - gfp_flags);
> - if (err) {
> - ciphertext_page = ERR_PTR(err);
> - goto errout;
> +
> + for (i = 0; i < page_nr_blks; i++) {
> + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
> + page, ciphertext_page,
> + i_blocksize(inode), offs, gfp_flags);
As I mentioned elsewhere, renaming fscrypt_do_page_crypto() to
fscrypt_crypt_block() would make more sense now.
> + if (err) {
> + ciphertext_page = ERR_PTR(err);
> + goto errout;
> + }
> + ++lblk_num;
> + offs += i_blocksize(inode);
> }
> SetPagePrivate(ciphertext_page);
> set_page_private(ciphertext_page, (unsigned long)ctx);
> --
> 2.19.1
>
On Sun, Apr 28, 2019 at 10:01:10AM +0530, Chandan Rajendra wrote:
> The "read callbacks" code is used by both Ext4 and F2FS. Hence to
> remove duplicity, this commit moves the code into
> include/linux/read_callbacks.h and fs/read_callbacks.c.
>
> The corresponding decrypt and verity "work" functions have been moved
> inside fscrypt and fsverity sources. With these in place, the read
> callbacks code now has to just invoke enqueue functions provided by
> fscrypt and fsverity.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/Kconfig | 4 +
> fs/Makefile | 4 +
> fs/crypto/Kconfig | 1 +
> fs/crypto/bio.c | 23 ++---
> fs/crypto/crypto.c | 17 +--
> fs/crypto/fscrypt_private.h | 3 +
> fs/ext4/ext4.h | 2 -
> fs/ext4/readpage.c | 183 +++++----------------------------
> fs/ext4/super.c | 9 +-
> fs/f2fs/data.c | 148 ++++----------------------
> fs/f2fs/super.c | 9 +-
> fs/read_callbacks.c | 136 ++++++++++++++++++++++++
> fs/verity/Kconfig | 1 +
> fs/verity/verify.c | 12 +++
> include/linux/fscrypt.h | 20 +---
> include/linux/read_callbacks.h | 21 ++++
> 16 files changed, 251 insertions(+), 342 deletions(-)
> create mode 100644 fs/read_callbacks.c
> create mode 100644 include/linux/read_callbacks.h
>
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 97f9eb8df713..03084f2dbeaf 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -308,6 +308,10 @@ config NFS_COMMON
> depends on NFSD || NFS_FS || LOCKD
> default y
>
> +config FS_READ_CALLBACKS
> + bool
> + default n
> +
> source "net/sunrpc/Kconfig"
> source "fs/ceph/Kconfig"
> source "fs/cifs/Kconfig"
This shouldn't be under the 'if NETWORK_FILESYSTEMS' block, since it has nothing
to do with network filesystems. When trying to compile this I got:
WARNING: unmet direct dependencies detected for FS_READ_CALLBACKS
Depends on [n]: NETWORK_FILESYSTEMS [=n]
Selected by [y]:
- FS_ENCRYPTION [=y]
- FS_VERITY [=y]
Perhaps put it just below FS_IOMAP?
> diff --git a/fs/Makefile b/fs/Makefile
> index 9dd2186e74b5..e0c0fce8cf40 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -21,6 +21,10 @@ else
> obj-y += no-block.o
> endif
>
> +ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
> +obj-y += read_callbacks.o
> +endif
> +
> obj-$(CONFIG_PROC_FS) += proc_namespace.o
>
> obj-y += notify/
> diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
> index f0de238000c0..163c328bcbd4 100644
> --- a/fs/crypto/Kconfig
> +++ b/fs/crypto/Kconfig
> @@ -8,6 +8,7 @@ config FS_ENCRYPTION
> select CRYPTO_CTS
> select CRYPTO_SHA256
> select KEYS
> + select FS_READ_CALLBACKS
> help
> Enable encryption of files and directories. This
> feature is similar to ecryptfs, but it is more memory
This selection needs to be conditional on BLOCK.
select FS_READ_CALLBACKS if BLOCK
Otherwise, building without BLOCK and with UBIFS encryption support fails.
fs/read_callbacks.c: In function ‘end_read_callbacks’:
fs/read_callbacks.c:34:23: error: storage size of ‘iter_all’ isn’t known
struct bvec_iter_all iter_all;
^~~~~~~~
fs/read_callbacks.c:37:20: error: dereferencing pointer to incomplete type ‘struct buffer_head’
if (!PageError(bh->b_page))
[...]
- Eric
On 2019-04-28, at 10:01:11 +0530, Chandan Rajendra wrote:
> Ext4 and F2FS store verity metadata in data extents (beyond
> inode->i_size) associated with a file. But other filesystems might
> choose alternative means to store verity metadata. Hence this commit
> adds a callback function pointer to 'struct fsverity_operations' to
> help in deciding if verity operation needs to performed against a
> page-cache page holding file data.
>
> Signed-off-by: Chandan Rajendra <[email protected]>
> ---
> fs/ext4/super.c | 6 ++++++
> fs/f2fs/super.c | 6 ++++++
> fs/read_callbacks.c | 4 +++-
> include/linux/fsverity.h | 1 +
> 4 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index aba724f82cc3..63d73b360f1d 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1428,10 +1428,16 @@ static struct page *ext4_read_verity_metadata_page(struct inode *inode,
> return read_mapping_page(inode->i_mapping, index, NULL);
> }
>
> +static bool ext4_verity_required(struct inode *inode, pgoff_t index)
> +{
> + return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +}
> +
> static const struct fsverity_operations ext4_verityops = {
> .set_verity = ext4_set_verity,
> .get_metadata_end = ext4_get_verity_metadata_end,
> .read_metadata_page = ext4_read_verity_metadata_page,
> + .verity_required = ext4_verity_required,
> };
> #endif /* CONFIG_FS_VERITY */
>
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 2f75f06c784a..cd1299e1f92d 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -2257,10 +2257,16 @@ static struct page *f2fs_read_verity_metadata_page(struct inode *inode,
> return read_mapping_page(inode->i_mapping, index, NULL);
> }
>
> +static bool f2fs_verity_required(struct inode *inode, pgoff_t index)
> +{
> + return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +}
> +
> static const struct fsverity_operations f2fs_verityops = {
> .set_verity = f2fs_set_verity,
> .get_metadata_end = f2fs_get_verity_metadata_end,
> .read_metadata_page = f2fs_read_verity_metadata_page,
> + .verity_required = f2fs_verity_required,
> };
> #endif /* CONFIG_FS_VERITY */
>
> diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> index b6d5b95e67d7..6dea54b0baa9 100644
> --- a/fs/read_callbacks.c
> +++ b/fs/read_callbacks.c
> @@ -86,7 +86,9 @@ struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> read_callbacks_steps |= 1 << STEP_DECRYPT;
> #ifdef CONFIG_FS_VERITY
> if (inode->i_verity_info != NULL &&
> - (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> + ((inode->i_sb->s_vop->verity_required
> + && inode->i_sb->s_vop->verity_required(inode, index))
> + || (inode->i_sb->s_vop->verity_required == NULL)))
I think this is a bit easier to follow:
(inode->i_sb->s_vop->verity_required == NULL ||
inode->i_sb->s_vop->verity_required(inode, index)))
> read_callbacks_steps |= 1 << STEP_VERITY;
> #endif
> if (read_callbacks_steps) {
> diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
> index 7c33b42abf1b..b83712d6c79a 100644
> --- a/include/linux/fsverity.h
> +++ b/include/linux/fsverity.h
> @@ -18,6 +18,7 @@ struct fsverity_operations {
> int (*set_verity)(struct inode *inode, loff_t data_i_size);
> int (*get_metadata_end)(struct inode *inode, loff_t *metadata_end_ret);
> struct page *(*read_metadata_page)(struct inode *inode, pgoff_t index);
> + bool (*verity_required)(struct inode *inode, pgoff_t index);
> };
>
> #ifdef CONFIG_FS_VERITY
> --
> 2.19.1
>
>
J.
On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote:
> On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> > For subpage-sized blocks, this commit now encrypts all blocks mapped by
> > a page range.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
> > 1 file changed, 25 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > index 4f0d832cae71..2d65b431563f 100644
> > --- a/fs/crypto/crypto.c
> > +++ b/fs/crypto/crypto.c
> > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
>
> Need to update the function comment to clearly explain what this function
> actually does now.
>
> > {
> > struct fscrypt_ctx *ctx;
> > struct page *ciphertext_page = page;
> > + int i, page_nr_blks;
> > int err;
> >
> > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
> >
>
> Make a 'blocksize' variable so you don't have to keep calling i_blocksize().
>
> Also, you need to check whether 'len' and 'offs' are filesystem-block-aligned,
> since the code now assumes it.
>
> const unsigned int blocksize = i_blocksize(inode);
>
> if (!IS_ALIGNED(len | offs, blocksize))
> return -EINVAL;
>
> However, did you check whether that's always true for ubifs? It looks like it
> may expect to encrypt a prefix of a block, that is only padded to the next
> 16-byte boundary.
>
> > + page_nr_blks = len >> inode->i_blkbits;
> > +
> > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> > /* with inplace-encryption we just encrypt the page */
> > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
> > - ciphertext_page, len, offs,
> > - gfp_flags);
> > - if (err)
> > - return ERR_PTR(err);
> > -
> > + for (i = 0; i < page_nr_blks; i++) {
> > + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> > + lblk_num, page,
> > + ciphertext_page,
> > + i_blocksize(inode), offs,
> > + gfp_flags);
> > + if (err)
> > + return ERR_PTR(err);
Apparently ubifs does encrypt data shorter than the filesystem block size, so
this part is wrong.
I suggest we split this into two functions, fscrypt_encrypt_block_inplace() and
fscrypt_encrypt_blocks(), so that it's conceptually simpler what each function
does. Currently this works completely differently depending on whether the
filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is weird.
I also noticed that using fscrypt_ctx for writes seems to be unnecessary.
AFAICS, page_private(bounce_page) could point directly to the pagecache page.
That would simplify things a lot, especially since then fscrypt_ctx could be
removed entirely after you convert reads to use read_callbacks_ctx.
IMO, these would be worthwhile cleanups for fscrypt by themselves, without
waiting for the read_callbacks stuff to be finalized. Finalizing the
read_callbacks stuff will probably require reaching a consensus about how they
should work with future filesystem features like fsverity and compression.
So to move things forward, I'm considering sending out a series with the above
cleanups for fscrypt, plus the equivalent of your patches:
"fscrypt_encrypt_page: Loop across all blocks mapped by a page range"
"fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page"
"Add decryption support for sub-pagesized blocks" (fs/crypto/ part only)
Then hopefully we can get all that applied for 5.3 so that fs/crypto/ itself is
ready for blocksize != PAGE_SIZE; and get your changes to ext4_bio_write_page(),
__ext4_block_zero_page_range(), and ext4_block_write_begin() applied too, so
that ext4 is partially ready for encryption with blocksize != PAGE_SIZE.
Then only the read_callbacks stuff will remain, to get encryption support into
fs/mpage.c and fs/buffer.c. Do you think that's a good plan?
Thanks!
- Eric
On Tuesday, April 30, 2019 10:21:15 PM IST Eric Biggers wrote:
> On Sun, Apr 28, 2019 at 10:01:20AM +0530, Chandan Rajendra wrote:
> > For subpage-sized blocks, this commit adds code to encrypt all zeroed
> > out blocks mapped by a page.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/crypto/bio.c | 40 ++++++++++++++++++----------------------
> > 1 file changed, 18 insertions(+), 22 deletions(-)
> >
> > diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> > index 856f4694902d..46dd2ec50c7d 100644
> > --- a/fs/crypto/bio.c
> > +++ b/fs/crypto/bio.c
> > @@ -108,29 +108,23 @@ EXPORT_SYMBOL(fscrypt_pullback_bio_page);
> > int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> > sector_t pblk, unsigned int len)
> > {
> > - struct fscrypt_ctx *ctx;
> > struct page *ciphertext_page = NULL;
> > struct bio *bio;
> > + u64 total_bytes, page_bytes;
>
> page_bytes should be 'unsigned int', since it's <= PAGE_SIZE.
>
> > int ret, err = 0;
> >
> > - BUG_ON(inode->i_sb->s_blocksize != PAGE_SIZE);
> > -
> > - ctx = fscrypt_get_ctx(inode, GFP_NOFS);
> > - if (IS_ERR(ctx))
> > - return PTR_ERR(ctx);
> > + total_bytes = len << inode->i_blkbits;
>
> Should cast len to 'u64' here, in case it's greater than UINT_MAX / blocksize.
>
> >
> > - ciphertext_page = fscrypt_alloc_bounce_page(ctx, GFP_NOWAIT);
> > - if (IS_ERR(ciphertext_page)) {
> > - err = PTR_ERR(ciphertext_page);
> > - goto errout;
> > - }
> > + while (total_bytes) {
> > + page_bytes = min_t(u64, total_bytes, PAGE_SIZE);
> >
> > - while (len--) {
> > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk,
> > - ZERO_PAGE(0), ciphertext_page,
> > - PAGE_SIZE, 0, GFP_NOFS);
> > - if (err)
> > + ciphertext_page = fscrypt_encrypt_page(inode, ZERO_PAGE(0),
> > + page_bytes, 0, lblk, GFP_NOFS);
> > + if (IS_ERR(ciphertext_page)) {
> > + err = PTR_ERR(ciphertext_page);
> > + ciphertext_page = NULL;
> > goto errout;
> > + }
>
> 'ciphertext_page' is leaked after each loop iteration. Did you mean to free it,
> or did you mean to reuse it for subsequent iterations?
Thanks for pointing this out. I actually meant to free it. I will see if I can
reuse ciphertext_page in my next patchset rather than freeing and allocating
it each time the loop is executed.
>
> >
> > bio = bio_alloc(GFP_NOWAIT, 1);
> > if (!bio) {
> > @@ -141,9 +135,8 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> > bio->bi_iter.bi_sector =
> > pblk << (inode->i_sb->s_blocksize_bits - 9);
>
> This line uses ->s_blocksize_bits, but your new code uses ->i_blkbits. AFAIK
> they'll always be the same, but please pick one or the other to use.
I will fix this.
>
> > bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
> > - ret = bio_add_page(bio, ciphertext_page,
> > - inode->i_sb->s_blocksize, 0);
> > - if (ret != inode->i_sb->s_blocksize) {
> > + ret = bio_add_page(bio, ciphertext_page, page_bytes, 0);
> > + if (ret != page_bytes) {
> > /* should never happen! */
> > WARN_ON(1);
> > bio_put(bio);
> > @@ -156,12 +149,15 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> > bio_put(bio);
> > if (err)
> > goto errout;
> > - lblk++;
> > - pblk++;
> > +
> > + lblk += page_bytes >> inode->i_blkbits;
> > + pblk += page_bytes >> inode->i_blkbits;
> > + total_bytes -= page_bytes;
> > }
> > err = 0;
> > errout:
> > - fscrypt_release_ctx(ctx);
> > + if (!IS_ERR_OR_NULL(ciphertext_page))
> > + fscrypt_restore_control_page(ciphertext_page);
> > return err;
> > }
> > EXPORT_SYMBOL(fscrypt_zeroout_range);
>
>
--
chandan
On Tuesday, April 30, 2019 5:30:28 AM IST Eric Biggers wrote:
> Hi Chandan,
>
> On Sun, Apr 28, 2019 at 10:01:10AM +0530, Chandan Rajendra wrote:
> > The "read callbacks" code is used by both Ext4 and F2FS. Hence to
> > remove duplicity, this commit moves the code into
> > include/linux/read_callbacks.h and fs/read_callbacks.c.
> >
> > The corresponding decrypt and verity "work" functions have been moved
> > inside fscrypt and fsverity sources. With these in place, the read
> > callbacks code now has to just invoke enqueue functions provided by
> > fscrypt and fsverity.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/Kconfig | 4 +
> > fs/Makefile | 4 +
> > fs/crypto/Kconfig | 1 +
> > fs/crypto/bio.c | 23 ++---
> > fs/crypto/crypto.c | 17 +--
> > fs/crypto/fscrypt_private.h | 3 +
> > fs/ext4/ext4.h | 2 -
> > fs/ext4/readpage.c | 183 +++++----------------------------
> > fs/ext4/super.c | 9 +-
> > fs/f2fs/data.c | 148 ++++----------------------
> > fs/f2fs/super.c | 9 +-
> > fs/read_callbacks.c | 136 ++++++++++++++++++++++++
> > fs/verity/Kconfig | 1 +
> > fs/verity/verify.c | 12 +++
> > include/linux/fscrypt.h | 20 +---
> > include/linux/read_callbacks.h | 21 ++++
> > 16 files changed, 251 insertions(+), 342 deletions(-)
> > create mode 100644 fs/read_callbacks.c
> > create mode 100644 include/linux/read_callbacks.h
> >
>
> For easier review, can you split this into multiple patches? Ideally the ext4
> and f2fs patches would be separate, but if that's truly not possible due to
> interdependencies it seems you could at least do:
>
> 1. Introduce the read_callbacks.
> 2. Convert encryption to use the read_callbacks.
> 3. Remove union from struct fscrypt_context.
>
> Also: just FYI, fs-verity isn't upstream yet, and in the past few months I
> haven't had much time to work on it. So you might consider arranging your
> series so that initially just fscrypt is supported. That will be useful on its
> own, for block_size < PAGE_SIZE support. Then fsverity can be added later.
>
> > diff --git a/fs/Kconfig b/fs/Kconfig
> > index 97f9eb8df713..03084f2dbeaf 100644
> > --- a/fs/Kconfig
> > +++ b/fs/Kconfig
> > @@ -308,6 +308,10 @@ config NFS_COMMON
> > depends on NFSD || NFS_FS || LOCKD
> > default y
> >
> > +config FS_READ_CALLBACKS
> > + bool
> > + default n
>
> 'default n' is unnecesary, since 'n' is already the default.
>
> > +
> > source "net/sunrpc/Kconfig"
> > source "fs/ceph/Kconfig"
> > source "fs/cifs/Kconfig"
> > diff --git a/fs/Makefile b/fs/Makefile
> > index 9dd2186e74b5..e0c0fce8cf40 100644
> > --- a/fs/Makefile
> > +++ b/fs/Makefile
> > @@ -21,6 +21,10 @@ else
> > obj-y += no-block.o
> > endif
> >
> > +ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
> > +obj-y += read_callbacks.o
> > +endif
> > +
>
> This can be simplified to:
>
> obj-$(CONFIG_FS_READ_CALLBACKS) += read_callbacks.o
>
> > diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> > new file mode 100644
> > index 000000000000..b6d5b95e67d7
> > --- /dev/null
> > +++ b/fs/read_callbacks.c
> > @@ -0,0 +1,136 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * This file tracks the state machine that needs to be executed after reading
> > + * data from files that are encrypted and/or have verity metadata associated
> > + * with them.
> > + */
> > +#include <linux/module.h>
> > +#include <linux/mm.h>
> > +#include <linux/pagemap.h>
> > +#include <linux/bio.h>
> > +#include <linux/fscrypt.h>
> > +#include <linux/fsverity.h>
> > +#include <linux/read_callbacks.h>
> > +
> > +#define NUM_PREALLOC_POST_READ_CTXS 128
> > +
> > +static struct kmem_cache *read_callbacks_ctx_cache;
> > +static mempool_t *read_callbacks_ctx_pool;
> > +
> > +/* Read callback state machine steps */
> > +enum read_callbacks_step {
> > + STEP_INITIAL = 0,
> > + STEP_DECRYPT,
> > + STEP_VERITY,
> > +};
> > +
> > +void end_read_callbacks(struct bio *bio)
> > +{
> > + struct page *page;
> > + struct bio_vec *bv;
> > + int i;
> > + struct bvec_iter_all iter_all;
> > +
> > + bio_for_each_segment_all(bv, bio, i, iter_all) {
> > + page = bv->bv_page;
> > +
> > + BUG_ON(bio->bi_status);
> > +
> > + if (!PageError(page))
> > + SetPageUptodate(page);
> > +
> > + unlock_page(page);
> > + }
> > + if (bio->bi_private)
> > + put_read_callbacks_ctx(bio->bi_private);
> > + bio_put(bio);
> > +}
> > +EXPORT_SYMBOL(end_read_callbacks);
>
> end_read_callbacks() is only called by read_callbacks() just below, so it should
> be 'static'.
>
> > +
> > +struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> > + struct bio *bio,
> > + pgoff_t index)
> > +{
> > + unsigned int read_callbacks_steps = 0;
>
> Rename 'read_callbacks_steps' => 'enabled_steps', since it's clear from context.
>
> > + struct read_callbacks_ctx *ctx = NULL;
> > +
> > + if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> > + read_callbacks_steps |= 1 << STEP_DECRYPT;
> > +#ifdef CONFIG_FS_VERITY
> > + if (inode->i_verity_info != NULL &&
> > + (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> > + read_callbacks_steps |= 1 << STEP_VERITY;
> > +#endif
>
> To avoid the #ifdef, this should probably be made a function in fsverity.h.
>
> > + if (read_callbacks_steps) {
> > + ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
> > + if (!ctx)
> > + return ERR_PTR(-ENOMEM);
> > + ctx->bio = bio;
> > + ctx->inode = inode;
> > + ctx->enabled_steps = read_callbacks_steps;
> > + ctx->cur_step = STEP_INITIAL;
> > + bio->bi_private = ctx;
> > + }
> > + return ctx;
> > +}
> > +EXPORT_SYMBOL(get_read_callbacks_ctx);
>
> The callers don't actually use the returned read_callbacks_ctx. Instead, they
> rely on this function storing it in ->bi_private. So, this function should just
> return an error code, and it should be renamed. Perhaps:
>
> int read_callbacks_setup_bio(struct inode *inode, struct bio *bio,
> pgoff_t first_pgoff);
>
> Please rename 'index' to 'first_pgoff' to make it clearer what it is, given that
> a bio can contain many pages.
>
> Please add kerneldoc for this function.
>
I will implement the changes suggested above.
--
chandan
On Tuesday, April 30, 2019 7:07:28 AM IST Chao Yu wrote:
> On 2019/4/28 12:31, Chandan Rajendra wrote:
> > The "read callbacks" code is used by both Ext4 and F2FS. Hence to
> > remove duplicity, this commit moves the code into
> > include/linux/read_callbacks.h and fs/read_callbacks.c.
> >
> > The corresponding decrypt and verity "work" functions have been moved
> > inside fscrypt and fsverity sources. With these in place, the read
> > callbacks code now has to just invoke enqueue functions provided by
> > fscrypt and fsverity.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/Kconfig | 4 +
> > fs/Makefile | 4 +
> > fs/crypto/Kconfig | 1 +
> > fs/crypto/bio.c | 23 ++---
> > fs/crypto/crypto.c | 17 +--
> > fs/crypto/fscrypt_private.h | 3 +
> > fs/ext4/ext4.h | 2 -
> > fs/ext4/readpage.c | 183 +++++----------------------------
> > fs/ext4/super.c | 9 +-
> > fs/f2fs/data.c | 148 ++++----------------------
> > fs/f2fs/super.c | 9 +-
> > fs/read_callbacks.c | 136 ++++++++++++++++++++++++
> > fs/verity/Kconfig | 1 +
> > fs/verity/verify.c | 12 +++
> > include/linux/fscrypt.h | 20 +---
> > include/linux/read_callbacks.h | 21 ++++
> > 16 files changed, 251 insertions(+), 342 deletions(-)
> > create mode 100644 fs/read_callbacks.c
> > create mode 100644 include/linux/read_callbacks.h
> >
> > diff --git a/fs/Kconfig b/fs/Kconfig
> > index 97f9eb8df713..03084f2dbeaf 100644
> > --- a/fs/Kconfig
> > +++ b/fs/Kconfig
> > @@ -308,6 +308,10 @@ config NFS_COMMON
> > depends on NFSD || NFS_FS || LOCKD
> > default y
> >
> > +config FS_READ_CALLBACKS
> > + bool
> > + default n
> > +
> > source "net/sunrpc/Kconfig"
> > source "fs/ceph/Kconfig"
> > source "fs/cifs/Kconfig"
> > diff --git a/fs/Makefile b/fs/Makefile
> > index 9dd2186e74b5..e0c0fce8cf40 100644
> > --- a/fs/Makefile
> > +++ b/fs/Makefile
> > @@ -21,6 +21,10 @@ else
> > obj-y += no-block.o
> > endif
> >
> > +ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
> > +obj-y += read_callbacks.o
> > +endif
> > +
> > obj-$(CONFIG_PROC_FS) += proc_namespace.o
> >
> > obj-y += notify/
> > diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
> > index f0de238000c0..163c328bcbd4 100644
> > --- a/fs/crypto/Kconfig
> > +++ b/fs/crypto/Kconfig
> > @@ -8,6 +8,7 @@ config FS_ENCRYPTION
> > select CRYPTO_CTS
> > select CRYPTO_SHA256
> > select KEYS
> > + select FS_READ_CALLBACKS
> > help
> > Enable encryption of files and directories. This
> > feature is similar to ecryptfs, but it is more memory
> > diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> > index 5759bcd018cd..27f5618174f2 100644
> > --- a/fs/crypto/bio.c
> > +++ b/fs/crypto/bio.c
> > @@ -24,6 +24,8 @@
> > #include <linux/module.h>
> > #include <linux/bio.h>
> > #include <linux/namei.h>
> > +#include <linux/read_callbacks.h>
> > +
> > #include "fscrypt_private.h"
> >
> > static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
> > @@ -54,24 +56,15 @@ void fscrypt_decrypt_bio(struct bio *bio)
> > }
> > EXPORT_SYMBOL(fscrypt_decrypt_bio);
> >
> > -static void completion_pages(struct work_struct *work)
> > +void fscrypt_decrypt_work(struct work_struct *work)
> > {
> > - struct fscrypt_ctx *ctx =
> > - container_of(work, struct fscrypt_ctx, r.work);
> > - struct bio *bio = ctx->r.bio;
> > + struct read_callbacks_ctx *ctx =
> > + container_of(work, struct read_callbacks_ctx, work);
> >
> > - __fscrypt_decrypt_bio(bio, true);
> > - fscrypt_release_ctx(ctx);
> > - bio_put(bio);
> > -}
> > + fscrypt_decrypt_bio(ctx->bio);
> >
> > -void fscrypt_enqueue_decrypt_bio(struct fscrypt_ctx *ctx, struct bio *bio)
> > -{
> > - INIT_WORK(&ctx->r.work, completion_pages);
> > - ctx->r.bio = bio;
> > - fscrypt_enqueue_decrypt_work(&ctx->r.work);
> > + read_callbacks(ctx);
> > }
> > -EXPORT_SYMBOL(fscrypt_enqueue_decrypt_bio);
> >
> > void fscrypt_pullback_bio_page(struct page **page, bool restore)
> > {
> > @@ -87,7 +80,7 @@ void fscrypt_pullback_bio_page(struct page **page, bool restore)
> > ctx = (struct fscrypt_ctx *)page_private(bounce_page);
> >
> > /* restore control page */
> > - *page = ctx->w.control_page;
> > + *page = ctx->control_page;
> >
> > if (restore)
> > fscrypt_restore_control_page(bounce_page);
> > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > index 3fc84bf2b1e5..ffa9302a7351 100644
> > --- a/fs/crypto/crypto.c
> > +++ b/fs/crypto/crypto.c
> > @@ -53,6 +53,7 @@ struct kmem_cache *fscrypt_info_cachep;
> >
> > void fscrypt_enqueue_decrypt_work(struct work_struct *work)
> > {
> > + INIT_WORK(work, fscrypt_decrypt_work);
> > queue_work(fscrypt_read_workqueue, work);
> > }
> > EXPORT_SYMBOL(fscrypt_enqueue_decrypt_work);
> > @@ -70,11 +71,11 @@ void fscrypt_release_ctx(struct fscrypt_ctx *ctx)
> > {
> > unsigned long flags;
> >
> > - if (ctx->flags & FS_CTX_HAS_BOUNCE_BUFFER_FL && ctx->w.bounce_page) {
> > - mempool_free(ctx->w.bounce_page, fscrypt_bounce_page_pool);
> > - ctx->w.bounce_page = NULL;
> > + if (ctx->flags & FS_CTX_HAS_BOUNCE_BUFFER_FL && ctx->bounce_page) {
> > + mempool_free(ctx->bounce_page, fscrypt_bounce_page_pool);
> > + ctx->bounce_page = NULL;
> > }
> > - ctx->w.control_page = NULL;
> > + ctx->control_page = NULL;
> > if (ctx->flags & FS_CTX_REQUIRES_FREE_ENCRYPT_FL) {
> > kmem_cache_free(fscrypt_ctx_cachep, ctx);
> > } else {
> > @@ -194,11 +195,11 @@ int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw,
> > struct page *fscrypt_alloc_bounce_page(struct fscrypt_ctx *ctx,
> > gfp_t gfp_flags)
> > {
> > - ctx->w.bounce_page = mempool_alloc(fscrypt_bounce_page_pool, gfp_flags);
> > - if (ctx->w.bounce_page == NULL)
> > + ctx->bounce_page = mempool_alloc(fscrypt_bounce_page_pool, gfp_flags);
> > + if (ctx->bounce_page == NULL)
> > return ERR_PTR(-ENOMEM);
> > ctx->flags |= FS_CTX_HAS_BOUNCE_BUFFER_FL;
> > - return ctx->w.bounce_page;
> > + return ctx->bounce_page;
> > }
> >
> > /**
> > @@ -267,7 +268,7 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> > if (IS_ERR(ciphertext_page))
> > goto errout;
> >
> > - ctx->w.control_page = page;
> > + ctx->control_page = page;
> > err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num,
> > page, ciphertext_page, len, offs,
> > gfp_flags);
> > diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
> > index 7da276159593..412a3bcf9efd 100644
> > --- a/fs/crypto/fscrypt_private.h
> > +++ b/fs/crypto/fscrypt_private.h
> > @@ -114,6 +114,9 @@ static inline bool fscrypt_valid_enc_modes(u32 contents_mode,
> > return false;
> > }
> >
> > +/* bio.c */
> > +void fscrypt_decrypt_work(struct work_struct *work);
> > +
> > /* crypto.c */
> > extern struct kmem_cache *fscrypt_info_cachep;
> > extern int fscrypt_initialize(unsigned int cop_flags);
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index f2b0e628ff7b..23f8568c9b53 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -3127,8 +3127,6 @@ static inline void ext4_set_de_type(struct super_block *sb,
> > extern int ext4_mpage_readpages(struct address_space *mapping,
> > struct list_head *pages, struct page *page,
> > unsigned nr_pages, bool is_readahead);
> > -extern int __init ext4_init_post_read_processing(void);
> > -extern void ext4_exit_post_read_processing(void);
> >
> > /* symlink.c */
> > extern const struct inode_operations ext4_encrypted_symlink_inode_operations;
> > diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
> > index 0169e3809da3..e363dededc21 100644
> > --- a/fs/ext4/readpage.c
> > +++ b/fs/ext4/readpage.c
> > @@ -44,14 +44,10 @@
> > #include <linux/backing-dev.h>
> > #include <linux/pagevec.h>
> > #include <linux/cleancache.h>
> > +#include <linux/read_callbacks.h>
> >
> > #include "ext4.h"
> >
> > -#define NUM_PREALLOC_POST_READ_CTXS 128
> > -
> > -static struct kmem_cache *bio_post_read_ctx_cache;
> > -static mempool_t *bio_post_read_ctx_pool;
> > -
> > static inline bool ext4_bio_encrypted(struct bio *bio)
> > {
> > #ifdef CONFIG_FS_ENCRYPTION
> > @@ -61,125 +57,6 @@ static inline bool ext4_bio_encrypted(struct bio *bio)
> > #endif
> > }
> >
> > -/* postprocessing steps for read bios */
> > -enum bio_post_read_step {
> > - STEP_INITIAL = 0,
> > - STEP_DECRYPT,
> > - STEP_VERITY,
> > -};
> > -
> > -struct bio_post_read_ctx {
> > - struct bio *bio;
> > - struct work_struct work;
> > - unsigned int cur_step;
> > - unsigned int enabled_steps;
> > -};
> > -
> > -static void __read_end_io(struct bio *bio)
> > -{
> > - struct page *page;
> > - struct bio_vec *bv;
> > - int i;
> > - struct bvec_iter_all iter_all;
> > -
> > - bio_for_each_segment_all(bv, bio, i, iter_all) {
> > - page = bv->bv_page;
> > -
> > - /* PG_error was set if any post_read step failed */
> > - if (bio->bi_status || PageError(page)) {
> > - ClearPageUptodate(page);
> > - SetPageError(page);
> > - } else {
> > - SetPageUptodate(page);
> > - }
> > - unlock_page(page);
> > - }
> > - if (bio->bi_private)
> > - mempool_free(bio->bi_private, bio_post_read_ctx_pool);
> > - bio_put(bio);
> > -}
> > -
> > -static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
> > -
> > -static void decrypt_work(struct work_struct *work)
> > -{
> > - struct bio_post_read_ctx *ctx =
> > - container_of(work, struct bio_post_read_ctx, work);
> > -
> > - fscrypt_decrypt_bio(ctx->bio);
> > -
> > - bio_post_read_processing(ctx);
> > -}
> > -
> > -static void verity_work(struct work_struct *work)
> > -{
> > - struct bio_post_read_ctx *ctx =
> > - container_of(work, struct bio_post_read_ctx, work);
> > -
> > - fsverity_verify_bio(ctx->bio);
> > -
> > - bio_post_read_processing(ctx);
> > -}
> > -
> > -static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
> > -{
> > - /*
> > - * We use different work queues for decryption and for verity because
> > - * verity may require reading metadata pages that need decryption, and
> > - * we shouldn't recurse to the same workqueue.
> > - */
> > - switch (++ctx->cur_step) {
> > - case STEP_DECRYPT:
> > - if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
> > - INIT_WORK(&ctx->work, decrypt_work);
> > - fscrypt_enqueue_decrypt_work(&ctx->work);
> > - return;
> > - }
> > - ctx->cur_step++;
> > - /* fall-through */
> > - case STEP_VERITY:
> > - if (ctx->enabled_steps & (1 << STEP_VERITY)) {
> > - INIT_WORK(&ctx->work, verity_work);
> > - fsverity_enqueue_verify_work(&ctx->work);
> > - return;
> > - }
> > - ctx->cur_step++;
> > - /* fall-through */
> > - default:
> > - __read_end_io(ctx->bio);
> > - }
> > -}
> > -
> > -static struct bio_post_read_ctx *get_bio_post_read_ctx(struct inode *inode,
> > - struct bio *bio,
> > - pgoff_t index)
> > -{
> > - unsigned int post_read_steps = 0;
> > - struct bio_post_read_ctx *ctx = NULL;
> > -
> > - if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> > - post_read_steps |= 1 << STEP_DECRYPT;
> > -#ifdef CONFIG_FS_VERITY
> > - if (inode->i_verity_info != NULL &&
> > - (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> > - post_read_steps |= 1 << STEP_VERITY;
> > -#endif
> > - if (post_read_steps) {
> > - ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
> > - if (!ctx)
> > - return ERR_PTR(-ENOMEM);
> > - ctx->bio = bio;
> > - ctx->enabled_steps = post_read_steps;
> > - bio->bi_private = ctx;
> > - }
> > - return ctx;
> > -}
> > -
> > -static bool bio_post_read_required(struct bio *bio)
> > -{
> > - return bio->bi_private && !bio->bi_status;
> > -}
> > -
> > /*
> > * I/O completion handler for multipage BIOs.
> > *
> > @@ -194,14 +71,30 @@ static bool bio_post_read_required(struct bio *bio)
> > */
> > static void mpage_end_io(struct bio *bio)
> > {
> > - if (bio_post_read_required(bio)) {
> > - struct bio_post_read_ctx *ctx = bio->bi_private;
> > + struct bio_vec *bv;
> > + int i;
> > + struct bvec_iter_all iter_all;
> > +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > + if (read_callbacks_required(bio)) {
> > + struct read_callbacks_ctx *ctx = bio->bi_private;
> >
> > - ctx->cur_step = STEP_INITIAL;
> > - bio_post_read_processing(ctx);
> > + read_callbacks(ctx);
> > return;
> > }
> > - __read_end_io(bio);
> > +#endif
> > + bio_for_each_segment_all(bv, bio, i, iter_all) {
> > + struct page *page = bv->bv_page;
> > +
> > + if (!bio->bi_status) {
> > + SetPageUptodate(page);
> > + } else {
> > + ClearPageUptodate(page);
> > + SetPageError(page);
> > + }
> > + unlock_page(page);
> > + }
> > +
> > + bio_put(bio);
> > }
> >
> > static inline loff_t ext4_readpage_limit(struct inode *inode)
> > @@ -368,17 +261,19 @@ int ext4_mpage_readpages(struct address_space *mapping,
> > bio = NULL;
> > }
> > if (bio == NULL) {
> > - struct bio_post_read_ctx *ctx;
> > + struct read_callbacks_ctx *ctx = NULL;
> >
> > bio = bio_alloc(GFP_KERNEL,
> > min_t(int, nr_pages, BIO_MAX_PAGES));
> > if (!bio)
> > goto set_error_page;
> > - ctx = get_bio_post_read_ctx(inode, bio, page->index);
> > +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > + ctx = get_read_callbacks_ctx(inode, bio, page->index);
> > if (IS_ERR(ctx)) {
> > bio_put(bio);
> > goto set_error_page;
> > }
> > +#endif
> > bio_set_dev(bio, bdev);
> > bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9);
> > bio->bi_end_io = mpage_end_io;
> > @@ -417,29 +312,3 @@ int ext4_mpage_readpages(struct address_space *mapping,
> > submit_bio(bio);
> > return 0;
> > }
> > -
> > -int __init ext4_init_post_read_processing(void)
> > -{
> > - bio_post_read_ctx_cache =
> > - kmem_cache_create("ext4_bio_post_read_ctx",
> > - sizeof(struct bio_post_read_ctx), 0, 0, NULL);
> > - if (!bio_post_read_ctx_cache)
> > - goto fail;
> > - bio_post_read_ctx_pool =
> > - mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
> > - bio_post_read_ctx_cache);
> > - if (!bio_post_read_ctx_pool)
> > - goto fail_free_cache;
> > - return 0;
> > -
> > -fail_free_cache:
> > - kmem_cache_destroy(bio_post_read_ctx_cache);
> > -fail:
> > - return -ENOMEM;
> > -}
> > -
> > -void ext4_exit_post_read_processing(void)
> > -{
> > - mempool_destroy(bio_post_read_ctx_pool);
> > - kmem_cache_destroy(bio_post_read_ctx_cache);
> > -}
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 4ae6f5849caa..aba724f82cc3 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -6101,10 +6101,6 @@ static int __init ext4_init_fs(void)
> > return err;
> >
> > err = ext4_init_pending();
> > - if (err)
> > - goto out7;
> > -
> > - err = ext4_init_post_read_processing();
> > if (err)
> > goto out6;
> >
> > @@ -6146,10 +6142,8 @@ static int __init ext4_init_fs(void)
> > out4:
> > ext4_exit_pageio();
> > out5:
> > - ext4_exit_post_read_processing();
> > -out6:
> > ext4_exit_pending();
> > -out7:
> > +out6:
> > ext4_exit_es();
> >
> > return err;
> > @@ -6166,7 +6160,6 @@ static void __exit ext4_exit_fs(void)
> > ext4_exit_sysfs();
> > ext4_exit_system_zone();
> > ext4_exit_pageio();
> > - ext4_exit_post_read_processing();
> > ext4_exit_es();
> > ext4_exit_pending();
> > }
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 038b958d0fa9..05430d3650ab 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -18,6 +18,7 @@
> > #include <linux/uio.h>
> > #include <linux/cleancache.h>
> > #include <linux/sched/signal.h>
> > +#include <linux/read_callbacks.h>
> >
> > #include "f2fs.h"
> > #include "node.h"
> > @@ -25,11 +26,6 @@
> > #include "trace.h"
> > #include <trace/events/f2fs.h>
> >
> > -#define NUM_PREALLOC_POST_READ_CTXS 128
> > -
> > -static struct kmem_cache *bio_post_read_ctx_cache;
> > -static mempool_t *bio_post_read_ctx_pool;
> > -
> > static bool __is_cp_guaranteed(struct page *page)
> > {
> > struct address_space *mapping = page->mapping;
> > @@ -69,20 +65,6 @@ static enum count_type __read_io_type(struct page *page)
> > return F2FS_RD_DATA;
> > }
> >
> > -/* postprocessing steps for read bios */
> > -enum bio_post_read_step {
> > - STEP_INITIAL = 0,
> > - STEP_DECRYPT,
> > - STEP_VERITY,
> > -};
> > -
> > -struct bio_post_read_ctx {
> > - struct bio *bio;
> > - struct work_struct work;
> > - unsigned int cur_step;
> > - unsigned int enabled_steps;
> > -};
> > -
> > static void __read_end_io(struct bio *bio)
> > {
> > struct page *page;
> > @@ -104,65 +86,16 @@ static void __read_end_io(struct bio *bio)
> > dec_page_count(F2FS_P_SB(page), __read_io_type(page));
> > unlock_page(page);
> > }
> > - if (bio->bi_private)
> > - mempool_free(bio->bi_private, bio_post_read_ctx_pool);
> > - bio_put(bio);
> > -}
> > -
> > -static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
> >
> > -static void decrypt_work(struct work_struct *work)
> > -{
> > - struct bio_post_read_ctx *ctx =
> > - container_of(work, struct bio_post_read_ctx, work);
> > -
> > - fscrypt_decrypt_bio(ctx->bio);
> > +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > + if (bio->bi_private) {
> > + struct read_callbacks_ctx *ctx;
> >
> > - bio_post_read_processing(ctx);
> > -}
> > -
> > -static void verity_work(struct work_struct *work)
> > -{
> > - struct bio_post_read_ctx *ctx =
> > - container_of(work, struct bio_post_read_ctx, work);
> > -
> > - fsverity_verify_bio(ctx->bio);
> > -
> > - bio_post_read_processing(ctx);
> > -}
> > -
> > -static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
> > -{
> > - /*
> > - * We use different work queues for decryption and for verity because
> > - * verity may require reading metadata pages that need decryption, and
> > - * we shouldn't recurse to the same workqueue.
> > - */
> > - switch (++ctx->cur_step) {
> > - case STEP_DECRYPT:
> > - if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
> > - INIT_WORK(&ctx->work, decrypt_work);
> > - fscrypt_enqueue_decrypt_work(&ctx->work);
> > - return;
> > - }
> > - ctx->cur_step++;
> > - /* fall-through */
> > - case STEP_VERITY:
> > - if (ctx->enabled_steps & (1 << STEP_VERITY)) {
> > - INIT_WORK(&ctx->work, verity_work);
> > - fsverity_enqueue_verify_work(&ctx->work);
> > - return;
> > - }
> > - ctx->cur_step++;
> > - /* fall-through */
> > - default:
> > - __read_end_io(ctx->bio);
> > + ctx = bio->bi_private;
> > + put_read_callbacks_ctx(ctx);
> > }
> > -}
> > -
> > -static bool f2fs_bio_post_read_required(struct bio *bio)
> > -{
> > - return bio->bi_private && !bio->bi_status;
> > +#endif
> > + bio_put(bio);
> > }
> >
> > static void f2fs_read_end_io(struct bio *bio)
> > @@ -173,14 +106,12 @@ static void f2fs_read_end_io(struct bio *bio)
> > bio->bi_status = BLK_STS_IOERR;
> > }
> >
> > - if (f2fs_bio_post_read_required(bio)) {
> > - struct bio_post_read_ctx *ctx = bio->bi_private;
> > -
> > - ctx->cur_step = STEP_INITIAL;
> > - bio_post_read_processing(ctx);
> > +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > + if (!bio->bi_status && bio->bi_private) {
> > + read_callbacks((struct read_callbacks_ctx *)(bio->bi_private));
> > return;
>
> Previously, in __read_end_io() we will decrease in-flight read IO count for each
> page, but it looks in the case that if fscrypto or fsverity is on and there is
> no IO error in end_io(), we will miss handling the count.
>
Thanks for pointing this out. I will fix this in the next version of this
patchset.
--
chandan
On Wednesday, May 1, 2019 2:40:38 AM IST Jeremy Sowden wrote:
> On 2019-04-28, at 10:01:11 +0530, Chandan Rajendra wrote:
> > Ext4 and F2FS store verity metadata in data extents (beyond
> > inode->i_size) associated with a file. But other filesystems might
> > choose alternative means to store verity metadata. Hence this commit
> > adds a callback function pointer to 'struct fsverity_operations' to
> > help in deciding if verity operation needs to performed against a
> > page-cache page holding file data.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/ext4/super.c | 6 ++++++
> > fs/f2fs/super.c | 6 ++++++
> > fs/read_callbacks.c | 4 +++-
> > include/linux/fsverity.h | 1 +
> > 4 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index aba724f82cc3..63d73b360f1d 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1428,10 +1428,16 @@ static struct page *ext4_read_verity_metadata_page(struct inode *inode,
> > return read_mapping_page(inode->i_mapping, index, NULL);
> > }
> >
> > +static bool ext4_verity_required(struct inode *inode, pgoff_t index)
> > +{
> > + return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > +}
> > +
> > static const struct fsverity_operations ext4_verityops = {
> > .set_verity = ext4_set_verity,
> > .get_metadata_end = ext4_get_verity_metadata_end,
> > .read_metadata_page = ext4_read_verity_metadata_page,
> > + .verity_required = ext4_verity_required,
> > };
> > #endif /* CONFIG_FS_VERITY */
> >
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index 2f75f06c784a..cd1299e1f92d 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -2257,10 +2257,16 @@ static struct page *f2fs_read_verity_metadata_page(struct inode *inode,
> > return read_mapping_page(inode->i_mapping, index, NULL);
> > }
> >
> > +static bool f2fs_verity_required(struct inode *inode, pgoff_t index)
> > +{
> > + return index < (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > +}
> > +
> > static const struct fsverity_operations f2fs_verityops = {
> > .set_verity = f2fs_set_verity,
> > .get_metadata_end = f2fs_get_verity_metadata_end,
> > .read_metadata_page = f2fs_read_verity_metadata_page,
> > + .verity_required = f2fs_verity_required,
> > };
> > #endif /* CONFIG_FS_VERITY */
> >
> > diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> > index b6d5b95e67d7..6dea54b0baa9 100644
> > --- a/fs/read_callbacks.c
> > +++ b/fs/read_callbacks.c
> > @@ -86,7 +86,9 @@ struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> > read_callbacks_steps |= 1 << STEP_DECRYPT;
> > #ifdef CONFIG_FS_VERITY
> > if (inode->i_verity_info != NULL &&
> > - (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
> > + ((inode->i_sb->s_vop->verity_required
> > + && inode->i_sb->s_vop->verity_required(inode, index))
> > + || (inode->i_sb->s_vop->verity_required == NULL)))
>
> I think this is a bit easier to follow:
>
> (inode->i_sb->s_vop->verity_required == NULL ||
> inode->i_sb->s_vop->verity_required(inode, index)))
Yes, you are right. I will implement the changes suggested by you.
--
chandan
On Tuesday, April 30, 2019 11:35:08 PM IST Eric Biggers wrote:
> On Sun, Apr 28, 2019 at 10:01:10AM +0530, Chandan Rajendra wrote:
> > The "read callbacks" code is used by both Ext4 and F2FS. Hence to
> > remove duplicity, this commit moves the code into
> > include/linux/read_callbacks.h and fs/read_callbacks.c.
> >
> > The corresponding decrypt and verity "work" functions have been moved
> > inside fscrypt and fsverity sources. With these in place, the read
> > callbacks code now has to just invoke enqueue functions provided by
> > fscrypt and fsverity.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/Kconfig | 4 +
> > fs/Makefile | 4 +
> > fs/crypto/Kconfig | 1 +
> > fs/crypto/bio.c | 23 ++---
> > fs/crypto/crypto.c | 17 +--
> > fs/crypto/fscrypt_private.h | 3 +
> > fs/ext4/ext4.h | 2 -
> > fs/ext4/readpage.c | 183 +++++----------------------------
> > fs/ext4/super.c | 9 +-
> > fs/f2fs/data.c | 148 ++++----------------------
> > fs/f2fs/super.c | 9 +-
> > fs/read_callbacks.c | 136 ++++++++++++++++++++++++
> > fs/verity/Kconfig | 1 +
> > fs/verity/verify.c | 12 +++
> > include/linux/fscrypt.h | 20 +---
> > include/linux/read_callbacks.h | 21 ++++
> > 16 files changed, 251 insertions(+), 342 deletions(-)
> > create mode 100644 fs/read_callbacks.c
> > create mode 100644 include/linux/read_callbacks.h
> >
> > diff --git a/fs/Kconfig b/fs/Kconfig
> > index 97f9eb8df713..03084f2dbeaf 100644
> > --- a/fs/Kconfig
> > +++ b/fs/Kconfig
> > @@ -308,6 +308,10 @@ config NFS_COMMON
> > depends on NFSD || NFS_FS || LOCKD
> > default y
> >
> > +config FS_READ_CALLBACKS
> > + bool
> > + default n
> > +
> > source "net/sunrpc/Kconfig"
> > source "fs/ceph/Kconfig"
> > source "fs/cifs/Kconfig"
>
> This shouldn't be under the 'if NETWORK_FILESYSTEMS' block, since it has nothing
> to do with network filesystems. When trying to compile this I got:
>
> WARNING: unmet direct dependencies detected for FS_READ_CALLBACKS
> Depends on [n]: NETWORK_FILESYSTEMS [=n]
> Selected by [y]:
> - FS_ENCRYPTION [=y]
> - FS_VERITY [=y]
>
> Perhaps put it just below FS_IOMAP?
>
> > diff --git a/fs/Makefile b/fs/Makefile
> > index 9dd2186e74b5..e0c0fce8cf40 100644
> > --- a/fs/Makefile
> > +++ b/fs/Makefile
> > @@ -21,6 +21,10 @@ else
> > obj-y += no-block.o
> > endif
> >
> > +ifeq ($(CONFIG_FS_READ_CALLBACKS),y)
> > +obj-y += read_callbacks.o
> > +endif
> > +
> > obj-$(CONFIG_PROC_FS) += proc_namespace.o
> >
> > obj-y += notify/
> > diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
> > index f0de238000c0..163c328bcbd4 100644
> > --- a/fs/crypto/Kconfig
> > +++ b/fs/crypto/Kconfig
> > @@ -8,6 +8,7 @@ config FS_ENCRYPTION
> > select CRYPTO_CTS
> > select CRYPTO_SHA256
> > select KEYS
> > + select FS_READ_CALLBACKS
> > help
> > Enable encryption of files and directories. This
> > feature is similar to ecryptfs, but it is more memory
>
> This selection needs to be conditional on BLOCK.
>
> select FS_READ_CALLBACKS if BLOCK
>
> Otherwise, building without BLOCK and with UBIFS encryption support fails.
>
> fs/read_callbacks.c: In function ‘end_read_callbacks’:
> fs/read_callbacks.c:34:23: error: storage size of ‘iter_all’ isn’t known
> struct bvec_iter_all iter_all;
> ^~~~~~~~
> fs/read_callbacks.c:37:20: error: dereferencing pointer to incomplete type ‘struct buffer_head’
> if (!PageError(bh->b_page))
>
> [...]
>
I will fix this in the next version of this patchset.
--
chandan
Hi Eric,
On Tuesday, April 30, 2019 6:08:18 AM IST Eric Biggers wrote:
> On Sun, Apr 28, 2019 at 10:01:15AM +0530, Chandan Rajendra wrote:
> > To support decryption of sub-pagesized blocks this commit adds code to,
> > 1. Track buffer head in "struct read_callbacks_ctx".
> > 2. Pass buffer head argument to all read callbacks.
> > 3. In the corresponding endio, loop across all the blocks mapped by the
> > page, decrypting each block in turn.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/buffer.c | 83 +++++++++++++++++++++++++---------
> > fs/crypto/bio.c | 50 +++++++++++++-------
> > fs/crypto/crypto.c | 19 +++++++-
> > fs/f2fs/data.c | 2 +-
> > fs/mpage.c | 2 +-
> > fs/read_callbacks.c | 53 ++++++++++++++--------
> > include/linux/buffer_head.h | 1 +
> > include/linux/read_callbacks.h | 5 +-
> > 8 files changed, 154 insertions(+), 61 deletions(-)
> >
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index ce357602f471..f324727e24bb 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -45,6 +45,7 @@
> > #include <linux/bit_spinlock.h>
> > #include <linux/pagevec.h>
> > #include <linux/sched/mm.h>
> > +#include <linux/read_callbacks.h>
> > #include <trace/events/block.h>
> >
> > static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
> > @@ -245,11 +246,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
> > return ret;
> > }
> >
> > -/*
> > - * I/O completion handler for block_read_full_page() - pages
> > - * which come unlocked at the end of I/O.
> > - */
> > -static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> > +void end_buffer_page_read(struct buffer_head *bh)
>
> I think __end_buffer_async_read() would be a better name, since the *page* isn't
> necessarily done yet.
>
> > {
> > unsigned long flags;
> > struct buffer_head *first;
> > @@ -257,17 +254,7 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> > struct page *page;
> > int page_uptodate = 1;
> >
> > - BUG_ON(!buffer_async_read(bh));
> > -
> > page = bh->b_page;
> > - if (uptodate) {
> > - set_buffer_uptodate(bh);
> > - } else {
> > - clear_buffer_uptodate(bh);
> > - buffer_io_error(bh, ", async page read");
> > - SetPageError(page);
> > - }
> > -
> > /*
> > * Be _very_ careful from here on. Bad things can happen if
> > * two buffer heads end IO at almost the same time and both
> > @@ -305,6 +292,44 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> > local_irq_restore(flags);
> > return;
> > }
> > +EXPORT_SYMBOL(end_buffer_page_read);
>
> No need for EXPORT_SYMBOL() here, as this is only called by built-in code.
>
> > +
> > +/*
> > + * I/O completion handler for block_read_full_page() - pages
> > + * which come unlocked at the end of I/O.
> > + */
>
> This comment is no longer correct. Change to something like:
>
> /*
> * I/O completion handler for block_read_full_page(). Pages are unlocked after
> * the I/O completes and the read callbacks (if any) have executed.
> */
>
> > +static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> > +{
> > + struct page *page;
> > +
> > + BUG_ON(!buffer_async_read(bh));
> > +
> > +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > + if (uptodate && bh->b_private) {
> > + struct read_callbacks_ctx *ctx = bh->b_private;
> > +
> > + read_callbacks(ctx);
> > + return;
> > + }
> > +
> > + if (bh->b_private) {
> > + struct read_callbacks_ctx *ctx = bh->b_private;
> > +
> > + WARN_ON(uptodate);
> > + put_read_callbacks_ctx(ctx);
> > + }
> > +#endif
>
> These details should be handled in read_callbacks code, not here. AFAICS, all
> you need is a function read_callbacks_end_bh() that returns a bool indicating
> whether it handled the buffer_head or not:
>
> static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
> {
> BUG_ON(!buffer_async_read(bh));
>
> if (read_callbacks_end_bh(bh, uptodate))
> return;
>
> page = bh->b_page;
> ...
> }
>
> Then read_callbacks_end_bh() would check ->b_private and uptodate, and call
> read_callbacks() or put_read_callbacks_ctx() as appropriate. When
> CONFIG_FS_READ_CALLBACKS=n it would be a stub that always returns false.
>
> > + page = bh->b_page;
> [...]
>
> > }
> > @@ -2292,11 +2323,21 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
> > * the underlying blockdev brought it uptodate (the sct fix).
> > */
> > for (i = 0; i < nr; i++) {
> > - bh = arr[i];
> > - if (buffer_uptodate(bh))
> > + bh = arr[i].bh;
> > + if (buffer_uptodate(bh)) {
> > end_buffer_async_read(bh, 1);
> > - else
> > + } else {
> > +#if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > + struct read_callbacks_ctx *ctx;
> > +
> > + ctx = get_read_callbacks_ctx(inode, NULL, bh, arr[i].blk_nr);
> > + if (WARN_ON(IS_ERR(ctx))) {
> > + end_buffer_async_read(bh, 0);
> > + continue;
> > + }
> > +#endif
> > submit_bh(REQ_OP_READ, 0, bh);
> > + }
> > }
> > return 0;
>
> Similarly here. This level of detail doesn't need to be exposed outside of the
> read_callbacks code. Just call read_callbacks_setup_bh() or something, make it
> return an 'err' rather than the read_callbacks_ctx, and make read_callbacks.h
> stub it out when !CONFIG_FS_READ_CALLBACKS. There should be no #ifdef here.
>
> > diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> > index 27f5618174f2..856f4694902d 100644
> > --- a/fs/crypto/bio.c
> > +++ b/fs/crypto/bio.c
> > @@ -24,44 +24,62 @@
> > #include <linux/module.h>
> > #include <linux/bio.h>
> > #include <linux/namei.h>
> > +#include <linux/buffer_head.h>
> > #include <linux/read_callbacks.h>
> >
> > #include "fscrypt_private.h"
> >
> > -static void __fscrypt_decrypt_bio(struct bio *bio, bool done)
> > +static void fscrypt_decrypt(struct bio *bio, struct buffer_head *bh)
> > {
> > + struct inode *inode;
> > + struct page *page;
> > struct bio_vec *bv;
> > + sector_t blk_nr;
> > + int ret;
> > int i;
> > struct bvec_iter_all iter_all;
> >
> > - bio_for_each_segment_all(bv, bio, i, iter_all) {
> > - struct page *page = bv->bv_page;
> > - int ret = fscrypt_decrypt_page(page->mapping->host, page,
> > - PAGE_SIZE, 0, page->index);
> > + WARN_ON(!bh && !bio);
> >
> > + if (bh) {
> > + page = bh->b_page;
> > + inode = page->mapping->host;
> > +
> > + blk_nr = page->index << (PAGE_SHIFT - inode->i_blkbits);
> > + blk_nr += (bh_offset(bh) >> inode->i_blkbits);
> > +
> > + ret = fscrypt_decrypt_page(inode, page, i_blocksize(inode),
> > + bh_offset(bh), blk_nr);
> > if (ret) {
> > WARN_ON_ONCE(1);
> > SetPageError(page);
> > - } else if (done) {
> > - SetPageUptodate(page);
> > }
> > - if (done)
> > - unlock_page(page);
> > + } else if (bio) {
> > + bio_for_each_segment_all(bv, bio, i, iter_all) {
> > + unsigned int blkbits;
> > +
> > + page = bv->bv_page;
> > + inode = page->mapping->host;
> > + blkbits = inode->i_blkbits;
> > + blk_nr = page->index << (PAGE_SHIFT - blkbits);
> > + blk_nr += (bv->bv_offset >> blkbits);
> > + ret = fscrypt_decrypt_page(page->mapping->host,
> > + page, bv->bv_len,
> > + bv->bv_offset, blk_nr);
> > + if (ret) {
> > + WARN_ON_ONCE(1);
> > + SetPageError(page);
> > + }
> > + }
> > }
> > }
>
> For clarity, can you make these two different functions?
> fscrypt_decrypt_bio() and fscrypt_decrypt_bh().
>
> FYI, the WARN_ON_ONCE() here was removed in the latest fscrypt tree.
>
> >
> > -void fscrypt_decrypt_bio(struct bio *bio)
> > -{
> > - __fscrypt_decrypt_bio(bio, false);
> > -}
> > -EXPORT_SYMBOL(fscrypt_decrypt_bio);
> > -
> > void fscrypt_decrypt_work(struct work_struct *work)
> > {
> > struct read_callbacks_ctx *ctx =
> > container_of(work, struct read_callbacks_ctx, work);
> >
> > - fscrypt_decrypt_bio(ctx->bio);
> > + fscrypt_decrypt(ctx->bio, ctx->bh);
> >
> > read_callbacks(ctx);
> > }
> > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > index ffa9302a7351..4f0d832cae71 100644
> > --- a/fs/crypto/crypto.c
> > +++ b/fs/crypto/crypto.c
> > @@ -305,11 +305,26 @@ EXPORT_SYMBOL(fscrypt_encrypt_page);
> > int fscrypt_decrypt_page(const struct inode *inode, struct page *page,
> > unsigned int len, unsigned int offs, u64 lblk_num)
> > {
> > + int i, page_nr_blks;
> > + int err = 0;
> > +
> > if (!(inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES))
> > BUG_ON(!PageLocked(page));
> >
> > - return fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num, page, page,
> > - len, offs, GFP_NOFS);
> > + page_nr_blks = len >> inode->i_blkbits;
> > +
> > + for (i = 0; i < page_nr_blks; i++) {
> > + err = fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num,
> > + page, page, i_blocksize(inode), offs,
> > + GFP_NOFS);
> > + if (err)
> > + break;
> > +
> > + ++lblk_num;
> > + offs += i_blocksize(inode);
> > + }
> > +
> > + return err;
> > }
> > EXPORT_SYMBOL(fscrypt_decrypt_page);
>
> I was confused by the code calling this until I saw you updated it to handle
> multiple blocks. Can you please rename it to fscrypt_decrypt_blocks()? The
> function comment also needs to be updated to clarify what it does now (decrypt a
> contiguous sequence of one or more filesystem blocks in the page). Also,
> 'lblk_num' should be renamed to 'starting_lblk_num' or similar.
>
fscrypt_decrypt_page() has the same semantics as fscrypt_encrypt_page()
i.e. they decrypt/encrypt contiguous blocks mapped by a page. This was the
reason behind leaving the names unchanged. Please let me know if you still think
that the names of both the functions need to be renamed to
fscrypt_[decrypt|encrypt]_blocks().
> Please also rename fscrypt_do_page_crypto() to fscrypt_crypt_block().
Sure, I will make the change.
>
> Also, there should be a check that the len and offset are block-aligned:
>
> const unsigned int blocksize = i_blocksize(inode);
>
> if (!IS_ALIGNED(len | offs, blocksize))
> return -EINVAL;
>
> >
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 05430d3650ab..ba437a2085e7 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -527,7 +527,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
> > bio_set_op_attrs(bio, REQ_OP_READ, op_flag);
> >
> > #if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > - ctx = get_read_callbacks_ctx(inode, bio, first_idx);
> > + ctx = get_read_callbacks_ctx(inode, bio, NULL, first_idx);
> > if (IS_ERR(ctx)) {
> > bio_put(bio);
> > return (struct bio *)ctx;
> > diff --git a/fs/mpage.c b/fs/mpage.c
> > index e342b859ee44..0557479fdca4 100644
> > --- a/fs/mpage.c
> > +++ b/fs/mpage.c
> > @@ -348,7 +348,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
> > goto confused;
> >
> > #if defined(CONFIG_FS_ENCRYPTION) || defined(CONFIG_FS_VERITY)
> > - ctx = get_read_callbacks_ctx(inode, args->bio, page->index);
> > + ctx = get_read_callbacks_ctx(inode, args->bio, NULL, page->index);
> > if (IS_ERR(ctx)) {
> > bio_put(args->bio);
> > args->bio = NULL;
> > diff --git a/fs/read_callbacks.c b/fs/read_callbacks.c
> > index 6dea54b0baa9..b3881c525720 100644
> > --- a/fs/read_callbacks.c
> > +++ b/fs/read_callbacks.c
> > @@ -8,6 +8,7 @@
> > #include <linux/mm.h>
> > #include <linux/pagemap.h>
> > #include <linux/bio.h>
> > +#include <linux/buffer_head.h>
> > #include <linux/fscrypt.h>
> > #include <linux/fsverity.h>
> > #include <linux/read_callbacks.h>
> > @@ -24,26 +25,41 @@ enum read_callbacks_step {
> > STEP_VERITY,
> > };
> >
> > -void end_read_callbacks(struct bio *bio)
> > +void end_read_callbacks(struct bio *bio, struct buffer_head *bh)
> > {
> > + struct read_callbacks_ctx *ctx;
> > struct page *page;
> > struct bio_vec *bv;
> > int i;
> > struct bvec_iter_all iter_all;
> >
> > - bio_for_each_segment_all(bv, bio, i, iter_all) {
> > - page = bv->bv_page;
> > + if (bh) {
> > + if (!PageError(bh->b_page))
> > + set_buffer_uptodate(bh);
> >
> > - BUG_ON(bio->bi_status);
> > + ctx = bh->b_private;
> >
> > - if (!PageError(page))
> > - SetPageUptodate(page);
> > + end_buffer_page_read(bh);
> >
> > - unlock_page(page);
> > + put_read_callbacks_ctx(ctx);
> > + } else if (bio) {
> > + bio_for_each_segment_all(bv, bio, i, iter_all) {
> > + page = bv->bv_page;
> > +
> > + WARN_ON(bio->bi_status);
> > +
> > + if (!PageError(page))
> > + SetPageUptodate(page);
> > +
> > + unlock_page(page);
> > + }
> > + WARN_ON(!bio->bi_private);
> > +
> > + ctx = bio->bi_private;
> > + put_read_callbacks_ctx(ctx);
> > +
> > + bio_put(bio);
> > }
> > - if (bio->bi_private)
> > - put_read_callbacks_ctx(bio->bi_private);
> > - bio_put(bio);
> > }
> > EXPORT_SYMBOL(end_read_callbacks);
>
> To make this easier to read, can you split this into end_read_callbacks_bio()
> and end_read_callbacks_bh()?
Sure, I will make the change.
>
> >
> > @@ -70,18 +86,21 @@ void read_callbacks(struct read_callbacks_ctx *ctx)
> > ctx->cur_step++;
> > /* fall-through */
> > default:
> > - end_read_callbacks(ctx->bio);
> > + end_read_callbacks(ctx->bio, ctx->bh);
> > }
> > }
> > EXPORT_SYMBOL(read_callbacks);
> >
> > struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> > struct bio *bio,
> > + struct buffer_head *bh,
> > pgoff_t index)
> > {
> > unsigned int read_callbacks_steps = 0;
> > struct read_callbacks_ctx *ctx = NULL;
> >
> > + WARN_ON(!bh && !bio);
> > +
>
> If this condition is true, return an error code; don't continue on.
>
> > if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode))
> > read_callbacks_steps |= 1 << STEP_DECRYPT;
> > #ifdef CONFIG_FS_VERITY
> > @@ -95,11 +114,15 @@ struct read_callbacks_ctx *get_read_callbacks_ctx(struct inode *inode,
> > ctx = mempool_alloc(read_callbacks_ctx_pool, GFP_NOFS);
> > if (!ctx)
> > return ERR_PTR(-ENOMEM);
> > + ctx->bh = bh;
> > ctx->bio = bio;
> > ctx->inode = inode;
> > ctx->enabled_steps = read_callbacks_steps;
> > ctx->cur_step = STEP_INITIAL;
> > - bio->bi_private = ctx;
> > + if (bio)
> > + bio->bi_private = ctx;
> > + else if (bh)
> > + bh->b_private = ctx;
>
> ... and if doing that, then you don't need to check 'else if (bh)' here.
I agree.
>
> > }
> > return ctx;
> > }
> > @@ -111,12 +134,6 @@ void put_read_callbacks_ctx(struct read_callbacks_ctx *ctx)
> > }
> > EXPORT_SYMBOL(put_read_callbacks_ctx);
> >
> > -bool read_callbacks_required(struct bio *bio)
> > -{
> > - return bio->bi_private && !bio->bi_status;
> > -}
> > -EXPORT_SYMBOL(read_callbacks_required);
> > -
>
> It's unexpected that the patch series introduces this function,
> only to delete it later.
I had replaced bio_post_read_required() with read_callbacks_required(). I will
remove this since the requirement for post read checking will need to work for
buffer heads as well.
--
chandan
On Tuesday, April 30, 2019 10:31:51 PM IST Eric Biggers wrote:
> On Sun, Apr 28, 2019 at 10:01:19AM +0530, Chandan Rajendra wrote:
> > For subpage-sized blocks, the initial logical block number mapped by a
> > page can be different from page->index. Hence this commit adds code to
> > compute the first logical block mapped by the page and also the page
> > range to be encrypted.
> >
> > Signed-off-by: Chandan Rajendra <[email protected]>
> > ---
> > fs/ext4/page-io.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> > index 3e9298e6a705..75485ee9e800 100644
> > --- a/fs/ext4/page-io.c
> > +++ b/fs/ext4/page-io.c
> > @@ -418,6 +418,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> > {
> > struct page *data_page = NULL;
> > struct inode *inode = page->mapping->host;
> > + u64 page_blk;
> > unsigned block_start;
> > struct buffer_head *bh, *head;
> > int ret = 0;
> > @@ -478,10 +479,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> >
> > if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode) && nr_to_submit) {
> > gfp_t gfp_flags = GFP_NOFS;
> > + unsigned int page_bytes;
> > +
>
> page_blk should be declared here, just after page_bytes.
>
> > + page_bytes = round_up(len, i_blocksize(inode));
> > + page_blk = page->index << (PAGE_SHIFT - inode->i_blkbits);
>
> Although block numbers are 32-bit in ext4, if you're going to make 'page_blk' a
> u64 anyway, then for consistency page->index should be cast to u64 here.
>
> >
> > retry_encrypt:
> > - data_page = fscrypt_encrypt_page(inode, page, PAGE_SIZE, 0,
> > - page->index, gfp_flags);
> > + data_page = fscrypt_encrypt_page(inode, page, page_bytes, 0,
> > + page_blk, gfp_flags);
> > if (IS_ERR(data_page)) {
> > ret = PTR_ERR(data_page);
> > if (ret == -ENOMEM && wbc->sync_mode == WB_SYNC_ALL) {
>
>
I will implement the changes that have been suggested here.
--
chandan
On Wednesday, May 1, 2019 4:38:41 AM IST Eric Biggers wrote:
> On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote:
> > On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> > > For subpage-sized blocks, this commit now encrypts all blocks mapped by
> > > a page range.
> > >
> > > Signed-off-by: Chandan Rajendra <[email protected]>
> > > ---
> > > fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
> > > 1 file changed, 25 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > > index 4f0d832cae71..2d65b431563f 100644
> > > --- a/fs/crypto/crypto.c
> > > +++ b/fs/crypto/crypto.c
> > > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> >
> > Need to update the function comment to clearly explain what this function
> > actually does now.
> >
> > > {
> > > struct fscrypt_ctx *ctx;
> > > struct page *ciphertext_page = page;
> > > + int i, page_nr_blks;
> > > int err;
> > >
> > > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
> > >
> >
> > Make a 'blocksize' variable so you don't have to keep calling i_blocksize().
> >
> > Also, you need to check whether 'len' and 'offs' are filesystem-block-aligned,
> > since the code now assumes it.
> >
> > const unsigned int blocksize = i_blocksize(inode);
> >
> > if (!IS_ALIGNED(len | offs, blocksize))
> > return -EINVAL;
> >
> > However, did you check whether that's always true for ubifs? It looks like it
> > may expect to encrypt a prefix of a block, that is only padded to the next
> > 16-byte boundary.
> >
> > > + page_nr_blks = len >> inode->i_blkbits;
> > > +
> > > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> > > /* with inplace-encryption we just encrypt the page */
> > > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
> > > - ciphertext_page, len, offs,
> > > - gfp_flags);
> > > - if (err)
> > > - return ERR_PTR(err);
> > > -
> > > + for (i = 0; i < page_nr_blks; i++) {
> > > + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> > > + lblk_num, page,
> > > + ciphertext_page,
> > > + i_blocksize(inode), offs,
> > > + gfp_flags);
> > > + if (err)
> > > + return ERR_PTR(err);
>
> Apparently ubifs does encrypt data shorter than the filesystem block size, so
> this part is wrong.
>
> I suggest we split this into two functions, fscrypt_encrypt_block_inplace() and
> fscrypt_encrypt_blocks(), so that it's conceptually simpler what each function
> does. Currently this works completely differently depending on whether the
> filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is weird.
>
> I also noticed that using fscrypt_ctx for writes seems to be unnecessary.
> AFAICS, page_private(bounce_page) could point directly to the pagecache page.
> That would simplify things a lot, especially since then fscrypt_ctx could be
> removed entirely after you convert reads to use read_callbacks_ctx.
>
> IMO, these would be worthwhile cleanups for fscrypt by themselves, without
> waiting for the read_callbacks stuff to be finalized. Finalizing the
> read_callbacks stuff will probably require reaching a consensus about how they
> should work with future filesystem features like fsverity and compression.
>
> So to move things forward, I'm considering sending out a series with the above
> cleanups for fscrypt, plus the equivalent of your patches:
>
> "fscrypt_encrypt_page: Loop across all blocks mapped by a page range"
> "fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page"
> "Add decryption support for sub-pagesized blocks" (fs/crypto/ part only)
>
> Then hopefully we can get all that applied for 5.3 so that fs/crypto/ itself is
> ready for blocksize != PAGE_SIZE; and get your changes to ext4_bio_write_page(),
> __ext4_block_zero_page_range(), and ext4_block_write_begin() applied too, so
> that ext4 is partially ready for encryption with blocksize != PAGE_SIZE.
>
> Then only the read_callbacks stuff will remain, to get encryption support into
> fs/mpage.c and fs/buffer.c. Do you think that's a good plan?
Hi Eric,
IMHO, I will continue posting the next version of the current patchset and if
there are no serious reservations from FS maintainers the "read callbacks"
patchset can be merged. In such a scenario, the cleanups being
non-complicated, can be merged later.
--
chandan
Hi Chandan,
On Wed, May 01, 2019 at 08:19:35PM +0530, Chandan Rajendra wrote:
> On Wednesday, May 1, 2019 4:38:41 AM IST Eric Biggers wrote:
> > On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote:
> > > On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> > > > For subpage-sized blocks, this commit now encrypts all blocks mapped by
> > > > a page range.
> > > >
> > > > Signed-off-by: Chandan Rajendra <[email protected]>
> > > > ---
> > > > fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
> > > > 1 file changed, 25 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > > > index 4f0d832cae71..2d65b431563f 100644
> > > > --- a/fs/crypto/crypto.c
> > > > +++ b/fs/crypto/crypto.c
> > > > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> > >
> > > Need to update the function comment to clearly explain what this function
> > > actually does now.
> > >
> > > > {
> > > > struct fscrypt_ctx *ctx;
> > > > struct page *ciphertext_page = page;
> > > > + int i, page_nr_blks;
> > > > int err;
> > > >
> > > > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
> > > >
> > >
> > > Make a 'blocksize' variable so you don't have to keep calling i_blocksize().
> > >
> > > Also, you need to check whether 'len' and 'offs' are filesystem-block-aligned,
> > > since the code now assumes it.
> > >
> > > const unsigned int blocksize = i_blocksize(inode);
> > >
> > > if (!IS_ALIGNED(len | offs, blocksize))
> > > return -EINVAL;
> > >
> > > However, did you check whether that's always true for ubifs? It looks like it
> > > may expect to encrypt a prefix of a block, that is only padded to the next
> > > 16-byte boundary.
> > >
> > > > + page_nr_blks = len >> inode->i_blkbits;
> > > > +
> > > > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> > > > /* with inplace-encryption we just encrypt the page */
> > > > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
> > > > - ciphertext_page, len, offs,
> > > > - gfp_flags);
> > > > - if (err)
> > > > - return ERR_PTR(err);
> > > > -
> > > > + for (i = 0; i < page_nr_blks; i++) {
> > > > + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> > > > + lblk_num, page,
> > > > + ciphertext_page,
> > > > + i_blocksize(inode), offs,
> > > > + gfp_flags);
> > > > + if (err)
> > > > + return ERR_PTR(err);
> >
> > Apparently ubifs does encrypt data shorter than the filesystem block size, so
> > this part is wrong.
> >
> > I suggest we split this into two functions, fscrypt_encrypt_block_inplace() and
> > fscrypt_encrypt_blocks(), so that it's conceptually simpler what each function
> > does. Currently this works completely differently depending on whether the
> > filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is weird.
> >
> > I also noticed that using fscrypt_ctx for writes seems to be unnecessary.
> > AFAICS, page_private(bounce_page) could point directly to the pagecache page.
> > That would simplify things a lot, especially since then fscrypt_ctx could be
> > removed entirely after you convert reads to use read_callbacks_ctx.
> >
> > IMO, these would be worthwhile cleanups for fscrypt by themselves, without
> > waiting for the read_callbacks stuff to be finalized. Finalizing the
> > read_callbacks stuff will probably require reaching a consensus about how they
> > should work with future filesystem features like fsverity and compression.
> >
> > So to move things forward, I'm considering sending out a series with the above
> > cleanups for fscrypt, plus the equivalent of your patches:
> >
> > "fscrypt_encrypt_page: Loop across all blocks mapped by a page range"
> > "fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page"
> > "Add decryption support for sub-pagesized blocks" (fs/crypto/ part only)
> >
> > Then hopefully we can get all that applied for 5.3 so that fs/crypto/ itself is
> > ready for blocksize != PAGE_SIZE; and get your changes to ext4_bio_write_page(),
> > __ext4_block_zero_page_range(), and ext4_block_write_begin() applied too, so
> > that ext4 is partially ready for encryption with blocksize != PAGE_SIZE.
> >
> > Then only the read_callbacks stuff will remain, to get encryption support into
> > fs/mpage.c and fs/buffer.c. Do you think that's a good plan?
>
> Hi Eric,
>
> IMHO, I will continue posting the next version of the current patchset and if
> there are no serious reservations from FS maintainers the "read callbacks"
> patchset can be merged. In such a scenario, the cleanups being
> non-complicated, can be merged later.
>
> --
> chandan
>
Most of the patches I have in mind are actually things that are in your patchset
already, or have been requested, or will be requested eventually :-). I'm
concerned that people will keep going back and forth on this patchset for a lot
longer, arguing about fsverity, compression, details of the fs/crypto/ stuff,
etc. Moreover it's based on unmerged patches that add the fsverity feature, so
it can't be merged as-is anyway.
IMO, it's also difficult for people to review the read_callbacks stuff when it's
mixed in with lots of other fscrypt and ext4 changes for blocksize != PAGE_SIZE.
I actually have a patchset almost ready already, so I'm going to send it out and
see what you think. It *should* make things a lot easier for you, since then
you can base a much smaller read_callbacks patchset on top of it.
Thanks!
- Eric
On Thursday, May 2, 2019 3:59:01 AM IST Eric Biggers wrote:
> Hi Chandan,
>
> On Wed, May 01, 2019 at 08:19:35PM +0530, Chandan Rajendra wrote:
> > On Wednesday, May 1, 2019 4:38:41 AM IST Eric Biggers wrote:
> > > On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote:
> > > > On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> > > > > For subpage-sized blocks, this commit now encrypts all blocks mapped by
> > > > > a page range.
> > > > >
> > > > > Signed-off-by: Chandan Rajendra <[email protected]>
> > > > > ---
> > > > > fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
> > > > > 1 file changed, 25 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > > > > index 4f0d832cae71..2d65b431563f 100644
> > > > > --- a/fs/crypto/crypto.c
> > > > > +++ b/fs/crypto/crypto.c
> > > > > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> > > >
> > > > Need to update the function comment to clearly explain what this function
> > > > actually does now.
> > > >
> > > > > {
> > > > > struct fscrypt_ctx *ctx;
> > > > > struct page *ciphertext_page = page;
> > > > > + int i, page_nr_blks;
> > > > > int err;
> > > > >
> > > > > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
> > > > >
> > > >
> > > > Make a 'blocksize' variable so you don't have to keep calling i_blocksize().
> > > >
> > > > Also, you need to check whether 'len' and 'offs' are filesystem-block-aligned,
> > > > since the code now assumes it.
> > > >
> > > > const unsigned int blocksize = i_blocksize(inode);
> > > >
> > > > if (!IS_ALIGNED(len | offs, blocksize))
> > > > return -EINVAL;
> > > >
> > > > However, did you check whether that's always true for ubifs? It looks like it
> > > > may expect to encrypt a prefix of a block, that is only padded to the next
> > > > 16-byte boundary.
> > > >
> > > > > + page_nr_blks = len >> inode->i_blkbits;
> > > > > +
> > > > > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> > > > > /* with inplace-encryption we just encrypt the page */
> > > > > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
> > > > > - ciphertext_page, len, offs,
> > > > > - gfp_flags);
> > > > > - if (err)
> > > > > - return ERR_PTR(err);
> > > > > -
> > > > > + for (i = 0; i < page_nr_blks; i++) {
> > > > > + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> > > > > + lblk_num, page,
> > > > > + ciphertext_page,
> > > > > + i_blocksize(inode), offs,
> > > > > + gfp_flags);
> > > > > + if (err)
> > > > > + return ERR_PTR(err);
> > >
> > > Apparently ubifs does encrypt data shorter than the filesystem block size, so
> > > this part is wrong.
> > >
> > > I suggest we split this into two functions, fscrypt_encrypt_block_inplace() and
> > > fscrypt_encrypt_blocks(), so that it's conceptually simpler what each function
> > > does. Currently this works completely differently depending on whether the
> > > filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is weird.
> > >
> > > I also noticed that using fscrypt_ctx for writes seems to be unnecessary.
> > > AFAICS, page_private(bounce_page) could point directly to the pagecache page.
> > > That would simplify things a lot, especially since then fscrypt_ctx could be
> > > removed entirely after you convert reads to use read_callbacks_ctx.
> > >
> > > IMO, these would be worthwhile cleanups for fscrypt by themselves, without
> > > waiting for the read_callbacks stuff to be finalized. Finalizing the
> > > read_callbacks stuff will probably require reaching a consensus about how they
> > > should work with future filesystem features like fsverity and compression.
> > >
> > > So to move things forward, I'm considering sending out a series with the above
> > > cleanups for fscrypt, plus the equivalent of your patches:
> > >
> > > "fscrypt_encrypt_page: Loop across all blocks mapped by a page range"
> > > "fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page"
> > > "Add decryption support for sub-pagesized blocks" (fs/crypto/ part only)
> > >
> > > Then hopefully we can get all that applied for 5.3 so that fs/crypto/ itself is
> > > ready for blocksize != PAGE_SIZE; and get your changes to ext4_bio_write_page(),
> > > __ext4_block_zero_page_range(), and ext4_block_write_begin() applied too, so
> > > that ext4 is partially ready for encryption with blocksize != PAGE_SIZE.
> > >
> > > Then only the read_callbacks stuff will remain, to get encryption support into
> > > fs/mpage.c and fs/buffer.c. Do you think that's a good plan?
> >
> > Hi Eric,
> >
> > IMHO, I will continue posting the next version of the current patchset and if
> > there are no serious reservations from FS maintainers the "read callbacks"
> > patchset can be merged. In such a scenario, the cleanups being
> > non-complicated, can be merged later.
> >
>
> Most of the patches I have in mind are actually things that are in your patchset
> already, or have been requested, or will be requested eventually :-). I'm
> concerned that people will keep going back and forth on this patchset for a lot
> longer, arguing about fsverity, compression, details of the fs/crypto/ stuff,
> etc. Moreover it's based on unmerged patches that add the fsverity feature, so
> it can't be merged as-is anyway.
>
> IMO, it's also difficult for people to review the read_callbacks stuff when it's
> mixed in with lots of other fscrypt and ext4 changes for blocksize != PAGE_SIZE.
>
> I actually have a patchset almost ready already, so I'm going to send it out and
> see what you think. It *should* make things a lot easier for you, since then
> you can base a much smaller read_callbacks patchset on top of it.
One of the things that I am concerned most about is the fact that the more we
delay merging read_callbacks patchset, the more the chances of filesystems
adding further operations that get executed after read I/O completes. Most of
the time, these implementations tend to have filesystem specific changes which
are going to be very difficult (impossible?) to make them work with
read_callback patchset. So instead of making things easier, delaying merging
the read_callback patchset ends up actually having the opposite effect.
With the read_callback patchset merged, FS feature developers will take
read_callback framework into consideration before designing/implementing new
related features.
--
chandan
Hi Chandan,
On Thu, May 02, 2019 at 11:22:05AM +0530, Chandan Rajendra wrote:
> On Thursday, May 2, 2019 3:59:01 AM IST Eric Biggers wrote:
> > Hi Chandan,
> >
> > On Wed, May 01, 2019 at 08:19:35PM +0530, Chandan Rajendra wrote:
> > > On Wednesday, May 1, 2019 4:38:41 AM IST Eric Biggers wrote:
> > > > On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote:
> > > > > On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> > > > > > For subpage-sized blocks, this commit now encrypts all blocks mapped by
> > > > > > a page range.
> > > > > >
> > > > > > Signed-off-by: Chandan Rajendra <[email protected]>
> > > > > > ---
> > > > > > fs/crypto/crypto.c | 37 +++++++++++++++++++++++++------------
> > > > > > 1 file changed, 25 insertions(+), 12 deletions(-)
> > > > > >
> > > > > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > > > > > index 4f0d832cae71..2d65b431563f 100644
> > > > > > --- a/fs/crypto/crypto.c
> > > > > > +++ b/fs/crypto/crypto.c
> > > > > > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const struct inode *inode,
> > > > >
> > > > > Need to update the function comment to clearly explain what this function
> > > > > actually does now.
> > > > >
> > > > > > {
> > > > > > struct fscrypt_ctx *ctx;
> > > > > > struct page *ciphertext_page = page;
> > > > > > + int i, page_nr_blks;
> > > > > > int err;
> > > > > >
> > > > > > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
> > > > > >
> > > > >
> > > > > Make a 'blocksize' variable so you don't have to keep calling i_blocksize().
> > > > >
> > > > > Also, you need to check whether 'len' and 'offs' are filesystem-block-aligned,
> > > > > since the code now assumes it.
> > > > >
> > > > > const unsigned int blocksize = i_blocksize(inode);
> > > > >
> > > > > if (!IS_ALIGNED(len | offs, blocksize))
> > > > > return -EINVAL;
> > > > >
> > > > > However, did you check whether that's always true for ubifs? It looks like it
> > > > > may expect to encrypt a prefix of a block, that is only padded to the next
> > > > > 16-byte boundary.
> > > > >
> > > > > > + page_nr_blks = len >> inode->i_blkbits;
> > > > > > +
> > > > > > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> > > > > > /* with inplace-encryption we just encrypt the page */
> > > > > > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page,
> > > > > > - ciphertext_page, len, offs,
> > > > > > - gfp_flags);
> > > > > > - if (err)
> > > > > > - return ERR_PTR(err);
> > > > > > -
> > > > > > + for (i = 0; i < page_nr_blks; i++) {
> > > > > > + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> > > > > > + lblk_num, page,
> > > > > > + ciphertext_page,
> > > > > > + i_blocksize(inode), offs,
> > > > > > + gfp_flags);
> > > > > > + if (err)
> > > > > > + return ERR_PTR(err);
> > > >
> > > > Apparently ubifs does encrypt data shorter than the filesystem block size, so
> > > > this part is wrong.
> > > >
> > > > I suggest we split this into two functions, fscrypt_encrypt_block_inplace() and
> > > > fscrypt_encrypt_blocks(), so that it's conceptually simpler what each function
> > > > does. Currently this works completely differently depending on whether the
> > > > filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is weird.
> > > >
> > > > I also noticed that using fscrypt_ctx for writes seems to be unnecessary.
> > > > AFAICS, page_private(bounce_page) could point directly to the pagecache page.
> > > > That would simplify things a lot, especially since then fscrypt_ctx could be
> > > > removed entirely after you convert reads to use read_callbacks_ctx.
> > > >
> > > > IMO, these would be worthwhile cleanups for fscrypt by themselves, without
> > > > waiting for the read_callbacks stuff to be finalized. Finalizing the
> > > > read_callbacks stuff will probably require reaching a consensus about how they
> > > > should work with future filesystem features like fsverity and compression.
> > > >
> > > > So to move things forward, I'm considering sending out a series with the above
> > > > cleanups for fscrypt, plus the equivalent of your patches:
> > > >
> > > > "fscrypt_encrypt_page: Loop across all blocks mapped by a page range"
> > > > "fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page"
> > > > "Add decryption support for sub-pagesized blocks" (fs/crypto/ part only)
> > > >
> > > > Then hopefully we can get all that applied for 5.3 so that fs/crypto/ itself is
> > > > ready for blocksize != PAGE_SIZE; and get your changes to ext4_bio_write_page(),
> > > > __ext4_block_zero_page_range(), and ext4_block_write_begin() applied too, so
> > > > that ext4 is partially ready for encryption with blocksize != PAGE_SIZE.
> > > >
> > > > Then only the read_callbacks stuff will remain, to get encryption support into
> > > > fs/mpage.c and fs/buffer.c. Do you think that's a good plan?
> > >
> > > Hi Eric,
> > >
> > > IMHO, I will continue posting the next version of the current patchset and if
> > > there are no serious reservations from FS maintainers the "read callbacks"
> > > patchset can be merged. In such a scenario, the cleanups being
> > > non-complicated, can be merged later.
> > >
> >
> > Most of the patches I have in mind are actually things that are in your patchset
> > already, or have been requested, or will be requested eventually :-). I'm
> > concerned that people will keep going back and forth on this patchset for a lot
> > longer, arguing about fsverity, compression, details of the fs/crypto/ stuff,
> > etc. Moreover it's based on unmerged patches that add the fsverity feature, so
> > it can't be merged as-is anyway.
> >
> > IMO, it's also difficult for people to review the read_callbacks stuff when it's
> > mixed in with lots of other fscrypt and ext4 changes for blocksize != PAGE_SIZE.
> >
> > I actually have a patchset almost ready already, so I'm going to send it out and
> > see what you think. It *should* make things a lot easier for you, since then
> > you can base a much smaller read_callbacks patchset on top of it.
>
> One of the things that I am concerned most about is the fact that the more we
> delay merging read_callbacks patchset, the more the chances of filesystems
> adding further operations that get executed after read I/O completes. Most of
> the time, these implementations tend to have filesystem specific changes which
> are going to be very difficult (impossible?) to make them work with
> read_callback patchset. So instead of making things easier, delaying merging
> the read_callback patchset ends up actually having the opposite effect.
>
> With the read_callback patchset merged, FS feature developers will take
> read_callback framework into consideration before designing/implementing new
> related features.
>
The main problems are that your patchset mixes up conceptually unrelated
changes, and is dependent on future filesystem features. See how it starts by
adding read_callbacks support for both fscrypt *and* fsverity (the latter of
which is not merged yet), then updates fs/crypto/ to support subpage blocks,
*then* goes back and finishes read_callbacks to support buffer_heads since that
depended on the fs/crypto/ changes. The ext4 changes for subpage blocks are
mixed in too throughout the patchset. So I don't think it can proceed in its
current form; it's too much for anyone to handle at once.
And I see your first patchset for ext4 encryption with subpage blocks was sent
almost a year and a half ago, so it's indeed been going in circles for a while.
But based on your work I've been able to get the fs/crypto/ and ext4
preparations for subpage blocks into a clean set of changes by themselves.
There are needed in any case, so IMO we should take them first in order to
unblock the rest.
I don't really understand your point about forcing filesystems to be compatible
with read_callbacks. The whole point of read_callbacks is that it's a common
support layer which makes it easier for filesystems to do the things they're
doing anyway, or will be doing. So it shouldn't affect filesystem designs.
Thanks!
- Eric