Hello,
This RFC patchset implements fs-verity for ext4 and f2fs. fs-verity is
similar to dm-verity, but implemented on a per-file basis: a Merkle tree
hidden past the end of the file is used to verify the file's data as it
is paged in. Most of the code is in fs/verity/, and not too many
filesystem-specific changes are needed. The Merkle tree is written by
userspace before calling an ioctl to mark the file as a verity file; the
file then becomes read-only and the tree is hidden from userspace.
Note: on Monday, Michael Halcrow and I will be giving a talk about
fs-verity at the Linux Security Summit. fs-verity was also previously
discussed at LSFMM 2018; see https://lwn.net/Articles/752614/. It was
also previously discussed on linux-fsdevel here:
https://www.spinics.net/lists/linux-fsdevel/msg121182.html
Since fs-verity provides the Merkle tree root hash in constant time and
verifies data blocks on-demand, it is useful for efficiently verifying
the authenticity of, or "appraising", large files of which only a small
portion may be accessed -- such as Android application (APK) files. It
can also be useful in "audit" use cases where file hashes are logged.
fs-verity also provides better protection against malicious disk
firmware than an ahead-of-time hash, since fs-verity re-verifies data
each time it's paged in.
This patchset doesn't yet include IMA support for fs-verity file
measurements; this is planned and we'd like to collaborate with the IMA
maintainers. Although fs-verity can be used on its own without IMA,
fs-verity is primarily a lower level feature (think of it as a way of
hashing a file), so some users will probably still need IMA's policy
mechanism. The patchset *does* include an optional means of including a
signature in the fs-verity metadata and verifying it against the
certificates in an fs-verity keyring; though, this might need to be
re-assessed if it turns out IMA works just as well for that use case.
For now this patchset only supports the case where the fs-verity block
sizes are equal to PAGE_SIZE. However, the fs-verity block sizes can be
different from the filesystem's block size.
A documentation file in Documentation/filesystems/ is planned but not
yet included.
This patchset is based on Linux v4.18. It can also be found in git at
tag "fsverity_2018-08-24" of:
https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git
A userspace utility for fs-verity can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/fsverity-utils.git
See the README.md file in the userspace utility source tree for examples.
Tests for fs-verity can be found at branch "fsverity" of:
https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/xfstests-dev.git
On ext4 and f2fs, using fs-verity requires setting the verity feature
flag on your filesystem. The verity feature flag is supported since
e2fsprogs 1.44.4-2 and f2fs-tools 1.11.0.
Warning: besides the feature bit and inode flag, fs-verity's on-disk
format is not yet stable, i.e. it can still be changed. Please don't
use this patchset "in production" yet!
Feedback on the design and implementation is greatly appreciated.
Thanks!
Eric Biggers (8):
fs-verity: add setup code, UAPI, and Kconfig
fs-verity: add data verification hooks for ->readpages()
fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
fs-verity: implement FS_IOC_MEASURE_VERITY ioctl
fs-verity: add SHA-512 support
fs-verity: add CRC-32C support
fs-verity: support builtin file signatures
f2fs: fs-verity support
Theodore Ts'o (2):
ext4: add basic fs-verity support
ext4: add fs-verity read support
fs/Kconfig | 2 +
fs/Makefile | 1 +
fs/ext4/Kconfig | 20 +
fs/ext4/ext4.h | 22 +-
fs/ext4/file.c | 6 +
fs/ext4/inode.c | 11 +
fs/ext4/ioctl.c | 12 +
fs/ext4/readpage.c | 207 ++++++--
fs/ext4/super.c | 87 ++++
fs/ext4/sysfs.c | 6 +
fs/f2fs/Kconfig | 20 +
fs/f2fs/data.c | 43 +-
fs/f2fs/f2fs.h | 17 +-
fs/f2fs/file.c | 58 +++
fs/f2fs/inode.c | 3 +-
fs/f2fs/super.c | 22 +
fs/f2fs/sysfs.c | 11 +
fs/verity/Kconfig | 53 ++
fs/verity/Makefile | 5 +
fs/verity/fsverity_private.h | 136 +++++
fs/verity/hash_algs.c | 115 +++++
fs/verity/ioctl.c | 170 +++++++
fs/verity/setup.c | 931 ++++++++++++++++++++++++++++++++++
fs/verity/signature.c | 187 +++++++
fs/verity/verify.c | 310 +++++++++++
include/linux/fs.h | 9 +
include/linux/fsverity.h | 102 ++++
include/uapi/linux/fsverity.h | 98 ++++
28 files changed, 2623 insertions(+), 41 deletions(-)
create mode 100644 fs/verity/Kconfig
create mode 100644 fs/verity/Makefile
create mode 100644 fs/verity/fsverity_private.h
create mode 100644 fs/verity/hash_algs.c
create mode 100644 fs/verity/ioctl.c
create mode 100644 fs/verity/setup.c
create mode 100644 fs/verity/signature.c
create mode 100644 fs/verity/verify.c
create mode 100644 include/linux/fsverity.h
create mode 100644 include/uapi/linux/fsverity.h
--
2.18.0
From: Eric Biggers <[email protected]>
fs-verity is a filesystem feature that provides efficient, transparent
integrity verification and authentication of read-only files. It uses a
dm-verity like mechanism at the file level: a Merkle tree hidden past
the end of the file is used to verify any block in the file in
log(filesize) time. It is implemented mainly by helper functions in
fs/verity/ that will be shared by multiple filesystems.
Essentially, fs-verity reports a file's hash in constant time, but reads
that would violate that hash fail at runtime. This is useful when only
a portion of the file is actually accessed, as only the accessed portion
has to be hashed, and the latency to the first read is much reduced over
a full file hash. On top of this hashing mechanism, auditing or
authentication policies can be implemented to log or verify file hashes.
Note that in general, fs-verity is *not* a replacement for IMA.
fs-verity is a lower-level feature, primarily a way to hash a file;
whereas IMA deals more with higher-level policy logic, like defining
which files are "measured" and what to do with those measurements. We
plan for IMA to support fs-verity measurements as an alternative to the
traditional full file hash. Still, some users find fs-verity useful by
itself, so it's also usable without IMA in simple cases, e.g. in cases
where just retrieving the file measurement via an ioctl is enough.
A structure containing the properties of the Merkle tree -- such as the
hash algorithm used, the block size, and the root hash -- is also stored
on-disk, following the Merkle tree. The actual file measurement hash
that fs-verity reports is the hash of this structure.
All fs-verity metadata is written by userspace; the kernel only reads
it. Extended attributes aren't used because the Merkle tree may be much
larger than XATTR_SIZE_MAX, we want the hash pages to be cached in the
page cache as usual, and in the case of fs-verity combined with fscrypt
we want the metadata to be encrypted to avoid leaking plaintext hashes.
The fs-verity metadata is hidden from userspace by overriding the i_size
of the in-memory VFS inode; ext4 additionally will override the on-disk
i_size in order to make verity a RO_COMPAT filesystem feature.
This initial patch only adds the fs-verity Kconfig option, UAPI, and
setup code, e.g. the ->open() hook that parses the fs-verity descriptor.
The actual ->readpages() data verification, the ioctls, ext4 and f2fs
support, and other functionality comes in later patches.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/Kconfig | 2 +
fs/Makefile | 1 +
fs/verity/Kconfig | 36 ++
fs/verity/Makefile | 3 +
fs/verity/fsverity_private.h | 99 ++++
fs/verity/hash_algs.c | 106 +++++
fs/verity/setup.c | 846 ++++++++++++++++++++++++++++++++++
include/linux/fs.h | 9 +
include/linux/fsverity.h | 62 +++
include/uapi/linux/fsverity.h | 86 ++++
10 files changed, 1250 insertions(+)
create mode 100644 fs/verity/Kconfig
create mode 100644 fs/verity/Makefile
create mode 100644 fs/verity/fsverity_private.h
create mode 100644 fs/verity/hash_algs.c
create mode 100644 fs/verity/setup.c
create mode 100644 include/linux/fsverity.h
create mode 100644 include/uapi/linux/fsverity.h
diff --git a/fs/Kconfig b/fs/Kconfig
index ac474a61be379..ddadc4e999429 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -105,6 +105,8 @@ config MANDATORY_FILE_LOCKING
source "fs/crypto/Kconfig"
+source "fs/verity/Kconfig"
+
source "fs/notify/Kconfig"
source "fs/quota/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index 293733f61594b..10b37f651ffde 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -32,6 +32,7 @@ obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
obj-$(CONFIG_AIO) += aio.o
obj-$(CONFIG_FS_DAX) += dax.o
obj-$(CONFIG_FS_ENCRYPTION) += crypto/
+obj-$(CONFIG_FS_VERITY) += verity/
obj-$(CONFIG_FILE_LOCKING) += locks.o
obj-$(CONFIG_COMPAT) += compat.o compat_ioctl.o
obj-$(CONFIG_BINFMT_AOUT) += binfmt_aout.o
diff --git a/fs/verity/Kconfig b/fs/verity/Kconfig
new file mode 100644
index 0000000000000..308d733a9401b
--- /dev/null
+++ b/fs/verity/Kconfig
@@ -0,0 +1,36 @@
+config FS_VERITY
+ tristate "FS Verity (file-based integrity/authentication)"
+ depends on BLOCK
+ select CRYPTO
+ # SHA-256 is selected as it's intended to be the default hash algorithm.
+ # To avoid bloat, other wanted algorithms must be selected explicitly.
+ select CRYPTO_SHA256
+ help
+ This option enables fs-verity. fs-verity is the dm-verity
+ mechanism implemented at the file level. On supported
+ filesystems, userspace can append a Merkle tree (hash tree) to
+ a file, then enable fs-verity on the file. The filesystem
+ will then transparently verify any data read from the file
+ against the Merkle tree. The file is also made read-only.
+
+ This serves as an integrity check, but the availability of the
+ Merkle tree root hash also allows efficiently supporting
+ various use cases where normally the whole file would need to
+ be hashed at once, such as: (a) auditing (logging the file's
+ hash), or (b) authenticity verification (comparing the hash
+ against a known good value, e.g. from a digital signature).
+
+ fs-verity is especially useful on large files where not all
+ the contents may actually be needed. Also, fs-verity verifies
+ data each time it is paged back in, which provides better
+ protection against malicious disks vs. an ahead-of-time hash.
+
+ If unsure, say N.
+
+config FS_VERITY_DEBUG
+ bool "FS Verity debugging"
+ depends on FS_VERITY
+ help
+ Enable debugging messages related to fs-verity by default.
+
+ Say N unless you are an fs-verity developer.
diff --git a/fs/verity/Makefile b/fs/verity/Makefile
new file mode 100644
index 0000000000000..39e123805c827
--- /dev/null
+++ b/fs/verity/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_FS_VERITY) += fsverity.o
+
+fsverity-y := hash_algs.o setup.o
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
new file mode 100644
index 0000000000000..a18ff645695f4
--- /dev/null
+++ b/fs/verity/fsverity_private.h
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * fs-verity: read-only file-based integrity/authentication
+ *
+ * Copyright (C) 2018 Google LLC
+ */
+
+#ifndef _FSVERITY_PRIVATE_H
+#define _FSVERITY_PRIVATE_H
+
+#ifdef CONFIG_FS_VERITY_DEBUG
+#define DEBUG
+#endif
+
+#define pr_fmt(fmt) "fs-verity: " fmt
+
+#include <crypto/sha.h>
+#define __FS_HAS_VERITY 1
+#include <linux/fsverity.h>
+
+/*
+ * Maximum depth of the Merkle tree. Up to 64 levels are theoretically possible
+ * with a very small block size, but we'd like to limit stack usage during
+ * verification, and in practice this is plenty. E.g., with SHA-256 and 4K
+ * blocks, a file with size UINT64_MAX bytes needs just 8 levels.
+ */
+#define FS_VERITY_MAX_LEVELS 16
+
+/*
+ * Largest digest size among all hash algorithms supported by fs-verity. This
+ * can be increased if needed.
+ */
+#define FS_VERITY_MAX_DIGEST_SIZE SHA256_DIGEST_SIZE
+
+/* A hash algorithm supported by fs-verity */
+struct fsverity_hash_alg {
+ struct crypto_ahash *tfm; /* allocated on demand */
+ const char *name;
+ unsigned int digest_size;
+ bool cryptographic;
+};
+
+/**
+ * fsverity_info - cached verity metadata for an inode
+ *
+ * When a verity file is first opened, an instance of this struct is allocated
+ * and stored in ->i_verity_info. It caches various values from the verity
+ * metadata, such as the tree topology and the root hash, which are needed to
+ * efficiently verify data read from the file. Once created, it remains until
+ * the inode is evicted.
+ *
+ * (The tree pages themselves are not cached here, though they may be cached in
+ * the inode's page cache.)
+ */
+struct fsverity_info {
+ const struct fsverity_hash_alg *hash_alg; /* hash algorithm */
+ u8 block_bits; /* log2(block size) */
+ u8 log_arity; /* log2(hashes per hash block) */
+ u8 depth; /* depth of the Merkle tree */
+ u8 *hashstate; /* salted initial hash state */
+ u64 data_i_size; /* original file size */
+ u64 full_i_size; /* full file size including metadata */
+ u8 root_hash[FS_VERITY_MAX_DIGEST_SIZE]; /* Merkle tree root hash */
+ u8 measurement[FS_VERITY_MAX_DIGEST_SIZE]; /* file measurement */
+ bool have_root_hash; /* have root hash from disk? */
+
+ /* Starting blocks for each tree level. 'depth-1' is the root level. */
+ u64 hash_lvl_region_idx[FS_VERITY_MAX_LEVELS];
+};
+
+/* hash_algs.c */
+extern struct fsverity_hash_alg fsverity_hash_algs[];
+const struct fsverity_hash_alg *fsverity_get_hash_alg(unsigned int num);
+void __init fsverity_check_hash_algs(void);
+void __exit fsverity_exit_hash_algs(void);
+
+/* setup.c */
+struct fsverity_info *create_fsverity_info(struct inode *inode, bool enabling);
+void free_fsverity_info(struct fsverity_info *vi);
+
+static inline struct fsverity_info *get_fsverity_info(const struct inode *inode)
+{
+ /* pairs with cmpxchg_release() in set_fsverity_info() */
+ return smp_load_acquire(&inode->i_verity_info);
+}
+
+static inline bool set_fsverity_info(struct inode *inode,
+ struct fsverity_info *vi)
+{
+ /* pairs with smp_load_acquire() in get_fsverity_info() */
+ if (cmpxchg_release(&inode->i_verity_info, NULL, vi) != NULL)
+ return false;
+
+ /* Set the in-memory i_size to the data size */
+ i_size_write(inode, vi->data_i_size);
+ return true;
+}
+
+#endif /* _FSVERITY_PRIVATE_H */
diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
new file mode 100644
index 0000000000000..424a26ee2f3c2
--- /dev/null
+++ b/fs/verity/hash_algs.c
@@ -0,0 +1,106 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fs/verity/hash_algs.c: fs-verity hash algorithm management
+ *
+ * Copyright (C) 2018 Google LLC
+ *
+ * Written by Eric Biggers.
+ */
+
+#include "fsverity_private.h"
+
+#include <crypto/hash.h>
+
+/* The list of hash algorithms supported by fs-verity */
+struct fsverity_hash_alg fsverity_hash_algs[] = {
+ [FS_VERITY_ALG_SHA256] = {
+ .name = "sha256",
+ .digest_size = 32,
+ .cryptographic = true,
+ },
+};
+
+/*
+ * Translate the given fs-verity hash algorithm number into a struct describing
+ * the algorithm, and ensure it has a hash transform ready to go. The hash
+ * transforms are allocated on-demand firstly to not waste resources when they
+ * aren't needed, and secondly because the fs-verity module may be loaded
+ * earlier than the needed crypto modules.
+ */
+const struct fsverity_hash_alg *fsverity_get_hash_alg(unsigned int num)
+{
+ struct fsverity_hash_alg *alg;
+ struct crypto_ahash *tfm;
+ int err;
+
+ if (num >= ARRAY_SIZE(fsverity_hash_algs) ||
+ !fsverity_hash_algs[num].digest_size) {
+ pr_warn("Unknown hash algorithm: %u\n", num);
+ return ERR_PTR(-EINVAL);
+ }
+ alg = &fsverity_hash_algs[num];
+retry:
+ /* pairs with cmpxchg_release() below */
+ tfm = smp_load_acquire(&alg->tfm);
+ if (tfm)
+ return alg;
+ /*
+ * Using the shash API would make things a bit simpler, but the ahash
+ * API is preferable as it allows the use of crypto accelerators.
+ */
+ tfm = crypto_alloc_ahash(alg->name, 0, 0);
+ if (IS_ERR(tfm)) {
+ if (PTR_ERR(tfm) == -ENOENT)
+ pr_warn("Algorithm %u (%s) is unavailable\n",
+ num, alg->name);
+ else
+ pr_warn("Error allocating algorithm %u (%s): %ld\n",
+ num, alg->name, PTR_ERR(tfm));
+ return ERR_CAST(tfm);
+ }
+
+ err = -EINVAL;
+ if (WARN_ON(alg->digest_size != crypto_ahash_digestsize(tfm)))
+ goto err_free_tfm;
+
+ pr_info("%s using implementation \"%s\"\n", alg->name,
+ crypto_hash_alg_common(tfm)->base.cra_driver_name);
+
+ /* pairs with smp_load_acquire() above */
+ if (cmpxchg_release(&alg->tfm, NULL, tfm) != NULL) {
+ crypto_free_ahash(tfm);
+ goto retry;
+ }
+
+ return alg;
+
+err_free_tfm:
+ crypto_free_ahash(tfm);
+ return ERR_PTR(err);
+}
+
+void __init fsverity_check_hash_algs(void)
+{
+ int i;
+
+ /*
+ * Sanity check the digest sizes (could be a build-time check, but
+ * they're in an array)
+ */
+ for (i = 0; i < ARRAY_SIZE(fsverity_hash_algs); i++) {
+ struct fsverity_hash_alg *alg = &fsverity_hash_algs[i];
+
+ if (!alg->digest_size)
+ continue;
+ BUG_ON(alg->digest_size > FS_VERITY_MAX_DIGEST_SIZE);
+ BUG_ON(!is_power_of_2(alg->digest_size));
+ }
+}
+
+void __exit fsverity_exit_hash_algs(void)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(fsverity_hash_algs); i++)
+ crypto_free_ahash(fsverity_hash_algs[i].tfm);
+}
diff --git a/fs/verity/setup.c b/fs/verity/setup.c
new file mode 100644
index 0000000000000..e675c52898d5b
--- /dev/null
+++ b/fs/verity/setup.c
@@ -0,0 +1,846 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fs/verity/setup.c: fs-verity module initialization and descriptor parsing
+ *
+ * Copyright (C) 2018 Google LLC
+ *
+ * Originally written by Jaegeuk Kim and Michael Halcrow;
+ * heavily rewritten by Eric Biggers.
+ */
+
+#include "fsverity_private.h"
+
+#include <crypto/hash.h>
+#include <linux/highmem.h>
+#include <linux/list_sort.h>
+#include <linux/module.h>
+#include <linux/pagemap.h>
+#include <linux/scatterlist.h>
+#include <linux/vmalloc.h>
+
+static struct kmem_cache *fsverity_info_cachep;
+
+static void dump_fsverity_descriptor(const struct fsverity_descriptor *desc)
+{
+ pr_debug("magic = %.*s\n", (int)sizeof(desc->magic), desc->magic);
+ pr_debug("major_version = %u\n", desc->major_version);
+ pr_debug("minor_version = %u\n", desc->minor_version);
+ pr_debug("log_data_blocksize = %u\n", desc->log_data_blocksize);
+ pr_debug("log_tree_blocksize = %u\n", desc->log_tree_blocksize);
+ pr_debug("data_algorithm = %u\n", le16_to_cpu(desc->data_algorithm));
+ pr_debug("tree_algorithm = %u\n", le16_to_cpu(desc->tree_algorithm));
+ pr_debug("flags = %#x\n", le32_to_cpu(desc->flags));
+ pr_debug("orig_file_size = %llu\n", le64_to_cpu(desc->orig_file_size));
+ pr_debug("auth_ext_count = %u\n", le16_to_cpu(desc->auth_ext_count));
+}
+
+/* Precompute the salted initial hash state */
+static int set_salt(struct fsverity_info *vi, const u8 *salt, size_t saltlen)
+{
+ struct crypto_ahash *tfm = vi->hash_alg->tfm;
+ struct ahash_request *req;
+ unsigned int reqsize = sizeof(*req) + crypto_ahash_reqsize(tfm);
+ struct scatterlist sg;
+ DECLARE_CRYPTO_WAIT(wait);
+ u8 *saltbuf;
+ int err;
+
+ vi->hashstate = kmalloc(crypto_ahash_statesize(tfm), GFP_KERNEL);
+ if (!vi->hashstate)
+ return -ENOMEM;
+ /* On error, vi->hashstate is freed by free_fsverity_info() */
+
+ /*
+ * Allocate a hash request buffer. Also reserve space for a copy of
+ * the salt, since the given 'salt' may point into vmap'ed memory, so
+ * sg_init_one() may not work on it.
+ */
+ req = kmalloc(reqsize + saltlen, GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+ saltbuf = (u8 *)req + reqsize;
+ memcpy(saltbuf, salt, saltlen);
+ sg_init_one(&sg, saltbuf, saltlen);
+
+ ahash_request_set_tfm(req, tfm);
+ ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+ CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &wait);
+ ahash_request_set_crypt(req, &sg, NULL, saltlen);
+
+ err = crypto_wait_req(crypto_ahash_init(req), &wait);
+ if (err)
+ goto out;
+ err = crypto_wait_req(crypto_ahash_update(req), &wait);
+ if (err)
+ goto out;
+ err = crypto_ahash_export(req, vi->hashstate);
+out:
+ kfree(req);
+ return err;
+}
+
+/*
+ * Copy in the root hash stored on disk.
+ *
+ * Note that the root hash could be computed by hashing the root block of the
+ * Merkle tree. But it works out a bit simpler to store the hash separately;
+ * then it gets included in the file measurement without special-casing it, and
+ * the root block gets verified on the ->readpages() path like the other blocks.
+ */
+static int parse_root_hash_extension(struct fsverity_info *vi,
+ const void *hash, size_t size)
+{
+ const struct fsverity_hash_alg *alg = vi->hash_alg;
+
+ if (vi->have_root_hash) {
+ pr_warn("Multiple root hashes were found!\n");
+ return -EINVAL;
+ }
+ if (size != alg->digest_size) {
+ pr_warn("Wrong root hash size; got %zu bytes, but expected %u for hash algorithm %s\n",
+ size, alg->digest_size, alg->name);
+ return -EINVAL;
+ }
+ memcpy(vi->root_hash, hash, size);
+ vi->have_root_hash = true;
+ pr_debug("Root hash: %s:%*phN\n", alg->name,
+ alg->digest_size, vi->root_hash);
+ return 0;
+}
+
+static int parse_salt_extension(struct fsverity_info *vi,
+ const void *salt, size_t saltlen)
+{
+ if (vi->hashstate) {
+ pr_warn("Multiple salts were found!\n");
+ return -EINVAL;
+ }
+ return set_salt(vi, salt, saltlen);
+}
+
+/* The available types of extensions (variable-length metadata items) */
+static const struct extension_type {
+ int (*parse)(struct fsverity_info *vi, const void *_ext,
+ size_t extra_len);
+ size_t base_len; /* length of fixed-size part of payload, if any */
+ bool unauthenticated; /* true if not included in file measurement */
+} extension_types[] = {
+ [FS_VERITY_EXT_ROOT_HASH] = {
+ .parse = parse_root_hash_extension,
+ },
+ [FS_VERITY_EXT_SALT] = {
+ .parse = parse_salt_extension,
+ },
+};
+
+static int do_parse_extensions(struct fsverity_info *vi,
+ const struct fsverity_extension **ext_hdr_p,
+ const void *end, int count, bool authenticated)
+{
+ const struct fsverity_extension *ext_hdr = *ext_hdr_p;
+ int i;
+ int err;
+
+ for (i = 0; i < count; i++) {
+ const struct extension_type *type;
+ u32 len, rounded_len;
+ u16 type_code;
+
+ if (end - (const void *)ext_hdr < sizeof(*ext_hdr)) {
+ pr_warn("Extension list overflows buffer\n");
+ return -EINVAL;
+ }
+ type_code = le16_to_cpu(ext_hdr->type);
+ if (type_code >= ARRAY_SIZE(extension_types) ||
+ !extension_types[type_code].parse) {
+ pr_warn("Unknown extension type: %u\n", type_code);
+ return -EINVAL;
+ }
+ type = &extension_types[type_code];
+ if (authenticated != !type->unauthenticated) {
+ pr_warn("Extension type %u must be %sauthenticated\n",
+ type_code, type->unauthenticated ? "un" : "");
+ return -EINVAL;
+ }
+ if (ext_hdr->reserved) {
+ pr_warn("Reserved bits set in extension header\n");
+ return -EINVAL;
+ }
+ len = le32_to_cpu(ext_hdr->length);
+ if (len < sizeof(*ext_hdr)) {
+ pr_warn("Invalid length in extension header\n");
+ return -EINVAL;
+ }
+ rounded_len = round_up(len, 8);
+ if (rounded_len == 0 ||
+ rounded_len > end - (const void *)ext_hdr) {
+ pr_warn("Extension item overflows buffer\n");
+ return -EINVAL;
+ }
+ if (len < sizeof(*ext_hdr) + type->base_len) {
+ pr_warn("Extension length too small for type\n");
+ return -EINVAL;
+ }
+ err = type->parse(vi, ext_hdr + 1,
+ len - sizeof(*ext_hdr) - type->base_len);
+ if (err)
+ return err;
+ ext_hdr = (const void *)ext_hdr + rounded_len;
+ }
+ *ext_hdr_p = ext_hdr;
+ return 0;
+}
+
+/*
+ * Parse the extension items following the fixed-size portion of the fs-verity
+ * descriptor. The fsverity_info is updated accordingly.
+ *
+ * Return: On success, the size of the authenticated portion of the descriptor
+ * (the fixed-size portion plus the authenticated extensions).
+ * Otherwise, a -errno value.
+ */
+static int parse_extensions(struct fsverity_info *vi,
+ const struct fsverity_descriptor *desc,
+ int desc_len)
+{
+ const struct fsverity_extension *ext_hdr = (const void *)(desc + 1);
+ const void *end = (const void *)desc + desc_len;
+ u16 auth_ext_count = le16_to_cpu(desc->auth_ext_count);
+ int auth_desc_len;
+ int err;
+
+ err = do_parse_extensions(vi, &ext_hdr, end, auth_ext_count, true);
+ if (err)
+ return err;
+ auth_desc_len = (void *)ext_hdr - (void *)desc;
+
+ /*
+ * Unauthenticated extensions (optional). Careful: an attacker able to
+ * corrupt the file can change these arbitrarily without being detected.
+ * Thus, only specific types of extensions are whitelisted here --
+ * namely, the ones containing a signature of the file measurement,
+ * which by definition can't be included in the file measurement itself.
+ */
+ if (end - (void *)ext_hdr >= 8) {
+ u16 unauth_ext_count = le16_to_cpup((__le16 *)ext_hdr);
+
+ ext_hdr = (void *)ext_hdr + 8;
+ err = do_parse_extensions(vi, &ext_hdr, end,
+ unauth_ext_count, false);
+ if (err)
+ return err;
+ }
+
+ return auth_desc_len;
+}
+
+/*
+ * Parse an fs-verity descriptor, loading information into the fsverity_info.
+ *
+ * Return: On success, the size of the authenticated portion of the descriptor
+ * (the fixed-size portion plus the authenticated extensions).
+ * Otherwise, a -errno value.
+ */
+static int parse_fsverity_descriptor(struct fsverity_info *vi,
+ const struct fsverity_descriptor *desc,
+ int desc_len, loff_t desc_start)
+{
+ unsigned int alg_num;
+ unsigned int hashes_per_block;
+ u64 orig_file_size;
+ int desc_auth_len;
+ int err;
+
+ BUILD_BUG_ON(sizeof(*desc) != 64);
+
+ /* magic */
+ if (memcmp(desc->magic, FS_VERITY_MAGIC, sizeof(desc->magic))) {
+ pr_warn("Wrong magic bytes\n");
+ return -EINVAL;
+ }
+
+ /* major_version */
+ if (desc->major_version != 1) {
+ pr_warn("Unsupported major version (%u)\n",
+ desc->major_version);
+ return -EINVAL;
+ }
+
+ /* minor_version */
+ if (desc->minor_version != 0) {
+ pr_warn("Unsupported minor version (%u)\n",
+ desc->minor_version);
+ return -EINVAL;
+ }
+
+ /* data_algorithm and tree_algorithm */
+ alg_num = le16_to_cpu(desc->data_algorithm);
+ if (alg_num != le16_to_cpu(desc->tree_algorithm)) {
+ pr_warn("Unimplemented case: data (%u) and tree (%u) hash algorithms differ\n",
+ alg_num, le16_to_cpu(desc->tree_algorithm));
+ return -EINVAL;
+ }
+ vi->hash_alg = fsverity_get_hash_alg(alg_num);
+ if (IS_ERR(vi->hash_alg))
+ return PTR_ERR(vi->hash_alg);
+
+ /* log_data_blocksize and log_tree_blocksize */
+ if (desc->log_data_blocksize != PAGE_SHIFT) {
+ pr_warn("Unsupported log_blocksize (%u). Need block_size == PAGE_SIZE.\n",
+ desc->log_data_blocksize);
+ return -EINVAL;
+ }
+ if (desc->log_tree_blocksize != desc->log_data_blocksize) {
+ pr_warn("Unimplemented case: data (%u) and tree (%u) block sizes differ\n",
+ desc->log_data_blocksize, desc->log_data_blocksize);
+ return -EINVAL;
+ }
+ vi->block_bits = desc->log_data_blocksize;
+ hashes_per_block = (1 << vi->block_bits) / vi->hash_alg->digest_size;
+ if (!is_power_of_2(hashes_per_block)) {
+ pr_warn("Unimplemented case: hashes per block (%u) isn't a power of 2\n",
+ hashes_per_block);
+ return -EINVAL;
+ }
+ vi->log_arity = ilog2(hashes_per_block);
+
+ /* flags */
+ if (desc->flags) {
+ pr_warn("Unsupported flags (%#x)\n", le32_to_cpu(desc->flags));
+ return -EINVAL;
+ }
+
+ /* reserved fields */
+ if (desc->reserved1 ||
+ memchr_inv(desc->reserved2, 0, sizeof(desc->reserved2))) {
+ pr_warn("Reserved bits set in fsverity_descriptor\n");
+ return -EINVAL;
+ }
+
+ /*
+ * orig_file_size. For filesystems that set the on-disk i_size to
+ * data_i_size rather than to full_i_size, this field is redundant --
+ * though it still must be included in the file measurement! Make sure
+ * it's really the same.
+ */
+ orig_file_size = le64_to_cpu(desc->orig_file_size);
+ if (vi->data_i_size) {
+ if (orig_file_size != vi->data_i_size) {
+ pr_warn("fsverity_descriptor.orig_file_size (%llu) doesn't match i_size (%llu)!\n",
+ orig_file_size, vi->data_i_size);
+ return -EINVAL;
+ }
+ } else {
+ vi->data_i_size = orig_file_size;
+ }
+ if (vi->data_i_size == 0) {
+ pr_warn("Original file size is 0; this is not supported\n");
+ return -EINVAL;
+ }
+ if (vi->data_i_size > desc_start) {
+ pr_warn("Original file size is too large (%llu)\n",
+ vi->data_i_size);
+ return -EINVAL;
+ }
+
+ /* extensions */
+ desc_auth_len = parse_extensions(vi, desc, desc_len);
+ if (desc_auth_len < 0)
+ return desc_auth_len;
+
+ if (!vi->have_root_hash) {
+ pr_warn("Root hash wasn't found!\n");
+ return -EINVAL;
+ }
+
+ /* Use an empty salt if no salt was found in the extensions list */
+ if (!vi->hashstate) {
+ err = set_salt(vi, "", 0);
+ if (err)
+ return err;
+ }
+
+ return desc_auth_len;
+}
+
+/*
+ * Calculate the depth of the Merkle tree, then create a map from level to the
+ * block offset at which that level's hash blocks start. Level 'depth - 1' is
+ * the root and is stored first in the file, in the first block following the
+ * original data. Level 0 is the level directly "above" the data blocks and is
+ * stored last in the file, just before the fsverity_descriptor.
+ */
+static int compute_tree_depth_and_offsets(struct fsverity_info *vi)
+{
+ unsigned int hashes_per_block = 1 << vi->log_arity;
+ u64 blocks = (vi->data_i_size + (1 << vi->block_bits) - 1) >>
+ vi->block_bits;
+ u64 offset = blocks;
+ int depth = 0;
+ int i;
+
+ while (blocks > 1) {
+ if (depth >= FS_VERITY_MAX_LEVELS) {
+ pr_warn("Too many tree levels (max is %d)\n",
+ FS_VERITY_MAX_LEVELS);
+ return -EINVAL;
+ }
+ blocks = (blocks + hashes_per_block - 1) >> vi->log_arity;
+ vi->hash_lvl_region_idx[depth++] = blocks;
+ }
+ vi->depth = depth;
+
+ for (i = depth - 1; i >= 0; i--) {
+ u64 next_count = vi->hash_lvl_region_idx[i];
+
+ vi->hash_lvl_region_idx[i] = offset;
+ pr_debug("Level %d is [%llu..%llu] (%llu blocks)\n",
+ i, offset, offset + next_count - 1, next_count);
+ offset += next_count;
+ }
+ return 0;
+}
+
+/* Arbitrary limit, can be increased if needed */
+#define MAX_DESCRIPTOR_PAGES 16
+
+/*
+ * Compute the file's measurement by hashing the first 'desc_auth_len' bytes of
+ * the fs-verity descriptor (which includes the Merkle tree root hash as an
+ * authenticated extension item).
+ *
+ * Note: 'desc' may point into vmap'ed memory, so it can't be passed directly to
+ * sg_set_buf() for the ahash API. Instead, we pass the pages directly.
+ */
+static int compute_measurement(const struct fsverity_info *vi,
+ const struct fsverity_descriptor *desc,
+ int desc_auth_len,
+ struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
+ int nr_desc_pages, u8 *measurement)
+{
+ struct ahash_request *req;
+ DECLARE_CRYPTO_WAIT(wait);
+ struct scatterlist sg[MAX_DESCRIPTOR_PAGES];
+ int offset, len, remaining;
+ int i;
+ int err;
+
+ req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ sg_init_table(sg, nr_desc_pages);
+ offset = offset_in_page(desc);
+ remaining = desc_auth_len;
+ for (i = 0; i < nr_desc_pages && remaining; i++) {
+ len = min_t(int, PAGE_SIZE - offset, remaining);
+ sg_set_page(&sg[i], desc_pages[i], len, offset);
+ remaining -= len;
+ offset = 0;
+ }
+
+ ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+ CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &wait);
+ ahash_request_set_crypt(req, sg, measurement, desc_auth_len);
+ err = crypto_wait_req(crypto_ahash_digest(req), &wait);
+ ahash_request_free(req);
+ return err;
+}
+
+static struct fsverity_info *alloc_fsverity_info(void)
+{
+ return kmem_cache_zalloc(fsverity_info_cachep, GFP_NOFS);
+}
+
+void free_fsverity_info(struct fsverity_info *vi)
+{
+ if (!vi)
+ return;
+ kfree(vi->hashstate);
+ kmem_cache_free(fsverity_info_cachep, vi);
+}
+
+/**
+ * find_fsverity_footer - find the fsverity_footer in the last page of the file
+ *
+ * To find the fsverity_footer we have to scan backwards from the end, skipping
+ * zero bytes. This is needed because some filesystems (e.g. ext4) set the
+ * on-disk i_size to data_i_size rather than to full_i_size, and full_i_size is
+ * instead gotten indirectly via the end of the last extent. This causes
+ * full_i_size to be rounded up to the end of the filesystem block.
+ *
+ * Return: pointer to the footer if found, else NULL
+ */
+static const struct fsverity_footer *
+find_fsverity_footer(const u8 *last_virt, size_t last_validsize)
+{
+ const u8 *p = last_virt + last_validsize;
+ const struct fsverity_footer *ftr;
+
+ /* Find the last nonzero byte, which should be ftr->magic[7] */
+ do {
+ if (p <= last_virt)
+ return NULL;
+ } while (*--p == 0);
+
+ BUILD_BUG_ON(sizeof(ftr->magic) != 8);
+ BUILD_BUG_ON(offsetof(struct fsverity_footer, magic[8]) !=
+ sizeof(*ftr));
+ if (p - last_virt < offsetof(struct fsverity_footer, magic[7]))
+ return NULL;
+ ftr = container_of(p, struct fsverity_footer, magic[7]);
+ if (memcmp(ftr->magic, FS_VERITY_MAGIC, sizeof(ftr->magic)))
+ return NULL;
+ return ftr;
+}
+
+/**
+ * map_fsverity_descriptor - map an inode's fs-verity descriptor into memory
+ *
+ * If the descriptor fits in one page, we use kmap; otherwise we use vmap.
+ * unmap_fsverity_descriptor() must be called later to unmap it.
+ *
+ * It's assumed that the file contents cannot be modified concurrently.
+ * (This is guaranteed by either deny_write_access() or by the verity bit.)
+ *
+ * Return: the virtual address of the start of the descriptor, in virtually
+ * contiguous memory. Also fills in desc_pages and returns in *desc_len the
+ * length of the descriptor including all extensions, and in *desc_start the
+ * offset of the descriptor from the start of the file, in bytes.
+ */
+static const struct fsverity_descriptor *
+map_fsverity_descriptor(struct inode *inode, loff_t full_i_size,
+ struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
+ int *nr_desc_pages, int *desc_len, loff_t *desc_start)
+{
+ const int last_validsize = ((full_i_size - 1) & ~PAGE_MASK) + 1;
+ const pgoff_t last_pgoff = (full_i_size - 1) >> PAGE_SHIFT;
+ struct page *last_page;
+ const void *last_virt;
+ const struct fsverity_footer *ftr;
+ pgoff_t first_pgoff;
+ u32 desc_reverse_offset;
+ pgoff_t pgoff;
+ const void *desc_virt;
+ int i;
+ int err;
+
+ *nr_desc_pages = 0;
+ *desc_len = 0;
+ *desc_start = 0;
+
+ if (full_i_size <= 0) {
+ pr_warn("File is empty!\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ last_page = read_mapping_page(inode->i_mapping, last_pgoff, NULL);
+ if (IS_ERR(last_page)) {
+ pr_warn("Error reading last page: %ld\n", PTR_ERR(last_page));
+ return ERR_CAST(last_page);
+ }
+ last_virt = kmap(last_page);
+
+ ftr = find_fsverity_footer(last_virt, last_validsize);
+ if (!ftr) {
+ pr_warn("No verity metadata found\n");
+ err = -EINVAL;
+ goto err_out;
+ }
+ full_i_size -= (last_virt + last_validsize - sizeof(*ftr)) -
+ (void *)ftr;
+
+ desc_reverse_offset = le32_to_cpu(ftr->desc_reverse_offset);
+ if (desc_reverse_offset <
+ sizeof(struct fsverity_descriptor) + sizeof(*ftr) ||
+ desc_reverse_offset > full_i_size) {
+ pr_warn("Unexpected desc_reverse_offset: %u\n",
+ desc_reverse_offset);
+ err = -EINVAL;
+ goto err_out;
+ }
+ *desc_start = full_i_size - desc_reverse_offset;
+ if (*desc_start & 7) {
+ pr_warn("fs-verity descriptor is misaligned (desc_start=%lld)\n",
+ *desc_start);
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ first_pgoff = *desc_start >> PAGE_SHIFT;
+ if (last_pgoff - first_pgoff >= MAX_DESCRIPTOR_PAGES) {
+ pr_warn("fs-verity descriptor is too long (%lu pages)\n",
+ last_pgoff - first_pgoff + 1);
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ *desc_len = desc_reverse_offset - sizeof(__le32);
+
+ if (first_pgoff == last_pgoff) {
+ /* Single-page descriptor; use the already-kmapped last page */
+ desc_pages[0] = last_page;
+ *nr_desc_pages = 1;
+ return last_virt + (*desc_start & ~PAGE_MASK);
+ }
+
+ /* Multi-page descriptor; map the additional pages into memory */
+
+ for (pgoff = first_pgoff; pgoff < last_pgoff; pgoff++) {
+ struct page *page;
+
+ page = read_mapping_page(inode->i_mapping, pgoff, NULL);
+ if (IS_ERR(page)) {
+ err = PTR_ERR(page);
+ pr_warn("Error reading descriptor page: %d\n", err);
+ goto err_out;
+ }
+ desc_pages[(*nr_desc_pages)++] = page;
+ }
+
+ desc_pages[(*nr_desc_pages)++] = last_page;
+ kunmap(last_page);
+ last_page = NULL;
+
+ desc_virt = vmap(desc_pages, *nr_desc_pages, VM_MAP, PAGE_KERNEL_RO);
+ if (!desc_virt) {
+ err = -ENOMEM;
+ goto err_out;
+ }
+
+ return desc_virt + (*desc_start & ~PAGE_MASK);
+
+err_out:
+ for (i = 0; i < *nr_desc_pages; i++)
+ put_page(desc_pages[i]);
+ if (last_page) {
+ kunmap(last_page);
+ put_page(last_page);
+ }
+ return ERR_PTR(err);
+}
+
+static void
+unmap_fsverity_descriptor(const struct fsverity_descriptor *desc,
+ struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
+ int nr_desc_pages)
+{
+ int i;
+
+ if (is_vmalloc_addr(desc)) {
+ vunmap((void *)((unsigned long)desc & PAGE_MASK));
+ } else {
+ WARN_ON(nr_desc_pages != 1);
+ kunmap(desc_pages[0]);
+ }
+ for (i = 0; i < nr_desc_pages; i++)
+ put_page(desc_pages[i]);
+}
+
+/*
+ * Read the file's fs-verity descriptor and create an fsverity_info for it.
+ */
+struct fsverity_info *create_fsverity_info(struct inode *inode, bool enabling)
+{
+ loff_t full_i_size;
+ struct fsverity_info *vi;
+ const struct fsverity_descriptor *desc = NULL;
+ struct page *desc_pages[MAX_DESCRIPTOR_PAGES];
+ int nr_desc_pages;
+ int desc_len;
+ loff_t desc_start;
+ int desc_auth_len;
+ int err;
+
+ vi = alloc_fsverity_info();
+ if (!vi)
+ return ERR_PTR(-ENOMEM);
+
+ full_i_size = i_size_read(inode);
+
+ if (inode->i_sb->s_vop->get_full_i_size && !enabling) {
+ /*
+ * For filesystems that set the on-disk i_size to data_i_size
+ * rather than to full_i_size, we have to get full_i_size from
+ * somewhere else, e.g. the end of the last extent.
+ */
+ vi->data_i_size = full_i_size;
+ err = inode->i_sb->s_vop->get_full_i_size(inode, &full_i_size);
+ if (err)
+ goto out;
+ }
+ vi->full_i_size = full_i_size;
+ pr_debug("full_i_size=%lld\n", full_i_size);
+
+ desc = map_fsverity_descriptor(inode, full_i_size, desc_pages,
+ &nr_desc_pages, &desc_len, &desc_start);
+ if (IS_ERR(desc)) {
+ err = PTR_ERR(desc);
+ desc = NULL;
+ goto out;
+ }
+
+ dump_fsverity_descriptor(desc);
+ desc_auth_len = parse_fsverity_descriptor(vi, desc, desc_len,
+ desc_start);
+ if (desc_auth_len < 0) {
+ err = desc_auth_len;
+ goto out;
+ }
+
+ err = compute_tree_depth_and_offsets(vi);
+ if (err)
+ goto out;
+ err = compute_measurement(vi, desc, desc_auth_len, desc_pages,
+ nr_desc_pages, vi->measurement);
+out:
+ if (desc)
+ unmap_fsverity_descriptor(desc, desc_pages, nr_desc_pages);
+ if (err) {
+ free_fsverity_info(vi);
+ vi = ERR_PTR(err);
+ }
+ return vi;
+}
+
+/* Ensure the inode has an ->i_verity_info */
+static int setup_fsverity_info(struct inode *inode)
+{
+ struct fsverity_info *vi = get_fsverity_info(inode);
+
+ if (vi)
+ return 0;
+
+ vi = create_fsverity_info(inode, false);
+ if (IS_ERR(vi))
+ return PTR_ERR(vi);
+
+ if (!set_fsverity_info(inode, vi))
+ free_fsverity_info(vi);
+ return 0;
+}
+
+/**
+ * fsverity_file_open - prepare to open a verity file
+ * @inode: the inode being opened
+ * @filp: the struct file being set up
+ *
+ * When opening a verity file, deny the open if it is for writing. Otherwise,
+ * set up the inode's ->i_verity_info (if not already done) by parsing the
+ * verity metadata at the end of the file.
+ *
+ * When combined with fscrypt, this must be called after fscrypt_file_open().
+ * Otherwise, we won't have the key set up to decrypt the verity metadata.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int fsverity_file_open(struct inode *inode, struct file *filp)
+{
+ if (filp->f_mode & FMODE_WRITE) {
+ pr_debug("Denying opening verity file (ino %lu) for write\n",
+ inode->i_ino);
+ return -EPERM;
+ }
+
+ return setup_fsverity_info(inode);
+}
+EXPORT_SYMBOL_GPL(fsverity_file_open);
+
+/**
+ * fsverity_prepare_setattr - prepare to change a verity inode's attributes
+ * @dentry: dentry through which the inode is being changed
+ * @attr: attributes to change
+ *
+ * Verity files are immutable, so deny truncates. This isn't covered by the
+ * open-time check because sys_truncate() takes a path, not a file descriptor.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr)
+{
+ if (attr->ia_valid & ATTR_SIZE) {
+ pr_debug("Denying truncate of verity file (ino %lu)\n",
+ d_inode(dentry)->i_ino);
+ return -EPERM;
+ }
+ return 0;
+}
+EXPORT_SYMBOL_GPL(fsverity_prepare_setattr);
+
+/**
+ * fsverity_prepare_getattr - prepare to get a verity inode's attributes
+ * @inode: the inode for which the attributes are being retrieved
+ *
+ * For filesystems that set the on-disk i_size to full_i_size rather than to
+ * data_i_size, to make st_size exclude the verity metadata even before the file
+ * has been opened for the first time we need to grab the original data size
+ * from the fs-verity descriptor. Currently, to implement this we just set up
+ * the ->i_verity_info, like in the ->open() hook.
+ *
+ * However, when combined with fscrypt, on an encrypted file this must only be
+ * called if the encryption key has been set up!
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int fsverity_prepare_getattr(struct inode *inode)
+{
+ return setup_fsverity_info(inode);
+}
+EXPORT_SYMBOL_GPL(fsverity_prepare_getattr);
+
+/**
+ * fsverity_cleanup_inode - free the inode's verity info, if present
+ *
+ * Filesystems must call this on inode eviction to free ->i_verity_info.
+ */
+void fsverity_cleanup_inode(struct inode *inode)
+{
+ free_fsverity_info(inode->i_verity_info);
+ inode->i_verity_info = NULL;
+}
+EXPORT_SYMBOL_GPL(fsverity_cleanup_inode);
+
+/**
+ * fsverity_full_i_size - get the full (on-disk) file size
+ *
+ * If the inode has had its in-memory ->i_size overridden for fs-verity (to
+ * exclude the metadata at the end of the file), then return the full i_size
+ * which is stored on-disk. Otherwise, just return the in-memory ->i_size.
+ *
+ * Return: the full (on-disk) file size
+ */
+loff_t fsverity_full_i_size(const struct inode *inode)
+{
+ struct fsverity_info *vi = get_fsverity_info(inode);
+
+ if (vi)
+ return vi->full_i_size;
+
+ return i_size_read(inode);
+}
+EXPORT_SYMBOL_GPL(fsverity_full_i_size);
+
+static int __init fsverity_module_init(void)
+{
+ fsverity_info_cachep = KMEM_CACHE(fsverity_info, SLAB_RECLAIM_ACCOUNT);
+ if (!fsverity_info_cachep)
+ return -ENOMEM;
+
+ fsverity_check_hash_algs();
+
+ pr_debug("Initialized fs-verity\n");
+ return 0;
+}
+
+static void __exit fsverity_module_exit(void)
+{
+ kmem_cache_destroy(fsverity_info_cachep);
+ fsverity_exit_hash_algs();
+}
+
+module_init(fsverity_module_init)
+module_exit(fsverity_module_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("fs-verity: read-only file-based integrity/authentication");
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 805bf22898cf2..26764ebcb7724 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -61,6 +61,8 @@ struct workqueue_struct;
struct iov_iter;
struct fscrypt_info;
struct fscrypt_operations;
+struct fsverity_info;
+struct fsverity_operations;
extern void __init inode_init(void);
extern void __init inode_init_early(void);
@@ -671,6 +673,10 @@ struct inode {
struct fscrypt_info *i_crypt_info;
#endif
+#if IS_ENABLED(CONFIG_FS_VERITY)
+ struct fsverity_info *i_verity_info;
+#endif
+
void *i_private; /* fs or device private pointer */
} __randomize_layout;
@@ -1369,6 +1375,9 @@ struct super_block {
const struct xattr_handler **s_xattr;
#if IS_ENABLED(CONFIG_FS_ENCRYPTION)
const struct fscrypt_operations *s_cop;
+#endif
+#if IS_ENABLED(CONFIG_FS_VERITY)
+ const struct fsverity_operations *s_vop;
#endif
struct hlist_bl_head s_roots; /* alternate root dentries for NFS */
struct list_head s_mounts; /* list of mounts; _not_ for fs use */
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
new file mode 100644
index 0000000000000..3af55241046aa
--- /dev/null
+++ b/include/linux/fsverity.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * fs-verity: read-only file-based integrity/authentication
+ *
+ * Copyright (C) 2018 Google, Inc.
+ */
+
+#ifndef _LINUX_FSVERITY_H
+#define _LINUX_FSVERITY_H
+
+#include <linux/fs.h>
+#include <uapi/linux/fsverity.h>
+
+/*
+ * fs-verity operations for filesystems
+ */
+struct fsverity_operations {
+ int (*set_verity)(struct inode *inode, loff_t data_i_size);
+ int (*get_full_i_size)(struct inode *inode, loff_t *full_i_size_ret);
+};
+
+#if __FS_HAS_VERITY
+
+/* setup.c */
+extern int fsverity_file_open(struct inode *inode, struct file *filp);
+extern int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
+extern int fsverity_prepare_getattr(struct inode *inode);
+extern void fsverity_cleanup_inode(struct inode *inode);
+extern loff_t fsverity_full_i_size(const struct inode *inode);
+
+#else /* !__FS_HAS_VERITY */
+
+/* setup.c */
+
+static inline int fsverity_file_open(struct inode *inode, struct file *filp)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline int fsverity_prepare_setattr(struct dentry *dentry,
+ struct iattr *attr)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline int fsverity_prepare_getattr(struct inode *inode)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline void fsverity_cleanup_inode(struct inode *inode)
+{
+}
+
+static inline loff_t fsverity_full_i_size(const struct inode *inode)
+{
+ return i_size_read(inode);
+}
+
+#endif /* !__FS_HAS_VERITY */
+
+#endif /* _LINUX_FSVERITY_H */
diff --git a/include/uapi/linux/fsverity.h b/include/uapi/linux/fsverity.h
new file mode 100644
index 0000000000000..24ebb8b6ea0d4
--- /dev/null
+++ b/include/uapi/linux/fsverity.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * fs-verity (file-based verity) support
+ *
+ * Copyright (C) 2018 Google LLC
+ */
+#ifndef _UAPI_LINUX_FSVERITY_H
+#define _UAPI_LINUX_FSVERITY_H
+
+#include <linux/limits.h>
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/* ========== Ioctls ========== */
+
+struct fsverity_digest {
+ __u16 digest_algorithm;
+ __u16 digest_size; /* input/output */
+ __u8 digest[];
+};
+
+#define FS_IOC_ENABLE_VERITY _IO('f', 133)
+#define FS_IOC_MEASURE_VERITY _IOWR('f', 134, struct fsverity_digest)
+
+/* ========== On-disk format ========== */
+
+#define FS_VERITY_MAGIC "FSVerity"
+
+/* Supported hash algorithms */
+#define FS_VERITY_ALG_SHA256 1
+
+/* Metadata stored near the end of verity files, after the Merkle tree */
+/* This structure is 64 bytes long */
+struct fsverity_descriptor {
+ __u8 magic[8]; /* must be FS_VERITY_MAGIC */
+ __u8 major_version; /* must be 1 */
+ __u8 minor_version; /* must be 0 */
+ __u8 log_data_blocksize;/* log2(data-bytes-per-hash), e.g. 12 for 4KB */
+ __u8 log_tree_blocksize;/* log2(tree-bytes-per-hash), e.g. 12 for 4KB */
+ __le16 data_algorithm; /* hash algorithm for data blocks */
+ __le16 tree_algorithm; /* hash algorithm for tree blocks */
+ __le32 flags; /* flags */
+ __le32 reserved1; /* must be 0 */
+ __le64 orig_file_size; /* size of the original, unpadded data */
+ __le16 auth_ext_count; /* number of authenticated extensions */
+ __u8 reserved2[30]; /* must be 0 */
+};
+/* followed by list of 'auth_ext_count' authenticated extensions */
+/*
+ * then followed by '__le16 unauth_ext_count' padded to next 8-byte boundary,
+ * then a list of 'unauth_ext_count' (may be 0) unauthenticated extensions
+ */
+
+/* Extension types */
+#define FS_VERITY_EXT_ROOT_HASH 1
+#define FS_VERITY_EXT_SALT 2
+
+/* Header of each extension (variable-length metadata item) */
+struct fsverity_extension {
+ /*
+ * Length in bytes, including this header but excluding padding to next
+ * 8-byte boundary that is applied when advancing to the next extension.
+ */
+ __le32 length;
+ __le16 type; /* Type of this extension (see codes above) */
+ __le16 reserved; /* Reserved, must be 0 */
+};
+/* followed by the payload of 'length - 8' bytes */
+
+/* Extension payload formats */
+
+/*
+ * FS_VERITY_EXT_ROOT_HASH payload is just a byte array, with size equal to the
+ * digest size of the hash algorithm given in the fsverity_descriptor
+ */
+
+/* FS_VERITY_EXT_SALT payload is just a byte array, any size */
+
+
+/* Fields stored at the very end of the file */
+struct fsverity_footer {
+ __le32 desc_reverse_offset; /* distance to fsverity_descriptor */
+ __u8 magic[8]; /* FS_VERITY_MAGIC */
+} __packed;
+
+#endif /* _UAPI_LINUX_FSVERITY_H */
--
2.18.0
From: Eric Biggers <[email protected]>
Add SHA-512 support to fs-verity. This is primarily a demonstration of
the (small) changes needed to support a new hash algorithm; it's
anticipated that most users will still prefer SHA-256 due to the smaller
space required to store the hashes, though some may prefer SHA-512.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/verity/fsverity_private.h | 2 +-
fs/verity/hash_algs.c | 5 +++++
include/uapi/linux/fsverity.h | 1 +
3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index c553f99dc4973..1046b87b12dee 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -30,7 +30,7 @@
* Largest digest size among all hash algorithms supported by fs-verity. This
* can be increased if needed.
*/
-#define FS_VERITY_MAX_DIGEST_SIZE SHA256_DIGEST_SIZE
+#define FS_VERITY_MAX_DIGEST_SIZE SHA512_DIGEST_SIZE
/* A hash algorithm supported by fs-verity */
struct fsverity_hash_alg {
diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
index 424a26ee2f3c2..e16d767070fec 100644
--- a/fs/verity/hash_algs.c
+++ b/fs/verity/hash_algs.c
@@ -18,6 +18,11 @@ struct fsverity_hash_alg fsverity_hash_algs[] = {
.digest_size = 32,
.cryptographic = true,
},
+ [FS_VERITY_ALG_SHA512] = {
+ .name = "sha512",
+ .digest_size = 64,
+ .cryptographic = true,
+ },
};
/*
diff --git a/include/uapi/linux/fsverity.h b/include/uapi/linux/fsverity.h
index 24ebb8b6ea0d4..64846763f7aef 100644
--- a/include/uapi/linux/fsverity.h
+++ b/include/uapi/linux/fsverity.h
@@ -28,6 +28,7 @@ struct fsverity_digest {
/* Supported hash algorithms */
#define FS_VERITY_ALG_SHA256 1
+#define FS_VERITY_ALG_SHA512 2
/* Metadata stored near the end of verity files, after the Merkle tree */
/* This structure is 64 bytes long */
--
2.18.0
From: Eric Biggers <[email protected]>
Add functions that verify data pages that have been read from a
fs-verity file, against that file's Merkle tree. These will be called
from filesystems' ->readpage() and ->readpages() methods.
Since data verification can block, a workqueue is provided for these
methods to enqueue verification work from their bio completion callback.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/verity/Makefile | 2 +-
fs/verity/fsverity_private.h | 3 +
fs/verity/setup.c | 26 ++-
fs/verity/verify.c | 310 +++++++++++++++++++++++++++++++++++
include/linux/fsverity.h | 23 +++
5 files changed, 362 insertions(+), 2 deletions(-)
create mode 100644 fs/verity/verify.c
diff --git a/fs/verity/Makefile b/fs/verity/Makefile
index 39e123805c827..a6c7cefb61ab7 100644
--- a/fs/verity/Makefile
+++ b/fs/verity/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_FS_VERITY) += fsverity.o
-fsverity-y := hash_algs.o setup.o
+fsverity-y := hash_algs.o setup.o verify.o
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index a18ff645695f4..c553f99dc4973 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -96,4 +96,7 @@ static inline bool set_fsverity_info(struct inode *inode,
return true;
}
+/* verify.c */
+extern struct workqueue_struct *fsverity_read_workqueue;
+
#endif /* _FSVERITY_PRIVATE_H */
diff --git a/fs/verity/setup.c b/fs/verity/setup.c
index e675c52898d5b..84cc2edeca25b 100644
--- a/fs/verity/setup.c
+++ b/fs/verity/setup.c
@@ -824,18 +824,42 @@ EXPORT_SYMBOL_GPL(fsverity_full_i_size);
static int __init fsverity_module_init(void)
{
+ int err;
+
+ /*
+ * Use an unbound workqueue to allow bios to be verified in parallel
+ * even when they happen to complete on the same CPU. This sacrifices
+ * locality, but it's worthwhile since hashing is CPU-intensive.
+ *
+ * Also use a high-priority workqueue to prioritize verification work,
+ * which blocks reads from completing, over regular application tasks.
+ */
+ err = -ENOMEM;
+ fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue",
+ WQ_UNBOUND | WQ_HIGHPRI,
+ num_online_cpus());
+ if (!fsverity_read_workqueue)
+ goto error;
+
+ err = -ENOMEM;
fsverity_info_cachep = KMEM_CACHE(fsverity_info, SLAB_RECLAIM_ACCOUNT);
if (!fsverity_info_cachep)
- return -ENOMEM;
+ goto error_free_workqueue;
fsverity_check_hash_algs();
pr_debug("Initialized fs-verity\n");
return 0;
+
+error_free_workqueue:
+ destroy_workqueue(fsverity_read_workqueue);
+error:
+ return err;
}
static void __exit fsverity_module_exit(void)
{
+ destroy_workqueue(fsverity_read_workqueue);
kmem_cache_destroy(fsverity_info_cachep);
fsverity_exit_hash_algs();
}
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
new file mode 100644
index 0000000000000..1452dd05f75d3
--- /dev/null
+++ b/fs/verity/verify.c
@@ -0,0 +1,310 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fs/verity/verify.c: fs-verity data verification functions,
+ * i.e. hooks for ->readpages()
+ *
+ * Copyright (C) 2018 Google LLC
+ *
+ * Originally written by Jaegeuk Kim and Michael Halcrow;
+ * heavily rewritten by Eric Biggers.
+ */
+
+#include "fsverity_private.h"
+
+#include <crypto/hash.h>
+#include <linux/bio.h>
+#include <linux/pagemap.h>
+#include <linux/ratelimit.h>
+#include <linux/scatterlist.h>
+
+struct workqueue_struct *fsverity_read_workqueue;
+
+/**
+ * hash_at_level() - compute the location of the block's hash at the given level
+ *
+ * @vi: (in) the file's verity info
+ * @dindex: (in) the index of the data block being verified
+ * @level: (in) the level of hash we want
+ * @hindex: (out) the index of the hash block containing the wanted hash
+ * @hoffset: (out) the byte offset to the wanted hash within the hash block
+ */
+static void hash_at_level(const struct fsverity_info *vi, pgoff_t dindex,
+ unsigned int level, pgoff_t *hindex,
+ unsigned int *hoffset)
+{
+ pgoff_t hoffset_in_lvl;
+
+ /*
+ * Compute the offset of the hash within the level's region, in hashes.
+ * For example, with 4096-byte blocks and 32-byte hashes, there are
+ * 4096/32 = 128 = 2^7 hashes per hash block, i.e. log_arity = 7. Then,
+ * if the data block index is 65668 and we want the level 1 hash, it is
+ * located at 65668 >> 7 = 513 hashes into the level 1 region.
+ */
+ hoffset_in_lvl = dindex >> (level * vi->log_arity);
+
+ /*
+ * Compute the index of the hash block containing the wanted hash.
+ * Continuing the above example, the block would be at index 513 >> 7 =
+ * 4 within the level 1 region. To this we'd add the index at which the
+ * level 1 region starts.
+ */
+ *hindex = vi->hash_lvl_region_idx[level] +
+ (hoffset_in_lvl >> vi->log_arity);
+
+ /*
+ * Finally, compute the index of the hash within the block rather than
+ * the region, and multiply by the hash size to turn it into a byte
+ * offset. Continuing the above example, the hash would be at byte
+ * offset (513 & ((1 << 7) - 1)) * 32 = 32 within the block.
+ */
+ *hoffset = (hoffset_in_lvl & ((1 << vi->log_arity) - 1)) *
+ vi->hash_alg->digest_size;
+}
+
+/* Extract a hash from a hash page */
+static void extract_hash(struct page *hpage, unsigned int hoffset,
+ unsigned int hsize, u8 *out)
+{
+ void *virt = kmap_atomic(hpage);
+
+ memcpy(out, virt + hoffset, hsize);
+ kunmap_atomic(virt);
+}
+
+static int hash_page(const struct fsverity_info *vi, struct ahash_request *req,
+ struct page *page, u8 *out)
+{
+ struct scatterlist sg[3];
+ DECLARE_CRYPTO_WAIT(wait);
+ int err;
+
+ sg_init_table(sg, 1);
+ sg_set_page(&sg[0], page, PAGE_SIZE, 0);
+
+ ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+ CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &wait);
+ ahash_request_set_crypt(req, sg, out, PAGE_SIZE);
+
+ err = crypto_ahash_import(req, vi->hashstate);
+ if (err)
+ return err;
+
+ return crypto_wait_req(crypto_ahash_finup(req), &wait);
+}
+
+static inline int compare_hashes(const u8 *want_hash, const u8 *real_hash,
+ int digest_size, struct inode *inode,
+ pgoff_t index, int level, const char *algname)
+{
+ if (memcmp(want_hash, real_hash, digest_size) == 0)
+ return 0;
+
+ pr_warn_ratelimited("VERIFICATION FAILURE! ino=%lu, index=%lu, level=%d, want_hash=%s:%*phN, real_hash=%s:%*phN\n",
+ inode->i_ino, index, level,
+ algname, digest_size, want_hash,
+ algname, digest_size, real_hash);
+ return -EBADMSG;
+}
+
+/*
+ * Verify a single data page against the file's Merkle tree.
+ *
+ * In principle, we need to verify the entire path to the root node. But as an
+ * optimization, we cache the hash pages in the file's page cache, similar to
+ * data pages. Therefore, we can stop verifying as soon as a verified hash page
+ * is seen while ascending the tree.
+ *
+ * Note that unlike data pages, hash pages are marked Uptodate *before* they are
+ * verified; instead, the Checked bit is set on hash pages that have been
+ * verified. Multiple tasks may race to verify a hash page and mark it Checked,
+ * but it doesn't matter. The use of the Checked bit also implies that the hash
+ * block size must equal PAGE_SIZE (for now).
+ */
+static bool verify_page(struct inode *inode, const struct fsverity_info *vi,
+ struct ahash_request *req, struct page *data_page)
+{
+ pgoff_t index = data_page->index;
+ int level = 0;
+ u8 _want_hash[FS_VERITY_MAX_DIGEST_SIZE];
+ const u8 *want_hash = NULL;
+ u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
+ struct page *hpages[FS_VERITY_MAX_LEVELS];
+ unsigned int hoffsets[FS_VERITY_MAX_LEVELS];
+ int err;
+
+ /* The page must not be unlocked until verification has completed. */
+ if (WARN_ON_ONCE(!PageLocked(data_page)))
+ return false;
+
+ /*
+ * Since ->i_size is overridden with ->data_i_size, and fs-verity avoids
+ * recursing into itself when reading hash pages, we shouldn't normally
+ * get here with a page beyond ->data_i_size. But, it can happen if a
+ * read is issued at or beyond EOF since the VFS doesn't check i_size
+ * before calling ->readpage(). Thus, just skip verification if the
+ * page is beyond ->data_i_size.
+ */
+ if (index >= (vi->data_i_size + PAGE_SIZE - 1) >> PAGE_SHIFT) {
+ pr_debug("Page %lu is in metadata region\n", index);
+ return true;
+ }
+
+ pr_debug_ratelimited("Verifying data page %lu...\n", index);
+
+ /*
+ * Starting at the leaves, ascend the tree saving hash pages along the
+ * way until we find a verified hash page, indicated by PageChecked; or
+ * until we reach the root.
+ */
+ for (level = 0; level < vi->depth; level++) {
+ pgoff_t hindex;
+ unsigned int hoffset;
+ struct page *hpage;
+
+ hash_at_level(vi, index, level, &hindex, &hoffset);
+
+ pr_debug_ratelimited("Level %d: hindex=%lu, hoffset=%u\n",
+ level, hindex, hoffset);
+
+ hpage = read_mapping_page(inode->i_mapping, hindex, NULL);
+ if (IS_ERR(hpage)) {
+ err = PTR_ERR(hpage);
+ goto out;
+ }
+
+ if (PageChecked(hpage)) {
+ extract_hash(hpage, hoffset, vi->hash_alg->digest_size,
+ _want_hash);
+ want_hash = _want_hash;
+ put_page(hpage);
+ pr_debug_ratelimited("Hash page already checked, want %s:%*phN\n",
+ vi->hash_alg->name,
+ vi->hash_alg->digest_size,
+ want_hash);
+ break;
+ }
+ pr_debug_ratelimited("Hash page not yet checked\n");
+ hpages[level] = hpage;
+ hoffsets[level] = hoffset;
+ }
+
+ if (!want_hash) {
+ want_hash = vi->root_hash;
+ pr_debug("Want root hash: %s:%*phN\n", vi->hash_alg->name,
+ vi->hash_alg->digest_size, want_hash);
+ }
+
+ /* Descend the tree verifying hash pages */
+ for (; level > 0; level--) {
+ struct page *hpage = hpages[level - 1];
+ unsigned int hoffset = hoffsets[level - 1];
+
+ err = hash_page(vi, req, hpage, real_hash);
+ if (err)
+ goto out;
+ err = compare_hashes(want_hash, real_hash,
+ vi->hash_alg->digest_size,
+ inode, index, level - 1,
+ vi->hash_alg->name);
+ if (err)
+ goto out;
+ SetPageChecked(hpage);
+ extract_hash(hpage, hoffset, vi->hash_alg->digest_size,
+ _want_hash);
+ want_hash = _want_hash;
+ put_page(hpage);
+ pr_debug("Verified hash page at level %d, now want %s:%*phN\n",
+ level - 1, vi->hash_alg->name,
+ vi->hash_alg->digest_size, want_hash);
+ }
+
+ /* Finally, verify the data page */
+ err = hash_page(vi, req, data_page, real_hash);
+ if (err)
+ goto out;
+ err = compare_hashes(want_hash, real_hash, vi->hash_alg->digest_size,
+ inode, index, -1, vi->hash_alg->name);
+out:
+ for (; level > 0; level--)
+ put_page(hpages[level - 1]);
+ if (err) {
+ pr_warn_ratelimited("Error verifying page; ino=%lu, index=%lu (err=%d)\n",
+ inode->i_ino, data_page->index, err);
+ return false;
+ }
+ return true;
+}
+
+/**
+ * fsverity_verify_page - verify a data page
+ *
+ * Verify a page that has just been read from a file against that file's Merkle
+ * tree. The page is assumed to be a pagecache page.
+ *
+ * Return: true if the page is valid, else false.
+ */
+bool fsverity_verify_page(struct page *data_page)
+{
+ struct inode *inode = data_page->mapping->host;
+ const struct fsverity_info *vi = get_fsverity_info(inode);
+ struct ahash_request *req;
+ bool valid;
+
+ req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
+ if (unlikely(!req))
+ return false;
+
+ valid = verify_page(inode, vi, req, data_page);
+
+ ahash_request_free(req);
+
+ return valid;
+}
+EXPORT_SYMBOL_GPL(fsverity_verify_page);
+
+/**
+ * fsverity_verify_bio - verify a 'read' bio that has just completed
+ *
+ * Verify a set of pages that have just been read from a file against that
+ * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
+ * fail verification are set to the Error state. Verification is skipped for
+ * pages already in the Error state, e.g. due to fscrypt decryption failure.
+ */
+void fsverity_verify_bio(struct bio *bio)
+{
+ struct inode *inode = bio_first_page_all(bio)->mapping->host;
+ const struct fsverity_info *vi = get_fsverity_info(inode);
+ struct ahash_request *req;
+ struct bio_vec *bv;
+ int i;
+
+ req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
+ if (unlikely(!req)) {
+ bio_for_each_segment_all(bv, bio, i)
+ SetPageError(bv->bv_page);
+ return;
+ }
+
+ bio_for_each_segment_all(bv, bio, i) {
+ struct page *page = bv->bv_page;
+
+ if (!PageError(page) && !verify_page(inode, vi, req, page))
+ SetPageError(page);
+ }
+
+ ahash_request_free(req);
+}
+EXPORT_SYMBOL_GPL(fsverity_verify_bio);
+
+/**
+ * fsverity_enqueue_verify_work - enqueue work on the fs-verity workqueue
+ *
+ * Enqueue verification work for asynchronous processing.
+ */
+void fsverity_enqueue_verify_work(struct work_struct *work)
+{
+ queue_work(fsverity_read_workqueue, work);
+}
+EXPORT_SYMBOL_GPL(fsverity_enqueue_verify_work);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 3af55241046aa..56341f10aa965 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -28,6 +28,11 @@ extern int fsverity_prepare_getattr(struct inode *inode);
extern void fsverity_cleanup_inode(struct inode *inode);
extern loff_t fsverity_full_i_size(const struct inode *inode);
+/* verify.c */
+extern bool fsverity_verify_page(struct page *page);
+extern void fsverity_verify_bio(struct bio *bio);
+extern void fsverity_enqueue_verify_work(struct work_struct *work);
+
#else /* !__FS_HAS_VERITY */
/* setup.c */
@@ -57,6 +62,24 @@ static inline loff_t fsverity_full_i_size(const struct inode *inode)
return i_size_read(inode);
}
+/* verify.c */
+
+static inline bool fsverity_verify_page(struct page *page)
+{
+ WARN_ON(1);
+ return false;
+}
+
+static inline void fsverity_verify_bio(struct bio *bio)
+{
+ WARN_ON(1);
+}
+
+static inline void fsverity_enqueue_verify_work(struct work_struct *work)
+{
+ WARN_ON(1);
+}
+
#endif /* !__FS_HAS_VERITY */
#endif /* _LINUX_FSVERITY_H */
--
2.18.0
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
From: Eric Biggers <[email protected]>
Add a function for filesystems to call to implement the
FS_IOC_ENABLE_VERITY ioctl. This ioctl performs various validations
(e.g., checking that the file isn't open for writing), then calls back
into the filesystem to set the verity bit on the inode.
This ioctl is used to mark a file as being fs-verity protected, after
userspace has appended the Merkle tree and other metadata to the file.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/verity/Makefile | 2 +-
fs/verity/ioctl.c | 121 +++++++++++++++++++++++++++++++++++++++
include/linux/fsverity.h | 11 ++++
3 files changed, 133 insertions(+), 1 deletion(-)
create mode 100644 fs/verity/ioctl.c
diff --git a/fs/verity/Makefile b/fs/verity/Makefile
index a6c7cefb61ab7..6450925e3a8b7 100644
--- a/fs/verity/Makefile
+++ b/fs/verity/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_FS_VERITY) += fsverity.o
-fsverity-y := hash_algs.o setup.o verify.o
+fsverity-y := hash_algs.o ioctl.o setup.o verify.o
diff --git a/fs/verity/ioctl.c b/fs/verity/ioctl.c
new file mode 100644
index 0000000000000..993f2afdcc734
--- /dev/null
+++ b/fs/verity/ioctl.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fs/verity/ioctl.c: fs-verity ioctls
+ *
+ * Copyright (C) 2018 Google LLC
+ *
+ * Originally written by Jaegeuk Kim and Michael Halcrow;
+ * heavily rewritten by Eric Biggers.
+ */
+
+#include "fsverity_private.h"
+
+#include <linux/capability.h>
+#include <linux/mm.h>
+#include <linux/mount.h>
+#include <linux/uaccess.h>
+
+/**
+ * fsverity_ioctl_enable - enable fs-verity on a file
+ *
+ * Set the verity bit on a file. Userspace must have already appended verity
+ * metadata to the file.
+ *
+ * Enabling fs-verity makes the file contents immutable, and the filesystem
+ * doesn't allow disabling it (other than by replacing the file).
+ *
+ * To avoid races with the file contents being modified, no processes must have
+ * the file open for writing. This includes the caller!
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int fsverity_ioctl_enable(struct file *filp, const void __user *arg)
+{
+ struct inode *inode = file_inode(filp);
+ struct fsverity_info *vi;
+ int err;
+
+ /*
+ * In principle we need only check 'inode_owner_or_capable(inode)',
+ * which would allow non-root users to enable fs-verity on their own
+ * files. But to be conservative, for now restrict this to root-only.
+ */
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
+ if (arg) /* argument is reserved */
+ return -EINVAL;
+
+ if (S_ISDIR(inode->i_mode))
+ return -EISDIR;
+
+ if (!S_ISREG(inode->i_mode))
+ return -EINVAL;
+
+ err = mnt_want_write_file(filp);
+ if (err)
+ goto out;
+
+ /*
+ * Temporarily lock out writers via writable file descriptors or
+ * truncate(). This should stabilize the contents of the file as well
+ * as its size. Note that at the end of this ioctl we will unlock
+ * writers, but at that point the verity bit will be set (if the ioctl
+ * succeeded), preventing future writers.
+ */
+ err = deny_write_access(filp);
+ if (err) /* -ETXTBSY */
+ goto out_drop_write;
+
+ /*
+ * fsync so that the verity bit can't be persisted to disk prior to the
+ * data, causing verification errors after a crash.
+ */
+ err = vfs_fsync(filp, 1);
+ if (err)
+ goto out_allow_write;
+
+ /* Serialize concurrent use of this ioctl on the same inode */
+ inode_lock(inode);
+
+ if (get_fsverity_info(inode)) { /* fs-verity already enabled? */
+ err = -EEXIST;
+ goto out_unlock;
+ }
+
+ /* Validate the verity metadata */
+ vi = create_fsverity_info(inode, true);
+ if (IS_ERR(vi)) {
+ err = PTR_ERR(vi);
+ if (err == -EINVAL) /* distinguish "invalid metadata" case */
+ err = -EBADMSG;
+ goto out_unlock;
+ }
+
+ /* Set the verity bit */
+ err = inode->i_sb->s_vop->set_verity(inode, vi->data_i_size);
+ if (err)
+ goto out_free_vi;
+
+ /* Invalidate all cached pages, forcing re-verification */
+ truncate_inode_pages(inode->i_mapping, 0);
+
+ /*
+ * Set ->i_verity_info, unless another task managed to do it already
+ * between ->set_verity() and here.
+ */
+ if (set_fsverity_info(inode, vi))
+ vi = NULL;
+ err = 0;
+out_free_vi:
+ free_fsverity_info(vi);
+out_unlock:
+ inode_unlock(inode);
+out_allow_write:
+ allow_write_access(filp);
+out_drop_write:
+ mnt_drop_write_file(filp);
+out:
+ return err;
+}
+EXPORT_SYMBOL_GPL(fsverity_ioctl_enable);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 56341f10aa965..c710b6b5fb4a6 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -21,6 +21,9 @@ struct fsverity_operations {
#if __FS_HAS_VERITY
+/* ioctl.c */
+extern int fsverity_ioctl_enable(struct file *filp, const void __user *arg);
+
/* setup.c */
extern int fsverity_file_open(struct inode *inode, struct file *filp);
extern int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
@@ -35,6 +38,14 @@ extern void fsverity_enqueue_verify_work(struct work_struct *work);
#else /* !__FS_HAS_VERITY */
+/* ioctl.c */
+
+static inline int fsverity_ioctl_enable(struct file *filp,
+ const void __user *arg)
+{
+ return -EOPNOTSUPP;
+}
+
/* setup.c */
static inline int fsverity_file_open(struct inode *inode, struct file *filp)
--
2.18.0
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
From: Eric Biggers <[email protected]>
Add a function for filesystems to call to implement the
FS_IOC_MEASURE_VERITY ioctl. This ioctl retrieves the file measurement
hash that fs-verity calculated for the given file and is enforcing for
reads; i.e., reads that don't match this hash will fail.
This ioctl can be used to implement lightweight auditing or
authentication of file hashes in userspace, as an alternative to an
in-kernel policy such as an IMA policy.
Note that due to fs-verity's use of a Merkle tree, opening a file and
executing this ioctl takes constant time, regardless of the file's size.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/verity/ioctl.c | 49 ++++++++++++++++++++++++++++++++++++++++
fs/verity/setup.c | 4 +++-
include/linux/fsverity.h | 6 +++++
3 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/fs/verity/ioctl.c b/fs/verity/ioctl.c
index 993f2afdcc734..3da9b0f441c13 100644
--- a/fs/verity/ioctl.c
+++ b/fs/verity/ioctl.c
@@ -119,3 +119,52 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *arg)
return err;
}
EXPORT_SYMBOL_GPL(fsverity_ioctl_enable);
+
+/**
+ * fsverity_ioctl_measure - get a verity file's measurement
+ *
+ * The FS_IOC_MEASURE_VERITY ioctl retrieves the file measurement that the
+ * kernel is enforcing for reads from a verity file.
+ *
+ * No privileges are required to use this ioctl, since it is a read-only
+ * operation on a single regular file.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int fsverity_ioctl_measure(struct file *filp, void __user *_uarg)
+{
+ const struct inode *inode = file_inode(filp);
+ struct fsverity_digest __user *uarg = _uarg;
+ const struct fsverity_info *vi;
+ const struct fsverity_hash_alg *hash_alg;
+ struct fsverity_digest arg;
+
+ vi = get_fsverity_info(inode);
+ if (!vi)
+ return -ENODATA; /* not a verity file */
+ hash_alg = vi->hash_alg;
+
+ /*
+ * The user specifies the digest_size their buffer has space for; we can
+ * return the digest if it fits in the available space. We write back
+ * the actual size, which may be shorter than the user-specified size.
+ */
+
+ if (get_user(arg.digest_size, &uarg->digest_size))
+ return -EFAULT;
+ if (arg.digest_size < hash_alg->digest_size)
+ return -EOVERFLOW;
+
+ memset(&arg, 0, sizeof(arg));
+ arg.digest_algorithm = hash_alg - fsverity_hash_algs;
+ arg.digest_size = hash_alg->digest_size;
+
+ if (copy_to_user(uarg, &arg, sizeof(arg)))
+ return -EFAULT;
+
+ if (copy_to_user(uarg->digest, vi->measurement, hash_alg->digest_size))
+ return -EFAULT;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(fsverity_ioctl_measure);
diff --git a/fs/verity/setup.c b/fs/verity/setup.c
index 84cc2edeca25b..3f5cb9526dbc9 100644
--- a/fs/verity/setup.c
+++ b/fs/verity/setup.c
@@ -842,7 +842,9 @@ static int __init fsverity_module_init(void)
goto error;
err = -ENOMEM;
- fsverity_info_cachep = KMEM_CACHE(fsverity_info, SLAB_RECLAIM_ACCOUNT);
+ fsverity_info_cachep = KMEM_CACHE_USERCOPY(fsverity_info,
+ SLAB_RECLAIM_ACCOUNT,
+ measurement);
if (!fsverity_info_cachep)
goto error_free_workqueue;
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index c710b6b5fb4a6..9d3371dbd262f 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -23,6 +23,7 @@ struct fsverity_operations {
/* ioctl.c */
extern int fsverity_ioctl_enable(struct file *filp, const void __user *arg);
+extern int fsverity_ioctl_measure(struct file *filp, void __user *arg);
/* setup.c */
extern int fsverity_file_open(struct inode *inode, struct file *filp);
@@ -46,6 +47,11 @@ static inline int fsverity_ioctl_enable(struct file *filp,
return -EOPNOTSUPP;
}
+static inline int fsverity_ioctl_measure(struct file *filp, void __user *arg)
+{
+ return -EOPNOTSUPP;
+}
+
/* setup.c */
static inline int fsverity_file_open(struct inode *inode, struct file *filp)
--
2.18.0
From: Eric Biggers <[email protected]>
Add fs-verity support to f2fs. fs-verity is a filesystem feature that
provides efficient, transparent integrity verification and
authentication of read-only files. It uses a dm-verity like mechanism
at the file level: a Merkle tree hidden past the end of the file is used
to verify any block in the file in log(filesize) time. It is
implemented mainly by helper functions in fs/verity/.
In f2fs, the main change is to the I/O path: ->readpage() and
->readpages() now verify data as it is read from verity files. Pages
that fail verification are set to PG_error && !PG_uptodate, causing
applications to see an I/O error.
Hooks are also added to several other f2fs filesystem operations:
* ->open(), to deny opening verity files for writing and to set up
the fsverity_info to prepare for I/O
* ->getattr() to set up the fsverity_info to make stat() show the
original data size of verity files
* ->setattr() to deny truncating verity files
* update_inode() to write out the full file size rather than the
original data size, since for verity files the in-memory ->i_size is
overridden with the original data size.
Finally, the FS_IOC_ENABLE_VERITY and FS_IOC_MEASURE_VERITY ioctls are
wired up. On f2fs, these ioctls require that the filesystem has the
'verity' feature, i.e. it was created with 'mkfs.f2fs -O verity'.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/f2fs/Kconfig | 20 +++++++++++++++++
fs/f2fs/data.c | 43 +++++++++++++++++++++++++++++++-----
fs/f2fs/f2fs.h | 17 ++++++++++++---
fs/f2fs/file.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++
fs/f2fs/inode.c | 3 ++-
fs/f2fs/super.c | 22 +++++++++++++++++++
fs/f2fs/sysfs.c | 11 ++++++++++
7 files changed, 165 insertions(+), 9 deletions(-)
diff --git a/fs/f2fs/Kconfig b/fs/f2fs/Kconfig
index 9a20ef42fadde..c8396c7220f2a 100644
--- a/fs/f2fs/Kconfig
+++ b/fs/f2fs/Kconfig
@@ -81,6 +81,26 @@ config F2FS_FS_ENCRYPTION
efficient since it avoids caching the encrypted and
decrypted pages in the page cache.
+config F2FS_FS_VERITY
+ bool "F2FS Verity"
+ depends on F2FS_FS
+ select FS_VERITY
+ help
+ This option enables fs-verity for f2fs. fs-verity is the
+ dm-verity mechanism implemented at the file level. Userspace
+ can append a Merkle tree (hash tree) to a file, then enable
+ fs-verity on the file. f2fs will then transparently verify
+ any data read from the file against the Merkle tree. The file
+ is also made read-only.
+
+ This serves as an integrity check, but the availability of the
+ Merkle tree root hash also allows efficiently supporting
+ various use cases where normally the whole file would need to
+ be hashed at once, such as auditing and authenticity
+ verification (appraisal).
+
+ If unsure, say N.
+
config F2FS_IO_TRACE
bool "F2FS IO tracer"
depends on F2FS_FS
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 8f931d699287a..fc9ea831f7235 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -59,6 +59,7 @@ static bool __is_cp_guaranteed(struct page *page)
enum bio_post_read_step {
STEP_INITIAL = 0,
STEP_DECRYPT,
+ STEP_VERITY,
};
struct bio_post_read_ctx {
@@ -103,8 +104,23 @@ static void decrypt_work(struct work_struct *work)
bio_post_read_processing(ctx);
}
+static void verity_work(struct work_struct *work)
+{
+ struct bio_post_read_ctx *ctx =
+ container_of(work, struct bio_post_read_ctx, work);
+
+ fsverity_verify_bio(ctx->bio);
+
+ bio_post_read_processing(ctx);
+}
+
static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
{
+ /*
+ * We use different work queues for decryption and for verity because
+ * verity may require reading metadata pages that need decryption, and
+ * we shouldn't recurse to the same workqueue.
+ */
switch (++ctx->cur_step) {
case STEP_DECRYPT:
if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
@@ -114,6 +130,14 @@ static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
}
ctx->cur_step++;
/* fall-through */
+ case STEP_VERITY:
+ if (ctx->enabled_steps & (1 << STEP_VERITY)) {
+ INIT_WORK(&ctx->work, verity_work);
+ fsverity_enqueue_verify_work(&ctx->work);
+ return;
+ }
+ ctx->cur_step++;
+ /* fall-through */
default:
__read_end_io(ctx->bio);
}
@@ -534,7 +558,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio)
}
static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
- unsigned nr_pages)
+ unsigned nr_pages, pgoff_t first_idx)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct bio *bio;
@@ -550,6 +574,11 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
if (f2fs_encrypted_file(inode))
post_read_steps |= 1 << STEP_DECRYPT;
+#ifdef CONFIG_F2FS_FS_VERITY
+ if (inode->i_verity_info != NULL &&
+ (first_idx < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
+ post_read_steps |= 1 << STEP_VERITY;
+#endif
if (post_read_steps) {
ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
if (!ctx) {
@@ -571,7 +600,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
static int f2fs_submit_page_read(struct inode *inode, struct page *page,
block_t blkaddr)
{
- struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1);
+ struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1, page->index);
if (IS_ERR(bio))
return PTR_ERR(bio);
@@ -1459,8 +1488,8 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
block_in_file = (sector_t)page->index;
last_block = block_in_file + nr_pages;
- last_block_in_file = (i_size_read(inode) + blocksize - 1) >>
- blkbits;
+ last_block_in_file = (fsverity_full_i_size(inode) +
+ blocksize - 1) >> blkbits;
if (last_block > last_block_in_file)
last_block = last_block_in_file;
@@ -1497,6 +1526,9 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
}
} else {
zero_user_segment(page, 0, PAGE_SIZE);
+ if (f2fs_verity_file(inode) &&
+ !fsverity_verify_page(page))
+ goto set_error_page;
if (!PageUptodate(page))
SetPageUptodate(page);
unlock_page(page);
@@ -1514,7 +1546,8 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
bio = NULL;
}
if (bio == NULL) {
- bio = f2fs_grab_read_bio(inode, block_nr, nr_pages);
+ bio = f2fs_grab_read_bio(inode, block_nr, nr_pages,
+ page->index);
if (IS_ERR(bio)) {
bio = NULL;
goto set_error_page;
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 4d8b1de831439..e59781b13c5c8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -29,6 +29,9 @@
#define __FS_HAS_ENCRYPTION IS_ENABLED(CONFIG_F2FS_FS_ENCRYPTION)
#include <linux/fscrypt.h>
+#define __FS_HAS_VERITY IS_ENABLED(CONFIG_F2FS_FS_VERITY)
+#include <linux/fsverity.h>
+
#ifdef CONFIG_F2FS_CHECK_FS
#define f2fs_bug_on(sbi, condition) BUG_ON(condition)
#else
@@ -146,7 +149,7 @@ struct f2fs_mount_info {
#define F2FS_FEATURE_QUOTA_INO 0x0080
#define F2FS_FEATURE_INODE_CRTIME 0x0100
#define F2FS_FEATURE_LOST_FOUND 0x0200
-#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
+#define F2FS_FEATURE_VERITY 0x0400
#define F2FS_HAS_FEATURE(sb, mask) \
((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
@@ -598,7 +601,7 @@ enum {
#define FADVISE_ENC_NAME_BIT 0x08
#define FADVISE_KEEP_SIZE_BIT 0x10
#define FADVISE_HOT_BIT 0x20
-#define FADVISE_VERITY_BIT 0x40 /* reserved */
+#define FADVISE_VERITY_BIT 0x40
#define file_is_cold(inode) is_file(inode, FADVISE_COLD_BIT)
#define file_wrong_pino(inode) is_file(inode, FADVISE_LOST_PINO_BIT)
@@ -616,6 +619,8 @@ enum {
#define file_is_hot(inode) is_file(inode, FADVISE_HOT_BIT)
#define file_set_hot(inode) set_file(inode, FADVISE_HOT_BIT)
#define file_clear_hot(inode) clear_file(inode, FADVISE_HOT_BIT)
+#define file_is_verity(inode) is_file(inode, FADVISE_VERITY_BIT)
+#define file_set_verity(inode) set_file(inode, FADVISE_VERITY_BIT)
#define DEF_DIR_LEVEL 0
@@ -3294,13 +3299,18 @@ static inline void f2fs_set_encrypted_inode(struct inode *inode)
#endif
}
+static inline bool f2fs_verity_file(struct inode *inode)
+{
+ return file_is_verity(inode);
+}
+
/*
* Returns true if the reads of the inode's data need to undergo some
* postprocessing step, like decryption or authenticity verification.
*/
static inline bool f2fs_post_read_required(struct inode *inode)
{
- return f2fs_encrypted_file(inode);
+ return f2fs_encrypted_file(inode) || f2fs_verity_file(inode);
}
#define F2FS_FEATURE_FUNCS(name, flagname) \
@@ -3318,6 +3328,7 @@ F2FS_FEATURE_FUNCS(flexible_inline_xattr, FLEXIBLE_INLINE_XATTR);
F2FS_FEATURE_FUNCS(quota_ino, QUOTA_INO);
F2FS_FEATURE_FUNCS(inode_crtime, INODE_CRTIME);
F2FS_FEATURE_FUNCS(lost_found, LOST_FOUND);
+F2FS_FEATURE_FUNCS(verity, VERITY);
#ifdef CONFIG_BLK_DEV_ZONED
static inline int get_blkz_type(struct f2fs_sb_info *sbi,
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6880c6f78d58d..ea86dd35685ff 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -486,6 +486,12 @@ static int f2fs_file_open(struct inode *inode, struct file *filp)
if (err)
return err;
+ if (f2fs_verity_file(inode)) {
+ err = fsverity_file_open(inode, filp);
+ if (err)
+ return err;
+ }
+
filp->f_mode |= FMODE_NOWAIT;
return dquot_file_open(inode, filp);
@@ -684,6 +690,22 @@ int f2fs_getattr(const struct path *path, struct kstat *stat,
struct f2fs_inode *ri;
unsigned int flags;
+ if (f2fs_verity_file(inode)) {
+ /*
+ * For fs-verity we need to override i_size with the original
+ * data i_size. This requires I/O to the file which with
+ * fscrypt requires that the key be set up. But, if the key is
+ * unavailable just continue on without the i_size override.
+ */
+ int err = fscrypt_require_key(inode);
+
+ if (!err) {
+ err = fsverity_prepare_getattr(inode);
+ if (err)
+ return err;
+ }
+ }
+
if (f2fs_has_extra_attr(inode) &&
f2fs_sb_has_inode_crtime(inode->i_sb) &&
F2FS_FITS_IN_INODE(ri, fi->i_extra_isize, i_crtime)) {
@@ -767,6 +789,12 @@ int f2fs_setattr(struct dentry *dentry, struct iattr *attr)
if (err)
return err;
+ if (f2fs_verity_file(inode)) {
+ err = fsverity_prepare_setattr(dentry, attr);
+ if (err)
+ return err;
+ }
+
if (is_quota_modification(inode, attr)) {
err = dquot_initialize(inode);
if (err)
@@ -2851,6 +2879,30 @@ static int f2fs_ioc_precache_extents(struct file *filp, unsigned long arg)
return f2fs_precache_extents(file_inode(filp));
}
+static int f2fs_ioc_enable_verity(struct file *filp, unsigned long arg)
+{
+ struct inode *inode = file_inode(filp);
+
+ f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
+
+ if (!f2fs_sb_has_verity(inode->i_sb)) {
+ f2fs_msg(inode->i_sb, KERN_WARNING,
+ "Can't enable fs-verity on inode %lu: the fs-verity feature is disabled on this filesystem.\n",
+ inode->i_ino);
+ return -EOPNOTSUPP;
+ }
+
+ return fsverity_ioctl_enable(filp, (const void __user *)arg);
+}
+
+static int f2fs_ioc_measure_verity(struct file *filp, unsigned long arg)
+{
+ if (!f2fs_sb_has_verity(file_inode(filp)->i_sb))
+ return -EOPNOTSUPP;
+
+ return fsverity_ioctl_measure(filp, (void __user *)arg);
+}
+
long f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
if (unlikely(f2fs_cp_error(F2FS_I_SB(file_inode(filp)))))
@@ -2907,6 +2959,10 @@ long f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
return f2fs_ioc_set_pin_file(filp, arg);
case F2FS_IOC_PRECACHE_EXTENTS:
return f2fs_ioc_precache_extents(filp, arg);
+ case FS_IOC_ENABLE_VERITY:
+ return f2fs_ioc_enable_verity(filp, arg);
+ case FS_IOC_MEASURE_VERITY:
+ return f2fs_ioc_measure_verity(filp, arg);
default:
return -ENOTTY;
}
@@ -3013,6 +3069,8 @@ long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case F2FS_IOC_GET_PIN_FILE:
case F2FS_IOC_SET_PIN_FILE:
case F2FS_IOC_PRECACHE_EXTENTS:
+ case FS_IOC_ENABLE_VERITY:
+ case FS_IOC_MEASURE_VERITY:
break;
default:
return -ENOIOCTLCMD;
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index f121c864f4c0d..e363e9f0c699e 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -407,7 +407,7 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
ri->i_uid = cpu_to_le32(i_uid_read(inode));
ri->i_gid = cpu_to_le32(i_gid_read(inode));
ri->i_links = cpu_to_le32(inode->i_nlink);
- ri->i_size = cpu_to_le64(i_size_read(inode));
+ ri->i_size = cpu_to_le64(fsverity_full_i_size(inode));
ri->i_blocks = cpu_to_le64(SECTOR_TO_BLOCK(inode->i_blocks) + 1);
if (et) {
@@ -618,6 +618,7 @@ void f2fs_evict_inode(struct inode *inode)
}
out_clear:
fscrypt_put_encryption_info(inode);
+ fsverity_cleanup_inode(inode);
clear_inode(inode);
}
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 3995e926ba3a3..52a0de200fb79 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1943,6 +1943,25 @@ static const struct fscrypt_operations f2fs_cryptops = {
};
#endif
+#ifdef CONFIG_F2FS_FS_VERITY
+static int f2fs_set_verity(struct inode *inode, loff_t data_i_size)
+{
+ int err;
+
+ err = f2fs_convert_inline_inode(inode);
+ if (err)
+ return err;
+
+ file_set_verity(inode);
+ f2fs_mark_inode_dirty_sync(inode, true);
+ return 0;
+}
+
+static const struct fsverity_operations f2fs_verityops = {
+ .set_verity = f2fs_set_verity,
+};
+#endif /* CONFIG_F2FS_FS_VERITY */
+
static struct inode *f2fs_nfs_get_inode(struct super_block *sb,
u64 ino, u32 generation)
{
@@ -2758,6 +2777,9 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
sb->s_op = &f2fs_sops;
#ifdef CONFIG_F2FS_FS_ENCRYPTION
sb->s_cop = &f2fs_cryptops;
+#endif
+#ifdef CONFIG_F2FS_FS_VERITY
+ sb->s_vop = &f2fs_verityops;
#endif
sb->s_xattr = f2fs_xattr_handlers;
sb->s_export_op = &f2fs_export_ops;
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 2e7e611deaef2..f11aa34a8be18 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -119,6 +119,9 @@ static ssize_t features_show(struct f2fs_attr *a,
if (f2fs_sb_has_lost_found(sb))
len += snprintf(buf + len, PAGE_SIZE - len, "%s%s",
len ? ", " : "", "lost_found");
+ if (f2fs_sb_has_verity(sb))
+ len += snprintf(buf + len, PAGE_SIZE - len, "%s%s",
+ len ? ", " : "", "verity");
len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return len;
}
@@ -333,6 +336,7 @@ enum feat_id {
FEAT_QUOTA_INO,
FEAT_INODE_CRTIME,
FEAT_LOST_FOUND,
+ FEAT_VERITY,
};
static ssize_t f2fs_feature_show(struct f2fs_attr *a,
@@ -349,6 +353,7 @@ static ssize_t f2fs_feature_show(struct f2fs_attr *a,
case FEAT_QUOTA_INO:
case FEAT_INODE_CRTIME:
case FEAT_LOST_FOUND:
+ case FEAT_VERITY:
return snprintf(buf, PAGE_SIZE, "supported\n");
}
return 0;
@@ -429,6 +434,9 @@ F2FS_FEATURE_RO_ATTR(flexible_inline_xattr, FEAT_FLEXIBLE_INLINE_XATTR);
F2FS_FEATURE_RO_ATTR(quota_ino, FEAT_QUOTA_INO);
F2FS_FEATURE_RO_ATTR(inode_crtime, FEAT_INODE_CRTIME);
F2FS_FEATURE_RO_ATTR(lost_found, FEAT_LOST_FOUND);
+#ifdef CONFIG_F2FS_FS_VERITY
+F2FS_FEATURE_RO_ATTR(verity, FEAT_VERITY);
+#endif
#define ATTR_LIST(name) (&f2fs_attr_##name.attr)
static struct attribute *f2fs_attrs[] = {
@@ -485,6 +493,9 @@ static struct attribute *f2fs_feat_attrs[] = {
ATTR_LIST(quota_ino),
ATTR_LIST(inode_crtime),
ATTR_LIST(lost_found),
+#ifdef CONFIG_F2FS_FS_VERITY
+ ATTR_LIST(verity),
+#endif
NULL,
};
--
2.18.0
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
From: Theodore Ts'o <[email protected]>
Make ext4_mpage_readpages() verify data as it is read from fs-verity
files, using the helper functions from fs/verity/.
To be compatible with fscrypt, like in the corresponding f2fs patch this
required refactoring the decryption workflow into a generic "post-read
processing" workflow, which can do decryption, verification, or both.
Signed-off-by: Theodore Ts'o <[email protected]>
(EB: various fixes and other changes)
Signed-off-by: Eric Biggers <[email protected]>
---
fs/ext4/ext4.h | 2 +
fs/ext4/inode.c | 3 +
fs/ext4/readpage.c | 207 ++++++++++++++++++++++++++++++++++++++-------
fs/ext4/super.c | 6 ++
4 files changed, 187 insertions(+), 31 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 335c99e781728..f8db4b8bf133c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3077,6 +3077,8 @@ static inline void ext4_set_de_type(struct super_block *sb,
extern int ext4_mpage_readpages(struct address_space *mapping,
struct list_head *pages, struct page *page,
unsigned nr_pages);
+extern int __init ext4_init_post_read_processing(void);
+extern void ext4_destroy_post_read_processing(void);
/* symlink.c */
extern const struct inode_operations ext4_encrypted_symlink_inode_operations;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bb8f50230d055..cbee798d0de17 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3848,6 +3848,9 @@ static ssize_t ext4_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
return 0;
#endif
+ if (ext4_verity_inode(inode))
+ return 0;
+
/*
* If we are doing data journalling we don't support O_DIRECT
*/
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 19b87a8de6ff3..7750e22c90e39 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -47,6 +47,11 @@
#include "ext4.h"
+#define NUM_PREALLOC_POST_READ_CTXS 128
+
+static struct kmem_cache *bio_post_read_ctx_cache;
+static mempool_t *bio_post_read_ctx_pool;
+
static inline bool ext4_bio_encrypted(struct bio *bio)
{
#ifdef CONFIG_EXT4_FS_ENCRYPTION
@@ -56,6 +61,124 @@ static inline bool ext4_bio_encrypted(struct bio *bio)
#endif
}
+/* postprocessing steps for read bios */
+enum bio_post_read_step {
+ STEP_INITIAL = 0,
+ STEP_DECRYPT,
+ STEP_VERITY,
+};
+
+struct bio_post_read_ctx {
+ struct bio *bio;
+ struct work_struct work;
+ unsigned int cur_step;
+ unsigned int enabled_steps;
+};
+
+static void __read_end_io(struct bio *bio)
+{
+ struct page *page;
+ struct bio_vec *bv;
+ int i;
+
+ bio_for_each_segment_all(bv, bio, i) {
+ page = bv->bv_page;
+
+ /* PG_error was set if any post_read step failed */
+ if (bio->bi_status || PageError(page)) {
+ ClearPageUptodate(page);
+ SetPageError(page);
+ } else {
+ SetPageUptodate(page);
+ }
+ unlock_page(page);
+ }
+ if (bio->bi_private)
+ mempool_free(bio->bi_private, bio_post_read_ctx_pool);
+ bio_put(bio);
+}
+
+static void bio_post_read_processing(struct bio_post_read_ctx *ctx);
+
+static void decrypt_work(struct work_struct *work)
+{
+ struct bio_post_read_ctx *ctx =
+ container_of(work, struct bio_post_read_ctx, work);
+
+ fscrypt_decrypt_bio(ctx->bio);
+
+ bio_post_read_processing(ctx);
+}
+
+static void verity_work(struct work_struct *work)
+{
+ struct bio_post_read_ctx *ctx =
+ container_of(work, struct bio_post_read_ctx, work);
+
+ fsverity_verify_bio(ctx->bio);
+
+ bio_post_read_processing(ctx);
+}
+
+static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
+{
+ /*
+ * We use different work queues for decryption and for verity because
+ * verity may require reading metadata pages that need decryption, and
+ * we shouldn't recurse to the same workqueue.
+ */
+ switch (++ctx->cur_step) {
+ case STEP_DECRYPT:
+ if (ctx->enabled_steps & (1 << STEP_DECRYPT)) {
+ INIT_WORK(&ctx->work, decrypt_work);
+ fscrypt_enqueue_decrypt_work(&ctx->work);
+ return;
+ }
+ ctx->cur_step++;
+ /* fall-through */
+ case STEP_VERITY:
+ if (ctx->enabled_steps & (1 << STEP_VERITY)) {
+ INIT_WORK(&ctx->work, verity_work);
+ fsverity_enqueue_verify_work(&ctx->work);
+ return;
+ }
+ ctx->cur_step++;
+ /* fall-through */
+ default:
+ __read_end_io(ctx->bio);
+ }
+}
+
+static struct bio_post_read_ctx *get_bio_post_read_ctx(struct inode *inode,
+ struct bio *bio,
+ pgoff_t index)
+{
+ unsigned int post_read_steps = 0;
+ struct bio_post_read_ctx *ctx = NULL;
+
+ if (ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode))
+ post_read_steps |= 1 << STEP_DECRYPT;
+#ifdef CONFIG_EXT4_FS_VERITY
+ if (inode->i_verity_info != NULL &&
+ (index < ((i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT)))
+ post_read_steps |= 1 << STEP_VERITY;
+#endif
+ if (post_read_steps) {
+ ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
+ if (!ctx)
+ return ERR_PTR(-ENOMEM);
+ ctx->bio = bio;
+ ctx->enabled_steps = post_read_steps;
+ bio->bi_private = ctx;
+ }
+ return ctx;
+}
+
+static bool bio_post_read_required(struct bio *bio)
+{
+ return bio->bi_private && !bio->bi_status;
+}
+
/*
* I/O completion handler for multipage BIOs.
*
@@ -70,30 +193,31 @@ static inline bool ext4_bio_encrypted(struct bio *bio)
*/
static void mpage_end_io(struct bio *bio)
{
- struct bio_vec *bv;
- int i;
+ if (bio_post_read_required(bio)) {
+ struct bio_post_read_ctx *ctx = bio->bi_private;
- if (ext4_bio_encrypted(bio)) {
- if (bio->bi_status) {
- fscrypt_release_ctx(bio->bi_private);
- } else {
- fscrypt_enqueue_decrypt_bio(bio->bi_private, bio);
- return;
- }
+ ctx->cur_step = STEP_INITIAL;
+ bio_post_read_processing(ctx);
+ return;
}
- bio_for_each_segment_all(bv, bio, i) {
- struct page *page = bv->bv_page;
+ __read_end_io(bio);
+}
- if (!bio->bi_status) {
- SetPageUptodate(page);
- } else {
- ClearPageUptodate(page);
- SetPageError(page);
- }
- unlock_page(page);
+static inline loff_t ext4_readpage_limit(struct inode *inode)
+{
+#ifdef CONFIG_EXT4_FS_VERITY
+ if (ext4_verity_inode(inode)) {
+ if (inode->i_verity_info)
+ /* limit to end of metadata region */
+ return fsverity_full_i_size(inode);
+ /*
+ * fsverity_info is currently being set up and no user reads are
+ * allowed yet. It's easiest to just not enforce a limit yet.
+ */
+ return inode->i_sb->s_maxbytes;
}
-
- bio_put(bio);
+#endif
+ return i_size_read(inode);
}
int ext4_mpage_readpages(struct address_space *mapping,
@@ -140,7 +264,8 @@ int ext4_mpage_readpages(struct address_space *mapping,
block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits);
last_block = block_in_file + nr_pages * blocks_per_page;
- last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits;
+ last_block_in_file = (ext4_readpage_limit(inode) +
+ blocksize - 1) >> blkbits;
if (last_block > last_block_in_file)
last_block = last_block_in_file;
page_block = 0;
@@ -240,19 +365,15 @@ int ext4_mpage_readpages(struct address_space *mapping,
bio = NULL;
}
if (bio == NULL) {
- struct fscrypt_ctx *ctx = NULL;
+ struct bio_post_read_ctx *ctx;
- if (ext4_encrypted_inode(inode) &&
- S_ISREG(inode->i_mode)) {
- ctx = fscrypt_get_ctx(inode, GFP_NOFS);
- if (IS_ERR(ctx))
- goto set_error_page;
- }
bio = bio_alloc(GFP_KERNEL,
min_t(int, nr_pages, BIO_MAX_PAGES));
- if (!bio) {
- if (ctx)
- fscrypt_release_ctx(ctx);
+ if (!bio)
+ goto set_error_page;
+ ctx = get_bio_post_read_ctx(inode, bio, page->index);
+ if (IS_ERR(ctx)) {
+ bio_put(bio);
goto set_error_page;
}
bio_set_dev(bio, bdev);
@@ -292,3 +413,27 @@ int ext4_mpage_readpages(struct address_space *mapping,
submit_bio(bio);
return 0;
}
+
+int __init ext4_init_post_read_processing(void)
+{
+ bio_post_read_ctx_cache = KMEM_CACHE(bio_post_read_ctx, 0);
+ if (!bio_post_read_ctx_cache)
+ goto fail;
+ bio_post_read_ctx_pool =
+ mempool_create_slab_pool(NUM_PREALLOC_POST_READ_CTXS,
+ bio_post_read_ctx_cache);
+ if (!bio_post_read_ctx_pool)
+ goto fail_free_cache;
+ return 0;
+
+fail_free_cache:
+ kmem_cache_destroy(bio_post_read_ctx_cache);
+fail:
+ return -ENOMEM;
+}
+
+void ext4_destroy_post_read_processing(void)
+{
+ mempool_destroy(bio_post_read_ctx_pool);
+ kmem_cache_destroy(bio_post_read_ctx_cache);
+}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c2f372c634ccb..0a17f7c6f630a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6006,6 +6006,10 @@ static int __init ext4_init_fs(void)
if (err)
return err;
+ err = ext4_init_post_read_processing();
+ if (err)
+ goto out6;
+
err = ext4_init_pageio();
if (err)
goto out5;
@@ -6044,6 +6048,8 @@ static int __init ext4_init_fs(void)
out4:
ext4_exit_pageio();
out5:
+ ext4_destroy_post_read_processing();
+out6:
ext4_exit_es();
return err;
--
2.18.0
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
From: Eric Biggers <[email protected]>
Add optional support for:
- At module initialization time, creating an ".fs-verity" keyring to
which trusted X.509 certificates can be added via sys_add_key().
- Parsing a signed file measurement from the fs-verity metadata (as a
PKCS7_SIGNATURE unauthenticated extension item), and verifying it
against the certificates in the ".fs-verity" keyring.
- Registering a sysctl fs.verity.require_signatures. This can be set to
enforce that all fs-verity files have a valid signature.
This is meant as a relatively simple mechanism that can be used to
provide an authenticity guarantee for fs-verity files, as an alternative
to IMA-appraisal. Userspace programs still need to check that the
fs-verity bit is set in order to get an authenticity guarantee.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/verity/Kconfig | 17 ++++
fs/verity/Makefile | 2 +
fs/verity/fsverity_private.h | 34 +++++++
fs/verity/setup.c | 63 +++++++++++-
fs/verity/signature.c | 187 ++++++++++++++++++++++++++++++++++
include/uapi/linux/fsverity.h | 10 ++
6 files changed, 311 insertions(+), 2 deletions(-)
create mode 100644 fs/verity/signature.c
diff --git a/fs/verity/Kconfig b/fs/verity/Kconfig
index 308d733a9401b..485488021ac16 100644
--- a/fs/verity/Kconfig
+++ b/fs/verity/Kconfig
@@ -34,3 +34,20 @@ config FS_VERITY_DEBUG
Enable debugging messages related to fs-verity by default.
Say N unless you are an fs-verity developer.
+
+config FS_VERITY_BUILTIN_SIGNATURES
+ bool "FS Verity builtin signature support"
+ depends on FS_VERITY
+ select SYSTEM_DATA_VERIFICATION
+ help
+ Support verifying signatures of verity files against the X.509
+ certificates that have been loaded into the ".fs-verity"
+ kernel keyring.
+
+ This is meant as a relatively simple mechanism that can be
+ used to provide an authenticity guarantee for verity files, as
+ an alternative to IMA appraisal. Userspace programs still
+ need to check that the verity bit is set in order to get an
+ authenticity guarantee.
+
+ If unsure, say N.
diff --git a/fs/verity/Makefile b/fs/verity/Makefile
index 6450925e3a8b7..d293ea2a1b393 100644
--- a/fs/verity/Makefile
+++ b/fs/verity/Makefile
@@ -1,3 +1,5 @@
obj-$(CONFIG_FS_VERITY) += fsverity.o
fsverity-y := hash_algs.o ioctl.o setup.o verify.o
+
+fsverity-$(CONFIG_FS_VERITY_BUILTIN_SIGNATURES) += signature.o
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index 1046b87b12dee..73a3f04776fce 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -63,6 +63,7 @@ struct fsverity_info {
u8 root_hash[FS_VERITY_MAX_DIGEST_SIZE]; /* Merkle tree root hash */
u8 measurement[FS_VERITY_MAX_DIGEST_SIZE]; /* file measurement */
bool have_root_hash; /* have root hash from disk? */
+ bool have_signed_measurement; /* have measurement from signature? */
/* Starting blocks for each tree level. 'depth-1' is the root level. */
u64 hash_lvl_region_idx[FS_VERITY_MAX_LEVELS];
@@ -96,6 +97,39 @@ static inline bool set_fsverity_info(struct inode *inode,
return true;
}
+/* signature.c */
+#ifdef CONFIG_FS_VERITY_BUILTIN_SIGNATURES
+extern int fsverity_require_signatures;
+
+int fsverity_parse_pkcs7_signature_extension(struct fsverity_info *vi,
+ const void *raw_pkcs7,
+ size_t size);
+
+int __init fsverity_signature_init(void);
+
+void __exit fsverity_signature_exit(void);
+#else /* CONFIG_FS_VERITY_BUILTIN_SIGNATURES */
+
+#define fsverity_require_signatures 0
+
+static inline int
+fsverity_parse_pkcs7_signature_extension(struct fsverity_info *vi,
+ const void *raw_pkcs7, size_t size)
+{
+ pr_warn("PKCS#7 signatures not supported in this kernel build!\n");
+ return -EINVAL;
+}
+
+static inline int fsverity_signature_init(void)
+{
+ return 0;
+}
+
+static inline void fsverity_signature_exit(void)
+{
+}
+#endif /* !CONFIG_FS_VERITY_BUILTIN_SIGNATURES */
+
/* verify.c */
extern struct workqueue_struct *fsverity_read_workqueue;
diff --git a/fs/verity/setup.c b/fs/verity/setup.c
index 3f5cb9526dbc9..6a11cdcbd01d4 100644
--- a/fs/verity/setup.c
+++ b/fs/verity/setup.c
@@ -132,6 +132,10 @@ static const struct extension_type {
[FS_VERITY_EXT_SALT] = {
.parse = parse_salt_extension,
},
+ [FS_VERITY_EXT_PKCS7_SIGNATURE] = {
+ .parse = fsverity_parse_pkcs7_signature_extension,
+ .unauthenticated = true,
+ },
};
static int do_parse_extensions(struct fsverity_info *vi,
@@ -449,6 +453,54 @@ static int compute_measurement(const struct fsverity_info *vi,
return err;
}
+/*
+ * Compute the file's measurement; then, if a signature was present, verify that
+ * the signed measurement matches the actual one.
+ */
+static int
+verify_file_measurement(struct fsverity_info *vi,
+ const struct fsverity_descriptor *desc,
+ int desc_auth_len,
+ struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
+ int nr_desc_pages)
+{
+ u8 measurement[FS_VERITY_MAX_DIGEST_SIZE];
+ int err;
+
+ err = compute_measurement(vi, desc, desc_auth_len, desc_pages,
+ nr_desc_pages, measurement);
+ if (err) {
+ pr_warn("Error computing fs-verity measurement: %d\n", err);
+ return err;
+ }
+
+ if (!vi->have_signed_measurement) {
+ pr_debug("Computed measurement: %s:%*phN (used desc_auth_len %d)\n",
+ vi->hash_alg->name, vi->hash_alg->digest_size,
+ measurement, desc_auth_len);
+ if (fsverity_require_signatures) {
+ pr_warn("require_signatures=1, rejecting unsigned file!\n");
+ return -EBADMSG;
+ }
+ memcpy(vi->measurement, measurement, vi->hash_alg->digest_size);
+ return 0;
+ }
+
+ if (!memcmp(measurement, vi->measurement, vi->hash_alg->digest_size)) {
+ pr_debug("Verified measurement: %s:%*phN (used desc_auth_len %d)\n",
+ vi->hash_alg->name, vi->hash_alg->digest_size,
+ measurement, desc_auth_len);
+ return 0;
+ }
+
+ pr_warn("FILE CORRUPTED (actual measurement mismatches signed measurement): "
+ "want %s:%*phN, real %s:%*phN (used desc_auth_len %d)\n",
+ vi->hash_alg->name, vi->hash_alg->digest_size, vi->measurement,
+ vi->hash_alg->name, vi->hash_alg->digest_size, measurement,
+ desc_auth_len);
+ return -EBADMSG;
+}
+
static struct fsverity_info *alloc_fsverity_info(void)
{
return kmem_cache_zalloc(fsverity_info_cachep, GFP_NOFS);
@@ -693,8 +745,8 @@ struct fsverity_info *create_fsverity_info(struct inode *inode, bool enabling)
err = compute_tree_depth_and_offsets(vi);
if (err)
goto out;
- err = compute_measurement(vi, desc, desc_auth_len, desc_pages,
- nr_desc_pages, vi->measurement);
+ err = verify_file_measurement(vi, desc, desc_auth_len,
+ desc_pages, nr_desc_pages);
out:
if (desc)
unmap_fsverity_descriptor(desc, desc_pages, nr_desc_pages);
@@ -848,11 +900,17 @@ static int __init fsverity_module_init(void)
if (!fsverity_info_cachep)
goto error_free_workqueue;
+ err = fsverity_signature_init();
+ if (err)
+ goto error_free_info_cache;
+
fsverity_check_hash_algs();
pr_debug("Initialized fs-verity\n");
return 0;
+error_free_info_cache:
+ kmem_cache_destroy(fsverity_info_cachep);
error_free_workqueue:
destroy_workqueue(fsverity_read_workqueue);
error:
@@ -863,6 +921,7 @@ static void __exit fsverity_module_exit(void)
{
destroy_workqueue(fsverity_read_workqueue);
kmem_cache_destroy(fsverity_info_cachep);
+ fsverity_signature_exit();
fsverity_exit_hash_algs();
}
diff --git a/fs/verity/signature.c b/fs/verity/signature.c
new file mode 100644
index 0000000000000..bb8407e7914c8
--- /dev/null
+++ b/fs/verity/signature.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fs/verity/signature.c: verification of builtin signatures
+ *
+ * Copyright (C) 2018 Google LLC
+ *
+ * Written by Eric Biggers.
+ */
+
+#include "fsverity_private.h"
+
+#include <linux/cred.h>
+#include <linux/key.h>
+#include <linux/verification.h>
+
+/*
+ * /proc/sys/fs/verity/require_signatures
+ * If 1, all verity files must have a valid builtin signature.
+ */
+int fsverity_require_signatures;
+
+/*
+ * Keyring that contains the trusted X.509 certificates.
+ *
+ * Only root (kuid=0) can modify this. Also, root may use
+ * keyctl_restrict_keyring() to prevent any more additions.
+ */
+static struct key *fsverity_keyring;
+
+static int extract_measurement(void *ctx, const void *data, size_t len,
+ size_t asn1hdrlen)
+{
+ struct fsverity_info *vi = ctx;
+ const struct fsverity_digest_disk *d;
+ const struct fsverity_hash_alg *hash_alg;
+
+ if (len < sizeof(*d)) {
+ pr_warn("Signed file measurement has unrecognized format\n");
+ return -EBADMSG;
+ }
+ d = (const void *)data;
+
+ hash_alg = fsverity_get_hash_alg(le16_to_cpu(d->digest_algorithm));
+ if (IS_ERR(hash_alg))
+ return PTR_ERR(hash_alg);
+
+ if (le16_to_cpu(d->digest_size) != hash_alg->digest_size) {
+ pr_warn("Wrong digest_size in signed measurement: wanted %u for algorithm %s, but got %u\n",
+ hash_alg->digest_size, hash_alg->name,
+ le16_to_cpu(d->digest_size));
+ return -EBADMSG;
+ }
+
+ if (len < sizeof(*d) + hash_alg->digest_size) {
+ pr_warn("Signed file measurement is truncated\n");
+ return -EBADMSG;
+ }
+
+ if (hash_alg != vi->hash_alg) {
+ pr_warn("Signed file measurement uses %s, but file uses %s\n",
+ hash_alg->name, vi->hash_alg->name);
+ return -EBADMSG;
+ }
+
+ memcpy(vi->measurement, d->digest, hash_alg->digest_size);
+ vi->have_signed_measurement = true;
+ return 0;
+}
+
+/**
+ * fsverity_parse_pkcs7_signature_extension - verify the signed file measurement
+ *
+ * Verify a signed fsverity_measurement against the certificates in the
+ * fs-verity keyring. The signature is given as a PKCS#7 formatted message, and
+ * the signed data is included in the message (not detached).
+ *
+ * Return: 0 if the signature checks out and the signed measurement is
+ * well-formed and uses the expected hash algorithm; -EBADMSG on signature
+ * verification failure or malformed data; else another -errno code.
+ */
+int fsverity_parse_pkcs7_signature_extension(struct fsverity_info *vi,
+ const void *raw_pkcs7, size_t size)
+{
+ int err;
+
+ if (vi->have_signed_measurement) {
+ pr_warn("Found multiple PKCS#7 signatures\n");
+ return -EBADMSG;
+ }
+
+ if (!vi->hash_alg->cryptographic) {
+ /* Might as well check this... */
+ pr_warn("Found signed %s file measurement, but %s isn't a cryptographic hash algorithm.\n",
+ vi->hash_alg->name, vi->hash_alg->name);
+ return -EBADMSG;
+ }
+
+ err = verify_pkcs7_signature(NULL, 0, raw_pkcs7, size, fsverity_keyring,
+ VERIFYING_UNSPECIFIED_SIGNATURE,
+ extract_measurement, vi);
+ if (err)
+ pr_warn("PKCS#7 signature verification error: %d\n", err);
+
+ return err;
+}
+
+#ifdef CONFIG_SYSCTL
+static int zero;
+static int one = 1;
+static struct ctl_table_header *fsverity_sysctl_header;
+
+static struct ctl_path fsverity_sysctl_path[] = {
+ { .procname = "fs", },
+ { .procname = "verity", },
+ { }
+};
+
+static struct ctl_table fsverity_sysctl_table[] = {
+ {
+ .procname = "require_signatures",
+ .data = &fsverity_require_signatures,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ { }
+};
+
+static int __init fsverity_sysctl_init(void)
+{
+ fsverity_sysctl_header = register_sysctl_paths(fsverity_sysctl_path,
+ fsverity_sysctl_table);
+ if (!fsverity_sysctl_header) {
+ pr_warn("sysctl registration failed!");
+ return -ENOMEM;
+ }
+ return 0;
+}
+
+static void __exit fsverity_sysctl_exit(void)
+{
+ unregister_sysctl_table(fsverity_sysctl_header);
+}
+#else /* CONFIG_SYSCTL */
+static inline int fsverity_sysctl_init(void)
+{
+ return 0;
+}
+
+static inline void fsverity_sysctl_exit(void)
+{
+}
+#endif /* !CONFIG_SYSCTL */
+
+int __init fsverity_signature_init(void)
+{
+ struct key *ring;
+ int err;
+
+ ring = keyring_alloc(".fs-verity", KUIDT_INIT(0), KGIDT_INIT(0),
+ current_cred(),
+ ((KEY_POS_ALL & ~KEY_POS_SETATTR) |
+ KEY_USR_VIEW | KEY_USR_READ |
+ KEY_USR_WRITE | KEY_USR_SEARCH | KEY_USR_SETATTR),
+ KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL);
+ if (IS_ERR(ring))
+ return PTR_ERR(ring);
+
+ err = fsverity_sysctl_init();
+ if (err)
+ goto error_put_ring;
+
+ fsverity_keyring = ring;
+ return 0;
+
+error_put_ring:
+ key_put(ring);
+ return err;
+}
+
+void __exit fsverity_signature_exit(void)
+{
+ key_put(fsverity_keyring);
+ fsverity_sysctl_exit();
+}
diff --git a/include/uapi/linux/fsverity.h b/include/uapi/linux/fsverity.h
index b1afd205bbf87..3d97181f50a77 100644
--- a/include/uapi/linux/fsverity.h
+++ b/include/uapi/linux/fsverity.h
@@ -56,6 +56,7 @@ struct fsverity_descriptor {
/* Extension types */
#define FS_VERITY_EXT_ROOT_HASH 1
#define FS_VERITY_EXT_SALT 2
+#define FS_VERITY_EXT_PKCS7_SIGNATURE 3
/* Header of each extension (variable-length metadata item) */
struct fsverity_extension {
@@ -78,6 +79,15 @@ struct fsverity_extension {
/* FS_VERITY_EXT_SALT payload is just a byte array, any size */
+/*
+ * FS_VERITY_EXT_PKCS7_SIGNATURE payload is a DER-encoded PKCS#7 message
+ * containing the signed file measurement in the following format:
+ */
+struct fsverity_digest_disk {
+ __le16 digest_algorithm;
+ __le16 digest_size;
+ __u8 digest[];
+};
/* Fields stored at the very end of the file */
struct fsverity_footer {
--
2.18.0
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
From: Eric Biggers <[email protected]>
Add CRC-32C support to fs-verity, to provide a faster alternative to
SHA-256 for users who want integrity-only (not authenticity), i.e. who
want to detect only accidental corruption, not malicious changes.
CRC-32C is chosen over CRC-32 because the CRC-32C polynomial is believed
to provide slightly better error-detection properties; and CRC-32C is
just as fast (or can be just as fast) as CRC-32, or even faster e.g. on
some x86 processors that have a CRC-32C instruction but not CRC-32.
We use "crc32c" from the crypto API, so the polynomial convention is
bitwise little-endian, the digest is bytewise little-endian, and the CRC
bits are inverted at the beginning and end (which is desirable).
Signed-off-by: Eric Biggers <[email protected]>
---
fs/verity/hash_algs.c | 4 ++++
include/uapi/linux/fsverity.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
index e16d767070fec..3fd4bba7c4aa6 100644
--- a/fs/verity/hash_algs.c
+++ b/fs/verity/hash_algs.c
@@ -23,6 +23,10 @@ struct fsverity_hash_alg fsverity_hash_algs[] = {
.digest_size = 64,
.cryptographic = true,
},
+ [FS_VERITY_ALG_CRC32C] = {
+ .name = "crc32c",
+ .digest_size = 4,
+ },
};
/*
diff --git a/include/uapi/linux/fsverity.h b/include/uapi/linux/fsverity.h
index 64846763f7aef..b1afd205bbf87 100644
--- a/include/uapi/linux/fsverity.h
+++ b/include/uapi/linux/fsverity.h
@@ -29,6 +29,7 @@ struct fsverity_digest {
/* Supported hash algorithms */
#define FS_VERITY_ALG_SHA256 1
#define FS_VERITY_ALG_SHA512 2
+#define FS_VERITY_ALG_CRC32C 3 /* for integrity only */
/* Metadata stored near the end of verity files, after the Merkle tree */
/* This structure is 64 bytes long */
--
2.18.0
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
From: Theodore Ts'o <[email protected]>
Add basic fs-verity support to ext4. fs-verity is a filesystem feature
that provides efficient, transparent integrity verification and
authentication of read-only files. It uses a dm-verity like mechanism
at the file level: a Merkle tree hidden past the end of the file is used
to verify any block in the file in log(filesize) time. It is
implemented mainly by helper functions in fs/verity/.
This patch adds everything except the data verification hooks that will
needed in ->readpages().
On ext4, enabling fs-verity on a file requires that the filesystem has
the 'verity' feature, e.g. that it was formatted with
'mkfs.ext4 -O verity' or had 'tune2fs -O verity' run on it.
This requires e2fsprogs 1.44.4-2 or later.
Signed-off-by: Theodore Ts'o <[email protected]>
(EB: lots of changes, including adding the verity feature flag and
storing the data i_size on disk to make it an RO_COMPAT feature)
Signed-off-by: Eric Biggers <[email protected]>
---
fs/ext4/Kconfig | 20 ++++++++++++
fs/ext4/ext4.h | 20 +++++++++++-
fs/ext4/file.c | 6 ++++
fs/ext4/inode.c | 8 +++++
fs/ext4/ioctl.c | 12 ++++++++
fs/ext4/super.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/sysfs.c | 6 ++++
7 files changed, 152 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig
index a453cc87082b5..5a76125ac0f8a 100644
--- a/fs/ext4/Kconfig
+++ b/fs/ext4/Kconfig
@@ -111,6 +111,26 @@ config EXT4_FS_ENCRYPTION
default y
depends on EXT4_ENCRYPTION
+config EXT4_FS_VERITY
+ bool "Ext4 Verity"
+ depends on EXT4_FS
+ select FS_VERITY
+ help
+ This option enables fs-verity for ext4. fs-verity is the
+ dm-verity mechanism implemented at the file level. Userspace
+ can append a Merkle tree (hash tree) to a file, then enable
+ fs-verity on the file. ext4 will then transparently verify
+ any data read from the file against the Merkle tree. The file
+ is also made read-only.
+
+ This serves as an integrity check, but the availability of the
+ Merkle tree root hash also allows efficiently supporting
+ various use cases where normally the whole file would need to
+ be hashed at once, such as auditing and authenticity
+ verification (appraisal).
+
+ If unsure, say N.
+
config EXT4_DEBUG
bool "EXT4 debugging support"
depends on EXT4_FS
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 7c7123f265c25..335c99e781728 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -43,6 +43,9 @@
#define __FS_HAS_ENCRYPTION IS_ENABLED(CONFIG_EXT4_FS_ENCRYPTION)
#include <linux/fscrypt.h>
+#define __FS_HAS_VERITY IS_ENABLED(CONFIG_EXT4_FS_VERITY)
+#include <linux/fsverity.h>
+
/*
* The fourth extended filesystem constants/structures
*/
@@ -394,6 +397,7 @@ struct flex_groups {
#define EXT4_TOPDIR_FL 0x00020000 /* Top of directory hierarchies*/
#define EXT4_HUGE_FILE_FL 0x00040000 /* Set to each huge file */
#define EXT4_EXTENTS_FL 0x00080000 /* Inode uses extents */
+#define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
#define EXT4_EA_INODE_FL 0x00200000 /* Inode used for large EA */
#define EXT4_EOFBLOCKS_FL 0x00400000 /* Blocks allocated beyond EOF */
#define EXT4_INLINE_DATA_FL 0x10000000 /* Inode has inline data. */
@@ -461,6 +465,7 @@ enum {
EXT4_INODE_TOPDIR = 17, /* Top of directory hierarchies*/
EXT4_INODE_HUGE_FILE = 18, /* Set to each huge file */
EXT4_INODE_EXTENTS = 19, /* Inode uses extents */
+ EXT4_INODE_VERITY = 20, /* Verity protected inode */
EXT4_INODE_EA_INODE = 21, /* Inode used for large EA */
EXT4_INODE_EOFBLOCKS = 22, /* Blocks allocated beyond EOF */
EXT4_INODE_INLINE_DATA = 28, /* Data in inode. */
@@ -506,6 +511,7 @@ static inline void ext4_check_flag_values(void)
CHECK_FLAG_VALUE(TOPDIR);
CHECK_FLAG_VALUE(HUGE_FILE);
CHECK_FLAG_VALUE(EXTENTS);
+ CHECK_FLAG_VALUE(VERITY);
CHECK_FLAG_VALUE(EA_INODE);
CHECK_FLAG_VALUE(EOFBLOCKS);
CHECK_FLAG_VALUE(INLINE_DATA);
@@ -1632,6 +1638,7 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
#define EXT4_FEATURE_RO_COMPAT_METADATA_CSUM 0x0400
#define EXT4_FEATURE_RO_COMPAT_READONLY 0x1000
#define EXT4_FEATURE_RO_COMPAT_PROJECT 0x2000
+#define EXT4_FEATURE_RO_COMPAT_VERITY 0x8000
#define EXT4_FEATURE_INCOMPAT_COMPRESSION 0x0001
#define EXT4_FEATURE_INCOMPAT_FILETYPE 0x0002
@@ -1720,6 +1727,7 @@ EXT4_FEATURE_RO_COMPAT_FUNCS(bigalloc, BIGALLOC)
EXT4_FEATURE_RO_COMPAT_FUNCS(metadata_csum, METADATA_CSUM)
EXT4_FEATURE_RO_COMPAT_FUNCS(readonly, READONLY)
EXT4_FEATURE_RO_COMPAT_FUNCS(project, PROJECT)
+EXT4_FEATURE_RO_COMPAT_FUNCS(verity, VERITY)
EXT4_FEATURE_INCOMPAT_FUNCS(compression, COMPRESSION)
EXT4_FEATURE_INCOMPAT_FUNCS(filetype, FILETYPE)
@@ -1775,7 +1783,8 @@ EXT4_FEATURE_INCOMPAT_FUNCS(encrypt, ENCRYPT)
EXT4_FEATURE_RO_COMPAT_BIGALLOC |\
EXT4_FEATURE_RO_COMPAT_METADATA_CSUM|\
EXT4_FEATURE_RO_COMPAT_QUOTA |\
- EXT4_FEATURE_RO_COMPAT_PROJECT)
+ EXT4_FEATURE_RO_COMPAT_PROJECT |\
+ EXT4_FEATURE_RO_COMPAT_VERITY)
#define EXTN_FEATURE_FUNCS(ver) \
static inline bool ext4_has_unknown_ext##ver##_compat_features(struct super_block *sb) \
@@ -2271,6 +2280,15 @@ static inline bool ext4_encrypted_inode(struct inode *inode)
return ext4_test_inode_flag(inode, EXT4_INODE_ENCRYPT);
}
+static inline bool ext4_verity_inode(struct inode *inode)
+{
+#ifdef CONFIG_EXT4_FS_VERITY
+ return ext4_test_inode_flag(inode, EXT4_INODE_VERITY);
+#else
+ return false;
+#endif
+}
+
#ifdef CONFIG_EXT4_FS_ENCRYPTION
static inline int ext4_fname_setup_filename(struct inode *dir,
const struct qstr *iname,
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 7f8023340eb8c..97a6a7699cff6 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -444,6 +444,12 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
if (ret)
return ret;
+ if (ext4_verity_inode(inode)) {
+ ret = fsverity_file_open(inode, filp);
+ if (ret)
+ return ret;
+ }
+
/*
* Set up the jbd2_inode if we are opening the inode for
* writing and the journal is present
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4efe77286ecd5..bb8f50230d055 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4651,6 +4651,8 @@ static bool ext4_should_use_dax(struct inode *inode)
return false;
if (ext4_encrypted_inode(inode))
return false;
+ if (ext4_verity_inode(inode))
+ return false;
return true;
}
@@ -5436,6 +5438,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
if (error)
return error;
+ if (ext4_verity_inode(inode)) {
+ error = fsverity_prepare_setattr(dentry, attr);
+ if (error)
+ return error;
+ }
+
if (is_quota_modification(inode, attr)) {
error = dquot_initialize(inode);
if (error)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index a7074115d6f68..55d54a176107e 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -983,6 +983,16 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
case EXT4_IOC_GET_ENCRYPTION_POLICY:
return fscrypt_ioctl_get_policy(filp, (void __user *)arg);
+ case FS_IOC_ENABLE_VERITY:
+ if (!ext4_has_feature_verity(sb))
+ return -EOPNOTSUPP;
+ return fsverity_ioctl_enable(filp, (const void __user *)arg);
+
+ case FS_IOC_MEASURE_VERITY:
+ if (!ext4_has_feature_verity(sb))
+ return -EOPNOTSUPP;
+ return fsverity_ioctl_measure(filp, (void __user *)arg);
+
case EXT4_IOC_FSGETXATTR:
{
struct fsxattr fa;
@@ -1101,6 +1111,8 @@ long ext4_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case EXT4_IOC_SET_ENCRYPTION_POLICY:
case EXT4_IOC_GET_ENCRYPTION_PWSALT:
case EXT4_IOC_GET_ENCRYPTION_POLICY:
+ case FS_IOC_ENABLE_VERITY:
+ case FS_IOC_MEASURE_VERITY:
case EXT4_IOC_SHUTDOWN:
case FS_IOC_GETFSMAP:
break;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b7f7922061be8..c2f372c634ccb 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1112,6 +1112,7 @@ void ext4_clear_inode(struct inode *inode)
EXT4_I(inode)->jinode = NULL;
}
fscrypt_put_encryption_info(inode);
+ fsverity_cleanup_inode(inode);
}
static struct inode *ext4_nfs_get_inode(struct super_block *sb,
@@ -1283,6 +1284,83 @@ static const struct fscrypt_operations ext4_cryptops = {
};
#endif
+#ifdef CONFIG_EXT4_FS_VERITY
+static int ext4_set_verity(struct inode *inode, loff_t data_i_size)
+{
+ int err;
+ handle_t *handle;
+ struct ext4_iloc iloc;
+
+ err = ext4_convert_inline_data(inode);
+ if (err)
+ return err;
+
+ /* Remove extents past EOF; see ext4_get_verity_full_size() */
+ err = ext4_truncate(inode);
+ if (err)
+ return err;
+
+ handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
+ if (IS_ERR(handle))
+ return PTR_ERR(handle);
+ err = ext4_reserve_inode_write(handle, inode, &iloc);
+ if (err == 0) {
+ ext4_set_inode_flag(inode, EXT4_INODE_VERITY);
+ EXT4_I(inode)->i_disksize = data_i_size;
+ err = ext4_mark_iloc_dirty(handle, inode, &iloc);
+ }
+ ext4_journal_stop(handle);
+
+ return err;
+}
+
+/*
+ * Retrieve the full size of a verity file. This is size of the original data
+ * plus the verity metadata such as the Merkle tree. To find this, we have to
+ * find the end of the last extent. This is needed because in ext4, in order to
+ * make verity an RO_COMPAT filesystem feature, the i_disksize of verity inodes
+ * is set to the data size rather than the full size.
+ */
+static int ext4_get_verity_full_size(struct inode *inode,
+ loff_t *full_i_size_ret)
+{
+ struct ext4_ext_path *path;
+ struct ext4_extent *last_extent;
+ u32 end_lblk;
+ int err;
+
+ if (ext4_has_inline_data(inode)) {
+ EXT4_ERROR_INODE(inode, "verity file has inline data");
+ return -EFSCORRUPTED;
+ }
+
+ path = ext4_find_extent(inode, EXT_MAX_BLOCKS - 1, NULL, 0);
+ if (IS_ERR(path))
+ return PTR_ERR(path);
+
+ last_extent = path[path->p_depth].p_ext;
+ if (!last_extent) {
+ EXT4_ERROR_INODE(inode, "verity file has no extents");
+ err = -EFSCORRUPTED;
+ goto out_drop_path;
+ }
+
+ end_lblk = le32_to_cpu(last_extent->ee_block) +
+ ext4_ext_get_actual_len(last_extent);
+ *full_i_size_ret = (loff_t)end_lblk << inode->i_blkbits;
+ err = 0;
+out_drop_path:
+ ext4_ext_drop_refs(path);
+ kfree(path);
+ return err;
+}
+
+static const struct fsverity_operations ext4_verityops = {
+ .set_verity = ext4_set_verity,
+ .get_full_i_size = ext4_get_verity_full_size,
+};
+#endif /* CONFIG_EXT4_FS_VERITY */
+
#ifdef CONFIG_QUOTA
static const char * const quotatypes[] = INITQFNAMES;
#define QTYPE2NAME(t) (quotatypes[t])
@@ -4104,6 +4182,9 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
#ifdef CONFIG_EXT4_FS_ENCRYPTION
sb->s_cop = &ext4_cryptops;
#endif
+#ifdef CONFIG_EXT4_FS_VERITY
+ sb->s_vop = &ext4_verityops;
+#endif
#ifdef CONFIG_QUOTA
sb->dq_op = &ext4_quota_operations;
if (ext4_has_feature_quota(sb))
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index f34da0bb8f174..3f3175367b696 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -223,6 +223,9 @@ EXT4_ATTR_FEATURE(meta_bg_resize);
#ifdef CONFIG_EXT4_FS_ENCRYPTION
EXT4_ATTR_FEATURE(encryption);
#endif
+#ifdef CONFIG_EXT4_FS_VERITY
+EXT4_ATTR_FEATURE(verity);
+#endif
EXT4_ATTR_FEATURE(metadata_csum_seed);
static struct attribute *ext4_feat_attrs[] = {
@@ -231,6 +234,9 @@ static struct attribute *ext4_feat_attrs[] = {
ATTR_LIST(meta_bg_resize),
#ifdef CONFIG_EXT4_FS_ENCRYPTION
ATTR_LIST(encryption),
+#endif
+#ifdef CONFIG_EXT4_FS_VERITY
+ ATTR_LIST(verity),
#endif
ATTR_LIST(metadata_csum_seed),
NULL,
--
2.18.0
On 08/24/2018 09:16 AM, Eric Biggers wrote:
> +/* ========== Ioctls ========== */
> +
> +struct fsverity_digest {
> + __u16 digest_algorithm;
> + __u16 digest_size; /* input/output */
> + __u8 digest[];
> +};
> +
> +#define FS_IOC_ENABLE_VERITY _IO('f', 133)
> +#define FS_IOC_MEASURE_VERITY _IOWR('f', 134, struct fsverity_digest)
Hi,
Please update Documentation/ioctl/ioctl-number.txt also.
thanks,
--
~Randy
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
On Fri, Aug 24, 2018, at 12:16 PM, Eric Biggers wrote:
> From: Eric Biggers <[email protected]>
>
> fs-verity is a filesystem feature that provides efficient, transparent
> integrity verification and authentication of read-only files. It uses a
> dm-verity like mechanism at the file level: a Merkle tree hidden past
> the end of the file is used to verify any block in the file in
> log(filesize) time. It is implemented mainly by helper functions in
> fs/verity/ that will be shared by multiple filesystems.
>
> Essentially, fs-verity reports a file's hash in constant time, but reads
> that would violate that hash fail at runtime. This is useful when only
> a portion of the file is actually accessed, as only the accessed portion
> has to be hashed, and the latency to the first read is much reduced over
> a full file hash. On top of this hashing mechanism, auditing or
> authentication policies can be implemented to log or verify file hashes.
>
> Note that in general, fs-verity is *not* a replacement for IMA.
> fs-verity is a lower-level feature, primarily a way to hash a file;
> whereas IMA deals more with higher-level policy logic, like defining
> which files are "measured" and what to do with those measurements. We
> plan for IMA to support fs-verity measurements as an alternative to the
> traditional full file hash. Still, some users find fs-verity useful by
> itself, so it's also usable without IMA in simple cases, e.g. in cases
> where just retrieving the file measurement via an ioctl is enough.
>
> A structure containing the properties of the Merkle tree -- such as the
> hash algorithm used, the block size, and the root hash -- is also stored
> on-disk, following the Merkle tree. The actual file measurement hash
> that fs-verity reports is the hash of this structure.
>
> All fs-verity metadata is written by userspace; the kernel only reads
> it. Extended attributes aren't used because the Merkle tree may be much
> larger than XATTR_SIZE_MAX, we want the hash pages to be cached in the
> page cache as usual, and in the case of fs-verity combined with fscrypt
> we want the metadata to be encrypted to avoid leaking plaintext hashes.
> The fs-verity metadata is hidden from userspace by overriding the i_size
> of the in-memory VFS inode; ext4 additionally will override the on-disk
> i_size in order to make verity a RO_COMPAT filesystem feature.
>
> This initial patch only adds the fs-verity Kconfig option, UAPI, and
> setup code, e.g. the ->open() hook that parses the fs-verity descriptor.
This first patch also adds a bit of core logic in the
simple fsverity_prepare_setattr() which ends up being called
by ext4 later.
While I'm not too familiar with the vfs, as far as I can
tell from inspection of Linus' git master is that pretty much any change (timestamp, hardlinks) ends up
calling notify_change() which calls the fs-specific one, and in
the verity case basically denies everything, right?
Previously I brought up many uses for "content immutable" files:
https://marc.info/?l=linux-fsdevel&m=151698481512084&w=2
The discussion sort of died out but...did you have any opinion
on e.g. my proposal to use the Unix mode bits as a way to describe
levels of mutablility?
Let's say that your new _VERITY inode flag becomes "_WRITEPROT"
or something a bit more generic.
Do you have any thoughts on my proposal to reuse the Unix mode
bits to control levels of inode mutability?
For example, it seems to me we could define u+w as "hardlinks are OK".
There shouldn't be any reason ext4/f2fs couldn't hardlink a verity-protected
inode right? Or if for some reason that is hard, we could disallow that to
start, but at least have the core VFS support _WRITEPROT inodes?
On Fri, Aug 24, 2018 at 01:42:29PM -0400, Colin Walters wrote:
>
> While I'm not too familiar with the vfs, as far as I can
> tell from inspection of Linus' git master is that pretty much any change (timestamp, hardlinks) ends up
> calling notify_change() which calls the fs-specific one, and in
> the verity case basically denies everything, right?
That's not correct. The verity case only denies truncate, because
changing the data of the file would break the Merkle tree checksums.
The metadata of the file is is not made immutable. So a
verity-protected file can be deleted, renamed, can have hard links,
and the timestamps can be set via utimes(), etc.
Cheers,
- Ted
Hi,
On 2018/8/25 0:16, Eric Biggers wrote:
> +/**
> + * fsverity_verify_page - verify a data page
> + *
> + * Verify a page that has just been read from a file against that file's Merkle
> + * tree. The page is assumed to be a pagecache page.
> + *
> + * Return: true if the page is valid, else false.
> + */
> +bool fsverity_verify_page(struct page *data_page)
> +{
> + struct inode *inode = data_page->mapping->host;
> + const struct fsverity_info *vi = get_fsverity_info(inode);
> + struct ahash_request *req;
> + bool valid;
> +
> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> + if (unlikely(!req))
> + return false;
> +
> + valid = verify_page(inode, vi, req, data_page);
> +
> + ahash_request_free(req);
> +
> + return valid;
> +}
> +EXPORT_SYMBOL_GPL(fsverity_verify_page);
> +
> +/**
> + * fsverity_verify_bio - verify a 'read' bio that has just completed
> + *
> + * Verify a set of pages that have just been read from a file against that
> + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
> + * fail verification are set to the Error state. Verification is skipped for
> + * pages already in the Error state, e.g. due to fscrypt decryption failure.
> + */
> +void fsverity_verify_bio(struct bio *bio)
> +{
> + struct inode *inode = bio_first_page_all(bio)->mapping->host;
> + const struct fsverity_info *vi = get_fsverity_info(inode);
> + struct ahash_request *req;
> + struct bio_vec *bv;
> + int i;
> +
> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> + if (unlikely(!req)) {
> + bio_for_each_segment_all(bv, bio, i)
> + SetPageError(bv->bv_page);
> + return;
> + }
> +
> + bio_for_each_segment_all(bv, bio, i) {
> + struct page *page = bv->bv_page;
> +
> + if (!PageError(page) && !verify_page(inode, vi, req, page))
> + SetPageError(page);
> + }
> +
> + ahash_request_free(req);
> +}
> +EXPORT_SYMBOL_GPL(fsverity_verify_bio);
Out of curiosity, I quickly scanned the fs-verity source code and some minor question out there....
If something is wrong, please point out, thanks in advance...
My first question is that 'Is there any way to skip to verify pages in a bio?'
I am thinking about
If metadata and data page are mixed in a filesystem of such kind, they could submit together in a bio, but metadata could be unsuitable for such kind of verification.
The second question is related to the first question --- 'Is there any way to verify a partial page?'
Take scalability into consideration, some files could be totally inlined or partially inlined in metadata.
Is there any way to deal with them in per-file approach? at least --- support for the interface?
At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better.....hmmm... :(
Thanks,
Gao Xiang
On Sat, Aug 25, 2018 at 10:29:26AM +0800, Gao Xiang wrote:
>
> My first question is that 'Is there any way to skip to verify pages in a bio?'
> I am thinking about
> If metadata and data page are mixed in a filesystem of such kind, they could submit together in a bio, but metadata could be unsuitable for such kind of verification.
>
> The second question is related to the first question --- 'Is there any way to verify a partial page?'
> Take scalability into consideration, some files could be totally inlined or partially inlined in metadata.
> Is there any way to deal with them in per-file approach? at least --- support for the interface?
A requirement of both fscrypt and fsverity is that is that block size
== page size, and that all data is stored in blocks. Inline data is
not supported.
The files that are intended for use with fsverity are large files
(such as APK files), so optimizing for files smaller than a block was
not a design goal.
- Ted
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Ted,
On 2018/8/25 11:45, Theodore Y. Ts'o wrote:
> On Sat, Aug 25, 2018 at 10:29:26AM +0800, Gao Xiang wrote:
>> My first question is that 'Is there any way to skip to verify pages in a bio?'
>> I am thinking about
>> If metadata and data page are mixed in a filesystem of such kind, they could submit together in a bio, but metadata could be unsuitable for such kind of verification.
>>
>> The second question is related to the first question --- 'Is there any way to verify a partial page?'
>> Take scalability into consideration, some files could be totally inlined or partially inlined in metadata.
>> Is there any way to deal with them in per-file approach? at least --- support for the interface?
> A requirement of both fscrypt and fsverity is that is that block size
> == page size, and that all data is stored in blocks. Inline data is
> not supported.
>
> The files that are intended for use with fsverity are large files
> (such as APK files), so optimizing for files smaller than a block was
> not a design goal.
Thanks for your quickly reply. :)
I had seen the background of why Google/Android introduces fs-verity before.
>
But I have some consideration than the current implementation.... (if it is suitable to discuss, thanks...)
1) Since it is the libfs-like library, I think bio-strict is too strict for its future fs users.
bios could be already organized in filesystem-specific way, which could include some other pages that is unnecessary to be verified.
I could give some example, if some filesystem organizes its bios for decompression, and some data exist in metadata.
It could be hard to use this libfs-like fsverity interface.
2) My last question
"At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better....."
is also for some files partially or totally encoded (eg. compressed, or whatever ...)
I think the hash tree is unnecessary to be compressed...so I think it could be better that it can be selected by users (filesystems of course).
Thanks,
Gao Xiang.
> - Ted
Hi Gao,
On Sat, Aug 25, 2018 at 10:29:26AM +0800, Gao Xiang wrote:
> Hi,
>
> On 2018/8/25 0:16, Eric Biggers wrote:
> > +/**
> > + * fsverity_verify_page - verify a data page
> > + *
> > + * Verify a page that has just been read from a file against that file's Merkle
> > + * tree. The page is assumed to be a pagecache page.
> > + *
> > + * Return: true if the page is valid, else false.
> > + */
> > +bool fsverity_verify_page(struct page *data_page)
> > +{
> > + struct inode *inode = data_page->mapping->host;
> > + const struct fsverity_info *vi = get_fsverity_info(inode);
> > + struct ahash_request *req;
> > + bool valid;
> > +
> > + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> > + if (unlikely(!req))
> > + return false;
> > +
> > + valid = verify_page(inode, vi, req, data_page);
> > +
> > + ahash_request_free(req);
> > +
> > + return valid;
> > +}
> > +EXPORT_SYMBOL_GPL(fsverity_verify_page);
> > +
> > +/**
> > + * fsverity_verify_bio - verify a 'read' bio that has just completed
> > + *
> > + * Verify a set of pages that have just been read from a file against that
> > + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
> > + * fail verification are set to the Error state. Verification is skipped for
> > + * pages already in the Error state, e.g. due to fscrypt decryption failure.
> > + */
> > +void fsverity_verify_bio(struct bio *bio)
> > +{
> > + struct inode *inode = bio_first_page_all(bio)->mapping->host;
> > + const struct fsverity_info *vi = get_fsverity_info(inode);
> > + struct ahash_request *req;
> > + struct bio_vec *bv;
> > + int i;
> > +
> > + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> > + if (unlikely(!req)) {
> > + bio_for_each_segment_all(bv, bio, i)
> > + SetPageError(bv->bv_page);
> > + return;
> > + }
> > +
> > + bio_for_each_segment_all(bv, bio, i) {
> > + struct page *page = bv->bv_page;
> > +
> > + if (!PageError(page) && !verify_page(inode, vi, req, page))
> > + SetPageError(page);
> > + }
> > +
> > + ahash_request_free(req);
> > +}
> > +EXPORT_SYMBOL_GPL(fsverity_verify_bio);
>
> Out of curiosity, I quickly scanned the fs-verity source code and some minor question out there....
>
> If something is wrong, please point out, thanks in advance...
>
> My first question is that 'Is there any way to skip to verify pages in a bio?'
> I am thinking about
> If metadata and data page are mixed in a filesystem of such kind, they could submit together in a bio, but metadata could be unsuitable for such kind of verification.
>
Pages below i_size are verified, pages above are not.
With my patches, ext4 and f2fs won't actually submit pages in both areas in the
same bio, and they won't call the fs-verity verification function for bios in
the data area. But even if they did, there's also a check in verify_page() that
skips the verification if the page is above i_size.
> The second question is related to the first question --- 'Is there any way to verify a partial page?'
> Take scalability into consideration, some files could be totally inlined or partially inlined in metadata.
> Is there any way to deal with them in per-file approach? at least --- support for the interface?
Well, one problem is that inline data has its own separate I/O path; see
ext4_readpage_inline() and f2fs_read_inline_data(). So it would be a large
effort to support features like encryption and verity which require
postprocessing after reads, and probably not worthwhile especially for verity
which is primarily intended for large files.
A somewhat separate question is whether the zero padding to a block boundary
after i_size, before the Merkle tree begins, is needed. The answer is yes,
since mixing data and metadata in the same page would cause problems. First,
userspace would be able to mmap the page and see some of the metadata rather
than zeroes. That's not a huge problem, but it breaks the standard behavior.
Second, any page containing data cannot be set Uptodate until it's been
verified. So, a special case would be needed to handle reading the part of the
metadata that's located in a data page.
> At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
> rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better.....hmmm... :(
In theory it would be a much cleaner design to store verity metadata separately
from the data. But the Merkle tree can be very large. For example, a 1 GB file
using SHA-512 would have a 16.6 MB Merkle tree. So the Merkle tree can't be an
extended attribute, since the xattrs API requires xattrs to be small (<= 64 KB),
and most filesystems further limit xattr sizes in their on-disk format to as
little as 4 KB. Furthermore, even if both of these limits were to be increased,
the xattrs functions (both the syscalls, and the internal functions that
filesystems have) are all based around getting/setting the entire xattr value.
Also when used with fscrypt, we want the Merkle tree and fsverity_descriptor to
be encrypted, so they doesn't leak plaintext hashes. And we want the Merkle
tree to be paged into memory, just like the file contents, to take advantage of
the usual Linux memory management.
What we really need is *streams*, like NTFS has. But the filesystems we're
targetting don't support streams, nor does the Linux syscall interface have any
API for accessing streams, nor does the VFS support them.
Adding streams support to all those things would be a huge multi-year effort,
controversial, and almost certainly not worth it just for fs-verity.
So simply storing the verity metadata past i_size seems like the best solution
for now.
That being said, in the future we could pretty easily swap out the calls to
read_mapping_page() with something else if a particular filesystem wanted to
store the metadata somewhere else. We actually even originally had a function
->read_metadata_page() in the filesystem's fsverity_operations, but it turned
out to be unnecessary and I replaced it with directly calling
read_mapping_page(), but it could be changed back at any time.
- Eric
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Colin,
On Fri, Aug 24, 2018 at 01:42:29PM -0400, Colin Walters wrote:
>
> On Fri, Aug 24, 2018, at 12:16 PM, Eric Biggers wrote:
> > From: Eric Biggers <[email protected]>
> >
> > fs-verity is a filesystem feature that provides efficient, transparent
> > integrity verification and authentication of read-only files. It uses a
> > dm-verity like mechanism at the file level: a Merkle tree hidden past
> > the end of the file is used to verify any block in the file in
> > log(filesize) time. It is implemented mainly by helper functions in
> > fs/verity/ that will be shared by multiple filesystems.
> >
> > Essentially, fs-verity reports a file's hash in constant time, but reads
> > that would violate that hash fail at runtime. This is useful when only
> > a portion of the file is actually accessed, as only the accessed portion
> > has to be hashed, and the latency to the first read is much reduced over
> > a full file hash. On top of this hashing mechanism, auditing or
> > authentication policies can be implemented to log or verify file hashes.
> >
> > Note that in general, fs-verity is *not* a replacement for IMA.
> > fs-verity is a lower-level feature, primarily a way to hash a file;
> > whereas IMA deals more with higher-level policy logic, like defining
> > which files are "measured" and what to do with those measurements. We
> > plan for IMA to support fs-verity measurements as an alternative to the
> > traditional full file hash. Still, some users find fs-verity useful by
> > itself, so it's also usable without IMA in simple cases, e.g. in cases
> > where just retrieving the file measurement via an ioctl is enough.
> >
> > A structure containing the properties of the Merkle tree -- such as the
> > hash algorithm used, the block size, and the root hash -- is also stored
> > on-disk, following the Merkle tree. The actual file measurement hash
> > that fs-verity reports is the hash of this structure.
> >
> > All fs-verity metadata is written by userspace; the kernel only reads
> > it. Extended attributes aren't used because the Merkle tree may be much
> > larger than XATTR_SIZE_MAX, we want the hash pages to be cached in the
> > page cache as usual, and in the case of fs-verity combined with fscrypt
> > we want the metadata to be encrypted to avoid leaking plaintext hashes.
> > The fs-verity metadata is hidden from userspace by overriding the i_size
> > of the in-memory VFS inode; ext4 additionally will override the on-disk
> > i_size in order to make verity a RO_COMPAT filesystem feature.
> >
> > This initial patch only adds the fs-verity Kconfig option, UAPI, and
> > setup code, e.g. the ->open() hook that parses the fs-verity descriptor.
>
> This first patch also adds a bit of core logic in the
> simple fsverity_prepare_setattr() which ends up being called
> by ext4 later.
>
> While I'm not too familiar with the vfs, as far as I can
> tell from inspection of Linus' git master is that pretty much any change (timestamp, hardlinks) ends up
> calling notify_change() which calls the fs-specific one, and in
> the verity case basically denies everything, right?
>
> Previously I brought up many uses for "content immutable" files:
> https://marc.info/?l=linux-fsdevel&m=151698481512084&w=2
>
> The discussion sort of died out but...did you have any opinion
> on e.g. my proposal to use the Unix mode bits as a way to describe
> levels of mutablility?
>
> Let's say that your new _VERITY inode flag becomes "_WRITEPROT"
> or something a bit more generic.
>
> Do you have any thoughts on my proposal to reuse the Unix mode
> bits to control levels of inode mutability?
>
> For example, it seems to me we could define u+w as "hardlinks are OK".
> There shouldn't be any reason ext4/f2fs couldn't hardlink a verity-protected
> inode right? Or if for some reason that is hard, we could disallow that to
> start, but at least have the core VFS support _WRITEPROT inodes?
>
As Ted pointed out, only truncates are denied on fs-verity files, not other
metadata changes like chmod().
Think of it this way: the purpose of fs-verity is *not* to make files immutable.
It's to hash them. We can't allow people to change the thing being hashed,
since that would invalidate the hash. So while fs-verity does make the file
contents immutable, it's actually a requirement for it being hashed (measured),
rather than the end goal. There's no reason from fs-verity's perspective to
make anything else immutable.
That being said, in the future, we could allow declaring file metadata like the
Unix mode bits in the authenticated portion of the fs-verity descriptor, so they
would be included in the file hash. fs-verity would then need to enforce that
the declared mode matches the actual one and that the actual one cannot be
changed. Extended attributes could be included in the hash in the same way.
But that's out of scope for now, as so far users only need the file contents to
be hashed.
- Eric
On Sat, Aug 25, 2018 at 12:00:04PM +0800, Gao Xiang wrote:
>
> But I have some consideration than the current implementation.... (if it is suitable to discuss, thanks...)
>
> 1) Since it is the libfs-like library, I think bio-strict is too strict for its future fs users.
Well, it's always possible to potentially expand fs-crypt and
fs-verity to be more flexible in the future. For example, Chandan
Rajendra from IBM has been working on a set of patches to support file
systems that have a block size smaller than a page size. This turns
out to be important on Power architecture with 64k page sizes.
Fundamentally, a Merkle tree is a data structure that works on fixed
size chunks, both for the data blocks and the hash tree. The natural
size to use is the page size, since data is cached in the page cache.
So a file system can be store data in any number of places, but
ultimately, most interesting file systems are ones where you can
execute ELF binaries out of said file system with demand paging, which
in turn means that mmap has to work, which in turn means that file
data will be stored in the page cache. This is true of f2fs, btrfs,
ext4, xfs, etc. So basically, fs-verity will be verifying the page
before it is marked as uptodate. Right now, all of the file systems
that we are interested in trigger the call to ask fsverity to verify
the page via the bio endio callback function.
Some other file systems could theoretically call that function after
assembling the page from a dozen random locations in a b-tree. In
that case, it could call fsverity after assembling the page in the
page cache. But I'd suggest worrying about it when such a file system
comes out of the woodwork, and someone is willing to do the work to
integrate fserity in that file system.
> 2) My last question
> "At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
> rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better....."
> is also for some files partially or totally encoded (eg. compressed, or whatever ...)
Well, the userspace interface for instantiating a fs-verity file is
that it writes the file data with the fs-verity metadata (which
consists of the Merkle tree with a fs-verity header at the end of the
file). The program (which might be a package manager such as dpkg or
rpm) would then call an ioctl which would cause the file system to
read the fs-verity header and make only the file data visible, and the
file system would the verify the data as it is read into the page
cache.
That is the userspace API to the fs-verity system. That has to remain
the same, regardless of which file system is in use. We need a common
interface so that whether it is the Android APK management system, or
some distribution package manager, can instantiate fs-verity protected
file the same way regardless of the file system in use.
There is a very simple, easy way to implement this in the file system,
and f2fs and ext4 both do it that way --- which is to simply change
the i_size exposed to the userspace when you stat the file, and we use
the file system's existing mechanism to map logical block numbers to
physical block numbers to read the Merkle tree.
If the file system wants to import that file data and store it
somewhere else random --- perhaps it breaks it apart into a zillion
tiny pieces and puts it in a b-tree --- a file system implementor is
free to do that. I personally think it is a completely insane thing
to do, but there is nothing in the fs-verity design that *prohibits*
that.
Regards,
- Ted
On 2018/8/25 0:16, Eric Biggers wrote:
> From: Eric Biggers <[email protected]>
> #ifdef CONFIG_F2FS_CHECK_FS
> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
> #else
> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
> #define F2FS_FEATURE_QUOTA_INO 0x0080
> #define F2FS_FEATURE_INODE_CRTIME 0x0100
> #define F2FS_FEATURE_LOST_FOUND 0x0200
> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
> +#define F2FS_FEATURE_VERITY 0x0400
>
> #define F2FS_HAS_FEATURE(sb, mask) \
> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> @@ -598,7 +601,7 @@ enum {
> #define FADVISE_ENC_NAME_BIT 0x08
> #define FADVISE_KEEP_SIZE_BIT 0x10
> #define FADVISE_HOT_BIT 0x20
> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
> +#define FADVISE_VERITY_BIT 0x40
As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
generic i_flags field like ext4, so we can a) remaining more bits for those
demands which really need file advise fields. b) using i_flags bits keeping line
with ext4. Not sure, if user want to know whether the file is verity one, it
will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
#define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
#define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
Thanks,
Hi Eric,
Thanks for your detailed reply.
My english is not quite well, I could not type logically and quickly like you and could use some words improperly,
I just want to express my personal concern, please understand, thanks. :)
On 2018/8/25 12:16, Eric Biggers wrote:
> Hi Gao,
>
> On Sat, Aug 25, 2018 at 10:29:26AM +0800, Gao Xiang wrote:
>> Hi,
>>
>> On 2018/8/25 0:16, Eric Biggers wrote:
>>> +/**
>>> + * fsverity_verify_page - verify a data page
>>> + *
>>> + * Verify a page that has just been read from a file against that file's Merkle
>>> + * tree. The page is assumed to be a pagecache page.
>>> + *
>>> + * Return: true if the page is valid, else false.
>>> + */
>>> +bool fsverity_verify_page(struct page *data_page)
>>> +{
>>> + struct inode *inode = data_page->mapping->host;
>>> + const struct fsverity_info *vi = get_fsverity_info(inode);
>>> + struct ahash_request *req;
>>> + bool valid;
>>> +
>>> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
>>> + if (unlikely(!req))
>>> + return false;
>>> +
>>> + valid = verify_page(inode, vi, req, data_page);
>>> +
>>> + ahash_request_free(req);
>>> +
>>> + return valid;
>>> +}
>>> +EXPORT_SYMBOL_GPL(fsverity_verify_page);
>>> +
>>> +/**
>>> + * fsverity_verify_bio - verify a 'read' bio that has just completed
>>> + *
>>> + * Verify a set of pages that have just been read from a file against that
>>> + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
>>> + * fail verification are set to the Error state. Verification is skipped for
>>> + * pages already in the Error state, e.g. due to fscrypt decryption failure.
>>> + */
>>> +void fsverity_verify_bio(struct bio *bio)
>>> +{
>>> + struct inode *inode = bio_first_page_all(bio)->mapping->host;
>>> + const struct fsverity_info *vi = get_fsverity_info(inode);
>>> + struct ahash_request *req;
>>> + struct bio_vec *bv;
>>> + int i;
>>> +
>>> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
>>> + if (unlikely(!req)) {
>>> + bio_for_each_segment_all(bv, bio, i)
>>> + SetPageError(bv->bv_page);
>>> + return;
>>> + }
>>> +
>>> + bio_for_each_segment_all(bv, bio, i) {
>>> + struct page *page = bv->bv_page;
>>> +
>>> + if (!PageError(page) && !verify_page(inode, vi, req, page))
>>> + SetPageError(page);
>>> + }
>>> +
>>> + ahash_request_free(req);
>>> +}
>>> +EXPORT_SYMBOL_GPL(fsverity_verify_bio);
>>
>> Out of curiosity, I quickly scanned the fs-verity source code and some minor question out there....
>>
>> If something is wrong, please point out, thanks in advance...
>>
>> My first question is that 'Is there any way to skip to verify pages in a bio?'
>> I am thinking about
>> If metadata and data page are mixed in a filesystem of such kind, they could submit together in a bio, but metadata could be unsuitable for such kind of verification.
>>
>
> Pages below i_size are verified, pages above are not.
>
> With my patches, ext4 and f2fs won't actually submit pages in both areas in the
> same bio, and they won't call the fs-verity verification function for bios in
> the data area. But even if they did, there's also a check in verify_page() that
I think you mean the hash area?
Yes, I understand your design. It is a wonderful job for ext4/f2fs for now as Ted said.
> skips the verification if the page is above i_size.
>
I think it could not be as simple as you said for all cases.
If some fs submits contiguous access with different MAPPING (something like mixed FILE_MAPPING and META_MAPPING),
their page->index are actually unreliable(could be logical page index for FILE_MAPPING,and physical page index for META_MAPPING),
and data are organized by design in multi bios for a fs-specific use (such as compresssion).
You couldn't do such verification `if the page is above i_size' and it could be hard to integrate somehow.
>> The second question is related to the first question --- 'Is there any way to verify a partial page?'
>> Take scalability into consideration, some files could be totally inlined or partially inlined in metadata.
>> Is there any way to deal with them in per-file approach? at least --- support for the interface?
>
> Well, one problem is that inline data has its own separate I/O path; see
> ext4_readpage_inline() and f2fs_read_inline_data(). So it would be a large
> effort to support features like encryption and verity which require
> postprocessing after reads, and probably not worthwhile especially for verity
> which is primarily intended for large files.
Yes, for the current user ext4 and f2fs, it is absolutely wonderful.
I have to admit I am curious about Google fs-verity roadmap for the future Android
(I have to identify whether it is designed to replace dm-verity, currently I think is no)
since it is very important whether our EROFS should support fs-verity or not in the near future...
I could give some EROFS use case if you have some time to discuss.
EROFS uses a more aggressive inline approach, which means it not only inline data for small files.
It is designed to inline the last page, which are reasonable small (eg. only a few byte) to inline for all files, eg.
IN FILE_MAPPING
IN META_MAPPING blk-aligned
+--------------------------------------| +--------+--------+ +----------+.......+
|inode A+inlined-last data.. inode B...| | page 0 | page 1 | ... | page n-1 . page n.
+--------------------------------------+ +--------+--------+ +----------+.......+
|------------------------------------------------------------------------------------------|\
In priciple, this approach could be also used for read-write file systems to save more storage space.
I think it is still easy for uncompressed file if you do the zero padding as you said below.
But if considering _compression_.....especially compression in VLE, I think it should not rely on `bio' directly, because,
1) endio with compressed data rather than FILE_MAPPING plain data, these pages which could from META_MAPPING
(for caching compressed page on purpose) or FILE_MAPPING(for decompressing in-place to save redundant META_MAPPING memory).
I think it should be decompress at first and then fs-verity, but there could be more filepages other than compresssed pages joined
(eg. 128kb->32kb, we submit 8 pages but decompress end with 32 pages), it should not be the original bio any more...
(actually I think it is not the bio concept anymore...)
2) EROFS VLE is more complicated, we could end a bio with a compressed page but decompress a partial file page, eg.
+-------------------+--------------------+
... | compressed page X |compressed page X+1 |
+-------------------|--------------------+
end of bio Y/ bio Y+1
\ | /
+-------------------------+
| plain data (file page)|
+-------------------------+
which means a bio could only decompress partial data of a page, the page could be Uptodate by two bios rather than one,
I have no idea how to fs-verity like this...
`it could call fsverity after assembling the page in the page cache.` as Ted said in that case.
>
> A somewhat separate question is whether the zero padding to a block boundary
> after i_size, before the Merkle tree begins, is needed. The answer is yes,
> since mixing data and metadata in the same page would cause problems. First,
> userspace would be able to mmap the page and see some of the metadata rather
> than zeroes. That's not a huge problem, but it breaks the standard behavior.
> Second, any page containing data cannot be set Uptodate until it's been
> verified. So, a special case would be needed to handle reading the part of the
> metadata that's located in a data page.
Yes, after I just thinked over, I think there should be a zero padding to a block boundary
as you said due to Uptodate and mmap scenerio if you directly use its inode(file) mapping for verification.
>
>> At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
>> rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better.....hmmm... :(
>
> In theory it would be a much cleaner design to store verity metadata separately
> from the data. But the Merkle tree can be very large. For example, a 1 GB file
> using SHA-512 would have a 16.6 MB Merkle tree. So the Merkle tree can't be an
> extended attribute, since the xattrs API requires xattrs to be small (<= 64 KB),
> and most filesystems further limit xattr sizes in their on-disk format to as
> little as 4 KB. Furthermore, even if both of these limits were to be increased,
> the xattrs functions (both the syscalls, and the internal functions that
> filesystems have) are all based around getting/setting the entire xattr value.
>
> Also when used with fscrypt, we want the Merkle tree and fsverity_descriptor to
> be encrypted, so they doesn't leak plaintext hashes. And we want the Merkle
> tree to be paged into memory, just like the file contents, to take advantage of
> the usual Linux memory management.
>
> What we really need is *streams*, like NTFS has. But the filesystems we're
> targetting don't support streams, nor does the Linux syscall interface have any
> API for accessing streams, nor does the VFS support them.
>
> Adding streams support to all those things would be a huge multi-year effort,
> controversial, and almost certainly not worth it just for fs-verity.
>
> So simply storing the verity metadata past i_size seems like the best solution
> for now.
>
> That being said, in the future we could pretty easily swap out the calls to
> read_mapping_page() with something else if a particular filesystem wanted to
> store the metadata somewhere else. We actually even originally had a function
> ->read_metadata_page() in the filesystem's fsverity_operations, but it turned
> out to be unnecessary and I replaced it with directly calling
> read_mapping_page(), but it could be changed back at any time.
OK, I got it.
I have to look into that and think over again. Thanks for your reply again in the end. :)
Thanks,
Gao Xiang
>
> - Eric
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Gao,
On Sat, Aug 25, 2018 at 02:31:16PM +0800, Gao Xiang wrote:
> Hi Eric,
>
> Thanks for your detailed reply.
>
> My english is not quite well, I could not type logically and quickly like you and could use some words improperly,
> I just want to express my personal concern, please understand, thanks. :)
>
> On 2018/8/25 12:16, Eric Biggers wrote:
> > Hi Gao,
> >
> > On Sat, Aug 25, 2018 at 10:29:26AM +0800, Gao Xiang wrote:
> >> Hi,
> >>
> >> On 2018/8/25 0:16, Eric Biggers wrote:
> >>> +/**
> >>> + * fsverity_verify_page - verify a data page
> >>> + *
> >>> + * Verify a page that has just been read from a file against that file's Merkle
> >>> + * tree. The page is assumed to be a pagecache page.
> >>> + *
> >>> + * Return: true if the page is valid, else false.
> >>> + */
> >>> +bool fsverity_verify_page(struct page *data_page)
> >>> +{
> >>> + struct inode *inode = data_page->mapping->host;
> >>> + const struct fsverity_info *vi = get_fsverity_info(inode);
> >>> + struct ahash_request *req;
> >>> + bool valid;
> >>> +
> >>> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> >>> + if (unlikely(!req))
> >>> + return false;
> >>> +
> >>> + valid = verify_page(inode, vi, req, data_page);
> >>> +
> >>> + ahash_request_free(req);
> >>> +
> >>> + return valid;
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(fsverity_verify_page);
> >>> +
> >>> +/**
> >>> + * fsverity_verify_bio - verify a 'read' bio that has just completed
> >>> + *
> >>> + * Verify a set of pages that have just been read from a file against that
> >>> + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
> >>> + * fail verification are set to the Error state. Verification is skipped for
> >>> + * pages already in the Error state, e.g. due to fscrypt decryption failure.
> >>> + */
> >>> +void fsverity_verify_bio(struct bio *bio)
> >>> +{
> >>> + struct inode *inode = bio_first_page_all(bio)->mapping->host;
> >>> + const struct fsverity_info *vi = get_fsverity_info(inode);
> >>> + struct ahash_request *req;
> >>> + struct bio_vec *bv;
> >>> + int i;
> >>> +
> >>> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> >>> + if (unlikely(!req)) {
> >>> + bio_for_each_segment_all(bv, bio, i)
> >>> + SetPageError(bv->bv_page);
> >>> + return;
> >>> + }
> >>> +
> >>> + bio_for_each_segment_all(bv, bio, i) {
> >>> + struct page *page = bv->bv_page;
> >>> +
> >>> + if (!PageError(page) && !verify_page(inode, vi, req, page))
> >>> + SetPageError(page);
> >>> + }
> >>> +
> >>> + ahash_request_free(req);
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(fsverity_verify_bio);
> >>
> >> Out of curiosity, I quickly scanned the fs-verity source code and some minor question out there....
> >>
> >> If something is wrong, please point out, thanks in advance...
> >>
> >> My first question is that 'Is there any way to skip to verify pages in a bio?'
> >> I am thinking about
> >> If metadata and data page are mixed in a filesystem of such kind, they could submit together in a bio, but metadata could be unsuitable for such kind of verification.
> >>
> >
> > Pages below i_size are verified, pages above are not.
> >
> > With my patches, ext4 and f2fs won't actually submit pages in both areas in the
> > same bio, and they won't call the fs-verity verification function for bios in
> > the data area. But even if they did, there's also a check in verify_page() that
>
> I think you mean the hash area?
Yes, I meant the hash area.
> > skips the verification if the page is above i_size.
> >
>
> I think it could not be as simple as you said for all cases.
>
> If some fs submits contiguous access with different MAPPING (something like
> mixed FILE_MAPPING and META_MAPPING), their page->index are actually
> unreliable(could be logical page index for FILE_MAPPING,and physical page
> index for META_MAPPING), and data are organized by design in multi bios for a
> fs-specific use (such as compresssion).
>
> You couldn't do such verification `if the page is above i_size' and it could
> be hard to integrate somehow.
We do have to be very careful here, but the same restriction already exists with
fscrypt which both f2fs and ext4 already support too. With fscrypt, each page
is decrypted with the key from page->mapping->host->i_crypt_info and the
initialization vector from page->index. With fs-verity, each page is verified
using the Merkle tree state from page->mapping->host->i_verify_info and the
block location from page->index. So, they are very similar.
On f2fs, any pages submitted via META_MAPPING just skip both fscrypt and
fs-verity since the "meta_inode" doesn't have either feature enabled. That's
done intentionally, so that garbage collection can move the blocks on-disk.
Regular reads aren't done via META_MAPPING.
>
> >> The second question is related to the first question --- 'Is there any way to verify a partial page?'
> >> Take scalability into consideration, some files could be totally inlined or partially inlined in metadata.
> >> Is there any way to deal with them in per-file approach? at least --- support for the interface?
> >
> > Well, one problem is that inline data has its own separate I/O path; see
> > ext4_readpage_inline() and f2fs_read_inline_data(). So it would be a large
> > effort to support features like encryption and verity which require
> > postprocessing after reads, and probably not worthwhile especially for verity
> > which is primarily intended for large files.
>
> Yes, for the current user ext4 and f2fs, it is absolutely wonderful.
>
>
> I have to admit I am curious about Google fs-verity roadmap for the future Android
> (I have to identify whether it is designed to replace dm-verity, currently I think is no)
>
> since it is very important whether our EROFS should support fs-verity or not in the near future...
>
>
> I could give some EROFS use case if you have some time to discuss.
>
> EROFS uses a more aggressive inline approach, which means it not only inline data for small files.
> It is designed to inline the last page, which are reasonable small (eg. only a few byte) to inline for all files, eg.
>
> IN FILE_MAPPING
> IN META_MAPPING blk-aligned
> +--------------------------------------| +--------+--------+ +----------+.......+
> |inode A+inlined-last data.. inode B...| | page 0 | page 1 | ... | page n-1 . page n.
> +--------------------------------------+ +--------+--------+ +----------+.......+
> |------------------------------------------------------------------------------------------|\
>
> In priciple, this approach could be also used for read-write file systems to save more storage space.
> I think it is still easy for uncompressed file if you do the zero padding as you said below.
>
> But if considering _compression_.....especially compression in VLE, I think it should not rely on `bio' directly, because,
> 1) endio with compressed data rather than FILE_MAPPING plain data, these pages which could from META_MAPPING
> (for caching compressed page on purpose) or FILE_MAPPING(for decompressing in-place to save redundant META_MAPPING memory).
>
> I think it should be decompress at first and then fs-verity, but there could be more filepages other than compresssed pages joined
> (eg. 128kb->32kb, we submit 8 pages but decompress end with 32 pages), it should not be the original bio any more...
> (actually I think it is not the bio concept anymore...)
>
> 2) EROFS VLE is more complicated, we could end a bio with a compressed page but decompress a partial file page, eg.
> +-------------------+--------------------+
> ... | compressed page X |compressed page X+1 |
> +-------------------|--------------------+
> end of bio Y/ bio Y+1
> \ | /
> +-------------------------+
> | plain data (file page)|
> +-------------------------+
> which means a bio could only decompress partial data of a page, the page could be Uptodate by two bios rather than one,
> I have no idea how to fs-verity like this...
>
> `it could call fsverity after assembling the page in the page cache.` as Ted said in that case.
>
I don't know of any plan to use fs-verity on Android's system partition or to
replace dm-verity on the system partition. The use cases so far have been
verifying files on /data, like APK files.
So I don't think you need to support fs-verity in EROFS.
Re: the compression, I don't see how it would be much of a problem (even if you
did need or want to add fs-verity support). Assuming that the verification is
done over the uncompressed version of the data, you wouldn't verify the pages
directly from the bio's page list since those would contain compressed data.
But even without fs-verity you'd need to decompress the data into pagecache
pages... so you could just call fsverity_verify_page() on each of those
decompressed pages before unlocking them and setting them Uptodate. You don't
*have* to call fsverity_verify_bio() to do the verification; it's just a helper
for the case where the list of pages to verify happens to be in a completed bio.
> >
> > A somewhat separate question is whether the zero padding to a block boundary
> > after i_size, before the Merkle tree begins, is needed. The answer is yes,
> > since mixing data and metadata in the same page would cause problems. First,
> > userspace would be able to mmap the page and see some of the metadata rather
> > than zeroes. That's not a huge problem, but it breaks the standard behavior.
> > Second, any page containing data cannot be set Uptodate until it's been
> > verified. So, a special case would be needed to handle reading the part of the
> > metadata that's located in a data page.
>
> Yes, after I just thinked over, I think there should be a zero padding to a block boundary
> as you said due to Uptodate and mmap scenerio if you directly use its inode(file) mapping for verification.
>
>
> >
> >> At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
> >> rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better.....hmmm... :(
> >
> > In theory it would be a much cleaner design to store verity metadata separately
> > from the data. But the Merkle tree can be very large. For example, a 1 GB file
> > using SHA-512 would have a 16.6 MB Merkle tree. So the Merkle tree can't be an
> > extended attribute, since the xattrs API requires xattrs to be small (<= 64 KB),
> > and most filesystems further limit xattr sizes in their on-disk format to as
> > little as 4 KB. Furthermore, even if both of these limits were to be increased,
> > the xattrs functions (both the syscalls, and the internal functions that
> > filesystems have) are all based around getting/setting the entire xattr value.
> >
> > Also when used with fscrypt, we want the Merkle tree and fsverity_descriptor to
> > be encrypted, so they doesn't leak plaintext hashes. And we want the Merkle
> > tree to be paged into memory, just like the file contents, to take advantage of
> > the usual Linux memory management.
> >
> > What we really need is *streams*, like NTFS has. But the filesystems we're
> > targetting don't support streams, nor does the Linux syscall interface have any
> > API for accessing streams, nor does the VFS support them.
> >
> > Adding streams support to all those things would be a huge multi-year effort,
> > controversial, and almost certainly not worth it just for fs-verity.
> >
> > So simply storing the verity metadata past i_size seems like the best solution
> > for now.
> >
> > That being said, in the future we could pretty easily swap out the calls to
> > read_mapping_page() with something else if a particular filesystem wanted to
> > store the metadata somewhere else. We actually even originally had a function
> > ->read_metadata_page() in the filesystem's fsverity_operations, but it turned
> > out to be unnecessary and I replaced it with directly calling
> > read_mapping_page(), but it could be changed back at any time.
>
> OK, I got it.
>
> I have to look into that and think over again. Thanks for your reply again in the end. :)
>
> Thanks,
> Gao Xiang
>
- Eric
Hi Ted,
Thanks for your detailed reply. Sorry about my english, the words could not be logical.
Tiny pieces in B-tree to compose a page is too far from us too, and you are right,
fs-verity is complete for >99% cases for the existed file system, and no need to worry about currently.
As I mentioned in reply to Eric, I am actually curious about the Google fs-verity roadmap
for the future Android, I need to analyze if it is only limited to APKs for the read-write partitions
and not to replace dm-verity in the near future since fs-verity has some conflicts
to EROFS I am working on I mentioned in the email to Eric.
I think it is more than just to handle FILE_MAPPING and bio-strict for compression use.
On 2018/8/25 13:06, Theodore Y. Ts'o wrote:
> But I'd suggest worrying about it when such a file system
> comes out of the woodwork, and someone is willing to do the work to
> integrate fserity in that file system.
>
Yes, we are now handling partial page due to compression use.
fs could submit bios in pages from different mapping(FILE_MAPPING[compress in-place and no caching
compressed page to reduce extra memory overhead] or META_MAPPING [for caching compressed page]) and
they could be decompressed into many full pages and (possible) a partial page (in-place or out-of-place).
so in principle, since we have BIO_MAX_PAGES limitation, a filemap page could be Uptodate
after two bios is ended and decompressed. and other runtime limitations could also divide a bio into two bios for encoded cases.
Therefore, I think in that case we could not just consider FILE_MAPPING and one bio, and as you said `In
that case, it could call fsverity after assembling the page in the page cache.' should be done in this way.
> Well, the userspace interface for instantiating a fs-verity file is
> that it writes the file data with the fs-verity metadata (which
> consists of the Merkle tree with a fs-verity header at the end of the
> file). The program (which might be a package manager such as dpkg or
> rpm) would then call an ioctl which would cause the file system to
> read the fs-verity header and make only the file data visible, and the
> file system would the verify the data as it is read into the page
> cache.
Thanks for your reply again, I think fs-verity is good enough for now.
However, I need to think over about fs-verity itself more... :(
Thanks,
Gao Xiang
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Eric,
On 2018/8/25 15:18, Eric Biggers wrote:
> We do have to be very careful here, but the same restriction already exists with
> fscrypt which both f2fs and ext4 already support too. With fscrypt, each page
> is decrypted with the key from page->mapping->host->i_crypt_info and the
> initialization vector from page->index. With fs-verity, each page is verified
> using the Merkle tree state from page->mapping->host->i_verify_info and the
> block location from page->index. So, they are very similar.
>
> On f2fs, any pages submitted via META_MAPPING just skip both fscrypt and
> fs-verity since the "meta_inode" doesn't have either feature enabled. That's
> done intentionally, so that garbage collection can move the blocks on-disk.
> Regular reads aren't done via META_MAPPING.
>
I think you deal with the existed cases quite well, I was just thinking about EROFS... :)
> I don't know of any plan to use fs-verity on Android's system partition or to
> replace dm-verity on the system partition. The use cases so far have been
> verifying files on /data, like APK files.
>
> So I don't think you need to support fs-verity in EROFS.
>
Thanks for your information about fs-verity, that is quite useful for us
Actually, I was worrying about that these months... :)
> Re: the compression, I don't see how it would be much of a problem (even if you
> did need or want to add fs-verity support). Assuming that the verification is
> done over the uncompressed version of the data, you wouldn't verify the pages
> directly from the bio's page list since those would contain compressed data.
> But even without fs-verity you'd need to decompress the data into pagecache
> pages... so you could just call fsverity_verify_page() on each of those
> decompressed pages before unlocking them and setting them Uptodate. You don't
> *have* to call fsverity_verify_bio() to do the verification; it's just a helper
> for the case where the list of pages to verify happens to be in a completed bio.
>
I haven't look into all patches, I will look into that carefully if I finish my current job.
It is wonderful to have such a helper --- fsverity_verify_page :)
I have no other problem currently, and look forward for your final implementation.
Best Regards,
Gao Xiang
> - Eric
Hi Ted,
Please ignore the following email, Eric has replied to me. :)
I need to dig into these fs-verity patches later and best wishes to fs-verity.
Thanks,
Gao Xiang
On 2018/8/25 15:33, Gao Xiang wrote:
> Hi Ted,
>
> Thanks for your detailed reply. Sorry about my english, the words could not be logical.
>
> Tiny pieces in B-tree to compose a page is too far from us too, and you are right,
> fs-verity is complete for >99% cases for the existed file system, and no need to worry about currently.
>
> As I mentioned in reply to Eric, I am actually curious about the Google fs-verity roadmap
> for the future Android, I need to analyze if it is only limited to APKs for the read-write partitions
> and not to replace dm-verity in the near future since fs-verity has some conflicts
> to EROFS I am working on I mentioned in the email to Eric.
>
> I think it is more than just to handle FILE_MAPPING and bio-strict for compression use.
>
> On 2018/8/25 13:06, Theodore Y. Ts'o wrote:
>> But I'd suggest worrying about it when such a file system
>> comes out of the woodwork, and someone is willing to do the work to
>> integrate fserity in that file system.
>>
> Yes, we are now handling partial page due to compression use.
>
> fs could submit bios in pages from different mapping(FILE_MAPPING[compress in-place and no caching
> compressed page to reduce extra memory overhead] or META_MAPPING [for caching compressed page]) and
> they could be decompressed into many full pages and (possible) a partial page (in-place or out-of-place).
>
> so in principle, since we have BIO_MAX_PAGES limitation, a filemap page could be Uptodate
> after two bios is ended and decompressed. and other runtime limitations could also divide a bio into two bios for encoded cases.
>
> Therefore, I think in that case we could not just consider FILE_MAPPING and one bio, and as you said `In
> that case, it could call fsverity after assembling the page in the page cache.' should be done in this way.
>
>> Well, the userspace interface for instantiating a fs-verity file is
>> that it writes the file data with the fs-verity metadata (which
>> consists of the Merkle tree with a fs-verity header at the end of the
>> file). The program (which might be a package manager such as dpkg or
>> rpm) would then call an ioctl which would cause the file system to
>> read the fs-verity header and make only the file data visible, and the
>> file system would the verify the data as it is read into the page
>> cache.
> Thanks for your reply again, I think fs-verity is good enough for now.
> However, I need to think over about fs-verity itself more... :(
>
> Thanks,
> Gao Xiang
On Sat, Aug 25, 2018 at 03:43:43PM +0800, Gao Xiang wrote:
> > I don't know of any plan to use fs-verity on Android's system partition or to
> > replace dm-verity on the system partition. The use cases so far have been
> > verifying files on /data, like APK files.
> >
> > So I don't think you need to support fs-verity in EROFS.
>
> Thanks for your information about fs-verity, that is quite useful for us
> Actually, I was worrying about that these months... :)
I'll be even clearer --- I can't *imagine* any situation where it
would make sense to use fs-verity on the Android system partition.
Remember, for OTA to work the system image has to be bit-for-bit
identical to the official golden image for that release. So the
system image has to be completely locked down from any modification
(to data or metadata), and that means dm-verity and *NOT* fs-verity.
The initial use of fs-verity (as you can see if you look at AOSP) will
be to protect a small number of privileged APK's that are stored on
the data partition. Previously, they were verified when they were
downloaded, and never again.
Part of the goal which we are trying to achieve here is that even if
the kernel gets compromised by a 0-day, a successful reboot should
restore the system to a known state. That is, the secure bootloader
checks the signature of the kernel, and then in turn, dm-verity will
verify the root Merkle hash protecting the system partition, and
fs-verity will protect the privileged APK's. If malware modifies any
these components in an attempt to be persistent, the modifications
would be detected, and the worst it could do is to cause subsequent
reboots to fail until the phone's software could be reflashed.
Cheers,
- Ted
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Ted,
Sorry for the late reply...
On 2018/8/26 1:06, Theodore Y. Ts'o wrote:
> On Sat, Aug 25, 2018 at 03:43:43PM +0800, Gao Xiang wrote:
>>> I don't know of any plan to use fs-verity on Android's system partition or to
>>> replace dm-verity on the system partition. The use cases so far have been
>>> verifying files on /data, like APK files.
>>>
>>> So I don't think you need to support fs-verity in EROFS.
>>
>> Thanks for your information about fs-verity, that is quite useful for us
>> Actually, I was worrying about that these months... :)
>
> I'll be even clearer --- I can't *imagine* any situation where it
> would make sense to use fs-verity on the Android system partition.
> Remember, for OTA to work the system image has to be bit-for-bit
> identical to the official golden image for that release. So the
> system image has to be completely locked down from any modification
> (to data or metadata), and that means dm-verity and *NOT* fs-verity.
I think so mainly because of the security reason you said above.
In addition, I think it is mandatory that the Android system partition
should also _never_ suffer from filesystem corrupted by design (expect
for the storage device corrupt or malware), therefore I think the
bit-for-bit read-only, and identical-verity requirement is quite strong
for Android, which will make the Android system steady and as solid as
rocks.
But I need to make sure my personal thoughts through this topic. :)
>
> The initial use of fs-verity (as you can see if you look at AOSP) will
> be to protect a small number of privileged APK's that are stored on
> the data partition. Previously, they were verified when they were
> downloaded, and never again.
>
> Part of the goal which we are trying to achieve here is that even if
> the kernel gets compromised by a 0-day, a successful reboot should
> restore the system to a known state. That is, the secure bootloader
> checks the signature of the kernel, and then in turn, dm-verity will
> verify the root Merkle hash protecting the system partition, and
> fs-verity will protect the privileged APK's. If malware modifies any
> these components in an attempt to be persistent, the modifications
> would be detected, and the worst it could do is to cause subsequent
> reboots to fail until the phone's software could be reflashed.
>
Yeah, I have seen the the fs-verity presentation and materials from
Android bootcamp and other official channels before.
Thanks for your kindly detailed explanation. :)
Best regards,
Gao Xiang
> Cheers,
>
> - Ted
>
> On Aug 24, 2018, at 12:16 PM, Eric Biggers <[email protected]> wrote:
>
> From: Eric Biggers <[email protected]>
>
> Add functions that verify data pages that have been read from a
> fs-verity file, against that file's Merkle tree. These will be called
> from filesystems' ->readpage() and ->readpages() methods.
>
> Since data verification can block, a workqueue is provided for these
> methods to enqueue verification work from their bio completion callback.
>
> Signed-off-by: Eric Biggers <[email protected]>
> ---
> fs/verity/Makefile | 2 +-
> fs/verity/fsverity_private.h | 3 +
> fs/verity/setup.c | 26 ++-
> fs/verity/verify.c | 310 +++++++++++++++++++++++++++++++++++
> include/linux/fsverity.h | 23 +++
> 5 files changed, 362 insertions(+), 2 deletions(-)
> create mode 100644 fs/verity/verify.c
>
> diff --git a/fs/verity/Makefile b/fs/verity/Makefile
> index 39e123805c827..a6c7cefb61ab7 100644
> --- a/fs/verity/Makefile
> +++ b/fs/verity/Makefile
> @@ -1,3 +1,3 @@
> obj-$(CONFIG_FS_VERITY) += fsverity.o
>
> -fsverity-y := hash_algs.o setup.o
> +fsverity-y := hash_algs.o setup.o verify.o
> diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> index a18ff645695f4..c553f99dc4973 100644
> --- a/fs/verity/fsverity_private.h
> +++ b/fs/verity/fsverity_private.h
> @@ -96,4 +96,7 @@ static inline bool set_fsverity_info(struct inode *inode,
> return true;
> }
>
> +/* verify.c */
> +extern struct workqueue_struct *fsverity_read_workqueue;
> +
> #endif /* _FSVERITY_PRIVATE_H */
> diff --git a/fs/verity/setup.c b/fs/verity/setup.c
> index e675c52898d5b..84cc2edeca25b 100644
> --- a/fs/verity/setup.c
> +++ b/fs/verity/setup.c
> @@ -824,18 +824,42 @@ EXPORT_SYMBOL_GPL(fsverity_full_i_size);
>
> static int __init fsverity_module_init(void)
> {
> + int err;
> +
> + /*
> + * Use an unbound workqueue to allow bios to be verified in parallel
> + * even when they happen to complete on the same CPU. This sacrifices
> + * locality, but it's worthwhile since hashing is CPU-intensive.
> + *
> + * Also use a high-priority workqueue to prioritize verification work,
> + * which blocks reads from completing, over regular application tasks.
> + */
> + err = -ENOMEM;
> + fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue",
> + WQ_UNBOUND | WQ_HIGHPRI,
> + num_online_cpus());
> + if (!fsverity_read_workqueue)
> + goto error;
> +
> + err = -ENOMEM;
> fsverity_info_cachep = KMEM_CACHE(fsverity_info, SLAB_RECLAIM_ACCOUNT);
> if (!fsverity_info_cachep)
> - return -ENOMEM;
> + goto error_free_workqueue;
>
> fsverity_check_hash_algs();
>
> pr_debug("Initialized fs-verity\n");
> return 0;
> +
> +error_free_workqueue:
> + destroy_workqueue(fsverity_read_workqueue);
> +error:
> + return err;
> }
>
> static void __exit fsverity_module_exit(void)
> {
> + destroy_workqueue(fsverity_read_workqueue);
> kmem_cache_destroy(fsverity_info_cachep);
> fsverity_exit_hash_algs();
> }
> diff --git a/fs/verity/verify.c b/fs/verity/verify.c
> new file mode 100644
> index 0000000000000..1452dd05f75d3
> --- /dev/null
> +++ b/fs/verity/verify.c
> @@ -0,0 +1,310 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * fs/verity/verify.c: fs-verity data verification functions,
> + * i.e. hooks for ->readpages()
> + *
> + * Copyright (C) 2018 Google LLC
> + *
> + * Originally written by Jaegeuk Kim and Michael Halcrow;
> + * heavily rewritten by Eric Biggers.
> + */
> +
> +#include "fsverity_private.h"
> +
> +#include <crypto/hash.h>
> +#include <linux/bio.h>
> +#include <linux/pagemap.h>
> +#include <linux/ratelimit.h>
> +#include <linux/scatterlist.h>
> +
> +struct workqueue_struct *fsverity_read_workqueue;
> +
> +/**
> + * hash_at_level() - compute the location of the block's hash at the given level
> + *
> + * @vi: (in) the file's verity info
> + * @dindex: (in) the index of the data block being verified
> + * @level: (in) the level of hash we want
> + * @hindex: (out) the index of the hash block containing the wanted hash
> + * @hoffset: (out) the byte offset to the wanted hash within the hash block
> + */
> +static void hash_at_level(const struct fsverity_info *vi, pgoff_t dindex,
> + unsigned int level, pgoff_t *hindex,
> + unsigned int *hoffset)
> +{
> + pgoff_t hoffset_in_lvl;
> +
> + /*
> + * Compute the offset of the hash within the level's region, in hashes.
> + * For example, with 4096-byte blocks and 32-byte hashes, there are
> + * 4096/32 = 128 = 2^7 hashes per hash block, i.e. log_arity = 7. Then,
> + * if the data block index is 65668 and we want the level 1 hash, it is
> + * located at 65668 >> 7 = 513 hashes into the level 1 region.
> + */
> + hoffset_in_lvl = dindex >> (level * vi->log_arity);
> +
> + /*
> + * Compute the index of the hash block containing the wanted hash.
> + * Continuing the above example, the block would be at index 513 >> 7 =
> + * 4 within the level 1 region. To this we'd add the index at which the
> + * level 1 region starts.
> + */
> + *hindex = vi->hash_lvl_region_idx[level] +
> + (hoffset_in_lvl >> vi->log_arity);
> +
> + /*
> + * Finally, compute the index of the hash within the block rather than
> + * the region, and multiply by the hash size to turn it into a byte
> + * offset. Continuing the above example, the hash would be at byte
> + * offset (513 & ((1 << 7) - 1)) * 32 = 32 within the block.
> + */
> + *hoffset = (hoffset_in_lvl & ((1 << vi->log_arity) - 1)) *
> + vi->hash_alg->digest_size;
> +}
> +
> +/* Extract a hash from a hash page */
> +static void extract_hash(struct page *hpage, unsigned int hoffset,
> + unsigned int hsize, u8 *out)
> +{
> + void *virt = kmap_atomic(hpage);
> +
> + memcpy(out, virt + hoffset, hsize);
> + kunmap_atomic(virt);
> +}
> +
> +static int hash_page(const struct fsverity_info *vi, struct ahash_request *req,
> + struct page *page, u8 *out)
> +{
> + struct scatterlist sg[3];
> + DECLARE_CRYPTO_WAIT(wait);
> + int err;
> +
> + sg_init_table(sg, 1);
> + sg_set_page(&sg[0], page, PAGE_SIZE, 0);
> +
> + ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
> + CRYPTO_TFM_REQ_MAY_BACKLOG,
> + crypto_req_done, &wait);
> + ahash_request_set_crypt(req, sg, out, PAGE_SIZE);
> +
> + err = crypto_ahash_import(req, vi->hashstate);
> + if (err)
> + return err;
> +
> + return crypto_wait_req(crypto_ahash_finup(req), &wait);
> +}
> +
> +static inline int compare_hashes(const u8 *want_hash, const u8 *real_hash,
> + int digest_size, struct inode *inode,
> + pgoff_t index, int level, const char *algname)
> +{
> + if (memcmp(want_hash, real_hash, digest_size) == 0)
> + return 0;
> +
> + pr_warn_ratelimited("VERIFICATION FAILURE! ino=%lu, index=%lu, level=%d, want_hash=%s:%*phN, real_hash=%s:%*phN\n",
> + inode->i_ino, index, level,
> + algname, digest_size, want_hash,
> + algname, digest_size, real_hash);
> + return -EBADMSG;
> +}
> +
> +/*
> + * Verify a single data page against the file's Merkle tree.
> + *
> + * In principle, we need to verify the entire path to the root node. But as an
> + * optimization, we cache the hash pages in the file's page cache, similar to
> + * data pages. Therefore, we can stop verifying as soon as a verified hash page
> + * is seen while ascending the tree.
> + *
> + * Note that unlike data pages, hash pages are marked Uptodate *before* they are
> + * verified; instead, the Checked bit is set on hash pages that have been
> + * verified. Multiple tasks may race to verify a hash page and mark it Checked,
> + * but it doesn't matter. The use of the Checked bit also implies that the hash
> + * block size must equal PAGE_SIZE (for now).
> + */
> +static bool verify_page(struct inode *inode, const struct fsverity_info *vi,
> + struct ahash_request *req, struct page *data_page)
> +{
> + pgoff_t index = data_page->index;
> + int level = 0;
> + u8 _want_hash[FS_VERITY_MAX_DIGEST_SIZE];
> + const u8 *want_hash = NULL;
> + u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
> + struct page *hpages[FS_VERITY_MAX_LEVELS];
> + unsigned int hoffsets[FS_VERITY_MAX_LEVELS];
> + int err;
> +
> + /* The page must not be unlocked until verification has completed. */
> + if (WARN_ON_ONCE(!PageLocked(data_page)))
> + return false;
> +
> + /*
> + * Since ->i_size is overridden with ->data_i_size, and fs-verity avoids
> + * recursing into itself when reading hash pages, we shouldn't normally
> + * get here with a page beyond ->data_i_size. But, it can happen if a
> + * read is issued at or beyond EOF since the VFS doesn't check i_size
> + * before calling ->readpage(). Thus, just skip verification if the
> + * page is beyond ->data_i_size.
> + */
> + if (index >= (vi->data_i_size + PAGE_SIZE - 1) >> PAGE_SHIFT) {
> + pr_debug("Page %lu is in metadata region\n", index);
> + return true;
> + }
> +
> + pr_debug_ratelimited("Verifying data page %lu...\n", index);
> +
> + /*
> + * Starting at the leaves, ascend the tree saving hash pages along the
> + * way until we find a verified hash page, indicated by PageChecked; or
> + * until we reach the root.
> + */
> + for (level = 0; level < vi->depth; level++) {
> + pgoff_t hindex;
> + unsigned int hoffset;
> + struct page *hpage;
> +
> + hash_at_level(vi, index, level, &hindex, &hoffset);
> +
> + pr_debug_ratelimited("Level %d: hindex=%lu, hoffset=%u\n",
> + level, hindex, hoffset);
> +
> + hpage = read_mapping_page(inode->i_mapping, hindex, NULL);
> + if (IS_ERR(hpage)) {
> + err = PTR_ERR(hpage);
> + goto out;
> + }
> +
> + if (PageChecked(hpage)) {
> + extract_hash(hpage, hoffset, vi->hash_alg->digest_size,
> + _want_hash);
> + want_hash = _want_hash;
> + put_page(hpage);
> + pr_debug_ratelimited("Hash page already checked, want %s:%*phN\n",
> + vi->hash_alg->name,
> + vi->hash_alg->digest_size,
> + want_hash);
> + break;
> + }
> + pr_debug_ratelimited("Hash page not yet checked\n");
> + hpages[level] = hpage;
> + hoffsets[level] = hoffset;
> + }
> +
> + if (!want_hash) {
> + want_hash = vi->root_hash;
> + pr_debug("Want root hash: %s:%*phN\n", vi->hash_alg->name,
> + vi->hash_alg->digest_size, want_hash);
> + }
> +
> + /* Descend the tree verifying hash pages */
> + for (; level > 0; level--) {
> + struct page *hpage = hpages[level - 1];
> + unsigned int hoffset = hoffsets[level - 1];
> +
> + err = hash_page(vi, req, hpage, real_hash);
> + if (err)
> + goto out;
> + err = compare_hashes(want_hash, real_hash,
> + vi->hash_alg->digest_size,
> + inode, index, level - 1,
> + vi->hash_alg->name);
> + if (err)
> + goto out;
> + SetPageChecked(hpage);
> + extract_hash(hpage, hoffset, vi->hash_alg->digest_size,
> + _want_hash);
> + want_hash = _want_hash;
> + put_page(hpage);
> + pr_debug("Verified hash page at level %d, now want %s:%*phN\n",
> + level - 1, vi->hash_alg->name,
> + vi->hash_alg->digest_size, want_hash);
> + }
> +
> + /* Finally, verify the data page */
> + err = hash_page(vi, req, data_page, real_hash);
> + if (err)
> + goto out;
> + err = compare_hashes(want_hash, real_hash, vi->hash_alg->digest_size,
> + inode, index, -1, vi->hash_alg->name);
> +out:
> + for (; level > 0; level--)
> + put_page(hpages[level - 1]);
> + if (err) {
> + pr_warn_ratelimited("Error verifying page; ino=%lu, index=%lu (err=%d)\n",
> + inode->i_ino, data_page->index, err);
> + return false;
> + }
> + return true;
> +}
> +
> +/**
> + * fsverity_verify_page - verify a data page
> + *
> + * Verify a page that has just been read from a file against that file's Merkle
> + * tree. The page is assumed to be a pagecache page.
> + *
> + * Return: true if the page is valid, else false.
> + */
> +bool fsverity_verify_page(struct page *data_page)
> +{
> + struct inode *inode = data_page->mapping->host;
> + const struct fsverity_info *vi = get_fsverity_info(inode);
> + struct ahash_request *req;
> + bool valid;
> +
> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> + if (unlikely(!req))
> + return false;
> +
> + valid = verify_page(inode, vi, req, data_page);
> +
> + ahash_request_free(req);
> +
> + return valid;
> +}
> +EXPORT_SYMBOL_GPL(fsverity_verify_page);
> +
> +/**
> + * fsverity_verify_bio - verify a 'read' bio that has just completed
> + *
> + * Verify a set of pages that have just been read from a file against that
> + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
> + * fail verification are set to the Error state. Verification is skipped for
> + * pages already in the Error state, e.g. due to fscrypt decryption failure.
> + */
> +void fsverity_verify_bio(struct bio *bio)
Hi Eric-
This kind of API won't work for remote filesystems, which do not use
"struct bio" to do their I/O. Could a remote filesystem solely use
fsverity_verify_page instead?
> +{
> + struct inode *inode = bio_first_page_all(bio)->mapping->host;
> + const struct fsverity_info *vi = get_fsverity_info(inode);
> + struct ahash_request *req;
> + struct bio_vec *bv;
> + int i;
> +
> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> + if (unlikely(!req)) {
> + bio_for_each_segment_all(bv, bio, i)
> + SetPageError(bv->bv_page);
> + return;
> + }
> +
> + bio_for_each_segment_all(bv, bio, i) {
> + struct page *page = bv->bv_page;
> +
> + if (!PageError(page) && !verify_page(inode, vi, req, page))
> + SetPageError(page);
> + }
> +
> + ahash_request_free(req);
> +}
> +EXPORT_SYMBOL_GPL(fsverity_verify_bio);
> +
> +/**
> + * fsverity_enqueue_verify_work - enqueue work on the fs-verity workqueue
> + *
> + * Enqueue verification work for asynchronous processing.
> + */
> +void fsverity_enqueue_verify_work(struct work_struct *work)
> +{
> + queue_work(fsverity_read_workqueue, work);
> +}
> +EXPORT_SYMBOL_GPL(fsverity_enqueue_verify_work);
> diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
> index 3af55241046aa..56341f10aa965 100644
> --- a/include/linux/fsverity.h
> +++ b/include/linux/fsverity.h
> @@ -28,6 +28,11 @@ extern int fsverity_prepare_getattr(struct inode *inode);
> extern void fsverity_cleanup_inode(struct inode *inode);
> extern loff_t fsverity_full_i_size(const struct inode *inode);
>
> +/* verify.c */
> +extern bool fsverity_verify_page(struct page *page);
> +extern void fsverity_verify_bio(struct bio *bio);
> +extern void fsverity_enqueue_verify_work(struct work_struct *work);
> +
> #else /* !__FS_HAS_VERITY */
>
> /* setup.c */
> @@ -57,6 +62,24 @@ static inline loff_t fsverity_full_i_size(const struct inode *inode)
> return i_size_read(inode);
> }
>
> +/* verify.c */
> +
> +static inline bool fsverity_verify_page(struct page *page)
> +{
> + WARN_ON(1);
> + return false;
> +}
> +
> +static inline void fsverity_verify_bio(struct bio *bio)
> +{
> + WARN_ON(1);
> +}
> +
> +static inline void fsverity_enqueue_verify_work(struct work_struct *work)
> +{
> + WARN_ON(1);
> +}
> +
> #endif /* !__FS_HAS_VERITY */
>
> #endif /* _LINUX_FSVERITY_H */
> --
> 2.18.0
>
--
Chuck Lever
[email protected]
Hi Eric-
Context: I'm working on IMA support for NFSv4, and would like to
use fs-verity (or some Merkle tree-like mechanism) eventually to
help address the performance impacts of using IMA with large NFS
files.
> On Aug 24, 2018, at 12:16 PM, Eric Biggers <[email protected]> wrote:
>
> From: Eric Biggers <[email protected]>
>
> fs-verity is a filesystem feature that provides efficient, transparent
> integrity verification and authentication of read-only files. It uses a
> dm-verity like mechanism at the file level: a Merkle tree hidden past
> the end of the file is used to verify any block in the file in
> log(filesize) time. It is implemented mainly by helper functions in
> fs/verity/ that will be shared by multiple filesystems.
This description suggests that the only way fs-verity can work is
by placing the Merkle tree data after EOF. Further, this organi-
zation is exposed to user space, making it a fixed part of the
fs-verity kernel/user space API.
Remote filesystems -- esp. NFS -- would prefer to manage the Merkle
tree data in other ways. The NFSv4 protocol, for example, supports
named streams (as some other filesystems do), and could store the
Merkle trees in those. Or, a new pNFS layout type could be con-
structed where Merkle trees are stored separately from a file's
content -- perhaps even on a separate file server.
File servers can store this data as the servers' local filesystems
require.
Sharing how the Merkle tree is created and used is sensible, but
IMHO the filesystem implementations should be allowed to store this
tree however they find convenient. The Merkle trees should be
exposed via a clean API, not as part of the file's content.
> Essentially, fs-verity reports a file's hash in constant time, but reads
> that would violate that hash fail at runtime. This is useful when only
> a portion of the file is actually accessed, as only the accessed portion
> has to be hashed, and the latency to the first read is much reduced over
> a full file hash. On top of this hashing mechanism, auditing or
> authentication policies can be implemented to log or verify file hashes.
>
> Note that in general, fs-verity is *not* a replacement for IMA.
> fs-verity is a lower-level feature, primarily a way to hash a file;
> whereas IMA deals more with higher-level policy logic, like defining
> which files are "measured" and what to do with those measurements. We
> plan for IMA to support fs-verity measurements as an alternative to the
> traditional full file hash. Still, some users find fs-verity useful by
> itself, so it's also usable without IMA in simple cases, e.g. in cases
> where just retrieving the file measurement via an ioctl is enough.
>
> A structure containing the properties of the Merkle tree -- such as the
> hash algorithm used, the block size, and the root hash -- is also stored
> on-disk, following the Merkle tree. The actual file measurement hash
> that fs-verity reports is the hash of this structure.
>
> All fs-verity metadata is written by userspace; the kernel only reads
> it. Extended attributes aren't used because the Merkle tree may be much
> larger than XATTR_SIZE_MAX, we want the hash pages to be cached in the
> page cache as usual, and in the case of fs-verity combined with fscrypt
> we want the metadata to be encrypted to avoid leaking plaintext hashes.
> The fs-verity metadata is hidden from userspace by overriding the i_size
> of the in-memory VFS inode; ext4 additionally will override the on-disk
> i_size in order to make verity a RO_COMPAT filesystem feature.
>
> This initial patch only adds the fs-verity Kconfig option, UAPI, and
> setup code, e.g. the ->open() hook that parses the fs-verity descriptor.
> The actual ->readpages() data verification, the ioctls, ext4 and f2fs
> support, and other functionality comes in later patches.
>
> Signed-off-by: Eric Biggers <[email protected]>
> ---
> fs/Kconfig | 2 +
> fs/Makefile | 1 +
> fs/verity/Kconfig | 36 ++
> fs/verity/Makefile | 3 +
> fs/verity/fsverity_private.h | 99 ++++
> fs/verity/hash_algs.c | 106 +++++
> fs/verity/setup.c | 846 ++++++++++++++++++++++++++++++++++
> include/linux/fs.h | 9 +
> include/linux/fsverity.h | 62 +++
> include/uapi/linux/fsverity.h | 86 ++++
> 10 files changed, 1250 insertions(+)
> create mode 100644 fs/verity/Kconfig
> create mode 100644 fs/verity/Makefile
> create mode 100644 fs/verity/fsverity_private.h
> create mode 100644 fs/verity/hash_algs.c
> create mode 100644 fs/verity/setup.c
> create mode 100644 include/linux/fsverity.h
> create mode 100644 include/uapi/linux/fsverity.h
>
> diff --git a/fs/Kconfig b/fs/Kconfig
> index ac474a61be379..ddadc4e999429 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -105,6 +105,8 @@ config MANDATORY_FILE_LOCKING
>
> source "fs/crypto/Kconfig"
>
> +source "fs/verity/Kconfig"
> +
> source "fs/notify/Kconfig"
>
> source "fs/quota/Kconfig"
> diff --git a/fs/Makefile b/fs/Makefile
> index 293733f61594b..10b37f651ffde 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -32,6 +32,7 @@ obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
> obj-$(CONFIG_AIO) += aio.o
> obj-$(CONFIG_FS_DAX) += dax.o
> obj-$(CONFIG_FS_ENCRYPTION) += crypto/
> +obj-$(CONFIG_FS_VERITY) += verity/
> obj-$(CONFIG_FILE_LOCKING) += locks.o
> obj-$(CONFIG_COMPAT) += compat.o compat_ioctl.o
> obj-$(CONFIG_BINFMT_AOUT) += binfmt_aout.o
> diff --git a/fs/verity/Kconfig b/fs/verity/Kconfig
> new file mode 100644
> index 0000000000000..308d733a9401b
> --- /dev/null
> +++ b/fs/verity/Kconfig
> @@ -0,0 +1,36 @@
> +config FS_VERITY
> + tristate "FS Verity (file-based integrity/authentication)"
> + depends on BLOCK
> + select CRYPTO
> + # SHA-256 is selected as it's intended to be the default hash algorithm.
> + # To avoid bloat, other wanted algorithms must be selected explicitly.
> + select CRYPTO_SHA256
> + help
> + This option enables fs-verity. fs-verity is the dm-verity
> + mechanism implemented at the file level. On supported
> + filesystems, userspace can append a Merkle tree (hash tree) to
> + a file, then enable fs-verity on the file. The filesystem
> + will then transparently verify any data read from the file
> + against the Merkle tree. The file is also made read-only.
> +
> + This serves as an integrity check, but the availability of the
> + Merkle tree root hash also allows efficiently supporting
> + various use cases where normally the whole file would need to
> + be hashed at once, such as: (a) auditing (logging the file's
> + hash), or (b) authenticity verification (comparing the hash
> + against a known good value, e.g. from a digital signature).
> +
> + fs-verity is especially useful on large files where not all
> + the contents may actually be needed. Also, fs-verity verifies
> + data each time it is paged back in, which provides better
> + protection against malicious disks vs. an ahead-of-time hash.
> +
> + If unsure, say N.
> +
> +config FS_VERITY_DEBUG
> + bool "FS Verity debugging"
> + depends on FS_VERITY
> + help
> + Enable debugging messages related to fs-verity by default.
> +
> + Say N unless you are an fs-verity developer.
> diff --git a/fs/verity/Makefile b/fs/verity/Makefile
> new file mode 100644
> index 0000000000000..39e123805c827
> --- /dev/null
> +++ b/fs/verity/Makefile
> @@ -0,0 +1,3 @@
> +obj-$(CONFIG_FS_VERITY) += fsverity.o
> +
> +fsverity-y := hash_algs.o setup.o
> diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> new file mode 100644
> index 0000000000000..a18ff645695f4
> --- /dev/null
> +++ b/fs/verity/fsverity_private.h
> @@ -0,0 +1,99 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * fs-verity: read-only file-based integrity/authentication
> + *
> + * Copyright (C) 2018 Google LLC
> + */
> +
> +#ifndef _FSVERITY_PRIVATE_H
> +#define _FSVERITY_PRIVATE_H
> +
> +#ifdef CONFIG_FS_VERITY_DEBUG
> +#define DEBUG
> +#endif
> +
> +#define pr_fmt(fmt) "fs-verity: " fmt
> +
> +#include <crypto/sha.h>
> +#define __FS_HAS_VERITY 1
> +#include <linux/fsverity.h>
> +
> +/*
> + * Maximum depth of the Merkle tree. Up to 64 levels are theoretically possible
> + * with a very small block size, but we'd like to limit stack usage during
> + * verification, and in practice this is plenty. E.g., with SHA-256 and 4K
> + * blocks, a file with size UINT64_MAX bytes needs just 8 levels.
> + */
> +#define FS_VERITY_MAX_LEVELS 16
> +
> +/*
> + * Largest digest size among all hash algorithms supported by fs-verity. This
> + * can be increased if needed.
> + */
> +#define FS_VERITY_MAX_DIGEST_SIZE SHA256_DIGEST_SIZE
> +
> +/* A hash algorithm supported by fs-verity */
> +struct fsverity_hash_alg {
> + struct crypto_ahash *tfm; /* allocated on demand */
> + const char *name;
> + unsigned int digest_size;
> + bool cryptographic;
> +};
> +
> +/**
> + * fsverity_info - cached verity metadata for an inode
> + *
> + * When a verity file is first opened, an instance of this struct is allocated
> + * and stored in ->i_verity_info. It caches various values from the verity
> + * metadata, such as the tree topology and the root hash, which are needed to
> + * efficiently verify data read from the file. Once created, it remains until
> + * the inode is evicted.
> + *
> + * (The tree pages themselves are not cached here, though they may be cached in
> + * the inode's page cache.)
> + */
> +struct fsverity_info {
> + const struct fsverity_hash_alg *hash_alg; /* hash algorithm */
> + u8 block_bits; /* log2(block size) */
> + u8 log_arity; /* log2(hashes per hash block) */
> + u8 depth; /* depth of the Merkle tree */
> + u8 *hashstate; /* salted initial hash state */
> + u64 data_i_size; /* original file size */
> + u64 full_i_size; /* full file size including metadata */
> + u8 root_hash[FS_VERITY_MAX_DIGEST_SIZE]; /* Merkle tree root hash */
> + u8 measurement[FS_VERITY_MAX_DIGEST_SIZE]; /* file measurement */
> + bool have_root_hash; /* have root hash from disk? */
> +
> + /* Starting blocks for each tree level. 'depth-1' is the root level. */
> + u64 hash_lvl_region_idx[FS_VERITY_MAX_LEVELS];
> +};
> +
> +/* hash_algs.c */
> +extern struct fsverity_hash_alg fsverity_hash_algs[];
> +const struct fsverity_hash_alg *fsverity_get_hash_alg(unsigned int num);
> +void __init fsverity_check_hash_algs(void);
> +void __exit fsverity_exit_hash_algs(void);
> +
> +/* setup.c */
> +struct fsverity_info *create_fsverity_info(struct inode *inode, bool enabling);
> +void free_fsverity_info(struct fsverity_info *vi);
> +
> +static inline struct fsverity_info *get_fsverity_info(const struct inode *inode)
> +{
> + /* pairs with cmpxchg_release() in set_fsverity_info() */
> + return smp_load_acquire(&inode->i_verity_info);
> +}
> +
> +static inline bool set_fsverity_info(struct inode *inode,
> + struct fsverity_info *vi)
> +{
> + /* pairs with smp_load_acquire() in get_fsverity_info() */
> + if (cmpxchg_release(&inode->i_verity_info, NULL, vi) != NULL)
> + return false;
> +
> + /* Set the in-memory i_size to the data size */
> + i_size_write(inode, vi->data_i_size);
> + return true;
> +}
> +
> +#endif /* _FSVERITY_PRIVATE_H */
> diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
> new file mode 100644
> index 0000000000000..424a26ee2f3c2
> --- /dev/null
> +++ b/fs/verity/hash_algs.c
> @@ -0,0 +1,106 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * fs/verity/hash_algs.c: fs-verity hash algorithm management
> + *
> + * Copyright (C) 2018 Google LLC
> + *
> + * Written by Eric Biggers.
> + */
> +
> +#include "fsverity_private.h"
> +
> +#include <crypto/hash.h>
> +
> +/* The list of hash algorithms supported by fs-verity */
> +struct fsverity_hash_alg fsverity_hash_algs[] = {
> + [FS_VERITY_ALG_SHA256] = {
> + .name = "sha256",
> + .digest_size = 32,
> + .cryptographic = true,
> + },
> +};
> +
> +/*
> + * Translate the given fs-verity hash algorithm number into a struct describing
> + * the algorithm, and ensure it has a hash transform ready to go. The hash
> + * transforms are allocated on-demand firstly to not waste resources when they
> + * aren't needed, and secondly because the fs-verity module may be loaded
> + * earlier than the needed crypto modules.
> + */
> +const struct fsverity_hash_alg *fsverity_get_hash_alg(unsigned int num)
> +{
> + struct fsverity_hash_alg *alg;
> + struct crypto_ahash *tfm;
> + int err;
> +
> + if (num >= ARRAY_SIZE(fsverity_hash_algs) ||
> + !fsverity_hash_algs[num].digest_size) {
> + pr_warn("Unknown hash algorithm: %u\n", num);
> + return ERR_PTR(-EINVAL);
> + }
> + alg = &fsverity_hash_algs[num];
> +retry:
> + /* pairs with cmpxchg_release() below */
> + tfm = smp_load_acquire(&alg->tfm);
> + if (tfm)
> + return alg;
> + /*
> + * Using the shash API would make things a bit simpler, but the ahash
> + * API is preferable as it allows the use of crypto accelerators.
> + */
> + tfm = crypto_alloc_ahash(alg->name, 0, 0);
> + if (IS_ERR(tfm)) {
> + if (PTR_ERR(tfm) == -ENOENT)
> + pr_warn("Algorithm %u (%s) is unavailable\n",
> + num, alg->name);
> + else
> + pr_warn("Error allocating algorithm %u (%s): %ld\n",
> + num, alg->name, PTR_ERR(tfm));
> + return ERR_CAST(tfm);
> + }
> +
> + err = -EINVAL;
> + if (WARN_ON(alg->digest_size != crypto_ahash_digestsize(tfm)))
> + goto err_free_tfm;
> +
> + pr_info("%s using implementation \"%s\"\n", alg->name,
> + crypto_hash_alg_common(tfm)->base.cra_driver_name);
> +
> + /* pairs with smp_load_acquire() above */
> + if (cmpxchg_release(&alg->tfm, NULL, tfm) != NULL) {
> + crypto_free_ahash(tfm);
> + goto retry;
> + }
> +
> + return alg;
> +
> +err_free_tfm:
> + crypto_free_ahash(tfm);
> + return ERR_PTR(err);
> +}
> +
> +void __init fsverity_check_hash_algs(void)
> +{
> + int i;
> +
> + /*
> + * Sanity check the digest sizes (could be a build-time check, but
> + * they're in an array)
> + */
> + for (i = 0; i < ARRAY_SIZE(fsverity_hash_algs); i++) {
> + struct fsverity_hash_alg *alg = &fsverity_hash_algs[i];
> +
> + if (!alg->digest_size)
> + continue;
> + BUG_ON(alg->digest_size > FS_VERITY_MAX_DIGEST_SIZE);
> + BUG_ON(!is_power_of_2(alg->digest_size));
> + }
> +}
> +
> +void __exit fsverity_exit_hash_algs(void)
> +{
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(fsverity_hash_algs); i++)
> + crypto_free_ahash(fsverity_hash_algs[i].tfm);
> +}
> diff --git a/fs/verity/setup.c b/fs/verity/setup.c
> new file mode 100644
> index 0000000000000..e675c52898d5b
> --- /dev/null
> +++ b/fs/verity/setup.c
> @@ -0,0 +1,846 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * fs/verity/setup.c: fs-verity module initialization and descriptor parsing
> + *
> + * Copyright (C) 2018 Google LLC
> + *
> + * Originally written by Jaegeuk Kim and Michael Halcrow;
> + * heavily rewritten by Eric Biggers.
> + */
> +
> +#include "fsverity_private.h"
> +
> +#include <crypto/hash.h>
> +#include <linux/highmem.h>
> +#include <linux/list_sort.h>
> +#include <linux/module.h>
> +#include <linux/pagemap.h>
> +#include <linux/scatterlist.h>
> +#include <linux/vmalloc.h>
> +
> +static struct kmem_cache *fsverity_info_cachep;
> +
> +static void dump_fsverity_descriptor(const struct fsverity_descriptor *desc)
> +{
> + pr_debug("magic = %.*s\n", (int)sizeof(desc->magic), desc->magic);
> + pr_debug("major_version = %u\n", desc->major_version);
> + pr_debug("minor_version = %u\n", desc->minor_version);
> + pr_debug("log_data_blocksize = %u\n", desc->log_data_blocksize);
> + pr_debug("log_tree_blocksize = %u\n", desc->log_tree_blocksize);
> + pr_debug("data_algorithm = %u\n", le16_to_cpu(desc->data_algorithm));
> + pr_debug("tree_algorithm = %u\n", le16_to_cpu(desc->tree_algorithm));
> + pr_debug("flags = %#x\n", le32_to_cpu(desc->flags));
> + pr_debug("orig_file_size = %llu\n", le64_to_cpu(desc->orig_file_size));
> + pr_debug("auth_ext_count = %u\n", le16_to_cpu(desc->auth_ext_count));
> +}
> +
> +/* Precompute the salted initial hash state */
> +static int set_salt(struct fsverity_info *vi, const u8 *salt, size_t saltlen)
> +{
> + struct crypto_ahash *tfm = vi->hash_alg->tfm;
> + struct ahash_request *req;
> + unsigned int reqsize = sizeof(*req) + crypto_ahash_reqsize(tfm);
> + struct scatterlist sg;
> + DECLARE_CRYPTO_WAIT(wait);
> + u8 *saltbuf;
> + int err;
> +
> + vi->hashstate = kmalloc(crypto_ahash_statesize(tfm), GFP_KERNEL);
> + if (!vi->hashstate)
> + return -ENOMEM;
> + /* On error, vi->hashstate is freed by free_fsverity_info() */
> +
> + /*
> + * Allocate a hash request buffer. Also reserve space for a copy of
> + * the salt, since the given 'salt' may point into vmap'ed memory, so
> + * sg_init_one() may not work on it.
> + */
> + req = kmalloc(reqsize + saltlen, GFP_KERNEL);
> + if (!req)
> + return -ENOMEM;
> + saltbuf = (u8 *)req + reqsize;
> + memcpy(saltbuf, salt, saltlen);
> + sg_init_one(&sg, saltbuf, saltlen);
> +
> + ahash_request_set_tfm(req, tfm);
> + ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
> + CRYPTO_TFM_REQ_MAY_BACKLOG,
> + crypto_req_done, &wait);
> + ahash_request_set_crypt(req, &sg, NULL, saltlen);
> +
> + err = crypto_wait_req(crypto_ahash_init(req), &wait);
> + if (err)
> + goto out;
> + err = crypto_wait_req(crypto_ahash_update(req), &wait);
> + if (err)
> + goto out;
> + err = crypto_ahash_export(req, vi->hashstate);
> +out:
> + kfree(req);
> + return err;
> +}
> +
> +/*
> + * Copy in the root hash stored on disk.
> + *
> + * Note that the root hash could be computed by hashing the root block of the
> + * Merkle tree. But it works out a bit simpler to store the hash separately;
> + * then it gets included in the file measurement without special-casing it, and
> + * the root block gets verified on the ->readpages() path like the other blocks.
> + */
> +static int parse_root_hash_extension(struct fsverity_info *vi,
> + const void *hash, size_t size)
> +{
> + const struct fsverity_hash_alg *alg = vi->hash_alg;
> +
> + if (vi->have_root_hash) {
> + pr_warn("Multiple root hashes were found!\n");
> + return -EINVAL;
> + }
> + if (size != alg->digest_size) {
> + pr_warn("Wrong root hash size; got %zu bytes, but expected %u for hash algorithm %s\n",
> + size, alg->digest_size, alg->name);
> + return -EINVAL;
> + }
> + memcpy(vi->root_hash, hash, size);
> + vi->have_root_hash = true;
> + pr_debug("Root hash: %s:%*phN\n", alg->name,
> + alg->digest_size, vi->root_hash);
> + return 0;
> +}
> +
> +static int parse_salt_extension(struct fsverity_info *vi,
> + const void *salt, size_t saltlen)
> +{
> + if (vi->hashstate) {
> + pr_warn("Multiple salts were found!\n");
> + return -EINVAL;
> + }
> + return set_salt(vi, salt, saltlen);
> +}
> +
> +/* The available types of extensions (variable-length metadata items) */
> +static const struct extension_type {
> + int (*parse)(struct fsverity_info *vi, const void *_ext,
> + size_t extra_len);
> + size_t base_len; /* length of fixed-size part of payload, if any */
> + bool unauthenticated; /* true if not included in file measurement */
> +} extension_types[] = {
> + [FS_VERITY_EXT_ROOT_HASH] = {
> + .parse = parse_root_hash_extension,
> + },
> + [FS_VERITY_EXT_SALT] = {
> + .parse = parse_salt_extension,
> + },
> +};
> +
> +static int do_parse_extensions(struct fsverity_info *vi,
> + const struct fsverity_extension **ext_hdr_p,
> + const void *end, int count, bool authenticated)
> +{
> + const struct fsverity_extension *ext_hdr = *ext_hdr_p;
> + int i;
> + int err;
> +
> + for (i = 0; i < count; i++) {
> + const struct extension_type *type;
> + u32 len, rounded_len;
> + u16 type_code;
> +
> + if (end - (const void *)ext_hdr < sizeof(*ext_hdr)) {
> + pr_warn("Extension list overflows buffer\n");
> + return -EINVAL;
> + }
> + type_code = le16_to_cpu(ext_hdr->type);
> + if (type_code >= ARRAY_SIZE(extension_types) ||
> + !extension_types[type_code].parse) {
> + pr_warn("Unknown extension type: %u\n", type_code);
> + return -EINVAL;
> + }
> + type = &extension_types[type_code];
> + if (authenticated != !type->unauthenticated) {
> + pr_warn("Extension type %u must be %sauthenticated\n",
> + type_code, type->unauthenticated ? "un" : "");
> + return -EINVAL;
> + }
> + if (ext_hdr->reserved) {
> + pr_warn("Reserved bits set in extension header\n");
> + return -EINVAL;
> + }
> + len = le32_to_cpu(ext_hdr->length);
> + if (len < sizeof(*ext_hdr)) {
> + pr_warn("Invalid length in extension header\n");
> + return -EINVAL;
> + }
> + rounded_len = round_up(len, 8);
> + if (rounded_len == 0 ||
> + rounded_len > end - (const void *)ext_hdr) {
> + pr_warn("Extension item overflows buffer\n");
> + return -EINVAL;
> + }
> + if (len < sizeof(*ext_hdr) + type->base_len) {
> + pr_warn("Extension length too small for type\n");
> + return -EINVAL;
> + }
> + err = type->parse(vi, ext_hdr + 1,
> + len - sizeof(*ext_hdr) - type->base_len);
> + if (err)
> + return err;
> + ext_hdr = (const void *)ext_hdr + rounded_len;
> + }
> + *ext_hdr_p = ext_hdr;
> + return 0;
> +}
> +
> +/*
> + * Parse the extension items following the fixed-size portion of the fs-verity
> + * descriptor. The fsverity_info is updated accordingly.
> + *
> + * Return: On success, the size of the authenticated portion of the descriptor
> + * (the fixed-size portion plus the authenticated extensions).
> + * Otherwise, a -errno value.
> + */
> +static int parse_extensions(struct fsverity_info *vi,
> + const struct fsverity_descriptor *desc,
> + int desc_len)
> +{
> + const struct fsverity_extension *ext_hdr = (const void *)(desc + 1);
> + const void *end = (const void *)desc + desc_len;
> + u16 auth_ext_count = le16_to_cpu(desc->auth_ext_count);
> + int auth_desc_len;
> + int err;
> +
> + err = do_parse_extensions(vi, &ext_hdr, end, auth_ext_count, true);
> + if (err)
> + return err;
> + auth_desc_len = (void *)ext_hdr - (void *)desc;
> +
> + /*
> + * Unauthenticated extensions (optional). Careful: an attacker able to
> + * corrupt the file can change these arbitrarily without being detected.
> + * Thus, only specific types of extensions are whitelisted here --
> + * namely, the ones containing a signature of the file measurement,
> + * which by definition can't be included in the file measurement itself.
> + */
> + if (end - (void *)ext_hdr >= 8) {
> + u16 unauth_ext_count = le16_to_cpup((__le16 *)ext_hdr);
> +
> + ext_hdr = (void *)ext_hdr + 8;
> + err = do_parse_extensions(vi, &ext_hdr, end,
> + unauth_ext_count, false);
> + if (err)
> + return err;
> + }
> +
> + return auth_desc_len;
> +}
> +
> +/*
> + * Parse an fs-verity descriptor, loading information into the fsverity_info.
> + *
> + * Return: On success, the size of the authenticated portion of the descriptor
> + * (the fixed-size portion plus the authenticated extensions).
> + * Otherwise, a -errno value.
> + */
> +static int parse_fsverity_descriptor(struct fsverity_info *vi,
> + const struct fsverity_descriptor *desc,
> + int desc_len, loff_t desc_start)
> +{
> + unsigned int alg_num;
> + unsigned int hashes_per_block;
> + u64 orig_file_size;
> + int desc_auth_len;
> + int err;
> +
> + BUILD_BUG_ON(sizeof(*desc) != 64);
> +
> + /* magic */
> + if (memcmp(desc->magic, FS_VERITY_MAGIC, sizeof(desc->magic))) {
> + pr_warn("Wrong magic bytes\n");
> + return -EINVAL;
> + }
> +
> + /* major_version */
> + if (desc->major_version != 1) {
> + pr_warn("Unsupported major version (%u)\n",
> + desc->major_version);
> + return -EINVAL;
> + }
> +
> + /* minor_version */
> + if (desc->minor_version != 0) {
> + pr_warn("Unsupported minor version (%u)\n",
> + desc->minor_version);
> + return -EINVAL;
> + }
> +
> + /* data_algorithm and tree_algorithm */
> + alg_num = le16_to_cpu(desc->data_algorithm);
> + if (alg_num != le16_to_cpu(desc->tree_algorithm)) {
> + pr_warn("Unimplemented case: data (%u) and tree (%u) hash algorithms differ\n",
> + alg_num, le16_to_cpu(desc->tree_algorithm));
> + return -EINVAL;
> + }
> + vi->hash_alg = fsverity_get_hash_alg(alg_num);
> + if (IS_ERR(vi->hash_alg))
> + return PTR_ERR(vi->hash_alg);
> +
> + /* log_data_blocksize and log_tree_blocksize */
> + if (desc->log_data_blocksize != PAGE_SHIFT) {
> + pr_warn("Unsupported log_blocksize (%u). Need block_size == PAGE_SIZE.\n",
> + desc->log_data_blocksize);
> + return -EINVAL;
> + }
> + if (desc->log_tree_blocksize != desc->log_data_blocksize) {
> + pr_warn("Unimplemented case: data (%u) and tree (%u) block sizes differ\n",
> + desc->log_data_blocksize, desc->log_data_blocksize);
> + return -EINVAL;
> + }
> + vi->block_bits = desc->log_data_blocksize;
> + hashes_per_block = (1 << vi->block_bits) / vi->hash_alg->digest_size;
> + if (!is_power_of_2(hashes_per_block)) {
> + pr_warn("Unimplemented case: hashes per block (%u) isn't a power of 2\n",
> + hashes_per_block);
> + return -EINVAL;
> + }
> + vi->log_arity = ilog2(hashes_per_block);
> +
> + /* flags */
> + if (desc->flags) {
> + pr_warn("Unsupported flags (%#x)\n", le32_to_cpu(desc->flags));
> + return -EINVAL;
> + }
> +
> + /* reserved fields */
> + if (desc->reserved1 ||
> + memchr_inv(desc->reserved2, 0, sizeof(desc->reserved2))) {
> + pr_warn("Reserved bits set in fsverity_descriptor\n");
> + return -EINVAL;
> + }
> +
> + /*
> + * orig_file_size. For filesystems that set the on-disk i_size to
> + * data_i_size rather than to full_i_size, this field is redundant --
> + * though it still must be included in the file measurement! Make sure
> + * it's really the same.
> + */
> + orig_file_size = le64_to_cpu(desc->orig_file_size);
> + if (vi->data_i_size) {
> + if (orig_file_size != vi->data_i_size) {
> + pr_warn("fsverity_descriptor.orig_file_size (%llu) doesn't match i_size (%llu)!\n",
> + orig_file_size, vi->data_i_size);
> + return -EINVAL;
> + }
> + } else {
> + vi->data_i_size = orig_file_size;
> + }
> + if (vi->data_i_size == 0) {
> + pr_warn("Original file size is 0; this is not supported\n");
> + return -EINVAL;
> + }
> + if (vi->data_i_size > desc_start) {
> + pr_warn("Original file size is too large (%llu)\n",
> + vi->data_i_size);
> + return -EINVAL;
> + }
> +
> + /* extensions */
> + desc_auth_len = parse_extensions(vi, desc, desc_len);
> + if (desc_auth_len < 0)
> + return desc_auth_len;
> +
> + if (!vi->have_root_hash) {
> + pr_warn("Root hash wasn't found!\n");
> + return -EINVAL;
> + }
> +
> + /* Use an empty salt if no salt was found in the extensions list */
> + if (!vi->hashstate) {
> + err = set_salt(vi, "", 0);
> + if (err)
> + return err;
> + }
> +
> + return desc_auth_len;
> +}
> +
> +/*
> + * Calculate the depth of the Merkle tree, then create a map from level to the
> + * block offset at which that level's hash blocks start. Level 'depth - 1' is
> + * the root and is stored first in the file, in the first block following the
> + * original data. Level 0 is the level directly "above" the data blocks and is
> + * stored last in the file, just before the fsverity_descriptor.
> + */
> +static int compute_tree_depth_and_offsets(struct fsverity_info *vi)
> +{
> + unsigned int hashes_per_block = 1 << vi->log_arity;
> + u64 blocks = (vi->data_i_size + (1 << vi->block_bits) - 1) >>
> + vi->block_bits;
> + u64 offset = blocks;
> + int depth = 0;
> + int i;
> +
> + while (blocks > 1) {
> + if (depth >= FS_VERITY_MAX_LEVELS) {
> + pr_warn("Too many tree levels (max is %d)\n",
> + FS_VERITY_MAX_LEVELS);
> + return -EINVAL;
> + }
> + blocks = (blocks + hashes_per_block - 1) >> vi->log_arity;
> + vi->hash_lvl_region_idx[depth++] = blocks;
> + }
> + vi->depth = depth;
> +
> + for (i = depth - 1; i >= 0; i--) {
> + u64 next_count = vi->hash_lvl_region_idx[i];
> +
> + vi->hash_lvl_region_idx[i] = offset;
> + pr_debug("Level %d is [%llu..%llu] (%llu blocks)\n",
> + i, offset, offset + next_count - 1, next_count);
> + offset += next_count;
> + }
> + return 0;
> +}
> +
> +/* Arbitrary limit, can be increased if needed */
> +#define MAX_DESCRIPTOR_PAGES 16
> +
> +/*
> + * Compute the file's measurement by hashing the first 'desc_auth_len' bytes of
> + * the fs-verity descriptor (which includes the Merkle tree root hash as an
> + * authenticated extension item).
> + *
> + * Note: 'desc' may point into vmap'ed memory, so it can't be passed directly to
> + * sg_set_buf() for the ahash API. Instead, we pass the pages directly.
> + */
> +static int compute_measurement(const struct fsverity_info *vi,
> + const struct fsverity_descriptor *desc,
> + int desc_auth_len,
> + struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
> + int nr_desc_pages, u8 *measurement)
> +{
> + struct ahash_request *req;
> + DECLARE_CRYPTO_WAIT(wait);
> + struct scatterlist sg[MAX_DESCRIPTOR_PAGES];
> + int offset, len, remaining;
> + int i;
> + int err;
> +
> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> + if (!req)
> + return -ENOMEM;
> +
> + sg_init_table(sg, nr_desc_pages);
> + offset = offset_in_page(desc);
> + remaining = desc_auth_len;
> + for (i = 0; i < nr_desc_pages && remaining; i++) {
> + len = min_t(int, PAGE_SIZE - offset, remaining);
> + sg_set_page(&sg[i], desc_pages[i], len, offset);
> + remaining -= len;
> + offset = 0;
> + }
> +
> + ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
> + CRYPTO_TFM_REQ_MAY_BACKLOG,
> + crypto_req_done, &wait);
> + ahash_request_set_crypt(req, sg, measurement, desc_auth_len);
> + err = crypto_wait_req(crypto_ahash_digest(req), &wait);
> + ahash_request_free(req);
> + return err;
> +}
> +
> +static struct fsverity_info *alloc_fsverity_info(void)
> +{
> + return kmem_cache_zalloc(fsverity_info_cachep, GFP_NOFS);
> +}
> +
> +void free_fsverity_info(struct fsverity_info *vi)
> +{
> + if (!vi)
> + return;
> + kfree(vi->hashstate);
> + kmem_cache_free(fsverity_info_cachep, vi);
> +}
> +
> +/**
> + * find_fsverity_footer - find the fsverity_footer in the last page of the file
> + *
> + * To find the fsverity_footer we have to scan backwards from the end, skipping
> + * zero bytes. This is needed because some filesystems (e.g. ext4) set the
> + * on-disk i_size to data_i_size rather than to full_i_size, and full_i_size is
> + * instead gotten indirectly via the end of the last extent. This causes
> + * full_i_size to be rounded up to the end of the filesystem block.
> + *
> + * Return: pointer to the footer if found, else NULL
> + */
> +static const struct fsverity_footer *
> +find_fsverity_footer(const u8 *last_virt, size_t last_validsize)
> +{
> + const u8 *p = last_virt + last_validsize;
> + const struct fsverity_footer *ftr;
> +
> + /* Find the last nonzero byte, which should be ftr->magic[7] */
> + do {
> + if (p <= last_virt)
> + return NULL;
> + } while (*--p == 0);
> +
> + BUILD_BUG_ON(sizeof(ftr->magic) != 8);
> + BUILD_BUG_ON(offsetof(struct fsverity_footer, magic[8]) !=
> + sizeof(*ftr));
> + if (p - last_virt < offsetof(struct fsverity_footer, magic[7]))
> + return NULL;
> + ftr = container_of(p, struct fsverity_footer, magic[7]);
> + if (memcmp(ftr->magic, FS_VERITY_MAGIC, sizeof(ftr->magic)))
> + return NULL;
> + return ftr;
> +}
> +
> +/**
> + * map_fsverity_descriptor - map an inode's fs-verity descriptor into memory
> + *
> + * If the descriptor fits in one page, we use kmap; otherwise we use vmap.
> + * unmap_fsverity_descriptor() must be called later to unmap it.
> + *
> + * It's assumed that the file contents cannot be modified concurrently.
> + * (This is guaranteed by either deny_write_access() or by the verity bit.)
> + *
> + * Return: the virtual address of the start of the descriptor, in virtually
> + * contiguous memory. Also fills in desc_pages and returns in *desc_len the
> + * length of the descriptor including all extensions, and in *desc_start the
> + * offset of the descriptor from the start of the file, in bytes.
> + */
> +static const struct fsverity_descriptor *
> +map_fsverity_descriptor(struct inode *inode, loff_t full_i_size,
> + struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
> + int *nr_desc_pages, int *desc_len, loff_t *desc_start)
> +{
> + const int last_validsize = ((full_i_size - 1) & ~PAGE_MASK) + 1;
> + const pgoff_t last_pgoff = (full_i_size - 1) >> PAGE_SHIFT;
> + struct page *last_page;
> + const void *last_virt;
> + const struct fsverity_footer *ftr;
> + pgoff_t first_pgoff;
> + u32 desc_reverse_offset;
> + pgoff_t pgoff;
> + const void *desc_virt;
> + int i;
> + int err;
> +
> + *nr_desc_pages = 0;
> + *desc_len = 0;
> + *desc_start = 0;
> +
> + if (full_i_size <= 0) {
> + pr_warn("File is empty!\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + last_page = read_mapping_page(inode->i_mapping, last_pgoff, NULL);
> + if (IS_ERR(last_page)) {
> + pr_warn("Error reading last page: %ld\n", PTR_ERR(last_page));
> + return ERR_CAST(last_page);
> + }
> + last_virt = kmap(last_page);
> +
> + ftr = find_fsverity_footer(last_virt, last_validsize);
> + if (!ftr) {
> + pr_warn("No verity metadata found\n");
> + err = -EINVAL;
> + goto err_out;
> + }
> + full_i_size -= (last_virt + last_validsize - sizeof(*ftr)) -
> + (void *)ftr;
> +
> + desc_reverse_offset = le32_to_cpu(ftr->desc_reverse_offset);
> + if (desc_reverse_offset <
> + sizeof(struct fsverity_descriptor) + sizeof(*ftr) ||
> + desc_reverse_offset > full_i_size) {
> + pr_warn("Unexpected desc_reverse_offset: %u\n",
> + desc_reverse_offset);
> + err = -EINVAL;
> + goto err_out;
> + }
> + *desc_start = full_i_size - desc_reverse_offset;
> + if (*desc_start & 7) {
> + pr_warn("fs-verity descriptor is misaligned (desc_start=%lld)\n",
> + *desc_start);
> + err = -EINVAL;
> + goto err_out;
> + }
> +
> + first_pgoff = *desc_start >> PAGE_SHIFT;
> + if (last_pgoff - first_pgoff >= MAX_DESCRIPTOR_PAGES) {
> + pr_warn("fs-verity descriptor is too long (%lu pages)\n",
> + last_pgoff - first_pgoff + 1);
> + err = -EINVAL;
> + goto err_out;
> + }
> +
> + *desc_len = desc_reverse_offset - sizeof(__le32);
> +
> + if (first_pgoff == last_pgoff) {
> + /* Single-page descriptor; use the already-kmapped last page */
> + desc_pages[0] = last_page;
> + *nr_desc_pages = 1;
> + return last_virt + (*desc_start & ~PAGE_MASK);
> + }
> +
> + /* Multi-page descriptor; map the additional pages into memory */
> +
> + for (pgoff = first_pgoff; pgoff < last_pgoff; pgoff++) {
> + struct page *page;
> +
> + page = read_mapping_page(inode->i_mapping, pgoff, NULL);
> + if (IS_ERR(page)) {
> + err = PTR_ERR(page);
> + pr_warn("Error reading descriptor page: %d\n", err);
> + goto err_out;
> + }
> + desc_pages[(*nr_desc_pages)++] = page;
> + }
> +
> + desc_pages[(*nr_desc_pages)++] = last_page;
> + kunmap(last_page);
> + last_page = NULL;
> +
> + desc_virt = vmap(desc_pages, *nr_desc_pages, VM_MAP, PAGE_KERNEL_RO);
> + if (!desc_virt) {
> + err = -ENOMEM;
> + goto err_out;
> + }
> +
> + return desc_virt + (*desc_start & ~PAGE_MASK);
> +
> +err_out:
> + for (i = 0; i < *nr_desc_pages; i++)
> + put_page(desc_pages[i]);
> + if (last_page) {
> + kunmap(last_page);
> + put_page(last_page);
> + }
> + return ERR_PTR(err);
> +}
> +
> +static void
> +unmap_fsverity_descriptor(const struct fsverity_descriptor *desc,
> + struct page *desc_pages[MAX_DESCRIPTOR_PAGES],
> + int nr_desc_pages)
> +{
> + int i;
> +
> + if (is_vmalloc_addr(desc)) {
> + vunmap((void *)((unsigned long)desc & PAGE_MASK));
> + } else {
> + WARN_ON(nr_desc_pages != 1);
> + kunmap(desc_pages[0]);
> + }
> + for (i = 0; i < nr_desc_pages; i++)
> + put_page(desc_pages[i]);
> +}
> +
> +/*
> + * Read the file's fs-verity descriptor and create an fsverity_info for it.
> + */
> +struct fsverity_info *create_fsverity_info(struct inode *inode, bool enabling)
> +{
> + loff_t full_i_size;
> + struct fsverity_info *vi;
> + const struct fsverity_descriptor *desc = NULL;
> + struct page *desc_pages[MAX_DESCRIPTOR_PAGES];
> + int nr_desc_pages;
> + int desc_len;
> + loff_t desc_start;
> + int desc_auth_len;
> + int err;
> +
> + vi = alloc_fsverity_info();
> + if (!vi)
> + return ERR_PTR(-ENOMEM);
> +
> + full_i_size = i_size_read(inode);
> +
> + if (inode->i_sb->s_vop->get_full_i_size && !enabling) {
> + /*
> + * For filesystems that set the on-disk i_size to data_i_size
> + * rather than to full_i_size, we have to get full_i_size from
> + * somewhere else, e.g. the end of the last extent.
> + */
> + vi->data_i_size = full_i_size;
> + err = inode->i_sb->s_vop->get_full_i_size(inode, &full_i_size);
> + if (err)
> + goto out;
> + }
> + vi->full_i_size = full_i_size;
> + pr_debug("full_i_size=%lld\n", full_i_size);
> +
> + desc = map_fsverity_descriptor(inode, full_i_size, desc_pages,
> + &nr_desc_pages, &desc_len, &desc_start);
> + if (IS_ERR(desc)) {
> + err = PTR_ERR(desc);
> + desc = NULL;
> + goto out;
> + }
> +
> + dump_fsverity_descriptor(desc);
> + desc_auth_len = parse_fsverity_descriptor(vi, desc, desc_len,
> + desc_start);
> + if (desc_auth_len < 0) {
> + err = desc_auth_len;
> + goto out;
> + }
> +
> + err = compute_tree_depth_and_offsets(vi);
> + if (err)
> + goto out;
> + err = compute_measurement(vi, desc, desc_auth_len, desc_pages,
> + nr_desc_pages, vi->measurement);
> +out:
> + if (desc)
> + unmap_fsverity_descriptor(desc, desc_pages, nr_desc_pages);
> + if (err) {
> + free_fsverity_info(vi);
> + vi = ERR_PTR(err);
> + }
> + return vi;
> +}
> +
> +/* Ensure the inode has an ->i_verity_info */
> +static int setup_fsverity_info(struct inode *inode)
> +{
> + struct fsverity_info *vi = get_fsverity_info(inode);
> +
> + if (vi)
> + return 0;
> +
> + vi = create_fsverity_info(inode, false);
> + if (IS_ERR(vi))
> + return PTR_ERR(vi);
> +
> + if (!set_fsverity_info(inode, vi))
> + free_fsverity_info(vi);
> + return 0;
> +}
> +
> +/**
> + * fsverity_file_open - prepare to open a verity file
> + * @inode: the inode being opened
> + * @filp: the struct file being set up
> + *
> + * When opening a verity file, deny the open if it is for writing. Otherwise,
> + * set up the inode's ->i_verity_info (if not already done) by parsing the
> + * verity metadata at the end of the file.
> + *
> + * When combined with fscrypt, this must be called after fscrypt_file_open().
> + * Otherwise, we won't have the key set up to decrypt the verity metadata.
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +int fsverity_file_open(struct inode *inode, struct file *filp)
> +{
> + if (filp->f_mode & FMODE_WRITE) {
> + pr_debug("Denying opening verity file (ino %lu) for write\n",
> + inode->i_ino);
> + return -EPERM;
> + }
> +
> + return setup_fsverity_info(inode);
> +}
> +EXPORT_SYMBOL_GPL(fsverity_file_open);
> +
> +/**
> + * fsverity_prepare_setattr - prepare to change a verity inode's attributes
> + * @dentry: dentry through which the inode is being changed
> + * @attr: attributes to change
> + *
> + * Verity files are immutable, so deny truncates. This isn't covered by the
> + * open-time check because sys_truncate() takes a path, not a file descriptor.
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr)
> +{
> + if (attr->ia_valid & ATTR_SIZE) {
> + pr_debug("Denying truncate of verity file (ino %lu)\n",
> + d_inode(dentry)->i_ino);
> + return -EPERM;
> + }
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(fsverity_prepare_setattr);
> +
> +/**
> + * fsverity_prepare_getattr - prepare to get a verity inode's attributes
> + * @inode: the inode for which the attributes are being retrieved
> + *
> + * For filesystems that set the on-disk i_size to full_i_size rather than to
> + * data_i_size, to make st_size exclude the verity metadata even before the file
> + * has been opened for the first time we need to grab the original data size
> + * from the fs-verity descriptor. Currently, to implement this we just set up
> + * the ->i_verity_info, like in the ->open() hook.
> + *
> + * However, when combined with fscrypt, on an encrypted file this must only be
> + * called if the encryption key has been set up!
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +int fsverity_prepare_getattr(struct inode *inode)
> +{
> + return setup_fsverity_info(inode);
> +}
> +EXPORT_SYMBOL_GPL(fsverity_prepare_getattr);
> +
> +/**
> + * fsverity_cleanup_inode - free the inode's verity info, if present
> + *
> + * Filesystems must call this on inode eviction to free ->i_verity_info.
> + */
> +void fsverity_cleanup_inode(struct inode *inode)
> +{
> + free_fsverity_info(inode->i_verity_info);
> + inode->i_verity_info = NULL;
> +}
> +EXPORT_SYMBOL_GPL(fsverity_cleanup_inode);
> +
> +/**
> + * fsverity_full_i_size - get the full (on-disk) file size
> + *
> + * If the inode has had its in-memory ->i_size overridden for fs-verity (to
> + * exclude the metadata at the end of the file), then return the full i_size
> + * which is stored on-disk. Otherwise, just return the in-memory ->i_size.
> + *
> + * Return: the full (on-disk) file size
> + */
> +loff_t fsverity_full_i_size(const struct inode *inode)
> +{
> + struct fsverity_info *vi = get_fsverity_info(inode);
> +
> + if (vi)
> + return vi->full_i_size;
> +
> + return i_size_read(inode);
> +}
> +EXPORT_SYMBOL_GPL(fsverity_full_i_size);
> +
> +static int __init fsverity_module_init(void)
> +{
> + fsverity_info_cachep = KMEM_CACHE(fsverity_info, SLAB_RECLAIM_ACCOUNT);
> + if (!fsverity_info_cachep)
> + return -ENOMEM;
> +
> + fsverity_check_hash_algs();
> +
> + pr_debug("Initialized fs-verity\n");
> + return 0;
> +}
> +
> +static void __exit fsverity_module_exit(void)
> +{
> + kmem_cache_destroy(fsverity_info_cachep);
> + fsverity_exit_hash_algs();
> +}
> +
> +module_init(fsverity_module_init)
> +module_exit(fsverity_module_exit);
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("fs-verity: read-only file-based integrity/authentication");
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 805bf22898cf2..26764ebcb7724 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -61,6 +61,8 @@ struct workqueue_struct;
> struct iov_iter;
> struct fscrypt_info;
> struct fscrypt_operations;
> +struct fsverity_info;
> +struct fsverity_operations;
>
> extern void __init inode_init(void);
> extern void __init inode_init_early(void);
> @@ -671,6 +673,10 @@ struct inode {
> struct fscrypt_info *i_crypt_info;
> #endif
>
> +#if IS_ENABLED(CONFIG_FS_VERITY)
> + struct fsverity_info *i_verity_info;
> +#endif
> +
> void *i_private; /* fs or device private pointer */
> } __randomize_layout;
>
> @@ -1369,6 +1375,9 @@ struct super_block {
> const struct xattr_handler **s_xattr;
> #if IS_ENABLED(CONFIG_FS_ENCRYPTION)
> const struct fscrypt_operations *s_cop;
> +#endif
> +#if IS_ENABLED(CONFIG_FS_VERITY)
> + const struct fsverity_operations *s_vop;
> #endif
> struct hlist_bl_head s_roots; /* alternate root dentries for NFS */
> struct list_head s_mounts; /* list of mounts; _not_ for fs use */
> diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
> new file mode 100644
> index 0000000000000..3af55241046aa
> --- /dev/null
> +++ b/include/linux/fsverity.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * fs-verity: read-only file-based integrity/authentication
> + *
> + * Copyright (C) 2018 Google, Inc.
> + */
> +
> +#ifndef _LINUX_FSVERITY_H
> +#define _LINUX_FSVERITY_H
> +
> +#include <linux/fs.h>
> +#include <uapi/linux/fsverity.h>
> +
> +/*
> + * fs-verity operations for filesystems
> + */
> +struct fsverity_operations {
> + int (*set_verity)(struct inode *inode, loff_t data_i_size);
> + int (*get_full_i_size)(struct inode *inode, loff_t *full_i_size_ret);
> +};
> +
> +#if __FS_HAS_VERITY
> +
> +/* setup.c */
> +extern int fsverity_file_open(struct inode *inode, struct file *filp);
> +extern int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
> +extern int fsverity_prepare_getattr(struct inode *inode);
> +extern void fsverity_cleanup_inode(struct inode *inode);
> +extern loff_t fsverity_full_i_size(const struct inode *inode);
> +
> +#else /* !__FS_HAS_VERITY */
> +
> +/* setup.c */
> +
> +static inline int fsverity_file_open(struct inode *inode, struct file *filp)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static inline int fsverity_prepare_setattr(struct dentry *dentry,
> + struct iattr *attr)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static inline int fsverity_prepare_getattr(struct inode *inode)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static inline void fsverity_cleanup_inode(struct inode *inode)
> +{
> +}
> +
> +static inline loff_t fsverity_full_i_size(const struct inode *inode)
> +{
> + return i_size_read(inode);
> +}
> +
> +#endif /* !__FS_HAS_VERITY */
> +
> +#endif /* _LINUX_FSVERITY_H */
> diff --git a/include/uapi/linux/fsverity.h b/include/uapi/linux/fsverity.h
> new file mode 100644
> index 0000000000000..24ebb8b6ea0d4
> --- /dev/null
> +++ b/include/uapi/linux/fsverity.h
> @@ -0,0 +1,86 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * fs-verity (file-based verity) support
> + *
> + * Copyright (C) 2018 Google LLC
> + */
> +#ifndef _UAPI_LINUX_FSVERITY_H
> +#define _UAPI_LINUX_FSVERITY_H
> +
> +#include <linux/limits.h>
> +#include <linux/ioctl.h>
> +#include <linux/types.h>
> +
> +/* ========== Ioctls ========== */
> +
> +struct fsverity_digest {
> + __u16 digest_algorithm;
> + __u16 digest_size; /* input/output */
> + __u8 digest[];
> +};
> +
> +#define FS_IOC_ENABLE_VERITY _IO('f', 133)
> +#define FS_IOC_MEASURE_VERITY _IOWR('f', 134, struct fsverity_digest)
> +
> +/* ========== On-disk format ========== */
> +
> +#define FS_VERITY_MAGIC "FSVerity"
> +
> +/* Supported hash algorithms */
> +#define FS_VERITY_ALG_SHA256 1
> +
> +/* Metadata stored near the end of verity files, after the Merkle tree */
> +/* This structure is 64 bytes long */
> +struct fsverity_descriptor {
> + __u8 magic[8]; /* must be FS_VERITY_MAGIC */
> + __u8 major_version; /* must be 1 */
> + __u8 minor_version; /* must be 0 */
> + __u8 log_data_blocksize;/* log2(data-bytes-per-hash), e.g. 12 for 4KB */
> + __u8 log_tree_blocksize;/* log2(tree-bytes-per-hash), e.g. 12 for 4KB */
> + __le16 data_algorithm; /* hash algorithm for data blocks */
> + __le16 tree_algorithm; /* hash algorithm for tree blocks */
> + __le32 flags; /* flags */
> + __le32 reserved1; /* must be 0 */
> + __le64 orig_file_size; /* size of the original, unpadded data */
> + __le16 auth_ext_count; /* number of authenticated extensions */
> + __u8 reserved2[30]; /* must be 0 */
> +};
> +/* followed by list of 'auth_ext_count' authenticated extensions */
> +/*
> + * then followed by '__le16 unauth_ext_count' padded to next 8-byte boundary,
> + * then a list of 'unauth_ext_count' (may be 0) unauthenticated extensions
> + */
> +
> +/* Extension types */
> +#define FS_VERITY_EXT_ROOT_HASH 1
> +#define FS_VERITY_EXT_SALT 2
> +
> +/* Header of each extension (variable-length metadata item) */
> +struct fsverity_extension {
> + /*
> + * Length in bytes, including this header but excluding padding to next
> + * 8-byte boundary that is applied when advancing to the next extension.
> + */
> + __le32 length;
> + __le16 type; /* Type of this extension (see codes above) */
> + __le16 reserved; /* Reserved, must be 0 */
> +};
> +/* followed by the payload of 'length - 8' bytes */
> +
> +/* Extension payload formats */
> +
> +/*
> + * FS_VERITY_EXT_ROOT_HASH payload is just a byte array, with size equal to the
> + * digest size of the hash algorithm given in the fsverity_descriptor
> + */
> +
> +/* FS_VERITY_EXT_SALT payload is just a byte array, any size */
> +
> +
> +/* Fields stored at the very end of the file */
> +struct fsverity_footer {
> + __le32 desc_reverse_offset; /* distance to fsverity_descriptor */
> + __u8 magic[8]; /* FS_VERITY_MAGIC */
> +} __packed;
> +
> +#endif /* _UAPI_LINUX_FSVERITY_H */
> --
> 2.18.0
>
--
Chuck Lever
[email protected]
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Chuck,
On Sun, Aug 26, 2018 at 11:55:57AM -0400, Chuck Lever wrote:
> > +
> > +/**
> > + * fsverity_verify_page - verify a data page
> > + *
> > + * Verify a page that has just been read from a file against that file's Merkle
> > + * tree. The page is assumed to be a pagecache page.
> > + *
> > + * Return: true if the page is valid, else false.
> > + */
> > +bool fsverity_verify_page(struct page *data_page)
> > +{
> > + struct inode *inode = data_page->mapping->host;
> > + const struct fsverity_info *vi = get_fsverity_info(inode);
> > + struct ahash_request *req;
> > + bool valid;
> > +
> > + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
> > + if (unlikely(!req))
> > + return false;
> > +
> > + valid = verify_page(inode, vi, req, data_page);
> > +
> > + ahash_request_free(req);
> > +
> > + return valid;
> > +}
> > +EXPORT_SYMBOL_GPL(fsverity_verify_page);
> > +
> > +/**
> > + * fsverity_verify_bio - verify a 'read' bio that has just completed
> > + *
> > + * Verify a set of pages that have just been read from a file against that
> > + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
> > + * fail verification are set to the Error state. Verification is skipped for
> > + * pages already in the Error state, e.g. due to fscrypt decryption failure.
> > + */
> > +void fsverity_verify_bio(struct bio *bio)
>
> Hi Eric-
>
> This kind of API won't work for remote filesystems, which do not use
> "struct bio" to do their I/O. Could a remote filesystem solely use
> fsverity_verify_page instead?
>
Yes, filesystems don't have to use fsverity_verify_bio(). They can call
fsverity_verify_page() on each page instead. I will clarify this in the next
revision of the patchset.
- Eric
Hi Chuck,
On Sun, Aug 26, 2018 at 12:22:08PM -0400, Chuck Lever wrote:
> Hi Eric-
>
> Context: I'm working on IMA support for NFSv4, and would like to
> use fs-verity (or some Merkle tree-like mechanism) eventually to
> help address the performance impacts of using IMA with large NFS
> files.
>
>
> > On Aug 24, 2018, at 12:16 PM, Eric Biggers <[email protected]> wrote:
> >
> > From: Eric Biggers <[email protected]>
> >
> > fs-verity is a filesystem feature that provides efficient, transparent
> > integrity verification and authentication of read-only files. It uses a
> > dm-verity like mechanism at the file level: a Merkle tree hidden past
> > the end of the file is used to verify any block in the file in
> > log(filesize) time. It is implemented mainly by helper functions in
> > fs/verity/ that will be shared by multiple filesystems.
>
> This description suggests that the only way fs-verity can work is
> by placing the Merkle tree data after EOF. Further, this organi-
> zation is exposed to user space, making it a fixed part of the
> fs-verity kernel/user space API.
>
> Remote filesystems -- esp. NFS -- would prefer to manage the Merkle
> tree data in other ways. The NFSv4 protocol, for example, supports
> named streams (as some other filesystems do), and could store the
> Merkle trees in those. Or, a new pNFS layout type could be con-
> structed where Merkle trees are stored separately from a file's
> content -- perhaps even on a separate file server.
>
> File servers can store this data as the servers' local filesystems
> require.
>
> Sharing how the Merkle tree is created and used is sensible, but
> IMHO the filesystem implementations should be allowed to store this
> tree however they find convenient. The Merkle trees should be
> exposed via a clean API, not as part of the file's content.
>
There has also been discussion with this on the thread for patch 02/10.
"A Merkle tree hidden past the end of the file" describes how ext4 and f2fs are
proposed to implement it, and it describes the file format expected by
FS_IOC_ENABLE_VERITY. But, at FS_IOC_ENABLE_VERITY time, a filesystem could
copy the verity metadata to somewhere else if it wanted, e.g. into a file
stream, and then truncate the file to its original size.
Afterwards, fs-verity doesn't really care where the metadata is stored.
Currently it does actually assume it's beyond EOF since it calls
read_mapping_page() directly, but that could be replaced at any time with
indirection via a method fsverity_operations.read_metadata_page().
We actually had such a method originally, but it turned out to be unnecessary
for ext4 and f2fs, so I had dropped it for now.
I will make this clearer in the next revision of the patchset, and maybe even
consider reintroducing ->read_metadata_page() to make it clear that filesystems
don't necessarily have to store the metadata beyond EOF.
Thanks,
- Eric
Hi Chao,
On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
> On 2018/8/25 0:16, Eric Biggers wrote:
> > From: Eric Biggers <[email protected]>
> > #ifdef CONFIG_F2FS_CHECK_FS
> > #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
> > #else
> > @@ -146,7 +149,7 @@ struct f2fs_mount_info {
> > #define F2FS_FEATURE_QUOTA_INO 0x0080
> > #define F2FS_FEATURE_INODE_CRTIME 0x0100
> > #define F2FS_FEATURE_LOST_FOUND 0x0200
> > -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
> > +#define F2FS_FEATURE_VERITY 0x0400
> >
> > #define F2FS_HAS_FEATURE(sb, mask) \
> > ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> > @@ -598,7 +601,7 @@ enum {
> > #define FADVISE_ENC_NAME_BIT 0x08
> > #define FADVISE_KEEP_SIZE_BIT 0x10
> > #define FADVISE_HOT_BIT 0x20
> > -#define FADVISE_VERITY_BIT 0x40 /* reserved */
> > +#define FADVISE_VERITY_BIT 0x40
>
> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
> generic i_flags field like ext4, so we can a) remaining more bits for those
> demands which really need file advise fields. b) using i_flags bits keeping line
> with ext4. Not sure, if user want to know whether the file is verity one, it
> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
>
> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
>
> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
>
I don't like using i_advise much either, but I actually don't see either
location being much better than the other at the moment. The real problem is an
artificial one: the i_flags in f2fs's on-disk format are being assumed to use
the same numbering scheme as ext4's on-disk format, which makes it seem that
they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
should use its own numbering for its i_flags, and it should map them to/from
whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
FS_IOC_FS{GET,SET}XATTR.
So putting the verity flag in *either* location (i_advise or i_flags) is just
kicking the can down the road. If I get around to it I will send a patch that
cleans up the f2fs flags properly...
Thanks,
- Eric
Hi Eric,
On 2018/8/27 1:04, Eric Biggers wrote:
> Hi Chuck,
>
> On Sun, Aug 26, 2018 at 11:55:57AM -0400, Chuck Lever wrote:
>>> +
>>> +/**
>>> + * fsverity_verify_page - verify a data page
>>> + *
>>> + * Verify a page that has just been read from a file against that file's Merkle
>>> + * tree. The page is assumed to be a pagecache page.
>>> + *
>>> + * Return: true if the page is valid, else false.
>>> + */
>>> +bool fsverity_verify_page(struct page *data_page)
>>> +{
>>> + struct inode *inode = data_page->mapping->host;
>>> + const struct fsverity_info *vi = get_fsverity_info(inode);
>>> + struct ahash_request *req;
>>> + bool valid;
>>> +
>>> + req = ahash_request_alloc(vi->hash_alg->tfm, GFP_KERNEL);
Some minor suggestions occurred to me after I saw this part of code
again before sleeping...
1) How about introducing an iterator callback to avoid too many
ahash_request_alloc and ahash_request_free... (It seems too many pages
and could be some slower than fsverity_verify_bio...)
2) How about adding a gfp_t input argument since I don't know whether
GFP_KERNEL is suitable for all use cases...
It seems there could be more fsverity_verify_page users as well as
fsverity_verify_bio ;)
Sorry for interruption...
Thanks,
Gao Xiang
>>> + if (unlikely(!req))
>>> + return false;
>>> +
>>> + valid = verify_page(inode, vi, req, data_page);
>>> +
>>> + ahash_request_free(req);
>>> +
>>> + return valid;
>>> +}
>>> +EXPORT_SYMBOL_GPL(fsverity_verify_page);
>>> +
>>> +/**
>>> + * fsverity_verify_bio - verify a 'read' bio that has just completed
>>> + *
>>> + * Verify a set of pages that have just been read from a file against that
>>> + * file's Merkle tree. The pages are assumed to be pagecache pages. Pages that
>>> + * fail verification are set to the Error state. Verification is skipped for
>>> + * pages already in the Error state, e.g. due to fscrypt decryption failure.
>>> + */
>>> +void fsverity_verify_bio(struct bio *bio)
>>
>> Hi Eric-
>>
>> This kind of API won't work for remote filesystems, which do not use
>> "struct bio" to do their I/O. Could a remote filesystem solely use
>> fsverity_verify_page instead?
>>
>
> Yes, filesystems don't have to use fsverity_verify_bio(). They can call
> fsverity_verify_page() on each page instead. I will clarify this in the next
> revision of the patchset.
>
> - Eric
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Hi Eric,
On 2018/8/27 1:35, Eric Biggers wrote:
> Hi Chao,
>
> On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
>> On 2018/8/25 0:16, Eric Biggers wrote:
>>> From: Eric Biggers <[email protected]>
>>> #ifdef CONFIG_F2FS_CHECK_FS
>>> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
>>> #else
>>> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
>>> #define F2FS_FEATURE_QUOTA_INO 0x0080
>>> #define F2FS_FEATURE_INODE_CRTIME 0x0100
>>> #define F2FS_FEATURE_LOST_FOUND 0x0200
>>> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
>>> +#define F2FS_FEATURE_VERITY 0x0400
>>>
>>> #define F2FS_HAS_FEATURE(sb, mask) \
>>> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
>>> @@ -598,7 +601,7 @@ enum {
>>> #define FADVISE_ENC_NAME_BIT 0x08
>>> #define FADVISE_KEEP_SIZE_BIT 0x10
>>> #define FADVISE_HOT_BIT 0x20
>>> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
>>> +#define FADVISE_VERITY_BIT 0x40
>>
>> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
>> generic i_flags field like ext4, so we can a) remaining more bits for those
>> demands which really need file advise fields. b) using i_flags bits keeping line
>> with ext4. Not sure, if user want to know whether the file is verity one, it
>> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
>>
>> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
>>
>> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
>>
>
> I don't like using i_advise much either, but I actually don't see either
> location being much better than the other at the moment. The real problem is an
> artificial one: the i_flags in f2fs's on-disk format are being assumed to use
Yeah, but since most copied flags from vfs/ext4 are not actually used in f2fs,
also 0x00100000 bit is not used now, so we can just define it now directly for
verity bit.
Cleanup and remapping in ioctl interface for those unused flags, we can do it
latter?
Thanks,
> the same numbering scheme as ext4's on-disk format, which makes it seem that
> they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
> reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
> should use its own numbering for its i_flags, and it should map them to/from
> whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
> FS_IOC_FS{GET,SET}XATTR.
>
> So putting the verity flag in *either* location (i_advise or i_flags) is just
> kicking the can down the road. If I get around to it I will send a patch that
> cleans up the f2fs flags properly...>
> Thanks,
>
> - Eric
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
On 08/27, Chao Yu wrote:
> Hi Eric,
>
> On 2018/8/27 1:35, Eric Biggers wrote:
> > Hi Chao,
> >
> > On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
> >> On 2018/8/25 0:16, Eric Biggers wrote:
> >>> From: Eric Biggers <[email protected]>
> >>> #ifdef CONFIG_F2FS_CHECK_FS
> >>> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
> >>> #else
> >>> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
> >>> #define F2FS_FEATURE_QUOTA_INO 0x0080
> >>> #define F2FS_FEATURE_INODE_CRTIME 0x0100
> >>> #define F2FS_FEATURE_LOST_FOUND 0x0200
> >>> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
> >>> +#define F2FS_FEATURE_VERITY 0x0400
> >>>
> >>> #define F2FS_HAS_FEATURE(sb, mask) \
> >>> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> >>> @@ -598,7 +601,7 @@ enum {
> >>> #define FADVISE_ENC_NAME_BIT 0x08
> >>> #define FADVISE_KEEP_SIZE_BIT 0x10
> >>> #define FADVISE_HOT_BIT 0x20
> >>> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
> >>> +#define FADVISE_VERITY_BIT 0x40
> >>
> >> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
> >> generic i_flags field like ext4, so we can a) remaining more bits for those
> >> demands which really need file advise fields. b) using i_flags bits keeping line
> >> with ext4. Not sure, if user want to know whether the file is verity one, it
> >> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
> >>
> >> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
> >>
> >> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
> >>
> >
> > I don't like using i_advise much either, but I actually don't see either
> > location being much better than the other at the moment. The real problem is an
> > artificial one: the i_flags in f2fs's on-disk format are being assumed to use
>
> Yeah, but since most copied flags from vfs/ext4 are not actually used in f2fs,
> also 0x00100000 bit is not used now, so we can just define it now directly for
> verity bit.
>
> Cleanup and remapping in ioctl interface for those unused flags, we can do it
> latter?
No, it was reserved by f2fs-tools, and I think this should be aligned to the
encryption bit. Moreover, we guarantee i_flags less strictly from power-cut than
i_advise.
>
> Thanks,
>
> > the same numbering scheme as ext4's on-disk format, which makes it seem that
> > they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
> > reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
> > should use its own numbering for its i_flags, and it should map them to/from
> > whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
> > FS_IOC_FS{GET,SET}XATTR.
> >
> > So putting the verity flag in *either* location (i_advise or i_flags) is just
> > kicking the can down the road. If I get around to it I will send a patch that
> > cleans up the f2fs flags properly...>
> > Thanks,
> >
> > - Eric
> >
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >
On 2018/8/28 15:27, Jaegeuk Kim wrote:
> On 08/27, Chao Yu wrote:
>> Hi Eric,
>>
>> On 2018/8/27 1:35, Eric Biggers wrote:
>>> Hi Chao,
>>>
>>> On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
>>>> On 2018/8/25 0:16, Eric Biggers wrote:
>>>>> From: Eric Biggers <[email protected]>
>>>>> #ifdef CONFIG_F2FS_CHECK_FS
>>>>> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
>>>>> #else
>>>>> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
>>>>> #define F2FS_FEATURE_QUOTA_INO 0x0080
>>>>> #define F2FS_FEATURE_INODE_CRTIME 0x0100
>>>>> #define F2FS_FEATURE_LOST_FOUND 0x0200
>>>>> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
>>>>> +#define F2FS_FEATURE_VERITY 0x0400
>>>>>
>>>>> #define F2FS_HAS_FEATURE(sb, mask) \
>>>>> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
>>>>> @@ -598,7 +601,7 @@ enum {
>>>>> #define FADVISE_ENC_NAME_BIT 0x08
>>>>> #define FADVISE_KEEP_SIZE_BIT 0x10
>>>>> #define FADVISE_HOT_BIT 0x20
>>>>> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
>>>>> +#define FADVISE_VERITY_BIT 0x40
>>>>
>>>> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
>>>> generic i_flags field like ext4, so we can a) remaining more bits for those
>>>> demands which really need file advise fields. b) using i_flags bits keeping line
>>>> with ext4. Not sure, if user want to know whether the file is verity one, it
>>>> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
>>>>
>>>> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
>>>>
>>>> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
>>>>
>>>
>>> I don't like using i_advise much either, but I actually don't see either
>>> location being much better than the other at the moment. The real problem is an
>>> artificial one: the i_flags in f2fs's on-disk format are being assumed to use
>>
>> Yeah, but since most copied flags from vfs/ext4 are not actually used in f2fs,
>> also 0x00100000 bit is not used now, so we can just define it now directly for
>> verity bit.
>>
>> Cleanup and remapping in ioctl interface for those unused flags, we can do it
>> latter?
>
> No, it was reserved by f2fs-tools,
That's not a problem, since we didn't use that reserved bit in any of images
now, there is no backward compatibility issue.
> and I think this should be aligned to the encryption bit.
Alright, we could, but if so, i_advise will run out of space earlier, after that
we have to add real advice bit into i_inline or i_flags, that would be a little
weird.
For encryption bit, as a common vfs feature flag, in the beginning of encryption
development, it will be better to set it into i_flags, IMO, but now, we have to
keep it as it was.
> Moreover, we guarantee i_flags less strictly from power-cut than i_advise.
IMO, in power-cut scenario, it needs to keep both i_flags and i_advise being
recoverable strictly. Any condition that we can not recover i_flags?
Thanks,
>
>>
>> Thanks,
>>
>>> the same numbering scheme as ext4's on-disk format, which makes it seem that
>>> they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
>>> reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
>>> should use its own numbering for its i_flags, and it should map them to/from
>>> whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
>>> FS_IOC_FS{GET,SET}XATTR.
>>>
>>> So putting the verity flag in *either* location (i_advise or i_flags) is just
>>> kicking the can down the road. If I get around to it I will send a patch that
>>> cleans up the f2fs flags properly...>
>>> Thanks,
>>>
>>> - Eric
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
>
> .
>
On 08/28, Chao Yu wrote:
> On 2018/8/28 15:27, Jaegeuk Kim wrote:
> > On 08/27, Chao Yu wrote:
> >> Hi Eric,
> >>
> >> On 2018/8/27 1:35, Eric Biggers wrote:
> >>> Hi Chao,
> >>>
> >>> On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
> >>>> On 2018/8/25 0:16, Eric Biggers wrote:
> >>>>> From: Eric Biggers <[email protected]>
> >>>>> #ifdef CONFIG_F2FS_CHECK_FS
> >>>>> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
> >>>>> #else
> >>>>> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
> >>>>> #define F2FS_FEATURE_QUOTA_INO 0x0080
> >>>>> #define F2FS_FEATURE_INODE_CRTIME 0x0100
> >>>>> #define F2FS_FEATURE_LOST_FOUND 0x0200
> >>>>> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
> >>>>> +#define F2FS_FEATURE_VERITY 0x0400
> >>>>>
> >>>>> #define F2FS_HAS_FEATURE(sb, mask) \
> >>>>> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> >>>>> @@ -598,7 +601,7 @@ enum {
> >>>>> #define FADVISE_ENC_NAME_BIT 0x08
> >>>>> #define FADVISE_KEEP_SIZE_BIT 0x10
> >>>>> #define FADVISE_HOT_BIT 0x20
> >>>>> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
> >>>>> +#define FADVISE_VERITY_BIT 0x40
> >>>>
> >>>> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
> >>>> generic i_flags field like ext4, so we can a) remaining more bits for those
> >>>> demands which really need file advise fields. b) using i_flags bits keeping line
> >>>> with ext4. Not sure, if user want to know whether the file is verity one, it
> >>>> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
> >>>>
> >>>> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
> >>>>
> >>>> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
> >>>>
> >>>
> >>> I don't like using i_advise much either, but I actually don't see either
> >>> location being much better than the other at the moment. The real problem is an
> >>> artificial one: the i_flags in f2fs's on-disk format are being assumed to use
> >>
> >> Yeah, but since most copied flags from vfs/ext4 are not actually used in f2fs,
> >> also 0x00100000 bit is not used now, so we can just define it now directly for
> >> verity bit.
> >>
> >> Cleanup and remapping in ioctl interface for those unused flags, we can do it
> >> latter?
> >
> > No, it was reserved by f2fs-tools,
>
> That's not a problem, since we didn't use that reserved bit in any of images
> now, there is no backward compatibility issue.
We're using that.
>
> > and I think this should be aligned to the encryption bit.
>
> Alright, we could, but if so, i_advise will run out of space earlier, after that
> we have to add real advice bit into i_inline or i_flags, that would be a little
> weird.
>
> For encryption bit, as a common vfs feature flag, in the beginning of encryption
> development, it will be better to set it into i_flags, IMO, but now, we have to
> keep it as it was.
>
> > Moreover, we guarantee i_flags less strictly from power-cut than i_advise.
>
> IMO, in power-cut scenario, it needs to keep both i_flags and i_advise being
> recoverable strictly. Any condition that we can not recover i_flags?
In __f2fs_ioc_setflags, f2fs_mark_inode_dirty_sync(inode, false);
>
> Thanks,
>
> >
> >>
> >> Thanks,
> >>
> >>> the same numbering scheme as ext4's on-disk format, which makes it seem that
> >>> they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
> >>> reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
> >>> should use its own numbering for its i_flags, and it should map them to/from
> >>> whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
> >>> FS_IOC_FS{GET,SET}XATTR.
> >>>
> >>> So putting the verity flag in *either* location (i_advise or i_flags) is just
> >>> kicking the can down the road. If I get around to it I will send a patch that
> >>> cleans up the f2fs flags properly...>
> >>> Thanks,
> >>>
> >>> - Eric
> >>>
> >>> ------------------------------------------------------------------------------
> >>> Check out the vibrant tech community on one of the world's most
> >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>> _______________________________________________
> >>> Linux-f2fs-devel mailing list
> >>> [email protected]
> >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>
> >
> > .
> >
On 2018/8/29 1:01, Jaegeuk Kim wrote:
> On 08/28, Chao Yu wrote:
>> On 2018/8/28 15:27, Jaegeuk Kim wrote:
>>> On 08/27, Chao Yu wrote:
>>>> Hi Eric,
>>>>
>>>> On 2018/8/27 1:35, Eric Biggers wrote:
>>>>> Hi Chao,
>>>>>
>>>>> On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
>>>>>> On 2018/8/25 0:16, Eric Biggers wrote:
>>>>>>> From: Eric Biggers <[email protected]>
>>>>>>> #ifdef CONFIG_F2FS_CHECK_FS
>>>>>>> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
>>>>>>> #else
>>>>>>> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
>>>>>>> #define F2FS_FEATURE_QUOTA_INO 0x0080
>>>>>>> #define F2FS_FEATURE_INODE_CRTIME 0x0100
>>>>>>> #define F2FS_FEATURE_LOST_FOUND 0x0200
>>>>>>> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
>>>>>>> +#define F2FS_FEATURE_VERITY 0x0400
>>>>>>>
>>>>>>> #define F2FS_HAS_FEATURE(sb, mask) \
>>>>>>> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
>>>>>>> @@ -598,7 +601,7 @@ enum {
>>>>>>> #define FADVISE_ENC_NAME_BIT 0x08
>>>>>>> #define FADVISE_KEEP_SIZE_BIT 0x10
>>>>>>> #define FADVISE_HOT_BIT 0x20
>>>>>>> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
>>>>>>> +#define FADVISE_VERITY_BIT 0x40
>>>>>>
>>>>>> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
>>>>>> generic i_flags field like ext4, so we can a) remaining more bits for those
>>>>>> demands which really need file advise fields. b) using i_flags bits keeping line
>>>>>> with ext4. Not sure, if user want to know whether the file is verity one, it
>>>>>> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
>>>>>>
>>>>>> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
>>>>>>
>>>>>> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
>>>>>>
>>>>>
>>>>> I don't like using i_advise much either, but I actually don't see either
>>>>> location being much better than the other at the moment. The real problem is an
>>>>> artificial one: the i_flags in f2fs's on-disk format are being assumed to use
>>>>
>>>> Yeah, but since most copied flags from vfs/ext4 are not actually used in f2fs,
>>>> also 0x00100000 bit is not used now, so we can just define it now directly for
>>>> verity bit.
>>>>
>>>> Cleanup and remapping in ioctl interface for those unused flags, we can do it
>>>> latter?
>>>
>>> No, it was reserved by f2fs-tools,
>>
>> That's not a problem, since we didn't use that reserved bit in any of images
>> now, there is no backward compatibility issue.
>
> We're using that.
Oops, if it was in production, I agree to keep it in i_advice, otherwise, we
still can discuss its location.
>
>>
>>> and I think this should be aligned to the encryption bit.
>>
>> Alright, we could, but if so, i_advise will run out of space earlier, after that
>> we have to add real advice bit into i_inline or i_flags, that would be a little
>> weird.
>>
>> For encryption bit, as a common vfs feature flag, in the beginning of encryption
>> development, it will be better to set it into i_flags, IMO, but now, we have to
>> keep it as it was.
>>
>>> Moreover, we guarantee i_flags less strictly from power-cut than i_advise.
>>
>> IMO, in power-cut scenario, it needs to keep both i_flags and i_advise being
>> recoverable strictly. Any condition that we can not recover i_flags?
>
> In __f2fs_ioc_setflags, f2fs_mark_inode_dirty_sync(inode, false);
Ah, that's right, do you remember why we treat them with different recoverable
level?
Thanks,
>
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>> the same numbering scheme as ext4's on-disk format, which makes it seem that
>>>>> they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
>>>>> reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
>>>>> should use its own numbering for its i_flags, and it should map them to/from
>>>>> whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
>>>>> FS_IOC_FS{GET,SET}XATTR.
>>>>>
>>>>> So putting the verity flag in *either* location (i_advise or i_flags) is just
>>>>> kicking the can down the road. If I get around to it I will send a patch that
>>>>> cleans up the f2fs flags properly...>
>>>>> Thanks,
>>>>>
>>>>> - Eric
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Check out the vibrant tech community on one of the world's most
>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>> _______________________________________________
>>>>> Linux-f2fs-devel mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>
>>>
>>> .
>>>
>
> .
>
On 08/29, Chao Yu wrote:
> On 2018/8/29 1:01, Jaegeuk Kim wrote:
> > On 08/28, Chao Yu wrote:
> >> On 2018/8/28 15:27, Jaegeuk Kim wrote:
> >>> On 08/27, Chao Yu wrote:
> >>>> Hi Eric,
> >>>>
> >>>> On 2018/8/27 1:35, Eric Biggers wrote:
> >>>>> Hi Chao,
> >>>>>
> >>>>> On Sat, Aug 25, 2018 at 01:54:08PM +0800, Chao Yu wrote:
> >>>>>> On 2018/8/25 0:16, Eric Biggers wrote:
> >>>>>>> From: Eric Biggers <[email protected]>
> >>>>>>> #ifdef CONFIG_F2FS_CHECK_FS
> >>>>>>> #define f2fs_bug_on(sbi, condition) BUG_ON(condition)
> >>>>>>> #else
> >>>>>>> @@ -146,7 +149,7 @@ struct f2fs_mount_info {
> >>>>>>> #define F2FS_FEATURE_QUOTA_INO 0x0080
> >>>>>>> #define F2FS_FEATURE_INODE_CRTIME 0x0100
> >>>>>>> #define F2FS_FEATURE_LOST_FOUND 0x0200
> >>>>>>> -#define F2FS_FEATURE_VERITY 0x0400 /* reserved */
> >>>>>>> +#define F2FS_FEATURE_VERITY 0x0400
> >>>>>>>
> >>>>>>> #define F2FS_HAS_FEATURE(sb, mask) \
> >>>>>>> ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
> >>>>>>> @@ -598,7 +601,7 @@ enum {
> >>>>>>> #define FADVISE_ENC_NAME_BIT 0x08
> >>>>>>> #define FADVISE_KEEP_SIZE_BIT 0x10
> >>>>>>> #define FADVISE_HOT_BIT 0x20
> >>>>>>> -#define FADVISE_VERITY_BIT 0x40 /* reserved */
> >>>>>>> +#define FADVISE_VERITY_BIT 0x40
> >>>>>>
> >>>>>> As I suggested before, how about moving f2fs' verity_bit from i_fadvise to more
> >>>>>> generic i_flags field like ext4, so we can a) remaining more bits for those
> >>>>>> demands which really need file advise fields. b) using i_flags bits keeping line
> >>>>>> with ext4. Not sure, if user want to know whether the file is verity one, it
> >>>>>> will be easy for f2fs to export the status through FS_IOC_SETFLAGS.
> >>>>>>
> >>>>>> #define EXT4_VERITY_FL 0x00100000 /* Verity protected inode */
> >>>>>>
> >>>>>> #define F2FS_VERITY_FL 0x00100000 /* Verity protected inode */
> >>>>>>
> >>>>>
> >>>>> I don't like using i_advise much either, but I actually don't see either
> >>>>> location being much better than the other at the moment. The real problem is an
> >>>>> artificial one: the i_flags in f2fs's on-disk format are being assumed to use
> >>>>
> >>>> Yeah, but since most copied flags from vfs/ext4 are not actually used in f2fs,
> >>>> also 0x00100000 bit is not used now, so we can just define it now directly for
> >>>> verity bit.
> >>>>
> >>>> Cleanup and remapping in ioctl interface for those unused flags, we can do it
> >>>> latter?
> >>>
> >>> No, it was reserved by f2fs-tools,
> >>
> >> That's not a problem, since we didn't use that reserved bit in any of images
> >> now, there is no backward compatibility issue.
> >
> > We're using that.
>
> Oops, if it was in production, I agree to keep it in i_advice, otherwise, we
> still can discuss its location.
>
> >
> >>
> >>> and I think this should be aligned to the encryption bit.
> >>
> >> Alright, we could, but if so, i_advise will run out of space earlier, after that
> >> we have to add real advice bit into i_inline or i_flags, that would be a little
> >> weird.
> >>
> >> For encryption bit, as a common vfs feature flag, in the beginning of encryption
> >> development, it will be better to set it into i_flags, IMO, but now, we have to
> >> keep it as it was.
> >>
> >>> Moreover, we guarantee i_flags less strictly from power-cut than i_advise.
> >>
> >> IMO, in power-cut scenario, it needs to keep both i_flags and i_advise being
> >> recoverable strictly. Any condition that we can not recover i_flags?
> >
> > In __f2fs_ioc_setflags, f2fs_mark_inode_dirty_sync(inode, false);
>
> Ah, that's right, do you remember why we treat them with different recoverable
> level?
Since I thought that such the flags wouln't be critical on power cuts, but be
enough for us to guarantee by write_inode() or fsync().
>
> Thanks,
>
> >
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>> the same numbering scheme as ext4's on-disk format, which makes it seem that
> >>>>> they have to be in sync, and that all new ext4 flags (say, EA_INODE) also
> >>>>> reserve bits in f2fs and vice versa, when they in fact do not. Instead, f2fs
> >>>>> should use its own numbering for its i_flags, and it should map them to/from
> >>>>> whatever is needed for common APIs like FS_IOC_{GET,SET}FLAGS and
> >>>>> FS_IOC_FS{GET,SET}XATTR.
> >>>>>
> >>>>> So putting the verity flag in *either* location (i_advise or i_flags) is just
> >>>>> kicking the can down the road. If I get around to it I will send a patch that
> >>>>> cleans up the f2fs flags properly...>
> >>>>> Thanks,
> >>>>>
> >>>>> - Eric
> >>>>>
> >>>>> ------------------------------------------------------------------------------
> >>>>> Check out the vibrant tech community on one of the world's most
> >>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>>> _______________________________________________
> >>>>> Linux-f2fs-devel mailing list
> >>>>> [email protected]
> >>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>>
> >>>
> >>> .
> >>>
> >
> > .
> >
On Fri, 2018-08-24 at 09:16 -0700, Eric Biggers wrote:
[...]
> Since fs-verity provides the Merkle tree root hash in constant time and
> verifies data blocks on-demand, it is useful for efficiently verifying
> the authenticity of, or "appraising", large files of which only a small
> portion may be accessed -- such as Android application (APK) files. It
> can also be useful in "audit" use cases where file hashes are logged.
> fs-verity also provides better protection against malicious disk
> firmware than an ahead-of-time hash, since fs-verity re-verifies data
> each time it's paged in.
[...]
> Feedback on the design and implementation is greatly appreciated.
Hi,
I've looked at the series and the slides linked form the recent lwn.net
article, but I'm not sure how fs-verity intends to protect against
malicious firmware (or offline modification). Similar to IMA/EVM, fs-
verity doesn't seem to include the name/location of the file into it's
verification. So the firmware/an attacker could replace one fs-verity-
protected file with another (maybe a trusted system APK with another
one for which a vulnerability was discovered, or /sbin/init with
/bin/sh).
Is the expected root hash of the file provided from somewhere else, so
this is not a problem on Android? Or is this problem out-of-scope for
fs-verity?
For IMA/EVM, there were patches by Dmitry to address this class of
attacks (they were not merged, though):
https://lwn.net/Articles/574221/
Thanks,
Jan
[1] https://events.linuxfoundation.org/wp-content/uploads/2017/11/fs-ve
rify_Mike-Halcrow_Eric-Biggers.pdf
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Hi Jan,
On Fri, Aug 31, 2018 at 10:05:23PM +0200, Jan L?bbe wrote:
> On Fri, 2018-08-24 at 09:16 -0700, Eric Biggers wrote:
> [...]
> > Since fs-verity provides the Merkle tree root hash in constant time and
> > verifies data blocks on-demand, it is useful for efficiently verifying
> > the authenticity of, or "appraising", large files of which only a small
> > portion may be accessed -- such as Android application (APK) files.??It
> > can also be useful in "audit" use cases where file hashes are logged.
> > fs-verity also provides better protection against malicious disk
> > firmware than an ahead-of-time hash, since fs-verity re-verifies data
> > each time it's paged in.
> [...]
> > Feedback on the design and implementation is greatly appreciated.
>
> Hi,
>
> I've looked at the series and the slides linked form the recent lwn.net
> article, but I'm not sure how fs-verity intends to protect against
> malicious firmware (or offline modification). Similar to IMA/EVM, fs-
> verity doesn't seem to include the name/location of the file into it's
> verification. So the firmware/an attacker could replace one fs-verity-
> protected file with another (maybe a trusted system APK with another
> one for which a vulnerability was discovered, or /sbin/init with
> /bin/sh).
>
> Is the expected root hash of the file provided from somewhere else, so
> this is not a problem on Android? Or is this problem out-of-scope for
> fs-verity?
>
> For IMA/EVM, there were patches by Dmitry to address this class of
> attacks (they were not merged, though):
> https://lwn.net/Articles/574221/
>
> Thanks,
> Jan
>
> [1] https://events.linuxfoundation.org/wp-content/uploads/2017/11/fs-ve
> rify_Mike-Halcrow_Eric-Biggers.pdf
Well, it's up to the user of fs-verity.
If you know that, say, /bin/sh is supposed to have a fs-verity file measurement
of sha256:01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b, then
you can just check that.
Or, after IMA support is added, users will be able to configure the fs-verity
measurements to go into the IMA measurement log and/or audit log, just regular
file hashes. Those records include both the file path and hash.
On the other hand, if the policy you want to enforce is just that a particular
file is using fs-verity and that its hash has been signed by a particular key,
then indeed, if there are multiple hashes that were signed with that key, an
attacker can replace the contents with a different signed contents. But that's
not really fs-verity's fault; it's really the type of policy that the user chose
to use on top of it, as part of their overall system security architecture.
Yes, the initial plan for Android APK verification is to just use that type of
policy, probably using the in-kernel signature verification support included in
patch 07/10. So it will therefore have that limitation. Still, it will be a
massive improvement over the status quo where attackers can make arbitrary
changes to APK files post-installation, e.g. to persistently inject arbitrary
malware.
It will also be possible to improve the security properties of APK verification
in the future by doing additional checks, such as optionally including
additional file metadata in the fs-verity measurement (fs-verity's design allows
for this; they can be added as authenticated extensions), or even simply
checking for the correct metadata from trusted userspace code running in the
/system partition. Or perhaps the trusted userspace code could download the
expected fs-verity measurement of the APK from the app store given the app name.
There are lots of options. But we have to start somewhere, and fs-verity is a
tool that seems to be needed in any case.
- Eric
Hi,
On Fri, Aug 24, 2018 at 9:16 PM, Eric Biggers <[email protected]> wrote:
> Hi Gao,
>
> On Sat, Aug 25, 2018 at 10:29:26AM +0800, Gao Xiang wrote:
>> Hi,
>>
>> At last, I hope filesystems could select the on-disk position of hash tree and 'struct fsverity_descriptor'
>> rather than fixed in the end of verity files...I think if fs-verity preparing such support and interfaces could be better.....hmmm... :(
>
> In theory it would be a much cleaner design to store verity metadata separately
> from the data. But the Merkle tree can be very large. For example, a 1 GB file
> using SHA-512 would have a 16.6 MB Merkle tree. So the Merkle tree can't be an
> extended attribute, since the xattrs API requires xattrs to be small (<= 64 KB),
> and most filesystems further limit xattr sizes in their on-disk format to as
> little as 4 KB. Furthermore, even if both of these limits were to be increased,
> the xattrs functions (both the syscalls, and the internal functions that
> filesystems have) are all based around getting/setting the entire xattr value.
>
> Also when used with fscrypt, we want the Merkle tree and fsverity_descriptor to
> be encrypted, so they doesn't leak plaintext hashes. And we want the Merkle
> tree to be paged into memory, just like the file contents, to take advantage of
> the usual Linux memory management.
>
> What we really need is *streams*, like NTFS has. But the filesystems we're
> targetting don't support streams, nor does the Linux syscall interface have any
> API for accessing streams, nor does the VFS support them.
>
> Adding streams support to all those things would be a huge multi-year effort,
> controversial, and almost certainly not worth it just for fs-verity.
>
> So simply storing the verity metadata past i_size seems like the best solution
> for now.
>
> That being said, in the future we could pretty easily swap out the calls to
> read_mapping_page() with something else if a particular filesystem wanted to
> store the metadata somewhere else. We actually even originally had a function
> ->read_metadata_page() in the filesystem's fsverity_operations, but it turned
> out to be unnecessary and I replaced it with directly calling
> read_mapping_page(), but it could be changed back at any time.
What about an xattr not to hold the Merkle tree, but to contain a
suitable reference to a file/inode+offset that contains it (+ toplevel
hash for said tree/file or the descriptor/struct)?
If you also expose said file in the directory structure, things such
as backups might be easier to handle. For where the tree is appended
to the file, you could self-reference.
-Olof
On Sat, Aug 25, 2018, at 12:48 AM, Eric Biggers wrote:
>
> As Ted pointed out, only truncates are denied on fs-verity files, not other
> metadata changes like chmod().
>
> Think of it this way: the purpose of fs-verity is *not* to make files immutable.
> It's to hash them.
Sorry for my unfamiliarity with Android internals but - in earlier discussion
I believe it was mentioned that APK (zip files?) that are being targeted here, right?
Now AIUI, Zip files have an internal header that contains e.g. the size and
indexes into the internal files. So if someone added random data to the end
of a zip file, nothing is going to end up actually reading it.
However, there are file formats that use the size of the file reported by stat();
at least OSTree does this with serializing GVariant. I'm sure there are others -
I'd imagine at least some things parsing ELF do this?
In such a case, we really want to deny appending to the file as well.
Unless there's some mechanism to deny applications reading not-verified
data?
And "hidden" data after fs-verity protected files would be a nice place
for persistent malware to hide.
Does anyone know of a use case for appending to a fs-verity file?
The slides here:
https://events.linuxfoundation.org/wp-content/uploads/2017/11/fs-verify_Mike-Halcrow_Eric-Biggers.pdf
even say "File becomes read-only!"
If not, then here's a strawman: Require that at FS_IOC_ENABLE_VERITY time
the file does not have any +w bits set (and I guess no ACLs that do so...
that may get ugly).
I think that would make it easier to later factor out a "_CONTENTS_IMMUTABLE"
flag.
Hi Colin,
On Fri, Sep 14, 2018 at 09:15:30AM -0400, Colin Walters wrote:
> On Sat, Aug 25, 2018, at 12:48 AM, Eric Biggers wrote:
> >
> > As Ted pointed out, only truncates are denied on fs-verity files, not other
> > metadata changes like chmod().
> >
> > Think of it this way: the purpose of fs-verity is *not* to make files immutable.
> > It's to hash them.
>
> Sorry for my unfamiliarity with Android internals but - in earlier discussion
> I believe it was mentioned that APK (zip files?) that are being targeted here, right?
>
> Now AIUI, Zip files have an internal header that contains e.g. the size and
> indexes into the internal files. So if someone added random data to the end
> of a zip file, nothing is going to end up actually reading it.
>
> However, there are file formats that use the size of the file reported by stat();
> at least OSTree does this with serializing GVariant. I'm sure there are others -
> I'd imagine at least some things parsing ELF do this?
> In such a case, we really want to deny appending to the file as well.
>
> Unless there's some mechanism to deny applications reading not-verified
> data?
>
> And "hidden" data after fs-verity protected files would be a nice place
> for persistent malware to hide.
>
> Does anyone know of a use case for appending to a fs-verity file?
>
> The slides here:
> https://events.linuxfoundation.org/wp-content/uploads/2017/11/fs-verify_Mike-Halcrow_Eric-Biggers.pdf
> even say "File becomes read-only!"
>
> If not, then here's a strawman: Require that at FS_IOC_ENABLE_VERITY time
> the file does not have any +w bits set (and I guess no ACLs that do so...
> that may get ugly).
>
> I think that would make it easier to later factor out a "_CONTENTS_IMMUTABLE"
> flag.
>
After the verity bit is enabled, the verity metadata is not visible to
userspace. Yes, that means i_size is adjusted too. Also all contents
modifications are denied, including appends.
- Eric
On Fri, Sep 14, 2018 at 09:21:43AM -0700, Eric Biggers wrote:
> >
> > Now AIUI, Zip files have an internal header that contains e.g. the size and
> > indexes into the internal files. So if someone added random data to the end
> > of a zip file, nothing is going to end up actually reading it.
>
> After the verity bit is enabled, the verity metadata is not visible to
> userspace. Yes, that means i_size is adjusted too. Also all contents
> modifications are denied, including appends.
One of this reasons why this is important is that ZIP files *also*
have an central directory at the end. And in the case of the APK
files, there is an in-band signature block which is located at at the
end of the last file and the central directory, which can be located
by starting at the end of the file, finding the length of the central
directory, and then backing up to find the signature block.
- Ted