Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp2274896rwl; Fri, 6 Jan 2023 04:26:37 -0800 (PST) X-Google-Smtp-Source: AMrXdXskzFBSxLDRgwsPodiMeYmkwYt8Yntp4OMzkVxxuHBP3ZIgVLMuBxRiVz45JWytdUXmJAGJ X-Received: by 2002:a05:6402:f02:b0:46f:a2c2:405b with SMTP id i2-20020a0564020f0200b0046fa2c2405bmr58293340eda.37.1673007996811; Fri, 06 Jan 2023 04:26:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673007996; cv=none; d=google.com; s=arc-20160816; b=syglpLbXzUet/5UK7C1b+WDupBHzcxF9HchpX6fSEc1BDiLiIuaMcvNsMAaKODjJxC PDuT/aEsVUKDYwk9xdzw+CVPBEhOIkQNO2R6SeJ7qZ3FbkLJco7rkHYUlCjJLxM0He6s 30Dv0aBKrLRXhozOkhCNyfNLQH8WmO1PAUbezeDIMljcvDqnjssjAPzJEoObuPIZp6rH yrOOOjUoUdG0s4LzGZisAWIHiZ2EECjY0h47rlkTEN1a6uVAN8zx+QAga/I9HoiwermO utxtWOyH+nnDvOTFK2hGTRLUqw4Q810Gl68FUaKfYSJ4VLg6J0uoDlmXsFT/j7q84KSS d5pA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=LzsbYm1v1GieXeKfZ+NGxeHC8we7FOSWZE8i0hStJ+g=; b=HL86Wh009olHlVnwszIlax5JSEVjCu3XY2TVUsRMUmrvcRpTY8WFxdvwkAjJwCIwe0 gkVk03LsJn++K0CGyy2oxv0NSa4NbHu9fzlJK9slfwhuQZDjjWcX/26mbzJQZfDfU6vc UYtm2NT8uJW6jB1/7iTOxQ62Bnol8LtkPhlTExTozFTixRE2KaClt1U+Oqxp8jXksDXU NllPzxUtvFTcz9KplpZLqpJFqaZxCo2fPY9/8uoSLosZoIl3HhKmIn+u8+3WiMYsVgYn wDxIYhvJ6gHU/8KQBYBhQTpffS18ETQPjp1/q1hD8xr1JY33u/Qr+k3uiGw6ypJdcywj CK+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HbDp6hgV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i7-20020a50fc07000000b0046b1abd7884si1291367edr.108.2023.01.06.04.26.23; Fri, 06 Jan 2023 04:26:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HbDp6hgV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232599AbjAFMTw (ORCPT + 56 others); Fri, 6 Jan 2023 07:19:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234184AbjAFMTX (ORCPT ); Fri, 6 Jan 2023 07:19:23 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BF80736DA for ; Fri, 6 Jan 2023 04:18:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673007527; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LzsbYm1v1GieXeKfZ+NGxeHC8we7FOSWZE8i0hStJ+g=; b=HbDp6hgVTV0rBghJ93iF3tDhx9k7yXD/Zysd4gwo6wAruz+y90jYNxB2UpKS/yQ7wmqtu+ PWDdwp9avm6VWMfJ62IM7YWfeQBF0/5HC2w0uuMkMVMVg2boRhFyQR5FFiiDQlwMJN2GqY r6Pk6tQLc+mXa0teUoNsZZtFvWhWFUY= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-669-UtJUuF4GMDGCL46VWuQLSQ-1; Fri, 06 Jan 2023 07:18:46 -0500 X-MC-Unique: UtJUuF4GMDGCL46VWuQLSQ-1 Received: by mail-io1-f70.google.com with SMTP id u1-20020a5d8181000000b006ee29a8c421so729541ion.19 for ; Fri, 06 Jan 2023 04:18:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LzsbYm1v1GieXeKfZ+NGxeHC8we7FOSWZE8i0hStJ+g=; b=zaqIM9noa+A5XkWjdg5J3YJEoZhQKMOPQJ5PN5xKsMZJBA+S9BUBeh2tEDxW6YYIOP pfJebTjCGUa/puuuOJ17okJCFnMDazwuBsT5cEG5LKgWLP4PF5hL+GRwH3y2cLn9xtOH ISWK199yows68pDJWHhCZ8AKvR5Brx1h3oDKwUP9SJIrQpDZiwFFutnMl+/UbnneDxiA ykimUx37R4mzqdH8HXg8FlGeq6PeCvTOxvS9veLjWBH9bD+NMoBBhCE8kMpJL/O+W9tx lMTfdABvaAFjBmysasD5SksceB2GDDitN+uIZH5rSb4u7ma0C7e+y+25EG8H2l/eXU/v 7sGg== X-Gm-Message-State: AFqh2kqZ1oqrQWyrzjYwaQTBLrQxY3N2XNfyvLwutqDjtzl5CNviSnxB QRdul6990aeTd6Y9JTnLGrb9S3lDNiI2p9eTfpFQ38Vrl8TPK8NDVAPSu0u+94mvGtRyJ+hA9uB +s+oy3GkUEZSeJ+uq8jpS3u1K X-Received: by 2002:a92:c90b:0:b0:304:d0b7:e56c with SMTP id t11-20020a92c90b000000b00304d0b7e56cmr37671686ilp.14.1673007523894; Fri, 06 Jan 2023 04:18:43 -0800 (PST) X-Received: by 2002:a92:c90b:0:b0:304:d0b7:e56c with SMTP id t11-20020a92c90b000000b00304d0b7e56cmr37671669ilp.14.1673007523452; Fri, 06 Jan 2023 04:18:43 -0800 (PST) Received: from x1 (c-73-214-169-22.hsd1.pa.comcast.net. [73.214.169.22]) by smtp.gmail.com with ESMTPSA id c16-20020a92c8d0000000b0030c423515bfsm347633ilq.27.2023.01.06.04.18.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jan 2023 04:18:42 -0800 (PST) Date: Fri, 6 Jan 2023 07:18:41 -0500 From: Brian Masney To: Alexander Larsson Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, gscrivan@redhat.com Subject: Re: [PATCH 4/6] composefs: Add filesystem implementation Message-ID: References: <1f0bd3e3a0c68ee19dd96ee0d573bb113428f1b6.1669631086.git.alexl@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1f0bd3e3a0c68ee19dd96ee0d573bb113428f1b6.1669631086.git.alexl@redhat.com> User-Agent: Mutt/2.2.7 (2022-08-07) X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 28, 2022 at 12:17:12PM +0100, Alexander Larsson wrote: > This is the basic inode and filesystem implementation. > > Signed-off-by: Alexander Larsson > Signed-off-by: Giuseppe Scrivano Note: I'm looking at this from the VFS viewpoint since I haven't done anything in this subsystem. Just looking for some generic suggestions. > --- > fs/composefs/cfs.c | 941 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 941 insertions(+) > create mode 100644 fs/composefs/cfs.c > > diff --git a/fs/composefs/cfs.c b/fs/composefs/cfs.c > new file mode 100644 > index 000000000000..4ed355ab079d > --- /dev/null > +++ b/fs/composefs/cfs.c > @@ -0,0 +1,941 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * composefs > + * > + * Copyright (C) 2000 Linus Torvalds. > + * 2000 Transmeta Corp. > + * Copyright (C) 2021 Giuseppe Scrivano > + * Copyright (C) 2022 Alexander Larsson > + * > + * This file is released under the GPL. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "cfs-internals.h" > + > +MODULE_LICENSE("GPL"); > +MODULE_AUTHOR("Giuseppe Scrivano "); An entry for composefs should be added to the MAINTAINERS file as well. > + > +#define CFS_MAX_STACK 500 > + > +struct cfs_info { > + struct cfs_context_s cfs_ctx; > + > + char *base_path; > + > + size_t n_bases; > + struct vfsmount **bases; > + > + u32 verity_check; /* 0 == none, 1 == if specified in image, 2 == require in image */ > + bool has_digest; > + u8 digest[SHA256_DIGEST_SIZE]; /* fs-verity digest */ > +}; > + > +struct cfs_inode { > + /* must be first for clear in cfs_alloc_inode to work */ > + struct inode vfs_inode; [ snip ] > +static struct inode *cfs_alloc_inode(struct super_block *sb) > +{ > + struct cfs_inode *cino = > + alloc_inode_sb(sb, cfs_inode_cachep, GFP_KERNEL); > + > + if (!cino) > + return NULL; > + > + memset((u8 *)cino + sizeof(struct inode), 0, > + sizeof(struct cfs_inode) - sizeof(struct inode)); Why not use container_of() to look up the vfs_inode and then you can get rid of the restriction of this being first. This may also break with structure randomization turned on. > + > + struct cfs_inode_data_s inode_data; > +}; > + > +static inline struct cfs_inode *CFS_I(struct inode *inode) CFS_I in upper case doesn't match naming conventions in the rest of the kernel. > +{ > + return container_of(inode, struct cfs_inode, vfs_inode); > +} > + > +static struct file empty_file; > + > +static const struct file_operations cfs_file_operations; > +static const struct vm_operations_struct generic_file_vm_ops; > + > +static const struct super_operations cfs_ops; > +static const struct file_operations cfs_dir_operations; > +static const struct inode_operations cfs_dir_inode_operations; > +static const struct inode_operations cfs_file_inode_operations; > +static const struct inode_operations cfs_link_inode_operations; > + > +static const struct xattr_handler *cfs_xattr_handlers[]; > +static const struct export_operations cfs_export_operations; > + > +static const struct address_space_operations cfs_aops = { > + .direct_IO = noop_direct_IO, > +}; > + > +static ssize_t cfs_listxattr(struct dentry *dentry, char *names, size_t size); > + > +/* copied from overlayfs. */ > +static unsigned int cfs_split_basedirs(char *str) > +{ > + unsigned int ctr = 1; > + char *s, *d; > + > + for (s = d = str;; s++, d++) { > + if (*s == '\\') { > + s++; > + } else if (*s == ':') { > + *d = '\0'; > + ctr++; > + continue; > + } > + *d = *s; > + if (!*s) > + break; > + } > + return ctr; > +} To expand on the comment, this is ovl_split_lowerdirs in fs/overlayfs/super.c. It'd be nice if there was a common place where this could go. > + > +static struct inode *cfs_make_inode(struct cfs_context_s *ctx, > + struct super_block *sb, ino_t ino_num, > + struct cfs_inode_s *ino, > + const struct inode *dir) > +{ > + struct cfs_xattr_header_s *xattrs = NULL; > + struct cfs_inode *cino; > + struct inode *inode = NULL; > + struct cfs_inode_data_s inode_data = { 0 }; > + int ret, res; Put in reverse Christmas tree order. > + > + res = cfs_init_inode_data(ctx, ino, ino_num, &inode_data); > + if (res < 0) { > + ret = res; > + goto fail; Just return ERR_PTR(res) here. xattrs is set to NULL at this point and the fail path attempts to free it. Yes, there's a check in kfree for this. The code in cfs_inode_data_put appears to handle this NULL case fine as well. > + } > + > + inode = new_inode(sb); > + if (inode) { > + inode_init_owner(&init_user_ns, inode, dir, ino->st_mode); > + inode->i_mapping->a_ops = &cfs_aops; > + > + cino = CFS_I(inode); > + cino->inode_data = inode_data; > + > + inode->i_ino = ino_num; > + set_nlink(inode, ino->st_nlink); > + inode->i_rdev = ino->st_rdev; > + inode->i_uid = make_kuid(current_user_ns(), ino->st_uid); > + inode->i_gid = make_kgid(current_user_ns(), ino->st_gid); > + inode->i_mode = ino->st_mode; > + inode->i_atime = ino->st_mtim; > + inode->i_mtime = ino->st_mtim; > + inode->i_ctime = ino->st_ctim; > + > + switch (ino->st_mode & S_IFMT) { > + case S_IFREG: > + inode->i_op = &cfs_file_inode_operations; > + inode->i_fop = &cfs_file_operations; > + inode->i_size = ino->st_size; > + break; > + case S_IFLNK: > + inode->i_link = cino->inode_data.path_payload; > + inode->i_op = &cfs_link_inode_operations; > + inode->i_fop = &cfs_file_operations; > + break; > + case S_IFDIR: > + inode->i_op = &cfs_dir_inode_operations; > + inode->i_fop = &cfs_dir_operations; > + inode->i_size = 4096; > + break; > + case S_IFCHR: > + case S_IFBLK: > + if (current_user_ns() != &init_user_ns) { > + ret = -EPERM; > + goto fail; > + } > + fallthrough; > + default: > + inode->i_op = &cfs_file_inode_operations; > + init_special_inode(inode, ino->st_mode, ino->st_rdev); > + break; > + } > + } > + return inode; > + > +fail: > + if (inode) > + iput(inode); > + kfree(xattrs); > + cfs_inode_data_put(&inode_data); > + return ERR_PTR(ret); > +} > + > +static struct inode *cfs_get_root_inode(struct super_block *sb) > +{ > + struct cfs_info *fsi = sb->s_fs_info; > + struct cfs_inode_s ino_buf; > + struct cfs_inode_s *ino; > + u64 index; > + > + ino = cfs_get_root_ino(&fsi->cfs_ctx, &ino_buf, &index); > + if (IS_ERR(ino)) > + return ERR_CAST(ino); > + > + return cfs_make_inode(&fsi->cfs_ctx, sb, index, ino, NULL); > +} > + > +static bool cfs_iterate_cb(void *private, const char *name, int name_len, > + u64 ino, unsigned int dtype) > +{ > + struct dir_context *ctx = private; > + > + if (!dir_emit(ctx, name, name_len, ino, dtype)) > + return 0; > + > + ctx->pos++; > + return 1; > +} > + > +static int cfs_iterate(struct file *file, struct dir_context *ctx) > +{ > + struct inode *inode = file->f_inode; > + struct cfs_info *fsi = inode->i_sb->s_fs_info; > + struct cfs_inode *cino = CFS_I(inode); > + > + if (!dir_emit_dots(file, ctx)) > + return 0; > + > + return cfs_dir_iterate(&fsi->cfs_ctx, inode->i_ino, &cino->inode_data, > + ctx->pos - 2, cfs_iterate_cb, ctx); > +} > + > +static struct dentry *cfs_lookup(struct inode *dir, struct dentry *dentry, > + unsigned int flags) > +{ > + struct cfs_info *fsi = dir->i_sb->s_fs_info; > + struct cfs_inode *cino = CFS_I(dir); > + struct cfs_inode_s ino_buf; > + struct inode *inode; > + struct cfs_inode_s *ino_s; > + u64 index; > + int ret; > + > + if (dentry->d_name.len > NAME_MAX) > + return ERR_PTR(-ENAMETOOLONG); > + > + ret = cfs_dir_lookup(&fsi->cfs_ctx, dir->i_ino, &cino->inode_data, > + dentry->d_name.name, dentry->d_name.len, &index); > + if (ret < 0) > + return ERR_PTR(ret); > + if (ret == 0) > + goto return_negative; > + > + ino_s = cfs_get_ino_index(&fsi->cfs_ctx, index, &ino_buf); > + if (IS_ERR(ino_s)) > + return ERR_CAST(ino_s); > + > + inode = cfs_make_inode(&fsi->cfs_ctx, dir->i_sb, index, ino_s, dir); > + if (IS_ERR(inode)) > + return ERR_CAST(inode); > + > + return d_splice_alias(inode, dentry); > + > +return_negative: > + d_add(dentry, NULL); > + return NULL; > +} > + > +static const struct file_operations cfs_dir_operations = { > + .llseek = generic_file_llseek, > + .read = generic_read_dir, > + .iterate_shared = cfs_iterate, > +}; > + > +static const struct inode_operations cfs_dir_inode_operations = { > + .lookup = cfs_lookup, > + .listxattr = cfs_listxattr, > +}; > + > +static const struct inode_operations cfs_link_inode_operations = { > + .get_link = simple_get_link, > + .listxattr = cfs_listxattr, > +}; > + > +static void digest_to_string(const u8 *digest, char *buf) > +{ > + static const char hexchars[] = "0123456789abcdef"; > + u32 i, j; > + > + for (i = 0, j = 0; i < SHA256_DIGEST_SIZE; i++, j += 2) { > + u8 byte = digest[i]; > + > + buf[j] = hexchars[byte >> 4]; > + buf[j + 1] = hexchars[byte & 0xF]; > + } > + buf[j] = '\0'; > +} Use %*phN printk() format with snprintf(). See the section 'Raw buffer as a hex string' in Documentation/core-api/printk-formats.rst. > + > +static int digest_from_string(const char *digest_str, u8 *digest) > +{ > + size_t i, j; > + > + for (i = 0, j = 0; i < SHA256_DIGEST_SIZE; i += 1, j += 2) { > + int big, little; > + > + if (digest_str[j] == 0 || digest_str[j + 1] == 0) > + return -EINVAL; /* Too short string */ > + > + big = cfs_xdigit_value(digest_str[j]); > + little = cfs_xdigit_value(digest_str[j + 1]); > + > + if (big == -1 || little == -1) > + return -EINVAL; /* Not hex digit */ > + > + digest[i] = (big << 4) | little; > + } > + > + if (digest_str[j] != 0) > + return -EINVAL; /* Too long string */ > + > + return 0; > +} I mentioned this an another patch in this series but replace with hex2bin() from lib/hexdump.c. > + > +/* > + * Display the mount options in /proc/mounts. > + */ > +static int cfs_show_options(struct seq_file *m, struct dentry *root) > +{ > + struct cfs_info *fsi = root->d_sb->s_fs_info; > + > + if (fsi->base_path) > + seq_show_option(m, "basedir", fsi->base_path); > + if (fsi->has_digest) { > + char buf[SHA256_DIGEST_SIZE * 2 + 1]; > + > + digest_to_string(fsi->digest, buf); > + seq_show_option(m, "digest", buf); > + } > + if (fsi->verity_check != 0) > + seq_printf(m, ",verity_check=%u", fsi->verity_check); > + > + return 0; > +} > + > +static struct kmem_cache *cfs_inode_cachep; > + > +static void cfs_destroy_inode(struct inode *inode) > +{ > + struct cfs_inode *cino = CFS_I(inode); > + > + cfs_inode_data_put(&cino->inode_data); > +} > + > +static void cfs_free_inode(struct inode *inode) > +{ > + struct cfs_inode *cino = CFS_I(inode); > + > + kmem_cache_free(cfs_inode_cachep, cino); > +} > + > +static void cfs_put_super(struct super_block *sb) > +{ > + struct cfs_info *fsi = sb->s_fs_info; > + > + cfs_ctx_put(&fsi->cfs_ctx); > + if (fsi->bases) { > + kern_unmount_array(fsi->bases, fsi->n_bases); > + kfree(fsi->bases); > + } > + kfree(fsi->base_path); > + > + kfree(fsi); > +} > + > +static int cfs_statfs(struct dentry *dentry, struct kstatfs *buf) > +{ > + struct cfs_info *fsi = dentry->d_sb->s_fs_info; > + int err = 0; > + > + /* We return the free space, etc from the first base dir. */ > + if (fsi->n_bases > 0) { > + struct path root = { .mnt = fsi->bases[0], > + .dentry = fsi->bases[0]->mnt_root }; > + err = vfs_statfs(&root, buf); > + } > + > + if (!err) { > + buf->f_namelen = NAME_MAX; > + buf->f_type = dentry->d_sb->s_magic; > + } > + > + return err; > +} > + > +static const struct super_operations cfs_ops = { > + .statfs = cfs_statfs, > + .drop_inode = generic_delete_inode, > + .show_options = cfs_show_options, > + .put_super = cfs_put_super, > + .destroy_inode = cfs_destroy_inode, > + .alloc_inode = cfs_alloc_inode, > + .free_inode = cfs_free_inode, > +}; > + > +enum cfs_param { > + Opt_base_path, > + Opt_digest, > + Opt_verity_check, > +}; > + > +const struct fs_parameter_spec cfs_parameters[] = { > + fsparam_string("basedir", Opt_base_path), > + fsparam_string("digest", Opt_digest), > + fsparam_u32("verity_check", Opt_verity_check), > + {} > +}; > + > +static int cfs_parse_param(struct fs_context *fc, struct fs_parameter *param) > +{ > + struct fs_parse_result result; > + struct cfs_info *fsi = fc->s_fs_info; > + int opt, r; > + > + opt = fs_parse(fc, cfs_parameters, param, &result); > + if (opt == -ENOPARAM) > + return vfs_parse_fs_param_source(fc, param); > + if (opt < 0) > + return opt; > + > + switch (opt) { > + case Opt_base_path: > + kfree(fsi->base_path); > + /* Take ownership. */ > + fsi->base_path = param->string; > + param->string = NULL; > + break; > + case Opt_digest: > + r = digest_from_string(param->string, fsi->digest); > + if (r < 0) > + return r; > + fsi->has_digest = true; > + fsi->verity_check = 2; /* Default to full verity check */ > + break; > + case Opt_verity_check: > + if (result.uint_32 > 2) > + return invalfc(fc, "Invalid verity_check mode"); > + fsi->verity_check = result.uint_32; > + break; > + } > + > + return 0; > +} > + > +static struct vfsmount *resolve_basedir(const char *name) > +{ > + struct path path = {}; > + struct vfsmount *mnt; > + int err = -EINVAL; > + > + if (!*name) { > + pr_err("empty basedir\n"); > + goto out; > + } > + err = kern_path(name, LOOKUP_FOLLOW, &path); > + if (err) { > + pr_err("failed to resolve '%s': %i\n", name, err); > + goto out; > + } > + > + mnt = clone_private_mount(&path); > + err = PTR_ERR(mnt); > + if (IS_ERR(mnt)) { > + pr_err("failed to clone basedir\n"); > + goto out_put; > + } > + > + path_put(&path); > + > + /* Don't inherit atime flags */ > + mnt->mnt_flags &= ~(MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME); > + > + return mnt; > + > +out_put: > + path_put(&path); > +out: > + return ERR_PTR(err); > +} > + > +static int cfs_fill_super(struct super_block *sb, struct fs_context *fc) > +{ > + struct cfs_info *fsi = sb->s_fs_info; > + struct vfsmount **bases = NULL; > + size_t numbasedirs = 0; > + struct inode *inode; > + struct vfsmount *mnt; > + int ret; > + > + if (sb->s_root) > + return -EINVAL; > + > + /* Set up the inode allocator early */ > + sb->s_op = &cfs_ops; > + sb->s_flags |= SB_RDONLY; > + sb->s_magic = CFS_MAGIC; > + sb->s_xattr = cfs_xattr_handlers; > + sb->s_export_op = &cfs_export_operations; > + > + if (fsi->base_path) { > + char *lower, *splitlower = NULL; > + size_t i; > + > + ret = -ENOMEM; > + splitlower = kstrdup(fsi->base_path, GFP_KERNEL); > + if (!splitlower) > + goto fail; > + > + ret = -EINVAL; > + numbasedirs = cfs_split_basedirs(splitlower); > + if (numbasedirs > CFS_MAX_STACK) { > + pr_err("too many lower directories, limit is %d\n", > + CFS_MAX_STACK); > + kfree(splitlower); > + goto fail; > + } > + > + ret = -ENOMEM; > + bases = kcalloc(numbasedirs, sizeof(struct vfsmount *), > + GFP_KERNEL); > + if (!bases) { > + kfree(splitlower); > + goto fail; > + } > + > + lower = splitlower; > + for (i = 0; i < numbasedirs; i++) { > + mnt = resolve_basedir(lower); > + if (IS_ERR(mnt)) { > + ret = PTR_ERR(mnt); > + kfree(splitlower); > + goto fail; > + } > + bases[i] = mnt; > + > + lower = strchr(lower, '\0') + 1; > + } > + kfree(splitlower); > + } > + > + /* Must be inited before calling cfs_get_inode. */ > + ret = cfs_init_ctx(fc->source, fsi->has_digest ? fsi->digest : NULL, > + &fsi->cfs_ctx); > + if (ret < 0) > + goto fail; > + > + inode = cfs_get_root_inode(sb); > + if (IS_ERR(inode)) { > + ret = PTR_ERR(inode); > + goto fail; > + } > + sb->s_root = d_make_root(inode); > + > + ret = -ENOMEM; > + if (!sb->s_root) > + goto fail; > + > + sb->s_maxbytes = MAX_LFS_FILESIZE; > + sb->s_blocksize = PAGE_SIZE; > + sb->s_blocksize_bits = PAGE_SHIFT; > + > + sb->s_time_gran = 1; > + > + fsi->bases = bases; > + fsi->n_bases = numbasedirs; > + return 0; > +fail: > + if (bases) { > + size_t i; > + > + for (i = 0; i < numbasedirs; i++) { You can now do 'for (size_t i = 0; ...' > + if (bases[i]) > + kern_unmount(bases[i]); > + } > + kfree(bases); > + } > + cfs_ctx_put(&fsi->cfs_ctx); > + return ret; > +} > + > +static int cfs_get_tree(struct fs_context *fc) > +{ > + return get_tree_nodev(fc, cfs_fill_super); > +} > + > +static const struct fs_context_operations cfs_context_ops = { > + .parse_param = cfs_parse_param, > + .get_tree = cfs_get_tree, > +}; > + > +static struct file *open_base_file(struct cfs_info *fsi, struct inode *inode, > + struct file *file) > +{ > + struct cfs_inode *cino = CFS_I(inode); > + struct file *real_file; > + char *real_path = cino->inode_data.path_payload; > + size_t i; > + > + for (i = 0; i < fsi->n_bases; i++) { > + real_file = file_open_root_mnt(fsi->bases[i], real_path, > + file->f_flags, 0); > + if (!IS_ERR(real_file) || PTR_ERR(real_file) != -ENOENT) > + return real_file; > + } > + > + return ERR_PTR(-ENOENT); > +} > + > +static int cfs_open_file(struct inode *inode, struct file *file) > +{ > + struct cfs_inode *cino = CFS_I(inode); > + struct cfs_info *fsi = inode->i_sb->s_fs_info; > + struct file *real_file; > + struct file *faked_file; > + char *real_path = cino->inode_data.path_payload; > + > + if (WARN_ON(!file)) > + return -EIO; > + > + if (file->f_flags & (O_WRONLY | O_RDWR | O_CREAT | O_EXCL | O_TRUNC)) > + return -EROFS; > + > + if (!real_path) { > + file->private_data = &empty_file; > + return 0; > + } > + > + if (fsi->verity_check >= 2 && !cino->inode_data.has_digest) { > + pr_warn("WARNING: composefs image file '%pd' specified no fs-verity digest\n", > + file->f_path.dentry); > + return -EIO; > + } > + > + real_file = open_base_file(fsi, inode, file); > + > + if (IS_ERR(real_file)) > + return PTR_ERR(real_file); > + > + /* If metadata records a digest for the file, ensure it is there > + * and correct before using the contents. > + */ > + if (cino->inode_data.has_digest && fsi->verity_check >= 1) { > + u8 verity_digest[FS_VERITY_MAX_DIGEST_SIZE]; > + enum hash_algo verity_algo; > + int res; > + > + res = fsverity_get_digest(d_inode(real_file->f_path.dentry), > + verity_digest, &verity_algo); > + if (res < 0) { > + pr_warn("WARNING: composefs backing file '%pd' has no fs-verity digest\n", > + real_file->f_path.dentry); > + fput(real_file); > + return -EIO; > + } > + if (verity_algo != HASH_ALGO_SHA256 || > + memcmp(cino->inode_data.digest, verity_digest, > + SHA256_DIGEST_SIZE) != 0) { > + pr_warn("WARNING: composefs backing file '%pd' has the wrong fs-verity digest\n", > + real_file->f_path.dentry); > + fput(real_file); > + return -EIO; > + } > + } > + > + faked_file = open_with_fake_path(&file->f_path, file->f_flags, > + real_file->f_inode, current_cred()); > + fput(real_file); > + > + if (IS_ERR(faked_file)) > + return PTR_ERR(faked_file); > + > + file->private_data = faked_file; > + return 0; > +} > + > +static unsigned long cfs_mmu_get_unmapped_area(struct file *file, > + unsigned long addr, > + unsigned long len, > + unsigned long pgoff, > + unsigned long flags) > +{ > + struct file *realfile = file->private_data; > + > + if (realfile == &empty_file) > + return 0; > + > + return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); > +} > + > +static int cfs_release_file(struct inode *inode, struct file *file) > +{ > + struct file *realfile = file->private_data; > + > + if (WARN_ON(!realfile)) > + return -EIO; > + > + if (realfile == &empty_file) > + return 0; > + > + fput(file->private_data); > + > + return 0; > +} > + > +static int cfs_mmap(struct file *file, struct vm_area_struct *vma) > +{ > + struct file *realfile = file->private_data; > + int ret; > + > + if (realfile == &empty_file) > + return 0; > + > + if (!realfile->f_op->mmap) > + return -ENODEV; > + > + if (WARN_ON(file != vma->vm_file)) > + return -EIO; > + > + vma_set_file(vma, realfile); > + > + ret = call_mmap(vma->vm_file, vma); > + > + return ret; > +} > + > +static ssize_t cfs_read_iter(struct kiocb *iocb, struct iov_iter *iter) > +{ > + struct file *file = iocb->ki_filp; > + struct file *realfile = file->private_data; > + int ret; > + > + if (realfile == &empty_file) > + return 0; > + > + if (!realfile->f_op->read_iter) > + return -ENODEV; > + > + iocb->ki_filp = realfile; > + ret = call_read_iter(realfile, iocb, iter); > + iocb->ki_filp = file; > + > + return ret; > +} > + > +static int cfs_fadvise(struct file *file, loff_t offset, loff_t len, int advice) > +{ > + struct file *realfile = file->private_data; > + > + if (realfile == &empty_file) > + return 0; > + > + return vfs_fadvise(realfile, offset, len, advice); > +} > + > +static int cfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, > + struct inode *parent) > +{ > + int len = 3; > + u64 nodeid; > + u32 generation; > + > + if (*max_len < len) { > + *max_len = len; > + return FILEID_INVALID; > + } > + > + nodeid = inode->i_ino; > + generation = inode->i_generation; > + > + fh[0] = (u32)(nodeid >> 32); > + fh[1] = (u32)(nodeid & 0xffffffff); > + fh[2] = generation; > + > + *max_len = len; > + > + return 0x91; > +} > + > +static struct dentry *cfs_fh_to_dentry(struct super_block *sb, struct fid *fid, > + int fh_len, int fh_type) > +{ > + struct cfs_info *fsi = sb->s_fs_info; > + struct inode *ino; > + u64 inode_index; > + u32 generation; > + > + if (fh_type != 0x91 || fh_len < 3) 0x91 is referenced above as well. Should this be a #define? Brian